Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17676][CORE] FsHistoryProvider should ignore hidden files #15250

Closed
wants to merge 3 commits into from

Conversation

squito
Copy link
Contributor

@squito squito commented Sep 27, 2016

What changes were proposed in this pull request?

FsHistoryProvider was writing a hidden file (to check the fs's clock).
Even though it deleted the file immediately, sometimes another thread
would try to scan the files on the fs in-between, and then there would
be an error msg logged which was very misleading for the end-user.
(The logged error was harmless, though.)

How was this patch tested?

I added one unit test, but to be clear, that test was passing before. The actual change in behavior in that test is just logging (after the change, there is no more logged error), which I just manually verified.

FsHistoryProvider was writing a hidden file (to check the fs's clock).
Even though it deleted the file immediately, sometimes another thread
would try to scan the files on the fs in-between, and then there would
be an error msg logged which was very misleading for the end-user.
(The logged error was harmless, though.)
@SparkQA
Copy link

SparkQA commented Sep 27, 2016

Test build #65942 has finished for PR 15250 at commit b125f2f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor Author

squito commented Sep 27, 2016

@vanzin woudl appreciate it if you can take a look at this -- related to the change you reviewed back here #5886 (comment)

@SparkQA
Copy link

SparkQA commented Sep 27, 2016

Test build #65997 has finished for PR 15250 at commit 9c9012e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -394,6 +393,30 @@ class FsHistoryProviderSuite extends SparkFunSuite with BeforeAndAfter with Matc
}
}

test("ignore hidden files") {
// FsHistoryProvider should ignore hidden files. (It even writes out a hidden file itself
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better test would be to write a valid log as a hidden file, and check that it wasn't read, if that's the goal of the change.

@@ -290,7 +290,12 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
.filter { entry =>
try {
val prevFileSize = fileToAppInfo.get(entry.getPath()).map{_.fileSize}.getOrElse(0L)
!entry.isDirectory() && prevFileSize < entry.getLen()
!entry.isDirectory() &&
// FsHistoryProvider generates a hidden file which can't be read, and the user may
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but I'd avoid mentioning user-created files here. Because users can create non-hidden files too. The mod time tracking code should take care of not parsing those files multiple times.

@vanzin
Copy link
Contributor

vanzin commented Sep 28, 2016

Looks ok but I think the test could be changed to actually ensure the change is working.

@SparkQA
Copy link

SparkQA commented Sep 29, 2016

Test build #66121 has finished for PR 15250 at commit 2d37bd3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Sep 29, 2016

LGTM, merging to master.

@asfgit asfgit closed this in 3993ebc Sep 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants