-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17676][CORE] FsHistoryProvider should ignore hidden files #15250
Conversation
FsHistoryProvider was writing a hidden file (to check the fs's clock). Even though it deleted the file immediately, sometimes another thread would try to scan the files on the fs in-between, and then there would be an error msg logged which was very misleading for the end-user. (The logged error was harmless, though.)
Test build #65942 has finished for PR 15250 at commit
|
@vanzin woudl appreciate it if you can take a look at this -- related to the change you reviewed back here #5886 (comment) |
Test build #65997 has finished for PR 15250 at commit
|
@@ -394,6 +393,30 @@ class FsHistoryProviderSuite extends SparkFunSuite with BeforeAndAfter with Matc | |||
} | |||
} | |||
|
|||
test("ignore hidden files") { | |||
// FsHistoryProvider should ignore hidden files. (It even writes out a hidden file itself |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better test would be to write a valid log as a hidden file, and check that it wasn't read, if that's the goal of the change.
@@ -290,7 +290,12 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) | |||
.filter { entry => | |||
try { | |||
val prevFileSize = fileToAppInfo.get(entry.getPath()).map{_.fileSize}.getOrElse(0L) | |||
!entry.isDirectory() && prevFileSize < entry.getLen() | |||
!entry.isDirectory() && | |||
// FsHistoryProvider generates a hidden file which can't be read, and the user may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but I'd avoid mentioning user-created files here. Because users can create non-hidden files too. The mod time tracking code should take care of not parsing those files multiple times.
Looks ok but I think the test could be changed to actually ensure the change is working. |
Test build #66121 has finished for PR 15250 at commit
|
LGTM, merging to master. |
What changes were proposed in this pull request?
FsHistoryProvider was writing a hidden file (to check the fs's clock).
Even though it deleted the file immediately, sometimes another thread
would try to scan the files on the fs in-between, and then there would
be an error msg logged which was very misleading for the end-user.
(The logged error was harmless, though.)
How was this patch tested?
I added one unit test, but to be clear, that test was passing before. The actual change in behavior in that test is just logging (after the change, there is no more logged error), which I just manually verified.