[SPARK-7336][HistoryServer] Fix bug that applications status uncorrect on JobHistory UI. #5886

ArcherShao · 2015-05-04T10:16:51Z

No description provided.

SparkQA · 2015-05-04T10:20:32Z

Test build #31755 has finished for PR 5886 at commit e70bedc.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2015-05-04T11:13:14Z

On inspection, I don't agree with the scenario you suggest. You say that if a last modification time is T, and a check happens at T + t/2, then it won't notice a file update at T + t/4. But the logic in this case finds files modified >= T. Why would this not update?

vanzin · 2015-05-04T18:05:36Z

I'm not convinced either. Please write a test that fails with the current code and passes with your changes. Also, storing another huge map in FsHistoryProvider is the wrong approach to solving any bug.

ArcherShao · 2015-05-05T01:45:20Z

@srowen Check all log files one time will cost some time, say 't', if the first file(app1) check at time T, and the last file(appN) check at time T + t, the last modification time of app1 is T + t / 4, and the last modification time of appN is T + t / 2.Then app1 will ignored at this check, and 'lastModifiedTime' will update to T + t / 2, app1 will ignored at next checks.

vanzin · 2015-05-05T02:07:28Z

I still don't understand. The "last modification time" is not based on how long it took to parse logs. It's based on the modification time of the newest log found.

If you scan files at time t, all logs will have modification time <= t, unless some code is doing something funny. The lastModifiedTime value in FsHistoryProvider will be, at most t. So if app1's log file changes after that poll, its modification time will be > t and it will be re-loaded next time a poll happens.

But feel free to prove me wrong by writing a test for this.

ArcherShao · 2015-05-05T02:45:05Z

Scan files will cost time, I mean start scan at time T, and finish at time T + t. If some update happen to the first file between T and T + t, the bug will repreduce.

srowen · 2015-05-05T07:48:31Z

I think this may be the problem scenario:

Log file last modified at time 80
lastModifiedTime begins at 90 due to other log files
Scan starts at 100
Log file is excluded from the update (80 < 100)
Log file is modified at time 101
Scan finds another log file modified at time 102
Scan ends and lastModifiedTime becomes 102
The next run won't pick up the log file since 101 < 102 but it had modifications

srowen · 2015-05-05T07:49:45Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

-              time >= lastModifiedTime
+              val fileName = entry.getPath.getName
+              val oldLastModifiedTime = lastModifiedTimes.getOrElse(fileName, -1L)
+              time > oldLastModifiedTime


I think technically this has to be >= since you might have an update that occurred in the same millisecond. This is very unlikely. But is there harm in reprocessing logs in this code? does it have to guarantee a log isn't read twice?

If the app is in process, it doesn't matter if we don't reprocess the log file. After the app completed, its' file name will change, and it will get the oldLastModifiedTime = -1, so time > oldLastModifiedTime will be true, and the log file will reprocess.

SparkQA · 2015-05-05T13:54:36Z

Test build #31875 has finished for PR 5886 at commit 9dcedd0.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2015-05-05T14:13:28Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

@@ -319,6 +318,15 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)

      applications = appsToRetain

+      val modifiedTimesToRetain = new mutable.LinkedHashMap[String, Long]()


What about simply deleting the entries that are too old? then the member can be val and doesn't need to be @volatile

vanzin · 2015-05-05T16:51:31Z

Scan finds another log file modified at time 102

This shouldn't technically happen since the scan started at time 100. The listing is done in one call, so it's very unlikely such a situation would exist. Still it might be possible. But in any case, this needs a test.

vanzin · 2015-05-05T17:12:32Z

@srowen I guess the part that I don't follow in your example is: how can a scan see a file with mod time = 102 but not see a file with mod time = 101 in the same listing call? Unless HDFS is really screwed up (unlikely), or the user is manually messing with the files' times, this shouldn't happen.

liyezhang556520 · 2015-05-06T02:28:23Z

@vanzin , there is time interval from getting the first file's modification time to the last file's. Assume there are 3 files: F1, F2, F3. And before scanning, their modification times are TF1=100, TF2=101, TF3=102 respectively.
At time T1=103, we start scanning .
At time T2=104, we finished loading F1 mod time, starting to loading F2 mod time.
At time T3=107, we finished loading F2 mod time. At this point, lastModifiedTime is 101, which is equal to F2 mode time --- TF2. And during loading F2 mod time, there are two operations:
First, at time T4=105, contents written to F1, which leads to F1 mod time changing from TF1=100 to TF1'=105
Second, at time T5=106, contents written to F3, which leads to F3 mod time changing from TF3=102 to TF3'=106.

Then we continue to load F3 mode time, and at time T6=108, we finished loading F3 mode time. At this point, lastModifiedTime is 106.

So for the next round, we would not pick up F1 even it has been modified at time T4=105.

liyezhang556520 · 2015-05-06T02:37:49Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

+              val fileName = entry.getPath.getName
+              val oldLastModifiedTime = lastModifiedTimes.getOrElse(fileName, -1L)
+              lastModifiedTimes += (fileName -> time)
+              time > oldLastModifiedTime


We can not use > to check whether the file is modified. We must use >=. Since there might be modification to the file immediately before and after calling getModificationTime, the result of which is the modified time stay the same. We need use >= to check the time along with checking the file size changing. For some discussions, please refer to SPARK-7189

vanzin · 2015-05-06T03:08:54Z

At time T2=104, we finished loading F1 mod time, starting to loading F2 mod time.

That doesn't exist. All mod times are retrieved in the same call. There's a single call to "listStatus" which returns all available log files - and log files are now files, and not directories as they used to be. If that were still the case then you'd be correct.

So you should never, in the same call to listStatus, get "f2 with mod time 102" and not get "f1 with mod time 101".

vanzin · 2015-05-06T03:14:35Z

And BTW I'm still waiting for that test that shows an issue exists here.

liyezhang556520 · 2015-05-06T04:53:56Z

Hi @vanzin , you are correct that a single call to listStatus will return all information of the all available log files, but I'm not sure about the details of listStatus in HDFS, it seems when there are too many items in the directory, HDFS will fetch them in batch, and between the batches, there is a time gap. in which some of the write operations can be inserted. So my concern is whether the getModificationTime provided by HDFS FileStatus can be fully trusted? May be there are some scenarios that HDFS can guarantee.

srowen · 2015-05-06T05:35:33Z

@vanzin yes the question is really what can happen inside listStatus. It's not a question of being correct or trusted @liyezhang556520 but just what can happen concurrently.

listStatus begins at 99
Finds old last modified time at 100
File modified at 101
Call returns at 102 with last modified time earlier than 101

This could be my ignorance but is this metadata update atomic in the name node? meaning, as far as HDFS is concerned, the file could not have been modified at 101 since it was busy with listStatus and recorded the modification at some time >= 102?

... but this boxes the issue into a really small corner. What happens in this case? I realize that some log file doesn't get processed until a bit later then but does the subsequent processing then go wrong? If the application state isn't corrupted or wrong afterwards, I think this isn't worth addressing.

vanzin · 2015-05-06T16:59:49Z

Ok, I buy that scenario. But as I mentioned before, the current solution is not very good: it increases the memory usage of the history server too much. If you have n log files, now during polling your HS will need order of 3 * n memory (instead of the current 2 * n, which is not great but we shouldn't make it worse).

My suggestion: before doing a listStatus, create an empty file in the log directory and retrieve its mod time. Use that as newLastModifiedTime. So you don't need to keep another map with every single log file available. Next time you poll, any modifications that happen during the listStatus call would be caught since lastModificationTime is guaranteed to be before anything that happened during the listStatus. (And lastModificationTime basically becomes lastPollTime.)

That may exacerbate the issue raised in SPARK-7189, though.

I realize that some log file doesn't get processed until a bit later

That's a good question. Because if the mod time changes, then the file is being written to, and its mod time will eventually change again. But I guess the same race can occur if the file is closed / renamed during an app shutdown, while the HS is doing the listing?

ArcherShao · 2015-05-07T07:23:14Z

@vanzin Thanks for your suggestion. Shoud the empty file's filename be a fixed value or a random value every time creat it ?

vanzin · 2015-05-07T17:12:55Z

Shoud the empty file's filename be a fixed value or a random value every time creat it ?

Random is probably better, potentially prefixed by . to follow the "hidden file" Unix convention. Also, it should be deleted so that these files don't accumulate (and so you don't need to filter it out when doing listStatus).

SparkQA · 2015-05-08T04:25:46Z

Test build #32180 has finished for PR 5886 at commit d966321.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2015-05-08T09:25:48Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

@@ -184,15 +183,14 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
   */
  private[history] def checkForLogs(): Unit = {
    try {
+      val newLastScanTime = getNewLastScanTime()


So, doesn't this have the opposite 'problem', that a log file will get scanned twice even if it hasn't changed in some cases?

If lastScanTime is 90, and newLastScanTime is 100, and a file is modified (once) at 101 (just after newLastScanTime is established) then it will be read twice. I'm just double-checking that this is fine if it happens only once in a while.

Touching a file seems a little icky but I understand the logic. I can't think of something better that doesn't involve listing the dir again, processing files again frequently, or taking arbitrary guesses about how long the listing takes.

This is worth it in the sense that the cost of missing an update in this very rare case is high? like a correctness issue? you'd miss a bit of history forever?

That's what I mentioned that my suggestion would make SPARK-7189 worse. There are ways to fix it, such as those discussed in the bug, that don't require too much extra memory in the HS - basically keep track of all log files that have lastModified > newLastScanTime and, if they haven't changed in the next poll, don't re-parse them. That's a lot less state to keep track of than the previous approach in this PR - in fact most of the time there will be nothing to keep track of.

srowen · 2015-05-08T21:51:13Z

Yeah, you have a good point that there's not really a way to 100% solve similar very-rare scenarios, like a file rename or deletion. This isn't an issue that affects app correctness right? just collecting logs? which is important, but I'm wondering if this approach would only open up other rare problems: logs collection fails entirely because touching a file fails, etc. Is it worth even trying to address this? it could be documented in comments at least.

vanzin · 2015-05-08T22:02:43Z

This isn't an issue that affects app correctness right?

It depends on what you're calling "app correctness" here. Isn't missing an updated / renamed log a correctness issue?

logs collection fails entirely because touching a file fails

True, but that should only cause one poll to fail. It shouldn't cause the HS to go down. If the HS can't persistently write to HDFS, that means some configuration is wrong, since the history directory is expected to be world-writable.

Is it worth even trying to address this?

I think it's worth it if it can be done without harming other parts of the HS (like requiring way more memory than before). I think the latest approach is close; if some code is added to address SPARK-7189, then it would be ready to go.

The issue raise by this PR should be a really, really rare situation though.

SparkQA · 2015-05-11T03:46:20Z

Test build #32359 has finished for PR 5886 at commit e0c82ec.

This patch passes all tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2015-05-11T08:00:07Z

Test build #32376 has finished for PR 5886 at commit 89436bb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-05-12T05:40:10Z

Test build #32456 has finished for PR 5886 at commit 7081da0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liyezhang556520 · 2015-05-12T14:55:32Z

@vanzin , the current implementation will make SPARK-7189 worse, what about introduce a hashMap to maintain the filename, modifiedTime of each file, file size, say mutable.HashMap[String, Long, Long], it can not only handle the modification case and also rename/delete cases. Since each file's modification time is maintained, this can both solve the problem of the race condition in this issue and also solve SPARK-7189? And it will only introduce extra memory with size of the hashMap size.

vanzin · 2015-05-12T17:07:32Z

what about introduce a hashMap to maintain the filename, modifiedTime of each file

That is pretty expensive for all files - hash maps use a ton of extra memory per entry. If you do that just for the files that have modTime > lastScanTime, that's a lot more manageable. Not great, but I can't think of anything better at the moment.

SparkQA · 2015-05-22T04:14:03Z

Test build #33317 has finished for PR 5886 at commit 67c074c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-05-23T09:01:39Z

Test build #33401 has finished for PR 5886 at commit 68a9633.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2015-07-23T16:23:21Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

+    val path = new Path(logDir, fileName)
+    val fos = fs.create(path)
+
+    val newLastScanTime =


You don't actually need to declare this variable.

vanzin · 2015-07-23T16:26:02Z

The problem @srowen mentions exists, but I think in the end it's not really a problem; that will mostly happen when applications are actively writing to their logs, and in those cases we do want to reload the file to get new events.

The current change looks ok to me (with one minor nit), but you'll need to update the code to match the current master.

vanzin · 2015-07-23T16:26:40Z

(Also, if there's a test that can be written - or an existing one modified - to verify the new behavior, it wouldn't hurt.)

andrewor14 · 2015-09-01T23:40:35Z

@ArcherShao can you rebase to master?

SparkQA · 2015-09-02T06:21:54Z

Test build #41909 has finished for PR 5886 at commit a4a876a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2015-09-02T17:59:02Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

@@ -19,6 +19,7 @@ package org.apache.spark.deploy.history

 import java.io.{BufferedInputStream, FileNotFoundException, InputStream, IOException, OutputStream}
 import java.util.concurrent.{ExecutorService, Executors, TimeUnit}
+import java.util.UUID


nit: out of order

vanzin · 2015-09-02T18:00:39Z

LGTM, I'll merge and fix the nit in the process.

ArcherShao changed the title ~~Fix bug that applications status uncorrect on JobHistory UI.~~ [SPARK-7336][HistoryServer] Fix bug that applications status uncorrect on JobHistory UI. May 4, 2015

srowen reviewed May 5, 2015
View reviewed changes

liyezhang556520 reviewed May 6, 2015
View reviewed changes

srowen reviewed May 8, 2015
View reviewed changes

vanzin reviewed Jul 23, 2015
View reviewed changes

Fix bug that applications status uncorrect on JobHistory UI.

a4a876a

ArcherShao force-pushed the SPARK-7336 branch from 68a9633 to a4a876a Compare September 2, 2015 03:20

vanzin reviewed Sep 2, 2015
View reviewed changes

asfgit closed this in c3b881a Sep 2, 2015

squito mentioned this pull request Sep 27, 2016

[SPARK-17676][CORE] FsHistoryProvider should ignore hidden files #15250

Closed

		@@ -319,6 +318,15 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)

		applications = appsToRetain

		val modifiedTimesToRetain = new mutable.LinkedHashMap[String, Long]()

[SPARK-7336][HistoryServer] Fix bug that applications status uncorrect on JobHistory UI. #5886

[SPARK-7336][HistoryServer] Fix bug that applications status uncorrect on JobHistory UI. #5886

Conversation

ArcherShao commented May 4, 2015

SparkQA commented May 4, 2015

srowen commented May 4, 2015

vanzin commented May 4, 2015

ArcherShao commented May 5, 2015

vanzin commented May 5, 2015

ArcherShao commented May 5, 2015

srowen commented May 5, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 5, 2015

Choose a reason for hiding this comment

vanzin commented May 5, 2015

vanzin commented May 5, 2015

liyezhang556520 commented May 6, 2015

Choose a reason for hiding this comment

vanzin commented May 6, 2015

vanzin commented May 6, 2015

liyezhang556520 commented May 6, 2015

srowen commented May 6, 2015

vanzin commented May 6, 2015

ArcherShao commented May 7, 2015

vanzin commented May 7, 2015

SparkQA commented May 8, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srowen commented May 8, 2015

vanzin commented May 8, 2015

SparkQA commented May 11, 2015

SparkQA commented May 11, 2015

SparkQA commented May 12, 2015

liyezhang556520 commented May 12, 2015

vanzin commented May 12, 2015

SparkQA commented May 22, 2015

SparkQA commented May 23, 2015

Choose a reason for hiding this comment

vanzin commented Jul 23, 2015

vanzin commented Jul 23, 2015

andrewor14 commented Sep 1, 2015

SparkQA commented Sep 2, 2015

Choose a reason for hiding this comment

vanzin commented Sep 2, 2015