Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7336][HistoryServer] Fix bug that applications status uncorrect on JobHistory UI. #5886

Closed
wants to merge 1 commit into from

Conversation

ArcherShao
Copy link
Contributor

No description provided.

@ArcherShao ArcherShao changed the title Fix bug that applications status uncorrect on JobHistory UI. [SPARK-7336][HistoryServer] Fix bug that applications status uncorrect on JobHistory UI. May 4, 2015
@SparkQA
Copy link

SparkQA commented May 4, 2015

Test build #31755 has finished for PR 5886 at commit e70bedc.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 4, 2015

On inspection, I don't agree with the scenario you suggest. You say that if a last modification time is T, and a check happens at T + t/2, then it won't notice a file update at T + t/4. But the logic in this case finds files modified >= T. Why would this not update?

@vanzin
Copy link
Contributor

vanzin commented May 4, 2015

I'm not convinced either. Please write a test that fails with the current code and passes with your changes. Also, storing another huge map in FsHistoryProvider is the wrong approach to solving any bug.

@ArcherShao
Copy link
Contributor Author

@srowen Check all log files one time will cost some time, say 't', if the first file(app1) check at time T, and the last file(appN) check at time T + t, the last modification time of app1 is T + t / 4, and the last modification time of appN is T + t / 2.Then app1 will ignored at this check, and 'lastModifiedTime' will update to T + t / 2, app1 will ignored at next checks.

@vanzin
Copy link
Contributor

vanzin commented May 5, 2015

I still don't understand. The "last modification time" is not based on how long it took to parse logs. It's based on the modification time of the newest log found.

If you scan files at time t, all logs will have modification time <= t, unless some code is doing something funny. The lastModifiedTime value in FsHistoryProvider will be, at most t. So if app1's log file changes after that poll, its modification time will be > t and it will be re-loaded next time a poll happens.

But feel free to prove me wrong by writing a test for this.

@ArcherShao
Copy link
Contributor Author

Scan files will cost time, I mean start scan at time T, and finish at time T + t. If some update happen to the first file between T and T + t, the bug will repreduce.

@srowen
Copy link
Member

srowen commented May 5, 2015

I think this may be the problem scenario:

  • Log file last modified at time 80
  • lastModifiedTime begins at 90 due to other log files
  • Scan starts at 100
  • Log file is excluded from the update (80 < 100)
  • Log file is modified at time 101
  • Scan finds another log file modified at time 102
  • Scan ends and lastModifiedTime becomes 102
  • The next run won't pick up the log file since 101 < 102 but it had modifications

time >= lastModifiedTime
val fileName = entry.getPath.getName
val oldLastModifiedTime = lastModifiedTimes.getOrElse(fileName, -1L)
time > oldLastModifiedTime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think technically this has to be >= since you might have an update that occurred in the same millisecond. This is very unlikely. But is there harm in reprocessing logs in this code? does it have to guarantee a log isn't read twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the app is in process, it doesn't matter if we don't reprocess the log file. After the app completed, its' file name will change, and it will get the oldLastModifiedTime = -1, so time > oldLastModifiedTime will be true, and the log file will reprocess.

@SparkQA
Copy link

SparkQA commented May 5, 2015

Test build #31875 has finished for PR 5886 at commit 9dcedd0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -319,6 +318,15 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)

applications = appsToRetain

val modifiedTimesToRetain = new mutable.LinkedHashMap[String, Long]()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about simply deleting the entries that are too old? then the member can be val and doesn't need to be @volatile

@vanzin
Copy link
Contributor

vanzin commented May 5, 2015

Scan finds another log file modified at time 102

This shouldn't technically happen since the scan started at time 100. The listing is done in one call, so it's very unlikely such a situation would exist. Still it might be possible. But in any case, this needs a test.

@vanzin
Copy link
Contributor

vanzin commented May 5, 2015

@srowen I guess the part that I don't follow in your example is: how can a scan see a file with mod time = 102 but not see a file with mod time = 101 in the same listing call? Unless HDFS is really screwed up (unlikely), or the user is manually messing with the files' times, this shouldn't happen.

@liyezhang556520
Copy link
Contributor

@vanzin , there is time interval from getting the first file's modification time to the last file's. Assume there are 3 files: F1, F2, F3. And before scanning, their modification times are TF1=100, TF2=101, TF3=102 respectively.
At time T1=103, we start scanning .
At time T2=104, we finished loading F1 mod time, starting to loading F2 mod time.
At time T3=107, we finished loading F2 mod time. At this point, lastModifiedTime is 101, which is equal to F2 mode time --- TF2. And during loading F2 mod time, there are two operations:
First, at time T4=105, contents written to F1, which leads to F1 mod time changing from TF1=100 to TF1'=105
Second, at time T5=106, contents written to F3, which leads to F3 mod time changing from TF3=102 to TF3'=106.

Then we continue to load F3 mode time, and at time T6=108, we finished loading F3 mode time. At this point, lastModifiedTime is 106.

So for the next round, we would not pick up F1 even it has been modified at time T4=105.

val fileName = entry.getPath.getName
val oldLastModifiedTime = lastModifiedTimes.getOrElse(fileName, -1L)
lastModifiedTimes += (fileName -> time)
time > oldLastModifiedTime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can not use > to check whether the file is modified. We must use >=. Since there might be modification to the file immediately before and after calling getModificationTime, the result of which is the modified time stay the same. We need use >= to check the time along with checking the file size changing. For some discussions, please refer to SPARK-7189

@vanzin
Copy link
Contributor

vanzin commented May 6, 2015

At time T2=104, we finished loading F1 mod time, starting to loading F2 mod time.

That doesn't exist. All mod times are retrieved in the same call. There's a single call to "listStatus" which returns all available log files - and log files are now files, and not directories as they used to be. If that were still the case then you'd be correct.

So you should never, in the same call to listStatus, get "f2 with mod time 102" and not get "f1 with mod time 101".

@vanzin
Copy link
Contributor

vanzin commented May 6, 2015

And BTW I'm still waiting for that test that shows an issue exists here.

@liyezhang556520
Copy link
Contributor

Hi @vanzin , you are correct that a single call to listStatus will return all information of the all available log files, but I'm not sure about the details of listStatus in HDFS, it seems when there are too many items in the directory, HDFS will fetch them in batch, and between the batches, there is a time gap. in which some of the write operations can be inserted. So my concern is whether the getModificationTime provided by HDFS FileStatus can be fully trusted? May be there are some scenarios that HDFS can guarantee.

@srowen
Copy link
Member

srowen commented May 6, 2015

@vanzin yes the question is really what can happen inside listStatus. It's not a question of being correct or trusted @liyezhang556520 but just what can happen concurrently.

  • listStatus begins at 99
  • Finds old last modified time at 100
  • File modified at 101
  • Call returns at 102 with last modified time earlier than 101

This could be my ignorance but is this metadata update atomic in the name node? meaning, as far as HDFS is concerned, the file could not have been modified at 101 since it was busy with listStatus and recorded the modification at some time >= 102?

... but this boxes the issue into a really small corner. What happens in this case? I realize that some log file doesn't get processed until a bit later then but does the subsequent processing then go wrong? If the application state isn't corrupted or wrong afterwards, I think this isn't worth addressing.

@vanzin
Copy link
Contributor

vanzin commented May 6, 2015

Ok, I buy that scenario. But as I mentioned before, the current solution is not very good: it increases the memory usage of the history server too much. If you have n log files, now during polling your HS will need order of 3 * n memory (instead of the current 2 * n, which is not great but we shouldn't make it worse).

My suggestion: before doing a listStatus, create an empty file in the log directory and retrieve its mod time. Use that as newLastModifiedTime. So you don't need to keep another map with every single log file available. Next time you poll, any modifications that happen during the listStatus call would be caught since lastModificationTime is guaranteed to be before anything that happened during the listStatus. (And lastModificationTime basically becomes lastPollTime.)

That may exacerbate the issue raised in SPARK-7189, though.

I realize that some log file doesn't get processed until a bit later

That's a good question. Because if the mod time changes, then the file is being written to, and its mod time will eventually change again. But I guess the same race can occur if the file is closed / renamed during an app shutdown, while the HS is doing the listing?

@ArcherShao
Copy link
Contributor Author

@vanzin Thanks for your suggestion. Shoud the empty file's filename be a fixed value or a random value every time creat it ?

@vanzin
Copy link
Contributor

vanzin commented May 7, 2015

Shoud the empty file's filename be a fixed value or a random value every time creat it ?

Random is probably better, potentially prefixed by . to follow the "hidden file" Unix convention. Also, it should be deleted so that these files don't accumulate (and so you don't need to filter it out when doing listStatus).

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32180 has finished for PR 5886 at commit d966321.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -184,15 +183,14 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
*/
private[history] def checkForLogs(): Unit = {
try {
val newLastScanTime = getNewLastScanTime()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, doesn't this have the opposite 'problem', that a log file will get scanned twice even if it hasn't changed in some cases?

If lastScanTime is 90, and newLastScanTime is 100, and a file is modified (once) at 101 (just after newLastScanTime is established) then it will be read twice. I'm just double-checking that this is fine if it happens only once in a while.

Touching a file seems a little icky but I understand the logic. I can't think of something better that doesn't involve listing the dir again, processing files again frequently, or taking arbitrary guesses about how long the listing takes.

This is worth it in the sense that the cost of missing an update in this very rare case is high? like a correctness issue? you'd miss a bit of history forever?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I mentioned that my suggestion would make SPARK-7189 worse. There are ways to fix it, such as those discussed in the bug, that don't require too much extra memory in the HS - basically keep track of all log files that have lastModified > newLastScanTime and, if they haven't changed in the next poll, don't re-parse them. That's a lot less state to keep track of than the previous approach in this PR - in fact most of the time there will be nothing to keep track of.

@srowen
Copy link
Member

srowen commented May 8, 2015

Yeah, you have a good point that there's not really a way to 100% solve similar very-rare scenarios, like a file rename or deletion. This isn't an issue that affects app correctness right? just collecting logs? which is important, but I'm wondering if this approach would only open up other rare problems: logs collection fails entirely because touching a file fails, etc. Is it worth even trying to address this? it could be documented in comments at least.

@vanzin
Copy link
Contributor

vanzin commented May 8, 2015

This isn't an issue that affects app correctness right?

It depends on what you're calling "app correctness" here. Isn't missing an updated / renamed log a correctness issue?

logs collection fails entirely because touching a file fails

True, but that should only cause one poll to fail. It shouldn't cause the HS to go down. If the HS can't persistently write to HDFS, that means some configuration is wrong, since the history directory is expected to be world-writable.

Is it worth even trying to address this?

I think it's worth it if it can be done without harming other parts of the HS (like requiring way more memory than before). I think the latest approach is close; if some code is added to address SPARK-7189, then it would be ready to go.

The issue raise by this PR should be a really, really rare situation though.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32359 has finished for PR 5886 at commit e0c82ec.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32376 has finished for PR 5886 at commit 89436bb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32456 has finished for PR 5886 at commit 7081da0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liyezhang556520
Copy link
Contributor

@vanzin , the current implementation will make SPARK-7189 worse, what about introduce a hashMap to maintain the filename, modifiedTime of each file, file size, say mutable.HashMap[String, Long, Long], it can not only handle the modification case and also rename/delete cases. Since each file's modification time is maintained, this can both solve the problem of the race condition in this issue and also solve SPARK-7189? And it will only introduce extra memory with size of the hashMap size.

@vanzin
Copy link
Contributor

vanzin commented May 12, 2015

what about introduce a hashMap to maintain the filename, modifiedTime of each file

That is pretty expensive for all files - hash maps use a ton of extra memory per entry. If you do that just for the files that have modTime > lastScanTime, that's a lot more manageable. Not great, but I can't think of anything better at the moment.

@SparkQA
Copy link

SparkQA commented May 22, 2015

Test build #33317 has finished for PR 5886 at commit 67c074c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 23, 2015

Test build #33401 has finished for PR 5886 at commit 68a9633.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val path = new Path(logDir, fileName)
val fos = fs.create(path)

val newLastScanTime =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't actually need to declare this variable.

@vanzin
Copy link
Contributor

vanzin commented Jul 23, 2015

The problem @srowen mentions exists, but I think in the end it's not really a problem; that will mostly happen when applications are actively writing to their logs, and in those cases we do want to reload the file to get new events.

The current change looks ok to me (with one minor nit), but you'll need to update the code to match the current master.

@vanzin
Copy link
Contributor

vanzin commented Jul 23, 2015

(Also, if there's a test that can be written - or an existing one modified - to verify the new behavior, it wouldn't hurt.)

@andrewor14
Copy link
Contributor

@ArcherShao can you rebase to master?

@SparkQA
Copy link

SparkQA commented Sep 2, 2015

Test build #41909 has finished for PR 5886 at commit a4a876a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -19,6 +19,7 @@ package org.apache.spark.deploy.history

import java.io.{BufferedInputStream, FileNotFoundException, InputStream, IOException, OutputStream}
import java.util.concurrent.{ExecutorService, Executors, TimeUnit}
import java.util.UUID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: out of order

@vanzin
Copy link
Contributor

vanzin commented Sep 2, 2015

LGTM, I'll merge and fix the nit in the process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants