Skip to content

[SPARK-30481][CORE][FOLLOWUP] Execute log compaction only when merge application listing is successful#27408

Closed
HeartSaVioR wants to merge 2 commits intoapache:masterfrom
HeartSaVioR:SPARK-30481-FOLLOWUP-MINOR-FIXES
Closed

[SPARK-30481][CORE][FOLLOWUP] Execute log compaction only when merge application listing is successful#27408
HeartSaVioR wants to merge 2 commits intoapache:masterfrom
HeartSaVioR:SPARK-30481-FOLLOWUP-MINOR-FIXES

Conversation

@HeartSaVioR
Copy link
Contributor

What changes were proposed in this pull request?

This PR fixes a couple of minor issues on SPARK-30481:

  • SHS runs "compaction" regardless of the result of "merge application listing".

If "merge application listing" fails, most likely the application log will have some issue and "compaction" won't work properly then. We can just skip trying compaction when "merge application listing" fails.

  • When "compaction" throws exception we don't handle it.

It's expected to swallow exception, but we don't even log the exception for now. It should be logged properly.

Why are the changes needed?

Described in above section.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing UTs.

@HeartSaVioR
Copy link
Contributor Author

cc. @vanzin @squito @gaborgsomogyi

@dongjoon-hyun
Copy link
Member

Hi, @HeartSaVioR . If you don't mind, could you revise the PR title more specifically? The current title repeats SPARK-30481 twice and minor issues doesn't give us any information.

[SPARK-30481][CORE][FOLLOWUP] Fix minor issues on SPARK-30481

@SparkQA
Copy link

SparkQA commented Jan 31, 2020

Test build #117589 has finished for PR 27408 at commit 20878e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR HeartSaVioR changed the title [SPARK-30481][CORE][FOLLOWUP] Fix minor issues on SPARK-30481 [SPARK-30481][CORE][FOLLOWUP] Execute compaction only when merge application listing is successful Jan 31, 2020
@HeartSaVioR HeartSaVioR changed the title [SPARK-30481][CORE][FOLLOWUP] Execute compaction only when merge application listing is successful [SPARK-30481][CORE][FOLLOWUP] Execute log compaction only when merge application listing is successful Jan 31, 2020
@HeartSaVioR
Copy link
Contributor Author

Thanks @dongjoon-hyun I just updated the title to describe one of issue. Other is really minor and don't think we should ever need to mention it.

@dongjoon-hyun
Copy link
Member

Thanks!

case None => // This is not applied to single event log file.
}
} catch {
case e: Exception => logError(s"Exception while compacting log for $rootPath", e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERROR level seems to be very high. Can we lower the log level, WARN? This is printed at every period, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's consistent with existing exception handling for mergeApplicationListing.

private def mergeApplicationListing(
reader: EventLogFileReader,
scanTime: Long,
enableOptimizations: Boolean): Unit = {
val rootPath = reader.rootPath
try {
val lastEvaluatedForCompaction: Option[Long] = try {
listing.read(classOf[LogInfo], rootPath.toString).lastEvaluatedForCompaction
} catch {
case _: NoSuchElementException => None
}
pendingReplayTasksCount.incrementAndGet()
doMergeApplicationListing(reader, scanTime, enableOptimizations, lastEvaluatedForCompaction)
if (conf.get(CLEANER_ENABLED)) {
checkAndCleanLog(rootPath.toString)
}
} catch {
case e: InterruptedException =>
throw e
case e: AccessControlException =>
// We don't have read permissions on the log file
logWarning(s"Unable to read log $rootPath", e)
blacklist(rootPath)
// SPARK-28157 We should remove this blacklisted entry from the KVStore
// to handle permission-only changes with the same file sizes later.
listing.delete(classOf[LogInfo], rootPath.toString)
case e: Exception =>
logError("Exception while merging application listings", e)
} finally {
endProcessing(rootPath)
pendingReplayTasksCount.decrementAndGet()
// triggering another task for compaction task
submitLogProcessTask(rootPath) { () => compact(reader) }
}
}

compact is even executed only when mergeApplicationListing succeeds, which would prune many of possible error cases for compact, so less possible to be logged.

If we feel the log level of both are too high, I'll change both altogether. WDYT?

Copy link
Member

@dongjoon-hyun dongjoon-hyun Jan 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above error handling is more fine-grained than this PR.
For example, AccessControlException is the one which most users will hit frequently with this feature. (That's the reason why I asked at the documentation.) For AccessControlException, the above code shows logWarning instead of logError. That seems to be the difference between your and my perspectives.

Let's see @vanzin or other people's opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's good point. I agree with you that mergeApplicationListing has more fine-grained exception handling and I even think compact should have same handling except blacklisting. Let me fix it.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @HeartSaVioR .
Pending Jenkins.

@SparkQA
Copy link

SparkQA commented Jan 31, 2020

Test build #117610 has finished for PR 27408 at commit e2eebd6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Merged to master.

@HeartSaVioR
Copy link
Contributor Author

Thanks for reviewing and merging!

@HeartSaVioR HeartSaVioR deleted the SPARK-30481-FOLLOWUP-MINOR-FIXES branch January 31, 2020 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments