Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18010][Core] Reduce work performed for building up the application list for the History Server app list UI page #15556

Closed
wants to merge 9 commits into from

Conversation

vijoshi
Copy link
Contributor

@vijoshi vijoshi commented Oct 19, 2016

What changes were proposed in this pull request?

allow ReplayListenerBus to skip deserialising and replaying certain events using an inexpensive check of the event log entry. Use this to ensure that when event log replay is triggered for building the application list, we get the ReplayListenerBus to skip over all but the few events needed for our immediate purpose. Refer [SPARK-18010] for the motivation behind this change.

How was this patch tested?

Tested with existing HistoryServer and ReplayListener unit test suites. All tests pass.

Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request.

*/
def replay(
logData: InputStream,
sourceName: String,
maybeTruncated: Boolean = false): Unit = {
maybeTruncated: Boolean = false,
eventsFilter: (String) => Boolean = (s: String) => true): Unit = {
var currentLine: String = null
var lineNumber: Int = 1
try {
val lines = Source.fromInputStream(logData).getLines()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call filter on the result of getLines() and pass the new predicate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't use a filter was so we could get correct line nos in the logs (catch block) in case parsing failed for one of them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. Still it seems like a case where zipWithIndex would be easier, followed by filter? While we're at it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a try in my "incorporate review comments" commit, but backed it out after I realized that it would have broken the "We can only ignore exception from last line of the file that might be truncated" logic and only way to fix that with zipindex/filter in place was to determine the total event log line count before any iteration over the log entries, which appeared as defeating the purpose of the zipindex/filter optimization. hence backed it out to the older way. let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that logic changes. It already doesn't actually know if there are more lines in the file, because the error might cause the iterator to close and not indicate any more lines available. It's pretty best-effort. I guess the scenario is: error occurs, and there are no more matching lines, but there are some unmatching lines available? Now we'd continue instead of fail, which seems equally reasonable given the purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, agree - updated the replay implementation

@@ -81,6 +81,10 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)

private val SPARK_HISTORY_FS_NUM_REPLAY_THREADS = "spark.history.fs.numReplayThreads"

private val AppStartEvent = "{\"Event\":\"SparkListenerApplicationStart\""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use CONSTANT_CASE and maybe call it a _PREFIX? that's just my taste.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should really be in a companion object anyway, including existing constants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the new constants to a companion object. did you want the existing unrelated ones moved into it as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, seems like it would be nice for consistency. You are welcome to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -557,7 +561,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
*/
private def replay(
eventLog: FileStatus,
bus: ReplayListenerBus): Option[FsApplicationAttemptInfo] = {
bus: ReplayListenerBus,
eventsFilter: (String) => Boolean = (s: String) => true): Option[FsApplicationAttemptInfo] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be parameterized? it seems like this method always can apply the filter you pass here. It's called two places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean this way ?

  private def replay(
      eventLog: FileStatus,
      bus: ReplayListenerBus,
      eventsFilter: (String) => Boolean): Option[FsApplicationAttemptInfo] = {

and have both the places that we call into it provide a filter:

val appAttemptInfo = replay(fs.getFileStatus(new Path(logDir, attempt.logPath)),
            replayBus, (s)=> true)

and

        val res = replay(fileStatus, bus, (eventString => eventString.startsWith(AppStartEvent)
          || eventString.startsWith(AppEndEvent)))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean it looked like the method would never need to use any filter except for the start/end event one. So it doesn't need it as a parameter. That is, why is it up to the caller? the method filters in what it needs to produce the info it needs. I could be missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok - this internal replay helper method is called in two places in FsHistoryProvider. The two places want different filtering - one wants all events replayed (call from getAppUI() ), the other (call from mergeApplicationListing) just the app start/end so they need to pass in the respective filters.

Copy link
Member

@ajbozarth ajbozarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked out your code and it looks good and works well, huge speed up. A couple of nits below but otherwise I really like this, it solved the speed issue without added/updated metadata. I'd like to see some more committer feedback but overall LGTM.

@@ -243,7 +243,7 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
val appListener = new ApplicationEventListener()
replayBus.addListener(appListener)
val appAttemptInfo = replay(fs.getFileStatus(new Path(logDir, attempt.logPath)),
replayBus)
replayBus, ReplayListenerBus.SELECT_ALL_FILTER)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ReplayListenerBus.SELECT_ALL_FILTER is the default this is unnecessary now

} catch {
case jpe: JsonParseException =>
// We can only ignore exception from last line of the file that might be truncated
if (!maybeTruncated || lines.hasNext) {
throw jpe
} else {
logWarning(s"Got JsonParseException from log file $sourceName" +
s" at line $lineNumber, the file might not have finished writing cleanly.")
s" at line $lineNumber, the file might not have finished writing cleanly.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this space is unnecessary and breaks scalastyle line length

{ eventString =>
eventString.startsWith (APPL_START_EVENT_PREFIX) ||
eventString.startsWith (APPL_END_EVENT_PREFIX)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though this passes scalastyle I think this could be styled cleaner

}
} catch {
case ioe: IOException =>
throw ioe
case e: Exception =>
logError(s"Exception parsing Spark event log: $sourceName", e)
logError(s"Malformed line #$lineNumber: $currentLine\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might still be useful to include the line number in there, if relevant. Not that it helps debug things much: if your file is corrupt then it will take effort to fix

Copy link
Contributor Author

@vijoshi vijoshi Oct 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that if the Exception occurs when iterating over lines, then I've added code to the inner catch( ) block to log out the line number etc. had to move things around due to the way i had to refactor this method. now when we get to this outer catch block:

  • either the line number has already been logged or,
  • there was a problem building the iterator over the event stream in which case there is no line number / current line to display.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@tgravescs
Copy link
Contributor

So this is still reading the entire file, its just filtering out just the start and end log entries? so basically the listener bus only replays those events vs all of them.

Have you run this on history files with GBs? I'll give it a try here when I have a chance and do a further review.

If it a good speedup without having the metadata separate its definitely good to have an if needed we can always do the separate metadata later.

@srowen
Copy link
Member

srowen commented Oct 20, 2016

That's right, it still reads the whole file but skips posting most of it.

@vijoshi
Copy link
Contributor Author

vijoshi commented Oct 20, 2016

@tgravescs I have tried it with a 150MB file and speedup is very noticeable (refer the spark issue for results I see). I have noticed the JSON deserialization is particularly slow, while simply running through the file not so much. I would expect proportionate benefit on larger files wherever reading the file/event stream is much less expensive than the Json deserialization + replay.

@tgravescs
Copy link
Contributor

I tried this out on one of our larger clusters. Normally history UI shows around 4500-5000 applications and takes a long time to load all of them. I don't have an exact time but I'm sure its over an hour. We have history files that can be 10's of GBs.

It is loading noticeably faster but to me its still not as good as it could be for just listing the apps. It took 30 minutes for it to create the entire listing. That is the initial startup.

anyway I'm fine with it as its definitely an improvement but I think we should leave the other one open. Or if no one else has this problem leave it closed and I'll reopen when I have time to work on it. I won't have time to code review in detail today but if others have already you don't have to wait for me.

@ajbozarth
Copy link
Member

@tgravescs I've been looking into (and will continue to) a metadata solution for this, but for now I think this fixes the "Bug" that was reported. I'd consider any metadata update an "improvement" at this point and a new JIRA could be opened for it at that time.

Also @srowen and @tgravescs, a side note, I've noticed that work done on the History Server is usually labeled as [Core] not [Web UI], is there a reason we don't consider the History Server part of the Web UI on the code side? (It can make tracking UI related JIRAs and PRs a little difficult)

@vanzin
Copy link
Contributor

vanzin commented Oct 20, 2016

ok to test

@vanzin
Copy link
Contributor

vanzin commented Oct 20, 2016

is there a reason we don't consider the History Server part of the Web UI

It would probably be better to have a separate component altogether for it.

val lineEntries = Source.fromInputStream(logData)
.getLines()
.zipWithIndex
.filter(entry => eventsFilter(entry._1))
Copy link
Contributor

@vanzin vanzin Oct 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .filter { case (line, _) => eventsFilter(line) }

try {
postToAll(JsonProtocol.sparkEventFromJson(parse(currentLine)))
entry = lineEntries.next()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: val (line, lineno) = lineEntries.next()

}

case e: Exception =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception handler looks weird. It's logging then re-throwing the exception, which is later caught again and another log message is written for it. So this can probably just be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to have the line number at which failure happened, since we had that in the earlier implementation. the line/line no values are not in scope in the outer catch so can't get them logged there. open for a better way to do this if any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it anyway to remove that exception handler.


// utility filter that selects all event logs during replay
val SELECT_ALL_FILTER = (eventString: String) => true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this I have a slight preference for an overloaded replay method that does not take a filter. Or a default value for the filter argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this as a default value is what I was trying to get at with some of my previous comments as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated with latest review comments.

@SparkQA
Copy link

SparkQA commented Oct 20, 2016

Test build #67285 has finished for PR 15556 at commit 962e524.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Oct 20, 2016

BTW I agree with Tom that SPARK-6951 should remain open since this issue helps, but does not fix the issue completely.

@SparkQA
Copy link

SparkQA commented Oct 21, 2016

Test build #67328 has finished for PR 15556 at commit 1a53388.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -557,7 +560,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
*/
private def replay(
eventLog: FileStatus,
bus: ReplayListenerBus): Option[FsApplicationAttemptInfo] = {
bus: ReplayListenerBus,
eventsFilter: ReplayEventsFilter = SELECT_ALL_FILTER): Option[FsApplicationAttemptInfo] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a bit out of scope to fix this, but this method is now convoluted. Its return value only depends on start/end events, so a filter that filters these out is invalid. A filter that filters in anything else is superfluous. Except that the second call site also depends on this to replay more events to the bus so that it can also optionally parse out ACLs etc from a listener to the bus. Why not push all that logic in here, and then return a structure that contains whatever info can be parsed from what the filtered-in events contain? I don't feel strongly that it must be part of this change but may be simpler than adding more overloads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen while you're right the return value could be empty if the filter specified filtered out the app start/end events, but then I also see that ideally the way to parse out data from replayed events is for callers to set appropriate listeners of their choice on the ReplayListenerBus instance specified for event replay. And the listeners capture event data etc. This particular replay () method in FsHistoryProvider looks like a helper for the two call sites and not much more, so maybe not so much of a problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree, but then, the other call site actually uses the return value of this method, but, also supplies a listener/bus to collect other info, and then also parses that. It seems like this method could simply do all that work for both callers and not even need an external bus?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper replay method will need to be supplied a listener bus externally because the two call sites need very different sets of listeners on the bus. Only thing it looked we could do was enrich the return value from the helper to prevent the second callsite (getAppUI() ) from having to add and parse from an additional listener of its own. I've pushed the updated commit.

@SparkQA
Copy link

SparkQA commented Oct 21, 2016

Test build #67329 has finished for PR 15556 at commit 552c405.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


val res = replay(fileStatus, bus, eventsFilter)

res.map {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here to try to get rid of some of the redundancy, because the None case is kind of addressed twice?

If it doesn't work to just return from the existing None case, that's simpler. Otherwise, pull up the check? to before this line? otherwise no real point in mapping nothing.

And then: map should be foreach, and there's no need to return a value in the statement below

@@ -399,20 +396,26 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
}
}


/**
* Replay the log files in the list and merge the list of old applications with new ones
*/
private def mergeApplicationListing(fileStatus: FileStatus): Unit = {
val newAttempts = try {
val bus = new ReplayListenerBus()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you even need a reference to this here now? can just pass new ReplayListenerBus() to the replay method


private val SPARK_HISTORY_FS_NUM_REPLAY_THREADS = "spark.history.fs.numReplayThreads"

val APPL_START_EVENT_PREFIX = "{\"Event\":\"SparkListenerApplicationStart\""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these be private? not 100% sure

try {
val entry = lineEntries.next()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing Marcelo's comment that this can be val (currentLine, lineNumber) = ..., I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to have lineNumber assigned to something that was in scope within the outer Exception catch () block to log the lineNumber where failure occurred.

@SparkQA
Copy link

SparkQA commented Oct 24, 2016

Test build #67451 has finished for PR 15556 at commit b36d51b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor things left to take care of.

replayBus)
appAttemptInfo.map { info =>

res.map { case (attemptInfo, envInfo) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems like attemptInfo is not used, so it should be _.

throw jpe
} else {
logWarning(s"Got JsonParseException from log file $sourceName" +
s" at line $lineNumber, the file might not have finished writing cleanly.")
s" at line ${lineNumber + 1}, the file might not have finished writing cleanly.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're adjusting line number every time you use it, might as well do it when you assign the variable. (Also because without the adjustment the variable name is wrong - it's the 0-based index, not the "line number".)

@@ -422,6 +425,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
}

if (newAttempts.isEmpty) {
logWarning(s"Failed to load application log ${fileStatus.getPath}. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was clearer where it was before. (Which might be the reason for Sean's other comment.)

In fact, I think this log message is scarier than it is useful. An application without an ID is a valid case (it may be still starting up), so printing warnings here is a little misleading.

bus: ReplayListenerBus): Option[FsApplicationAttemptInfo] = {
bus: ReplayListenerBus,
eventsFilter: ReplayEventsFilter = SELECT_ALL_FILTER)
: (Option[(FsApplicationAttemptInfo, FsApplicationEnvACLInfo)]) = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about returning the ApplicationEventListener instead of creating this new type (FsApplicationEnvACLInfo) for the info that exists there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that the other value returned is an FsApplicationAttemptInfo. For a replay, the ApplicationEventListener is always there, the FsApplicationAttemptInfo may or may not get generated. So the return value would probably have to be (Option[FsApplicationAttemptInfo], ApplicationEventListener) - which could get awkward to deal with ? i.e you couldn't simply say that the method returned an Option[(FsApplicationAttemptInfo, ApplicationEventListener)]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the way the return value is used in the two call sites, you could get away with returning just ApplicationEventListener and move some of the logic currently in this method to mergeApplicationListing, since getAppUI doesn't seem to need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, yes, updated the pull request.

@vijoshi vijoshi force-pushed the SAAS-467_master branch 2 times, most recently from 1a2ae5f to c686337 Compare October 24, 2016 20:30
@SparkQA
Copy link

SparkQA commented Oct 24, 2016

Test build #67470 has finished for PR 15556 at commit ce2c337.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 24, 2016

Test build #67472 has finished for PR 15556 at commit c686337.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 24, 2016

Test build #67473 has finished for PR 15556 at commit 85ffbad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 25, 2016

Test build #67500 has finished for PR 15556 at commit 43ce952.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with it if @vanzin is

@vanzin
Copy link
Contributor

vanzin commented Oct 25, 2016

LGTM. Merging to master.

@asfgit asfgit closed this in c5fe3dd Oct 25, 2016
@vijoshi
Copy link
Contributor Author

vijoshi commented Oct 26, 2016

@vanzin @srowen @ajbozarth - can this go into branch-2.0 as well? is there a deciding criteria? i can open a pull request.

@srowen
Copy link
Member

srowen commented Oct 26, 2016

Hm, usually it's more for important or small fixes. This is a moderate performance improvement, in a place it's important. I don't think it changes behavior. I'd be OK back-porting it. Anyone else?

@vanzin
Copy link
Contributor

vanzin commented Oct 26, 2016

If it's a clean backport I'm ok with it.

@vijoshi
Copy link
Contributor Author

vijoshi commented Oct 27, 2016

@vanzin @srowen - thanks, i've opened pull request #15655 on branch-2.0

robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
…tion list for the History Server app list UI page

## What changes were proposed in this pull request?
allow ReplayListenerBus to skip deserialising and replaying certain events using an inexpensive check of the event log entry. Use this to ensure that when event log replay is triggered for building the application list, we get the ReplayListenerBus to skip over all but the few events needed for our immediate purpose. Refer [SPARK-18010] for the motivation behind this change.

## How was this patch tested?

Tested with existing HistoryServer and ReplayListener unit test suites. All tests pass.

Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request.

Author: Vinayak <vijoshi5@in.ibm.com>

Closes apache#15556 from vijoshi/SAAS-467_master.
vijoshi added a commit to vijoshi/spark that referenced this pull request Nov 11, 2016
…tion list for the History Server app list UI page

## What changes were proposed in this pull request?
allow ReplayListenerBus to skip deserialising and replaying certain events using an inexpensive check of the event log entry. Use this to ensure that when event log replay is triggered for building the application list, we get the ReplayListenerBus to skip over all but the few events needed for our immediate purpose. Refer [SPARK-18010] for the motivation behind this change.

## How was this patch tested?

Tested with existing HistoryServer and ReplayListener unit test suites. All tests pass.

Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request.

Author: Vinayak <vijoshi5@in.ibm.com>

Closes apache#15556 from vijoshi/SAAS-467_master.
asfgit pushed a commit that referenced this pull request Nov 14, 2016
…tion list for the History Server app list UI page

## What changes were proposed in this pull request?

backport of pull request #15556

allow ReplayListenerBus to skip deserialising and replaying certain events using an inexpensive check of the event log entry. Use this to ensure that when event log replay is triggered for building the application list, we get the ReplayListenerBus to skip over all but the few events needed for our immediate purpose. Refer [SPARK-18010] for the motivation behind this change.
## How was this patch tested?

Tested with existing HistoryServer and ReplayListener unit test suites. All tests pass.

Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request.

Author: Vinayak vijoshi5in.ibm.com

Closes #15556 from vijoshi/SAAS-467_master.

Author: Vinayak <vijoshi5@in.ibm.com>

Closes #15655 from vijoshi/SAAS-467_branch-2.0.
@vijoshi vijoshi deleted the SAAS-467_master branch January 24, 2017 10:31
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…tion list for the History Server app list UI page

## What changes were proposed in this pull request?
allow ReplayListenerBus to skip deserialising and replaying certain events using an inexpensive check of the event log entry. Use this to ensure that when event log replay is triggered for building the application list, we get the ReplayListenerBus to skip over all but the few events needed for our immediate purpose. Refer [SPARK-18010] for the motivation behind this change.

## How was this patch tested?

Tested with existing HistoryServer and ReplayListener unit test suites. All tests pass.

Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request.

Author: Vinayak <vijoshi5@in.ibm.com>

Closes apache#15556 from vijoshi/SAAS-467_master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants