Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-4705:[core] Write event logs of different application attempts to different files. #4845

Closed

Conversation

twinkle-sachdeva
Copy link

Hi,

Following is the approach taken:

  1. For applications with attempt ID," _<attempt_id>" has been added to file name, while creating the event log file. For example, with attempt id file name will be : application_1423546284151_0031_2
  2. Added the attempt id inside the SparkListenerApplicationStart event, so that same can be read while replaying the event log file too.
  3. If there is no application with attempt id info (e.g. all played in client mode ), then old UI will continue to display, as an application with attempt id has been logged, then following new UI will start appearing

updated ui - ii

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented Mar 1, 2015

(Please rebase, and make the title of the PR descriptive.)

@twinkle-sachdeva twinkle-sachdeva changed the title Pull request for SPARK-4705 from master branch SPARK-4705:[ For Cluster mode ] Pull request for being able to retain the event logs for all the application attempts. Currently they need to be overridden using property "spark.eventLog.overwrite" Mar 2, 2015
twinkle-g and others added 4 commits March 2, 2015 13:58
…the master branch. 2) Added the attempt id inside the SparkListenerApplicationStart, to make the info available independent of directory structure. 3) Changes in History Server to render the UI as per the snaphot II
@@ -26,7 +26,8 @@ private[spark] case class ApplicationHistoryInfo(
endTime: Long,
lastUpdated: Long,
sparkUser: String,
completed: Boolean = false)
completed: Boolean = false,
appAttemptId: String = "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option[String]?

@srowen
Copy link
Member

srowen commented Mar 2, 2015

(Title should start with something like SPARK-4705 [YARN] ...)
This seems like a big change for a fairly narrow problem, which is that retrying the driver in cluster mode fails because the same logs dir is used. I think we just need to pick semantics, like using a subdir of the given dir, or removing old logs on a retry, and implement that. I think adding a new UI sounds like overkill, at least to address this.

@twinkle-sachdeva
Copy link
Author

Hi @srowen ,

Please have a look at discussion/comments in Jira https://issues.apache.org/jira/browse/SPARK-4705 regarding avoiding creating directory, UI related changes has been discussed there with @vanzin over there

Initially, this issue was created for yarn-cluster only, but later changed to cluster. I will do the change for other cluster schedulers too ( as only one API needs to be overridden, so thought of creating the PR, so that rest of the changes are final by then. )

Thanks,
Twinkle

@vanzin
Copy link
Contributor

vanzin commented Mar 2, 2015

HI @srowen ,

I think tracking all attempts and showing them on the UI is the more correct fix. This is especially important for streaming jobs that can run for a long time and go through a bunch of AM instances in the process. This makes sure the logs for all attempts are available.

@vanzin
Copy link
Contributor

vanzin commented Mar 2, 2015

HI @twinkle-sachdeva , could you write a better title? I'd suggest:

[SPARK-4705] [core] Write event logs of different application attemps to different files.

val logger =
new EventLoggingListener(applicationId, eventLogDir.get, conf, hadoopConfiguration)
val logger = new EventLoggingListener(
applicationId, applicationAttemptId, eventLogDir.get, conf, hadoopConfiguration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indented too far

@vanzin
Copy link
Contributor

vanzin commented Mar 3, 2015

Hi @twinkle-sachdeva,

The code in HistoryPage.scala feels a little confusing, and I think it could be written more cleanly by choosing more appropriate data structures. Since it needs to create some new data structures based on the app listing, we should avoid always translating the whole list of apps, instead only translating the necessary info to fill the current page, to keep memory usage in check.

I took a quick look at UIUtils.listingTable and it seems like it's easy to generate multiple rows for each generateDataRow callback, so using the data structure I suggested should be relatively easy.

@twinkle-sachdeva twinkle-sachdeva changed the title SPARK-4705:[ For Cluster mode ] Pull request for being able to retain the event logs for all the application attempts. Currently they need to be overridden using property "spark.eventLog.overwrite" SPARK-4705:[core] Write event logs of different application attempts to different files. Mar 3, 2015
@twinkle-sachdeva
Copy link
Author

Hi @vanzin ,

I am already using the UIUtils.listingTable, it is looking a bit messier, because of multiple functions calling /usage. I am not very well versed with inline XML stuff, so maybe there is some more cleaner way to do so. I will look around a bit more.

Thanks,
Twinkle

@srowen
Copy link
Member

srowen commented Apr 24, 2015

It looks like this work is being continued in #5432 which is currently more active. Do you mind closing this PR and focusing discussion on that PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants