SPARK-4705:[core] Write event logs of different application attempts to different files. #4845

twinkle-sachdeva · 2015-03-01T15:15:23Z

Hi,

Following is the approach taken:

For applications with attempt ID," _<attempt_id>" has been added to file name, while creating the event log file. For example, with attempt id file name will be : application_1423546284151_0031_2
Added the attempt id inside the SparkListenerApplicationStart event, so that same can be read while replaying the event log file too.
If there is no application with attempt id info (e.g. all played in client mode ), then old UI will continue to display, as an application with attempt id has been logged, then following new UI will start appearing

AmplabJenkins · 2015-03-01T15:17:09Z

Can one of the admins verify this patch?

srowen · 2015-03-01T17:19:07Z

(Please rebase, and make the title of the PR descriptive.)

…ll do the rest of the changes after this

…the master branch. 2) Added the attempt id inside the SparkListenerApplicationStart, to make the info available independent of directory structure. 3) Changes in History Server to render the UI as per the snaphot II

srowen · 2015-03-02T09:18:42Z

core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala

@@ -26,7 +26,8 @@ private[spark] case class ApplicationHistoryInfo(
    endTime: Long,
    lastUpdated: Long,
    sparkUser: String,
-    completed: Boolean = false)
+    completed: Boolean = false,
+    appAttemptId: String = "") 


Option[String]?

srowen · 2015-03-02T09:21:22Z

(Title should start with something like SPARK-4705 [YARN] ...)
This seems like a big change for a fairly narrow problem, which is that retrying the driver in cluster mode fails because the same logs dir is used. I think we just need to pick semantics, like using a subdir of the given dir, or removing old logs on a retry, and implement that. I think adding a new UI sounds like overkill, at least to address this.

twinkle-sachdeva · 2015-03-02T09:28:23Z

Hi @srowen ,

Please have a look at discussion/comments in Jira https://issues.apache.org/jira/browse/SPARK-4705 regarding avoiding creating directory, UI related changes has been discussed there with @vanzin over there

Initially, this issue was created for yarn-cluster only, but later changed to cluster. I will do the change for other cluster schedulers too ( as only one API needs to be overridden, so thought of creating the PR, so that rest of the changes are final by then. )

Thanks,
Twinkle

vanzin · 2015-03-02T18:00:52Z

HI @srowen ,

I think tracking all attempts and showing them on the UI is the more correct fix. This is especially important for streaming jobs that can run for a long time and go through a bunch of AM instances in the process. This makes sure the logs for all attempts are available.

vanzin · 2015-03-02T18:30:15Z

HI @twinkle-sachdeva , could you write a better title? I'd suggest:

[SPARK-4705] [core] Write event logs of different application attemps to different files.

vanzin · 2015-03-03T00:06:27Z

core/src/main/scala/org/apache/spark/SparkContext.scala

-      val logger =
-        new EventLoggingListener(applicationId, eventLogDir.get, conf, hadoopConfiguration)
+      val logger = new EventLoggingListener(
+          applicationId, applicationAttemptId, eventLogDir.get, conf, hadoopConfiguration)


nit: indented too far

vanzin · 2015-03-03T00:40:19Z

Hi @twinkle-sachdeva,

The code in HistoryPage.scala feels a little confusing, and I think it could be written more cleanly by choosing more appropriate data structures. Since it needs to create some new data structures based on the app listing, we should avoid always translating the whole list of apps, instead only translating the necessary info to fill the current page, to keep memory usage in check.

I took a quick look at UIUtils.listingTable and it seems like it's easy to generate multiple rows for each generateDataRow callback, so using the data structure I suggested should be relatively easy.

twinkle-sachdeva · 2015-03-03T14:00:42Z

Hi @vanzin ,

I am already using the UIUtils.listingTable, it is looking a bit messier, because of multiple functions calling /usage. I am not very well versed with inline XML stuff, so maybe there is some more cleaner way to do so. I will look around a bit more.

Thanks,
Twinkle

srowen · 2015-04-24T11:04:29Z

It looks like this work is being continued in #5432 which is currently more active. Do you mind closing this PR and focusing discussion on that PR?

twinkle-sachdeva mentioned this pull request Mar 1, 2015

SPARK-4705:Creating different log directories for different app attempts... #4311

Closed

twinkle-sachdeva changed the title ~~Pull request for SPARK-4705 from master branch~~ SPARK-4705:[ For Cluster mode ] Pull request for being able to retain the event logs for all the application attempts. Currently they need to be overridden using property "spark.eventLog.overwrite" Mar 2, 2015

twinkle-g and others added 4 commits March 2, 2015 13:58

SPARK-4705: Doing cherry-pick of fix into master

7484b9d

SPARK-4705 Incorporating the review comments regarding formatting, wi…

64a33d0

…ll do the rest of the changes after this

SPARK-4705 Incorporating the review comments regarding formatting, wi…

0762e86

…ll do the rest of the changes after this

twinkle-sachdeva force-pushed the SPARK-4705-Master branch from 75bdd1b to cc9311e Compare March 2, 2015 08:30

srowen reviewed Mar 2, 2015
View reviewed changes

vanzin reviewed Mar 3, 2015
View reviewed changes

vanzin mentioned this pull request Apr 9, 2015

[SPARK-4705] Handle multiple app attempts event logs, history server. #5432

Closed

twinkle-sachdeva closed this Apr 27, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK-4705:[core] Write event logs of different application attempts to different files. #4845

SPARK-4705:[core] Write event logs of different application attempts to different files. #4845

twinkle-sachdeva commented Mar 1, 2015

AmplabJenkins commented Mar 1, 2015

srowen commented Mar 1, 2015

srowen Mar 2, 2015

srowen commented Mar 2, 2015

twinkle-sachdeva commented Mar 2, 2015

vanzin commented Mar 2, 2015

vanzin commented Mar 2, 2015

vanzin Mar 3, 2015

vanzin commented Mar 3, 2015

twinkle-sachdeva commented Mar 3, 2015

srowen commented Apr 24, 2015

SPARK-4705:[core] Write event logs of different application attempts to different files. #4845

SPARK-4705:[core] Write event logs of different application attempts to different files. #4845

Conversation

twinkle-sachdeva commented Mar 1, 2015

AmplabJenkins commented Mar 1, 2015

srowen commented Mar 1, 2015

srowen Mar 2, 2015

Choose a reason for hiding this comment

srowen commented Mar 2, 2015

twinkle-sachdeva commented Mar 2, 2015

vanzin commented Mar 2, 2015

vanzin commented Mar 2, 2015

vanzin Mar 3, 2015

Choose a reason for hiding this comment

vanzin commented Mar 3, 2015

twinkle-sachdeva commented Mar 3, 2015

srowen commented Apr 24, 2015