Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13885][YARN] Fix attempt id regression for Spark running on Yarn #11721

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,10 @@
<tbody>
{{#applications}}
<tr>
<td class="rowGroupColumn"><span title="{{id}}"><a href="{{url}}">{{id}}</a></span></td>
<td class="rowGroupColumn"><span title="{{id}}"><a href="/history/{{id}}/{{num}}/jobs/">{{id}}</a></span></td>
<td class="rowGroupColumn">{{name}}</td>
{{#attempts}}
<td class="attemptIDSpan"><a href="history/{{id}}/{{attemptId}}/">{{attemptId}}</a></td>
<td class="attemptIDSpan"><a href="/history/{{id}}/{{attemptId}}/jobs/">{{attemptId}}</a></td>
<td>{{startTime}}</td>
<td>{{endTime}}</td>
<td><span title="{{duration}}" class="durationClass">{{duration}}</span></td>
Expand Down
19 changes: 2 additions & 17 deletions core/src/main/resources/org/apache/spark/ui/static/historypage.js
Original file line number Diff line number Diff line change
Expand Up @@ -123,28 +123,13 @@ $(document).ready(function() {
if (app["attempts"].length > 1) {
hasMultipleAttempts = true;
}

var maxAttemptId = null
var num = app["attempts"].length;
for (j in app["attempts"]) {
var attempt = app["attempts"][j];
if (attempt['attemptId'] != null) {
if (maxAttemptId == null || attempt['attemptId'] > maxAttemptId) {
maxAttemptId = attempt['attemptId']
}
}

attempt["startTime"] = formatDate(attempt["startTime"]);
attempt["endTime"] = formatDate(attempt["endTime"]);
attempt["lastUpdated"] = formatDate(attempt["lastUpdated"]);

var url = null
if (maxAttemptId == null) {
url = "history/" + id + "/"
} else {
url = "history/" + id + "/" + maxAttemptId + "/"
}

var app_clone = {"id" : id, "name" : name, "url" : url, "attempts" : [attempt]};
var app_clone = {"id" : id, "name" : name, "num" : num, "attempts" : [attempt]};
array.push(app_clone);
}
}
Expand Down
6 changes: 0 additions & 6 deletions core/src/main/scala/org/apache/spark/SparkContext.scala
Original file line number Diff line number Diff line change
Expand Up @@ -374,12 +374,6 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
throw new SparkException("An application name must be set in your configuration")
}

// System property spark.yarn.app.id must be set if user code ran by AM on a YARN cluster
if (master == "yarn" && deployMode == "cluster" && !_conf.contains("spark.yarn.app.id")) {
throw new SparkException("Detected yarn cluster mode, but isn't running on a cluster. " +
"Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.")
}

if (_conf.getBoolean("spark.logConf", false)) {
logInfo("Spark configuration:\n" + _conf.toDebugString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this was introduced so that people don't try new SparkContext(new SparkConf().setMaster("yarn-cluster")) and then complain when they get really odd error messages.

Is there another way to do this check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, removing this will still get other error messages if trying to run cluster mode with such way, but maybe a little odd as you mentioned.

}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -132,13 +132,6 @@ private[spark] class ApplicationMaster(
// Set the master and deploy mode property to match the requested mode.
System.setProperty("spark.master", "yarn")
System.setProperty("spark.submit.deployMode", "cluster")

// Propagate the application ID so that YarnClusterSchedulerBackend can pick it up.
System.setProperty("spark.yarn.app.id", appAttemptId.getApplicationId().toString())

// Propagate the attempt if, so that in case of event logging,
// different attempt's logs gets created in different directory
System.setProperty("spark.yarn.app.attemptId", appAttemptId.getAttemptId().toString())
}

logInfo("ApplicationAttemptId: " + appAttemptId)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,12 @@ private[spark] abstract class YarnSchedulerBackend(
/**
* Get the attempt ID for this run, if the cluster manager supports multiple
* attempts. Applications run in client mode will not have attempt IDs.
* This attempt ID only includes attempt counter, like "1", "2".
*
* @return The application attempt id, if available.
*/
override def applicationAttemptId(): Option[String] = {
attemptId.map(_.toString)
attemptId.map(_.getAttemptId.toString)
}

/**
Expand Down