-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-10311][Streaming]Reload appId and attemptId when app starts with checkpoint file in cluster mode #8477
Conversation
Test build #41670 has finished for PR 8477 at commit
|
@harishreedharan @vanzin Could you guys take a look at this? |
@@ -49,6 +49,8 @@ class Checkpoint(@transient ssc: StreamingContext, val checkpointTime: Time) | |||
// Reload properties for the checkpoint application since user wants to set a reload property | |||
// or spark had changed its value and user wants to set it back. | |||
val propertiesToReload = List( | |||
"spark.yarn.app.id", | |||
"spark.yarn.app.attemptId", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right. If the AM restarted, the attempt ID will be different.
I'm not familiar with how this data is used by the streaming backend, but it looks odd to be manually reloading it from a checkpoint. In client mode, AM restarts do not affect the driver. In cluster mode, the app id is the same, but the attempt id changes; and those will be re-populated anyway when the SparkContext for the app is initialized by the new AM process. |
Sorry that I don't declare the problem clearly. When an app starts with CheckPoint file using getOrCreate method, the new AM process will new a SparkContext object, but just using the old SparkConf, So the new attemptId set by new AM process doesn't do anything.
Also the appId is the same. |
+1. This looks good. @vanzin - I assume your concern is with the reloading. Reloading here is to ignore the old values from the serlialized spark conf so we pick up the values in the new one which was created by the current attempt. |
@tdas Hari knows a lot more about this area than me so feel free to ignore my comments. |
@harishreedharan Please take a look. |
I already +1-ed it. LGTM |
@harishreedharan Oops, yeah, missed that. Merging this to master and 1.5 |
…with checkpoint file in cluster mode Author: xutingjun <xutingjun@huawei.com> Closes #8477 from XuTingjun/streaming-attempt.
…with checkpoint file in cluster mode Author: xutingjun <xutingjun@huawei.com> Closes apache#8477 from XuTingjun/streaming-attempt. (cherry picked from commit dc39658)
No description provided.