-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-22955][DSTREAMS] - graceful shutdown shouldn't lead to job gen… #25511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22955][DSTREAMS] - graceful shutdown shouldn't lead to job gen… #25511
Conversation
2ba6586 to
c8937b1
Compare
|
Hi @srowen, could you please take a look ? |
|
I tend to agree, but let's see what tests say. CC @tdas |
|
Test build #4840 has finished for PR 25511 at commit
|
|
Merged to master |
|
Yes I think that's OK. We're in a 'code freeze' for the 2.4.4 release at the moment, so I hesitate to merge anything but critical fixes until it's finalized. But it could go in for 2.4.5. |
|
OK, thanks. |
|
@dongjoon-hyun did you say you're having problems with the merge script not being able to backport? I'm seeing the same. We are just manually cherry-picking and pushing? |
### What changes were proposed in this pull request? During graceful shutdown of ``StreamingContext`` ``graph.stop()`` is invoked right after stopping of ``timer`` which generates new job. Thus it's possible that the latest jobs generated by timer are still in the middle of generation but invocation of ``graph.stop()`` closes some objects required to job generation, e.g. consumer for Kafka, and generation fails. That also leads to fully waiting of ``spark.streaming.gracefulStopTimeout`` which is equal to 10 batch intervals by default. Stopping of the graph should be performed later, after ``haveAllBatchesBeenProcessed`` is completed. ### How was this patch tested? Added test to existing test suite. Closes #25511 from choojoyq/SPARK-22955-job-generation-error-on-graceful-stop. Authored-by: Nikita Gorbachevsky <nikitag@playtika.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
|
Seemed to work fine. This is backported to 2.4 |
|
Thank @srowen |
|
Yes. @srowen . The current merge script has two issues.
BTW, sorry for being late response, I've been on a vacation since last Saturday. I'm connecting here time to time. |
### What changes were proposed in this pull request? During graceful shutdown of ``StreamingContext`` ``graph.stop()`` is invoked right after stopping of ``timer`` which generates new job. Thus it's possible that the latest jobs generated by timer are still in the middle of generation but invocation of ``graph.stop()`` closes some objects required to job generation, e.g. consumer for Kafka, and generation fails. That also leads to fully waiting of ``spark.streaming.gracefulStopTimeout`` which is equal to 10 batch intervals by default. Stopping of the graph should be performed later, after ``haveAllBatchesBeenProcessed`` is completed. ### How was this patch tested? Added test to existing test suite. Closes apache#25511 from choojoyq/SPARK-22955-job-generation-error-on-graceful-stop. Authored-by: Nikita Gorbachevsky <nikitag@playtika.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
What changes were proposed in this pull request?
During graceful shutdown of
StreamingContextgraph.stop()is invoked right after stopping oftimerwhich generates new job. Thus it's possible that the latest jobs generated by timer are still in the middle of generation but invocation ofgraph.stop()closes some objects required to job generation, e.g. consumer for Kafka, and generation fails. That also leads to fully waiting ofspark.streaming.gracefulStopTimeoutwhich is equal to 10 batch intervals by default. Stopping of the graph should be performed later, afterhaveAllBatchesBeenProcessedis completed.How was this patch tested?
Added test to existing test suite.