various cpp improvement in src folder #538

GulajavaMinistudio · 2019-03-10T00:17:18Z

…terruptedException

What changes were proposed in this pull request?

Before a Kafka consumer gets assigned with partitions, its offset will contain 0 partitions. However, runContinuous will still run and launch a Spark job having 0 partitions. In this case, there is a race that epoch may interrupt the query execution thread after lastExecution.toRdd, and either epochEndpoint.askSync[Unit](StopContinuousExecutionWrites) or the next runContinuous will get interrupted unintentionally.

To handle this case, this PR has the following changes:

Clean up the resources in queryExecutionThread.runUninterruptibly. This may increase the waiting time of stop but should be minor because the operations here are very fast (just sending an RPC message in the same process and stopping a very simple thread).
Clear the interrupted status at the end so that it won't impact the runContinuous call. We may clear the interrupted status set by stop, but it doesn't affect the query termination because runActivatedStream will check state and exit accordingly.

I also updated the clean up codes to make sure exceptions thrown from epochEndpoint.askSync[Unit](StopContinuousExecutionWrites) won't stop the clean up.

How was this patch tested?

Jenkins

Closes apache#24034 from zsxwing/SPARK-27111.

Authored-by: Shixiong Zhu zsxwing@gmail.com
Signed-off-by: Shixiong Zhu zsxwing@gmail.com

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

…terruptedException ## What changes were proposed in this pull request? Before a Kafka consumer gets assigned with partitions, its offset will contain 0 partitions. However, runContinuous will still run and launch a Spark job having 0 partitions. In this case, there is a race that epoch may interrupt the query execution thread after `lastExecution.toRdd`, and either `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` or the next `runContinuous` will get interrupted unintentionally. To handle this case, this PR has the following changes: - Clean up the resources in `queryExecutionThread.runUninterruptibly`. This may increase the waiting time of `stop` but should be minor because the operations here are very fast (just sending an RPC message in the same process and stopping a very simple thread). - Clear the interrupted status at the end so that it won't impact the `runContinuous` call. We may clear the interrupted status set by `stop`, but it doesn't affect the query termination because `runActivatedStream` will check `state` and exit accordingly. I also updated the clean up codes to make sure exceptions thrown from `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` won't stop the clean up. ## How was this patch tested? Jenkins Closes #24034 from zsxwing/SPARK-27111. Authored-by: Shixiong Zhu <zsxwing@gmail.com> Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

GulajavaMinistudio merged commit ea1700d into GulajavaMinistudio:master Mar 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

various cpp improvement in src folder #538

various cpp improvement in src folder #538

GulajavaMinistudio commented Mar 10, 2019

various cpp improvement in src folder #538

various cpp improvement in src folder #538

Conversation

GulajavaMinistudio commented Mar 10, 2019

What changes were proposed in this pull request?

How was this patch tested?

What changes were proposed in this pull request?

How was this patch tested?