Skip to content

Conversation

@GJL
Copy link
Member

@GJL GJL commented Oct 11, 2018

What is the purpose of the change

Wait for the result of asynchronous operations to be served before shutting down the cluster. This is necessary for the "cancel with savepoint" operation. If we do not wait for the result to be accessed by the client, we may shutdown the cluster, and the client gets a ConnectionException.

cc: @zentol

Brief change log

  • Before shutting down cluster, wait for asynchronous operations.
  • Log stacktrace if checkpoint cannot be ack'ed.

Verifying this change

This change added tests and can be verified as follows:

  • Added test to RestServerEndpointITCase to verify that handlers are closed first.
  • Added unit tests for CompletedOperationCache.
  • Verified the changes by submitting and cancelling with savepoint of a job in a loop.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@GJL GJL force-pushed the FLINK-10309-release-1.5 branch from 4ff61b1 to 18c1458 Compare October 11, 2018 13:39
…us operations.

Wait for the result of asynchronous operations to be served before shutting down
the cluster.  This is necessary for the "cancel with savepoint" operation. If we
do not wait for the result to be accessed by the client, we may shutdown the
cluster, and the client gets a ConnectionException.

Extract CompletedOperationCache from AbstractAsynchronousOperationHandlers to
ease unit testing.

Ensure that there are no in-flight HTTP requests when we close the server
channel.

Move class AbstractHandler to handler package.

This closes apache#6820.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants