Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-9196][flip6, yarn] Cleanup application files when deregistering YARN AM #5938

Closed
wants to merge 8 commits into from

Conversation

GJL
Copy link
Member

@GJL GJL commented Apr 28, 2018

What is the purpose of the change

Ensure that YARN application files are removed if cluster is shutdown.

cc: @StephanEwen @tillrohrmann

Brief change log

  • Enable graceful cluster shut down via HTTP.
  • Remove Flink application files from remote file system when the YarnResourceManager deregisters the YARN ApplicationMaster.

Verifying this change

This change added tests and can be verified as follows:

  • Manually verified that files are removed from HDFS when running stream (attached/detached) and batch jobs (attached).
  • Manually verified that files are removed from HDFS when running stopping a yarn session gracefully.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)


/**
* Tests for {@link Utils}.
*/
public class UtilsTest extends TestLogger {
Copy link
Member Author

@GJL GJL Apr 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed class to YarnFlinkResourceManagerTest.


/**
* Tests for {@link Utils}.
*/
public class UtilsTest extends TestLogger {
public class UtilsTest {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestLogger missing.

LOG.info("Sending shutdown request to the Application Master");
try {
final Future<Object> response = Patterns.ask(applicationClient.get(),
new YarnMessages.LocalStopYarnSession(ApplicationStatus.SUCCEEDED,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always SUCCEEDED because previous logic in 1.4.2 used

YarnApplicationState appState = lastReport.getYarnApplicationState();
			ApplicationStatus status =
				(appState == YarnApplicationState.FAILED || appState == YarnApplicationState.KILLED) ?
					ApplicationStatus.FAILED : ApplicationStatus.SUCCEEDED;
			if (status != ApplicationStatus.SUCCEEDED) {
				LOG.warn("YARN reported application state {}", appState);
				LOG.warn("Diagnostics: {}", lastReport.getDiagnostics());
			}
			return status;

which does not make sense imo.

Copy link
Contributor

@StephanEwen StephanEwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, with on minor minor comment.

} catch (InterruptedException e) {
Thread.currentThread().interrupt();
} catch (ExecutionException e) {
log.error("Error while shutting down cluster", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw the cause of the ExecutionException?

@StephanEwen
Copy link
Contributor

Merging this...

@asfgit asfgit closed this in 37df3c8 Apr 30, 2018
StephanEwen pushed a commit to StephanEwen/flink that referenced this pull request Apr 30, 2018
…ng YARN AM

Enable graceful cluster shut down via HTTP.
Remove Flink application files from remote file system when the
YarnResourceManager deregisters the YARN ApplicationMaster.

This closes apache#5938
sampathBhat pushed a commit to sampathBhat/flink that referenced this pull request Jul 26, 2018
…ng YARN AM

Enable graceful cluster shut down via HTTP.
Remove Flink application files from remote file system when the
YarnResourceManager deregisters the YARN ApplicationMaster.

This closes apache#5938
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants