Skip to content

Conversation

@tillrohrmann
Copy link
Contributor

This PR is based on #4259, #4260, #4261 and #4272.

Add the YarnSesssionClusterEntrypoint and the YarnJobClusterEntrypoint which extend
SessionClusterEntrypoint and JobClusterEntrypoint, respectively.

Add new Yarn session and per-job cluster entry points

Remove old Flip-6 Yarn per job entry point

return SecurityUtils.getInstalledContext();
}

public static Configuration loadConfiguration(String workingDirectory, Map<String, String> env) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much of this could be simplified by moving to standard dynamic properties. The code seems to convert from YarnConfigKeys settings to native Flink config settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be refactored. We should write the dynamic properties to the configuration file and make it available to the started YARN container. It's mainly due to historical reasons that we still send the dynamic properties encoded in an environment variable. Since this is an orthogonal issue, I would like to address this with a different PR. It is also related to: https://issues.apache.org/jira/browse/FLINK-7269

String workingDirectory) throws Exception {
org.apache.hadoop.conf.Configuration hadoopConfiguration = null;

//To support Yarn Secure Integration Test Scenario
Copy link
Contributor

@EronWright EronWright Jul 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was a workaround for an issue with the Kerberos variant of the Yarn integration test. There must be a simpler way; it seems wrong to have test code in the production codepath. I gave that feedback in the original PR but it didn't get resolved at the time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that including test code in the production codepath is not nice. Since this PR is mainly restructuring the existing YARN code, I would like to address this issue with a separate PR: https://issues.apache.org/jira/browse/FLINK-7288


@Override
protected JobGraph retrieveJobGraph(Configuration configuration) throws FlinkException {
String jobGraphFile = configuration.getString(JOB_GRAPH_FILE_PATH, "job.graph");
Copy link
Contributor

@EronWright EronWright Jul 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally a standard approach to locating the job to launch could be developed across K8/YARN/etc. It hadn't occurred to me that the docker image would have a serialized job graph within it (as opposed to the packaged program w/ main method) , but it definitely simplifies the recovery model.

This also has me wondering about how flink run would combine with this. The overlay system in the runtime.clusterframework package might also be implicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed still up to discussions how we do it in the container world. I could imagine different approaches: Adding it to the container image (if possible), retrieve the JobGraph from a service (HTTP server) or a network mounted disk. For the moment, this should not affect this PR, though.

@tillrohrmann
Copy link
Contributor Author

Thanks for your review @EronWright. I will rebase this PR to see whether it passes Travis.

@tillrohrmann tillrohrmann force-pushed the yarnEntrypoints branch 2 times, most recently from 3cfb12a to 040981c Compare July 28, 2017 10:18
…point

Add the YarnSesssionClusterEntrypoint and the YarnJobClusterEntrypoint which extend
SessionClusterEntrypoint and JobClusterEntrypoint, respectively.

Add new Yarn session and per-job cluster entry points

Remove old Flip-6 Yarn per job entry point
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants