Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-13946] Remove job session related code from ExecutionEnvironment #9607

Closed
wants to merge 12 commits into from

Conversation

kl0u
Copy link
Contributor

@kl0u kl0u commented Sep 4, 2019

What is the purpose of the change

This PR removes code related to JobSessions from the ExecutionEnvironment and the PlanExecutors. This code was added in the context of FLINK-2097 but it was never activated, as illustrated by the comment at ExecutionEnvironment.java#L285 . The work in this PR is part of the preparation for the upcoming re-design of the whole Client/Executor API.

Brief change log

The changes in the subclasses of the ExecutionEnvironment remove methods that were setting session-related parameters and reflect the simplification of the PlanExecutor lifecycle explained below (for Local and RemoteEnvironment).

The changes to the PlanExecutors have to do with the executor's lifecycle. Now the executor itself controls its lifecycle (start() and stop() are private) and we instantiate an executor for each call to executePlan(). This allows to get rid of the reapers from the Local and RemoteEnvironments and the lock that protected concurrent access to the executor's state.

The lifecycle is more explicit now and aligned with the current use of the ExecutionEnvironment. If in the future we choose to change this and decide to re-use execution environments, then we can add this functionality back, potentially under a different design/architecture.

Verifying this change

This change is a code cleanup so it is covered by existing tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

Please have a look at this one @aljoscha and @tillrohrmann.

@flinkbot
Copy link
Collaborator

flinkbot commented Sep 4, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 5267206 (Fri Sep 06 13:30:35 UTC 2019)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Sep 4, 2019

CI report:

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kl0u for opening this pull requests. I left some inline comments.

}
ClusterClient<?> client = null;
try {
client = startClusterClient();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline startClusterClient and stopClusterClient make more sense to me.

@kl0u
Copy link
Contributor Author

kl0u commented Sep 4, 2019

Hi @tisonkun ! Thanks for the review. I integrated your comments. Please have a look and let me know what you think.

Copy link
Contributor

@aljoscha aljoscha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! But there is still some superfluous/needlessly complicated code, IMO.

private Configuration createExecutorServiceConfig() {
final Configuration newConfiguration = new Configuration();
newConfiguration.setInteger(TaskManagerOptions.NUM_TASK_SLOTS, taskManagerNumSlots);
newConfiguration.setBoolean(CoreOptions.FILESYTEM_DEFAULT_OVERRIDE, defaultOverwriteFiles);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see usage of this setter in one single test, maybe we can also remove that or at least not use a field for this anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the setTaskManagerNumSlots() setter?

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your updates @kl0u! Generally looks good to me.

This pull request has conflicts with current master and fails on compile error. Please rebase and fix compile error.

@kl0u
Copy link
Contributor Author

kl0u commented Sep 4, 2019

I integrated your comments @aljoscha . Please have a look and let me know what you think.

@@ -93,4 +93,9 @@ public String stopWithSavepoint(JobID jobId, boolean advanceToEndOfEventTime, @N
public Map<String, OptionalFailure<Object>> getAccumulators(JobID jobID, ClassLoader loader) {
return Collections.emptyMap();
}

@Override
public void close() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wried. The latest CI complaint with

/home/travis/build/flink-ci/flink/flink-yarn-tests/src/test/java/org/apache/flink/yarn/util/FakeClusterClient.java:[43,8] org.apache.flink.yarn.util.FakeClusterClient is not abstract and does not override abstract method close() in java.lang.AutoCloseable

but actually we implement it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what travis says now.

@tisonkun
Copy link
Member

tisonkun commented Sep 5, 2019

Hi @kl0u, thanks for your update. I notice the strange state in travis that reports a compile error which should not be there. Please take a look.

Also you can mark conversations that addressed or reached a consensus above as "resolved" to fold them, which make ongoing reviews more clear.

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kl0u, this pull request now looks good to me.

+1 to merge. Also it will unblock FLINK-13961 which I'm glad to implement our discussion on JobExecutor(Service). Please also take a look at the JIRA. Thanks.

Copy link
Contributor

@aljoscha aljoscha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good now!

@kl0u
Copy link
Contributor Author

kl0u commented Sep 6, 2019

Thanks for the reviews @tisonkun and @aljoscha . Merged.

@kl0u kl0u closed this Sep 6, 2019
@kl0u kl0u deleted the remove-session branch January 28, 2020 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants