Skip to content

Conversation

@aljoscha
Copy link
Contributor

What is the purpose of the change

  • Change the SQL cli to use the new Executor interface and JobClient instead of manually going through cluster descriptor/cluster specification.

This also does not print the cluster ID and web interface URL anymore
when submitting because there is no portable way of retrieving these.
Previously, even for YARN the printed cluster id was not usable because
it was just the class name of the implementation class.

Verifying this change

This change is covered by existing tests and ITcases that were adapted.

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@aljoscha aljoscha requested review from kl0u and tisonkun November 25, 2019 12:50
@flinkbot
Copy link
Collaborator

flinkbot commented Nov 25, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 64339da (Wed Dec 04 17:08:48 UTC 2019)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Nov 25, 2019

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build

@aljoscha aljoscha force-pushed the executor-sql-cli-without-job-client-changes branch 2 times, most recently from df63633 to 4a5ca62 Compare November 25, 2019 13:48
Copy link
Contributor

@kl0u kl0u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @aljoscha !
I had some comments with the main ones being the one about not closing the jobClient and the one about changing the threading model of the MaterializedCollectBatchResult.

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this pull request @aljoscha ! I left several comments while response to @zjffdu 's & @kl0u 's comments also. Please ping me when you address them :-)

@twalthr
Copy link
Contributor

twalthr commented Nov 26, 2019

Thanks for the PR @aljoscha. I would like to take a final look before merging. Please ping me when there are no more concerns.

@aljoscha
Copy link
Contributor Author

I think I addressed all your comments. Please take another look.

@aljoscha
Copy link
Contributor Author

Is everyone OK with me rebasing and squashing this so that Timo can have a look?

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more concerns from my side. Go ahead :-)

Copy link
Contributor

@kl0u kl0u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aljoscha I did my pass and I left some comments mainly having to do with style rather than essence. After integrating whichever you find meaningful, you have my +1, but I would feel more comfortable if someone who also knows more about the SQL Client, like @twalthr, has a look before merging, as I am not that familiar with this part of the codebase.

CommandLine commandLine,
Options commandLineOptions,
List<CustomCommandLine> availableCommandLines) throws FlinkException {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This options is only used for logging. Do we need it or we can simply print the final executionConfig ?

}

private Pipeline createPipeline(String name, Configuration flinkConfig) {
public Pipeline createPipeline(String name, Configuration flinkConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this method it seems we do not use the flinkConfig or the parallelism in the else branch. Is it ok to remove them? Also the else is not needed as we return in the if branch. So this can become:

public Pipeline createPipeline(String name) {
			if (streamExecEnv != null) {
				// special case for Blink planner to apply batch optimizations
				// note: it also modifies the ExecutionConfig!
				if (executor instanceof ExecutorBase) {
					return ((ExecutorBase) executor).getStreamGraph(name);
				}
				return streamExecEnv.getStreamGraph(name);
			}
			return execEnv.createProgramPlan(name);
		}

boolean awaitJobResult) {
this.context = context;
this.jobGraph = jobGraph;
this.pipeline = pipeline;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkNotNull?

* @param awaitJobResult block for a job execution result from the cluster
*/
public ProgramDeployer(
ExecutionContext<C> context,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, for core readability, I think it is easier to pass the configuration directly instead of creating in the deploy and when we fetch the result, have a checkState on the ATTACHED flag so that the reader of the code knows what to expect. The way it is now, it seems like in the deploy, we are setting the ATTACHED flag but it is never used.

WDYT?

throw e;
} catch (Exception e) {
throw new SqlExecutionException("Could not locate a cluster.", e);
if (awaitJobResult) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about configuration.set(DeploymentOptions.ATTACHED, awaitJobResult); ?

} catch (Exception e) {
// ignore
}
jobClient = executor.execute(pipeline, configuration);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not directly return executor.execute(pipeline, configuration); ?

@aljoscha aljoscha force-pushed the executor-sql-cli-without-job-client-changes branch 6 times, most recently from 64339da to c10cc10 Compare December 5, 2019 09:22
@aljoscha
Copy link
Contributor Author

aljoscha commented Dec 5, 2019

@flinkbot run travis

@aljoscha aljoscha force-pushed the executor-sql-cli-without-job-client-changes branch 3 times, most recently from b50be35 to 588f931 Compare December 6, 2019 22:05
This also does not print the cluster ID and web interface URL anymore
when submitting because there is no portable way of retrieving these.
Previously, even for YARN the printed cluster id was not usable because
it was just the class name of the implementation class.
@aljoscha aljoscha force-pushed the executor-sql-cli-without-job-client-changes branch from 588f931 to 170f6e3 Compare December 7, 2019 16:15
@aljoscha aljoscha closed this Dec 7, 2019
@aljoscha aljoscha deleted the executor-sql-cli-without-job-client-changes branch December 7, 2019 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants