Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-11026][ES6] Rework creation of fat sql-client jars #7251

Merged
merged 3 commits into from Dec 18, 2018

Conversation

zentol
Copy link
Contributor

@zentol zentol commented Dec 6, 2018

Based on #7247.

What is the purpose of the change

This PR is a PoC for reworking the packaging of jars specific to the sql-client (which basically are just fat-jars). Only the flink-connector-elasticsearch6 module is covered here; if accepted the same principle should be applied to the kafka connectors (0.10, 0.11, 2) and all formats.

Instead of defining separate shade-plugin execution with a custom artifactSuffix this PR adds a dedicated flink-sql-connector-elasticsearch6 module which only contains the packaging logic. This is a similar approach that we've already been using for flink-shaded-hadoop2-uber.

The main motivation for this is licensing; for accurate notice files it is necessary to be able to supply each artifact with distinct NOTICE files.

This cannot be done within a single module in a reasonable way. We would have to un-package each created jar, add the appropriate license files, and re-pack them again. We'd end up with tightly-coupled plugin definitions (since the names have to match!) and an overall more complicated (and slower!) build.

Brief change log

  • add new flink-sql-connector-elasticsearch6 module containing the sql-client-specific shade-plugin configuration and apply the following modifications
    • set executionId to shade-flink
    • disable shadedArtifactAttached so only a single jar is deployed
    • remove sql-jar suffix as it is no longer necessary
  • remove sqlJars profile from flink-connector-elasticsearch6
  • add sqlJars profile to flink-connectors to support skipping the creation of sql jars

Verifying this change

Covered by sql-client E2E test.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes)

Documentation

I have not checked the documentation yet for references that would have to be changed.

@twalthr twalthr self-assigned this Dec 6, 2018
@zentol
Copy link
Contributor Author

zentol commented Dec 7, 2018

There's an issue in this PR that causes mvn install to fail if the module was already built before. currently investigating what is causing this.

@zentol
Copy link
Contributor Author

zentol commented Dec 7, 2018

Issue should be fixed; it was the uncommon issue about shading of test-jars for modules that don't have any test classes.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the rework of the modules @zentol. I think it makes sense to have dedicated modules for different jars. I was wondering whether it wouldn't also make sense to move the table classes Elasticsearch6UpsertTableSink and Elasticsearch6UpsertTableSinkFactory into the newly created module. That way we would complete the separation (could also be a follow up issue). Moreover, I wanted to ask whether you've tried your changes out?

</relocations>
<!-- Relocate the table format factory service file. -->
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can we delete this transformer in the flink-sql-connector-elasticsearch6 module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pulled in through the shade-flink execution

<includes>
<include>com/facebook/presto/hadoop/**</include>
</includes>
</filter>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these changes necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh... whoops. these don't belong in here, reverting.

<goal>shade</goal>
</goals>
<configuration>
<shadeTestJar>false</shadeTestJar>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we disable the shadeTestJar property? What is the default btw?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget this comment. You've already answered it here: #7251 (comment). Maybe the default is false and thus not needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shade-flink execution that we inherit from sets it to true.

@zentol
Copy link
Contributor Author

zentol commented Dec 7, 2018

It might make sense to move the SQL classes, but I believe this is an orthogonal issue to how the modules are structured. This PR is only about making the required module-structure changes to support licensing; moving these classes however is more about how to structure our own code.

In the long run, particularly for consistency that is my main concern in this PR, I would very much like to have sql-client specific extensions (and possibly table-stuff in general) in separate modules for all connectors/formats, regardless of packaging, but that's a separate issue.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification @zentol. Then please create a follow up issue to move the Table API specific classes into the new sql client specific module. +1 for merging after trying the new module out with the SQL client.

@zentol
Copy link
Contributor Author

zentol commented Dec 13, 2018

@twalthr What's your take on this? I wouldn't want to change the SQL-packaging rules without anyone from the SQL side signing off on it.

Copy link
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @zentol. I tried to run the SQL Client end-to-end test but it seems to fail when executing the Elasticsearch query. I guess this is related to your changes.

Please also update the documentation link in docs/dev/table/connect.md.

2018-12-13 12:52:32,628 WARN  org.apache.flink.table.client.cli.CliClient                   - Could not execute SQL statement.
org.apache.flink.table.client.gateway.SqlExecutionException: Could not retrieve or create a cluster.
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJob(ProgramDeployer.java:102)
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.run(ProgramDeployer.java:78)
	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdateInternal(LocalExecutor.java:393)
	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdate(LocalExecutor.java:310)
	at org.apache.flink.table.client.cli.CliClient.callInsertInto(CliClient.java:414)
	at org.apache.flink.table.client.cli.CliClient.lambda$submitUpdate$0(CliClient.java:213)
	at java.util.Optional.map(Optional.java:215)
	at org.apache.flink.table.client.cli.CliClient.submitUpdate(CliClient.java:210)
	at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:125)
	at org.apache.flink.table.client.SqlClient.start(SqlClient.java:105)
	at org.apache.flink.table.client.SqlClient.main(SqlClient.java:187)
Caused by: org.apache.flink.client.program.ProgramInvocationException: Could not submit job (JobID: ee04dad94643e5ea2a90a00848a620ee)
	at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:250)
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJobOnExistingCluster(ProgramDeployer.java:170)
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJob(ProgramDeployer.java:99)
	... 10 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
	at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:380)
	at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
	at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:203)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
	at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Not found.]
	at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:380)
	at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:364)
	at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
	... 4 more
2018-12-13 12:52:32,632 ERROR org.apache.flink.table.client.SqlClient                       - SQL Client must stop.
org.apache.flink.table.client.SqlClientException: Could not submit given SQL update statement to cluster.
	at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:127)
	at org.apache.flink.table.client.SqlClient.start(SqlClient.java:105)
	at org.apache.flink.table.client.SqlClient.main(SqlClient.java:187)

@twalthr
Copy link
Contributor

twalthr commented Dec 13, 2018

I just noticed that I have another Flink instance running on my machine. Will check the failure again.

@twalthr
Copy link
Contributor

twalthr commented Dec 13, 2018

@zentol sorry for the false alarm. the changes look good to me (modulo my docs comment).

@zentol
Copy link
Contributor Author

zentol commented Dec 13, 2018

Don't scare me like that 🗡 . I've update the docs, rebased the PR and will merge it once travis is green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants