New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-11026][ES6] Rework creation of fat sql-client jars #7251
Conversation
There's an issue in this PR that causes |
Issue should be fixed; it was the uncommon issue about shading of test-jars for modules that don't have any test classes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the rework of the modules @zentol. I think it makes sense to have dedicated modules for different jars. I was wondering whether it wouldn't also make sense to move the table classes Elasticsearch6UpsertTableSink
and Elasticsearch6UpsertTableSinkFactory
into the newly created module. That way we would complete the separation (could also be a follow up issue). Moreover, I wanted to ask whether you've tried your changes out?
</relocations> | ||
<!-- Relocate the table format factory service file. --> | ||
<transformers> | ||
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can we delete this transformer in the flink-sql-connector-elasticsearch6
module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pulled in through the shade-flink
execution
<includes> | ||
<include>com/facebook/presto/hadoop/**</include> | ||
</includes> | ||
</filter> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these changes necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh... whoops. these don't belong in here, reverting.
<goal>shade</goal> | ||
</goals> | ||
<configuration> | ||
<shadeTestJar>false</shadeTestJar> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we disable the shadeTestJar
property? What is the default btw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget this comment. You've already answered it here: #7251 (comment). Maybe the default is false and thus not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the shade-flink
execution that we inherit from sets it to true
.
It might make sense to move the SQL classes, but I believe this is an orthogonal issue to how the modules are structured. This PR is only about making the required module-structure changes to support licensing; moving these classes however is more about how to structure our own code. In the long run, particularly for consistency that is my main concern in this PR, I would very much like to have sql-client specific extensions (and possibly table-stuff in general) in separate modules for all connectors/formats, regardless of packaging, but that's a separate issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification @zentol. Then please create a follow up issue to move the Table API specific classes into the new sql client specific module. +1 for merging after trying the new module out with the SQL client.
@twalthr What's your take on this? I wouldn't want to change the SQL-packaging rules without anyone from the SQL side signing off on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @zentol. I tried to run the SQL Client end-to-end test but it seems to fail when executing the Elasticsearch query. I guess this is related to your changes.
Please also update the documentation link in docs/dev/table/connect.md
.
2018-12-13 12:52:32,628 WARN org.apache.flink.table.client.cli.CliClient - Could not execute SQL statement.
org.apache.flink.table.client.gateway.SqlExecutionException: Could not retrieve or create a cluster.
at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJob(ProgramDeployer.java:102)
at org.apache.flink.table.client.gateway.local.ProgramDeployer.run(ProgramDeployer.java:78)
at org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdateInternal(LocalExecutor.java:393)
at org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdate(LocalExecutor.java:310)
at org.apache.flink.table.client.cli.CliClient.callInsertInto(CliClient.java:414)
at org.apache.flink.table.client.cli.CliClient.lambda$submitUpdate$0(CliClient.java:213)
at java.util.Optional.map(Optional.java:215)
at org.apache.flink.table.client.cli.CliClient.submitUpdate(CliClient.java:210)
at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:125)
at org.apache.flink.table.client.SqlClient.start(SqlClient.java:105)
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:187)
Caused by: org.apache.flink.client.program.ProgramInvocationException: Could not submit job (JobID: ee04dad94643e5ea2a90a00848a620ee)
at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:250)
at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJobOnExistingCluster(ProgramDeployer.java:170)
at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJob(ProgramDeployer.java:99)
... 10 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:380)
at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:203)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Not found.]
at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:380)
at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:364)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
... 4 more
2018-12-13 12:52:32,632 ERROR org.apache.flink.table.client.SqlClient - SQL Client must stop.
org.apache.flink.table.client.SqlClientException: Could not submit given SQL update statement to cluster.
at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:127)
at org.apache.flink.table.client.SqlClient.start(SqlClient.java:105)
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:187)
I just noticed that I have another Flink instance running on my machine. Will check the failure again. |
@zentol sorry for the false alarm. the changes look good to me (modulo my docs comment). |
Don't scare me like that 🗡 . I've update the docs, rebased the PR and will merge it once travis is green. |
Based on #7247.
What is the purpose of the change
This PR is a PoC for reworking the packaging of jars specific to the sql-client (which basically are just fat-jars). Only the
flink-connector-elasticsearch6
module is covered here; if accepted the same principle should be applied to the kafka connectors (0.10, 0.11, 2) and all formats.Instead of defining separate shade-plugin execution with a custom artifactSuffix this PR adds a dedicated
flink-sql-connector-elasticsearch6
module which only contains the packaging logic. This is a similar approach that we've already been using forflink-shaded-hadoop2-uber
.The main motivation for this is licensing; for accurate notice files it is necessary to be able to supply each artifact with distinct NOTICE files.
This cannot be done within a single module in a reasonable way. We would have to un-package each created jar, add the appropriate license files, and re-pack them again. We'd end up with tightly-coupled plugin definitions (since the names have to match!) and an overall more complicated (and slower!) build.
Brief change log
flink-sql-connector-elasticsearch6
module containing the sql-client-specific shade-plugin configuration and apply the following modificationsexecutionId
toshade-flink
shadedArtifactAttached
so only a single jar is deployedflink-connector-elasticsearch6
sqlJars
profile toflink-connectors
to support skipping the creation of sql jarsVerifying this change
Covered by sql-client E2E test.
Does this pull request potentially affect one of the following parts:
Documentation
I have not checked the documentation yet for references that would have to be changed.