[FLINK-11026][ES6] Rework creation of fat sql-client jars #7251

zentol · 2018-12-06T12:11:55Z

Based on #7247.

What is the purpose of the change

This PR is a PoC for reworking the packaging of jars specific to the sql-client (which basically are just fat-jars). Only the flink-connector-elasticsearch6 module is covered here; if accepted the same principle should be applied to the kafka connectors (0.10, 0.11, 2) and all formats.

Instead of defining separate shade-plugin execution with a custom artifactSuffix this PR adds a dedicated flink-sql-connector-elasticsearch6 module which only contains the packaging logic. This is a similar approach that we've already been using for flink-shaded-hadoop2-uber.

The main motivation for this is licensing; for accurate notice files it is necessary to be able to supply each artifact with distinct NOTICE files.

This cannot be done within a single module in a reasonable way. We would have to un-package each created jar, add the appropriate license files, and re-pack them again. We'd end up with tightly-coupled plugin definitions (since the names have to match!) and an overall more complicated (and slower!) build.

Brief change log

add new flink-sql-connector-elasticsearch6 module containing the sql-client-specific shade-plugin configuration and apply the following modifications
- set executionId to shade-flink
- disable shadedArtifactAttached so only a single jar is deployed
- remove sql-jar suffix as it is no longer necessary
remove sqlJars profile from flink-connector-elasticsearch6
add sqlJars profile to flink-connectors to support skipping the creation of sql jars

Verifying this change

Covered by sql-client E2E test.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (yes)

Documentation

I have not checked the documentation yet for references that would have to be changed.

zentol · 2018-12-07T11:37:27Z

There's an issue in this PR that causes mvn install to fail if the module was already built before. currently investigating what is causing this.

zentol · 2018-12-07T12:18:09Z

Issue should be fixed; it was the uncommon issue about shading of test-jars for modules that don't have any test classes.

tillrohrmann

Thanks for the rework of the modules @zentol. I think it makes sense to have dedicated modules for different jars. I was wondering whether it wouldn't also make sense to move the table classes Elasticsearch6UpsertTableSink and Elasticsearch6UpsertTableSinkFactory into the newly created module. That way we would complete the separation (could also be a follow up issue). Moreover, I wanted to ask whether you've tried your changes out?

tillrohrmann · 2018-12-07T12:46:04Z

flink-connectors/flink-connector-elasticsearch6/pom.xml

-									</relocations>
-									<!-- Relocate the table format factory service file. -->
-									<transformers>
-										<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>


Why can we delete this transformer in the flink-sql-connector-elasticsearch6 module?

pulled in through the shade-flink execution

tillrohrmann · 2018-12-07T12:46:38Z

flink-filesystems/flink-s3-fs-presto/pom.xml

-									<includes>
-										<include>com/facebook/presto/hadoop/**</include>
-									</includes>
-								</filter>


Why are these changes necessary?

eh... whoops. these don't belong in here, reverting.

tillrohrmann · 2018-12-07T12:47:09Z

flink-connectors/flink-sql-connector-elasticsearch6/pom.xml

+							<goal>shade</goal>
+						</goals>
+						<configuration>
+							<shadeTestJar>false</shadeTestJar>


Why do we disable the shadeTestJar property? What is the default btw?

Forget this comment. You've already answered it here: #7251 (comment). Maybe the default is false and thus not needed here.

the shade-flink execution that we inherit from sets it to true.

zentol · 2018-12-07T14:41:07Z

It might make sense to move the SQL classes, but I believe this is an orthogonal issue to how the modules are structured. This PR is only about making the required module-structure changes to support licensing; moving these classes however is more about how to structure our own code.

In the long run, particularly for consistency that is my main concern in this PR, I would very much like to have sql-client specific extensions (and possibly table-stuff in general) in separate modules for all connectors/formats, regardless of packaging, but that's a separate issue.

tillrohrmann

Thanks for the clarification @zentol. Then please create a follow up issue to move the Table API specific classes into the new sql client specific module. +1 for merging after trying the new module out with the SQL client.

zentol · 2018-12-13T09:51:34Z

@twalthr What's your take on this? I wouldn't want to change the SQL-packaging rules without anyone from the SQL side signing off on it.

twalthr

Thank you @zentol. I tried to run the SQL Client end-to-end test but it seems to fail when executing the Elasticsearch query. I guess this is related to your changes.

Please also update the documentation link in docs/dev/table/connect.md.

2018-12-13 12:52:32,628 WARN  org.apache.flink.table.client.cli.CliClient                   - Could not execute SQL statement.
org.apache.flink.table.client.gateway.SqlExecutionException: Could not retrieve or create a cluster.
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJob(ProgramDeployer.java:102)
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.run(ProgramDeployer.java:78)
	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdateInternal(LocalExecutor.java:393)
	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdate(LocalExecutor.java:310)
	at org.apache.flink.table.client.cli.CliClient.callInsertInto(CliClient.java:414)
	at org.apache.flink.table.client.cli.CliClient.lambda$submitUpdate$0(CliClient.java:213)
	at java.util.Optional.map(Optional.java:215)
	at org.apache.flink.table.client.cli.CliClient.submitUpdate(CliClient.java:210)
	at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:125)
	at org.apache.flink.table.client.SqlClient.start(SqlClient.java:105)
	at org.apache.flink.table.client.SqlClient.main(SqlClient.java:187)
Caused by: org.apache.flink.client.program.ProgramInvocationException: Could not submit job (JobID: ee04dad94643e5ea2a90a00848a620ee)
	at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:250)
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJobOnExistingCluster(ProgramDeployer.java:170)
	at org.apache.flink.table.client.gateway.local.ProgramDeployer.deployJob(ProgramDeployer.java:99)
	... 10 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
	at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:380)
	at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
	at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:203)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
	at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Not found.]
	at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:380)
	at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:364)
	at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
	... 4 more
2018-12-13 12:52:32,632 ERROR org.apache.flink.table.client.SqlClient                       - SQL Client must stop.
org.apache.flink.table.client.SqlClientException: Could not submit given SQL update statement to cluster.
	at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:127)
	at org.apache.flink.table.client.SqlClient.start(SqlClient.java:105)
	at org.apache.flink.table.client.SqlClient.main(SqlClient.java:187)

twalthr · 2018-12-13T12:01:30Z

I just noticed that I have another Flink instance running on my machine. Will check the failure again.

twalthr · 2018-12-13T12:20:34Z

@zentol sorry for the false alarm. the changes look good to me (modulo my docs comment).

zentol · 2018-12-13T13:48:36Z

Don't scare me like that 🗡 . I've update the docs, rebased the PR and will merge it once travis is green.

zentol requested review from twalthr and tillrohrmann December 6, 2018 12:11

twalthr self-assigned this Dec 6, 2018

zentol force-pushed the 11023_es branch from 18a309f to 26a0f98 Compare December 7, 2018 12:17

tillrohrmann reviewed Dec 7, 2018

View reviewed changes

zentol force-pushed the 11023_es branch from 26a0f98 to 2142730 Compare December 7, 2018 14:21

tillrohrmann approved these changes Dec 10, 2018

View reviewed changes

twalthr requested changes Dec 13, 2018

View reviewed changes

zentol added 3 commits December 13, 2018 14:44

[FLINK-11026][ES6] Rework creation of fat sql-client jars

5692704

disable shading of test jar

8a07730

update documentation

93f4b84

zentol force-pushed the 11023_es branch from 2142730 to 93f4b84 Compare December 13, 2018 13:46

zentol merged commit eb7039d into apache:master Dec 18, 2018

zentol deleted the 11023_es branch December 18, 2018 12:16

zentol mentioned this pull request Dec 20, 2018

[FLINK-11026][kafka][SQL] Rework kafka sql-client jar creation #7341

Merged

rmetzger added component=BuildSystem component=SQL/Client labels Mar 18, 2019

JingsongLi mentioned this pull request Jun 15, 2020

[FLINK-18261][parquet][orc] flink-orc and flink-parquet have invalid NOTICE file #12622

Merged

flinkbot added component=TableSQL/Client and removed component=SQL/Client labels Mar 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-11026][ES6] Rework creation of fat sql-client jars #7251

[FLINK-11026][ES6] Rework creation of fat sql-client jars #7251

zentol commented Dec 6, 2018

zentol commented Dec 7, 2018

zentol commented Dec 7, 2018

tillrohrmann left a comment

tillrohrmann Dec 7, 2018

zentol Dec 7, 2018

tillrohrmann Dec 7, 2018

zentol Dec 7, 2018

tillrohrmann Dec 7, 2018

tillrohrmann Dec 7, 2018

zentol Dec 7, 2018

zentol commented Dec 7, 2018

tillrohrmann left a comment

zentol commented Dec 13, 2018

twalthr left a comment

twalthr commented Dec 13, 2018

twalthr commented Dec 13, 2018

zentol commented Dec 13, 2018

[FLINK-11026][ES6] Rework creation of fat sql-client jars #7251

[FLINK-11026][ES6] Rework creation of fat sql-client jars #7251

Conversation

zentol commented Dec 6, 2018

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

zentol commented Dec 7, 2018

zentol commented Dec 7, 2018

tillrohrmann left a comment

Choose a reason for hiding this comment

tillrohrmann Dec 7, 2018

Choose a reason for hiding this comment

zentol Dec 7, 2018

Choose a reason for hiding this comment

tillrohrmann Dec 7, 2018

Choose a reason for hiding this comment

zentol Dec 7, 2018

Choose a reason for hiding this comment

tillrohrmann Dec 7, 2018

Choose a reason for hiding this comment

tillrohrmann Dec 7, 2018

Choose a reason for hiding this comment

zentol Dec 7, 2018

Choose a reason for hiding this comment

zentol commented Dec 7, 2018

tillrohrmann left a comment

Choose a reason for hiding this comment

zentol commented Dec 13, 2018

twalthr left a comment

Choose a reason for hiding this comment

twalthr commented Dec 13, 2018

twalthr commented Dec 13, 2018

zentol commented Dec 13, 2018