[FLINK-24687][table] Move FileSystemTableSource/Sink in flink-connector-files #17897

slinkydeveloper · 2021-11-24T15:27:03Z

What is the purpose of the change

This PR moves the FileSystemTableSource/Sink out of flink-table-runtime and inside flink-connector-files module. This PR ended up being bigger than expected, so I tried to factor out the "preliminary changes" to the actual file moving in separate hotfix commits. More details in the commit details section.

Brief change log

Classes changes:

Renamed package org.apache.flink.table.filesystem to org.apache.flink.connector.file.table
Moved every production class org.apache.flink.connector.file.table from flink-table-runtime to flink-connector-files
Moved some test classes inside org.apache.flink.connector.file.table from flink-table-runtime to flink-connector-files, others in flink-table-planner. Now no filesystem related test lives in flink-table-runtime
Reworked and moved the testcsv format inside flink-table-planner
Moved the columnar RowData implementations from flink-table-runtime to flink-table-common, under the package org.apache.flink.table.data.columnar
Moved FileSystemFormatFactory, BulkWriterFormatFactory and BulkReaderFormatFactory to flink-connector-files, in a new package org.apache.flink.connector.file.table.factories
Moved BulkDecodingFormat to flink-connector-files, in a new package org.apache.flink.connector.file.table.formats

Module changes:

No table-* package depends on flink-connector-files for production jars anymore
flink-table-planner still depends on flink-connector-files in test classpath
flink-orc and flink-parquet don't depend on flink-table-runtime anymore, hence the scala suffix is dropped
Every format module now depends optionally on flink-connector-files to implement BulkWriterFormatFactory and BulkReaderFormatFactory

Commit details

Fix the FactoryUtil loading mechanism to tolerate NoClassDefFoundError: This is necessary in scenarios when users bring in a format module, implementing BulkReaderFormatFactory/BulkWriterFormatFactory, but don't have flink-connector-files in the classpath because they don't use it. E.g. you're using flink-connector-kafka and you want to use flink-avro, which brings in an implementation of BulkWriterFormatFactory. This scenario is already tested by every test in flink-connector-kafka which doesn't bring in flink-connector-files in its test classpath.
Remove planner dependency on FileSystemConnectorFiles hardcoding the used option
Copied DecimalDataUtils#is32BitDecimal and DecimalDataUtils#is32BitDecimal in ParquetSchemaConverter to remove the dependency on DecimalDataUtils (which lives in the runtime module)
Refactored the testcsv format:
- Reimplemented the writing side using SerializationSchemaFormatFactory, since the implementation of BulkReaderFormatFactory was causing issues when using the flink-table-planner test-jar in other modules.
- Tried to make the format implementation as much as possible independent from planner internal classes, but this cannot be done completely due to an issue in FileSystemTableSink#createSourceContext, which cannot provide the DataStructureConverter. This issue can potentially come up with other formats used by FileSystemFormatFactory as well, so we probably need to address it separately from this PR.
- Moved to a new ad-hoc package and moved in flink-table-planner
Big commit that moves code. Other than moving code and fixing imports as described above, it does the following:
- Both parquet and orc formats were using InternalTypeInfo for creating the TypeInformation. Now they're using the context.
- Changed some javadocs of deprecated apis in org.apache.flink.table.sinks to remove the dependency from flink-connector-files types
- All the code moved in common is now annotated with @Internal
- Some tests couldn't be moved to flink-connector-files because depending on several test utils/test bases from runtime. Neverthless, they have been adapted to the new package name.
- Every package depending on either the moved format factories or the filesystem table sink/source internals now depend on flink-connector-files
Other commits involve package changes and are in a separate commit for clarity:
- Dropped the scala suffix from flink-orc and flink-parquet
- flink-table-uber now needs to manually import flink-connector-files (as it's not a transitive depedency from flink-table-common anymore)

Verifying this change

Every change is already covered by existing tests

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): moves dependencies around
The public API, i.e., is any changed class annotated with @Public(Evolving): changes the package name of flink-orc and flink-parquet
The serializers: no
The runtime per-record code paths (performance sensitive): no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? no
If yes, how is the feature documented? not applicable

flinkbot · 2021-11-24T15:28:44Z

CI report:

a96dbbc Azure: SUCCESS

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run azure re-run the last Azure build

flinkbot · 2021-11-24T15:32:23Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 620501a (Wed Nov 24 15:32:22 UTC 2021)

Warnings:

16 pom.xml files were touched: Check for build and licensing issues.
No documentation files were touched! Remember to keep the Flink docs up to date!

_{Mention the bot in a comment to re-run the automated checks.}

Review Progress

❓ 1. The [description] looks good.
❓ 2. There is [consensus] that the contribution should go into to Flink.
❓ 3. Needs [attention] from.
❓ 4. The change fits into the overall [architecture].
❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

Airblader · 2021-11-24T16:10:39Z

Awesome stuff! Please make sure to rebase to current master and add this connector to flink-architecture-tests 😃

slinkydeveloper · 2021-11-24T16:18:22Z

Awesome stuff! Please make sure to rebase to current master and add this connector to flink-architecture-tests

The files connector is already in flink-architecture-tests, I updated the violations file, I apparently fixed one without knowing :)

fapaul

Thanks for doing all the effort and disentangling the dependencies 👏

Overall, the changes look correct but I would propose to change the commit structure. In general [hotfix] commits are only useful for really trivial changes i.e. fix license header formatting but not for related work of a certain ticket.

I'd propose all commits on this PR should include the ticket number and it would also be great to add the modules either [table][file][orc]....

Can you also double-check the classes you have moved to the file connector package that they are annotated with @Internal if users should not use them directly?

...ble/flink-table-common/src/main/java/org/apache/flink/table/factories/ServiceLoaderUtil.java

fapaul · 2021-11-26T15:20:57Z

flink-table/flink-table-common/src/main/java/org/apache/flink/table/factories/FactoryUtil.java

+                        loadResult -> {
+                            if (loadResult.failed()) {
+                                if (loadResult.getError() instanceof NoClassDefFoundError) {
+                                    LOG.debug(


We should definitely write a test to verify that the discovery does not fail if the base interface is not found

fapaul · 2021-11-26T15:22:06Z

flink-table/flink-table-common/src/main/java/org/apache/flink/table/factories/FactoryUtil.java

+                .forEachRemaining(
+                        loadResult -> {
+                            if (loadResult.failed()) {
+                                if (loadResult.getError() instanceof NoClassDefFoundError) {


Can we be more precise about which error we want to catch? So far we only want to catch the exception for the flink-connector-files interfaces but I do not think we should continue in general for NoClassDefFoundError.

The issue is that NoClassDefFoundError and the cause ClassNotFoundException both have no method to get the class name that failed to load. Reference:

https://docs.oracle.com/javase/8/docs/api/java/lang/ClassNotFoundException.html

https://docs.oracle.com/javase/8/docs/api/java/lang/NoClassDefFoundError.html

I could extract the class name from the exception message, but i feel like it could be an unstable method, and might break with certain jdks... WDYT? Do you have some alternatives in mind?

fapaul · 2021-11-26T15:25:13Z

...ble/flink-table-common/src/main/java/org/apache/flink/table/factories/ServiceLoaderUtil.java

+        public boolean hasNext() {
+            try {
+                return iterator.hasNext();
+            } catch (Throwable t) {


What kind of Throwable do we expect here?

Check https://docs.oracle.com/javase/8/docs/api/java/util/ServiceLoader.html#iterator-- for more details, but in short both hasNext() and next() might fail according to this javadoc, and they can throw any kind of error:

Its hasNext and next methods can therefore throw a ServiceConfigurationError if a provider-configuration file violates the specified format, or if it names a provider class that cannot be found and instantiated, or if the result of instantiating the class is not assignable to the service type, or if any other kind of exception or error is thrown as the next provider is located and instantiated.

fapaul · 2021-11-26T15:26:28Z

...a/org/apache/flink/table/planner/plan/rules/physical/batch/BatchPhysicalLegacySinkRule.scala

@@ -53,12 +52,14 @@ class BatchPhysicalLegacySinkRule extends ConverterRule(
            val dynamicPartIndices =
              dynamicPartFields.map(partitionSink.getTableSchema.getFieldNames.indexOf(_))

+            // TODO This option is hardcoded to remove the dependency of planner from


Do we already have a follow-up ticket to introduce a separate option?

@JingsongLi do you know if this option is still actively used?

I know of others who use it because we don't have the partition by syntax on SQL.

flink-connectors/flink-connector-files/pom.xml

...rs/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/BinPacking.java

...files/src/main/java/org/apache/flink/connector/file/table/stream/compact/CompactContext.java

...iles/src/main/java/org/apache/flink/connector/file/table/stream/compact/CompactOperator.java

fapaul · 2021-11-26T15:49:22Z

flink-table/flink-table-uber/pom.xml

@@ -88,6 +88,11 @@ under the License.
 			<artifactId>flink-cep</artifactId>
 			<version>${project.version}</version>
 		</dependency>
+		<dependency>
+			<groupId>org.apache.flink</groupId>


Why do we again include the file connector in the uber-jar?

This PR doesn't changes the uber jar building, which on master already includes flink-connector-files. The reason for this specific change is that, before this PR, flink-table-common depends on flink-connector-files, so flink-table-uber depends on flink-connector-files through flink-table-common. Now none of the packages in flink-table-* depends anymore on flink-connector-files, so we need to import it here explicitly.

I'm also skeptical about this change. It would be better to explicitly add flink-connector-files as a separate dependency to lib. It must not be part of the the table jars.

We should also add a comment in the docs that the filesystem connector is in lib by default.

I'm also skeptical about this change

But this is no different from master. If you build flink-table-uber in master, you'll see that flink-connector-files is inside the resulting jar, because this is brough in transitively from flink-table-common. If we want to do such change, I suggest we do it in a separate PR with a separate issue as it's user facing.

Hmm, but when reading the ticket https://issues.apache.org/jira/browse/FLINK-24687 isn't that exactly what we want to solve? We want to exclude the flink-connector-files from the uber jar to prevent conflicts with user jars if we keep including it we did not fix the problem.

Uhm you're right, I forgot what the initial issue title was about 😃

I've pushed a commit to remove the flink-connector-files shading from the table-uber and to add the flink-connector-files to the lib folder in the distribution. I've also manually tested the new distribution output with a sample source - sink sql batch job using the sql client.

This is the distribution tree now:

!2174 ➜ tree . ├── bin │ ├── bash-java-utils.jar │ ├── config.sh │ ├── find-flink-home.sh │ ├── flink │ ├── flink-console.sh │ ├── flink-daemon.sh │ ├── historyserver.sh │ ├── jobmanager.sh │ ├── kubernetes-jobmanager.sh │ ├── kubernetes-session.sh │ ├── kubernetes-taskmanager.sh │ ├── pyflink-shell.sh │ ├── sql-client.sh │ ├── standalone-job.sh │ ├── start-cluster.sh │ ├── start-zookeeper-quorum.sh │ ├── stop-cluster.sh │ ├── stop-zookeeper-quorum.sh │ ├── taskmanager.sh │ ├── yarn-session.sh │ └── zookeeper.sh ├── conf │ ├── flink-conf.yaml │ ├── log4j-cli.properties │ ├── log4j-console.properties │ ├── log4j.properties │ ├── log4j-session.properties │ ├── logback-console.xml │ ├── logback-session.xml │ ├── logback.xml │ ├── masters │ ├── workers │ └── zoo.cfg ├── examples │ ├── batch │ │ ├── ConnectedComponents.jar │ │ ├── DistCp.jar │ │ ├── EnumTriangles.jar │ │ ├── KMeans.jar │ │ ├── PageRank.jar │ │ ├── TransitiveClosure.jar │ │ ├── WebLogAnalysis.jar │ │ └── WordCount.jar │ ├── gelly │ │ └── flink-gelly-examples_2.12-1.15-SNAPSHOT.jar │ ├── python │ │ ├── datastream │ │ │ ├── event_time_timer.py │ │ │ ├── __init__.py │ │ │ ├── process_json_data.py │ │ │ ├── state_access.py │ │ │ └── word_count.py │ │ └── table │ │ ├── __init__.py │ │ ├── mixing_use_of_datastream_and_table.py │ │ ├── multi_sink.py │ │ ├── pandas │ │ │ ├── conversion_from_dataframe.py │ │ │ ├── __init__.py │ │ │ └── pandas_udaf.py │ │ ├── process_json_data.py │ │ ├── process_json_data_with_udf.py │ │ ├── windowing │ │ │ ├── __init__.py │ │ │ ├── over_window.py │ │ │ ├── session_window.py │ │ │ ├── sliding_window.py │ │ │ └── tumble_window.py │ │ └── word_count.py │ ├── streaming │ │ ├── Iteration.jar │ │ ├── SessionWindowing.jar │ │ ├── SocketWindowWordCount.jar │ │ ├── StateMachineExample.jar │ │ ├── TopSpeedWindowing.jar │ │ ├── Twitter.jar │ │ ├── WindowJoin.jar │ │ └── WordCount.jar │ └── table │ ├── AdvancedFunctionsExample.jar │ ├── ChangelogSocketExample.jar │ ├── GettingStartedExample.jar │ ├── StreamSQLExample.jar │ ├── StreamWindowSQLExample.jar │ ├── UpdatingTopCityExample.jar │ └── WordCountSQLExample.jar ├── lib │ ├── flink-connector-files-1.15-SNAPSHOT.jar │ ├── flink-csv-1.15-SNAPSHOT.jar │ ├── flink-dist-1.15-SNAPSHOT.jar │ ├── flink-json-1.15-SNAPSHOT.jar │ ├── flink-scala_2.12-1.15-SNAPSHOT.jar │ ├── flink-shaded-zookeeper-3.4.14.jar │ ├── flink-table_2.12-1.15-SNAPSHOT.jar │ ├── log4j-1.2-api-2.14.1.jar │ ├── log4j-api-2.14.1.jar │ ├── log4j-core-2.14.1.jar │ └── log4j-slf4j-impl-2.14.1.jar ├── LICENSE ├── log │ ├── flink-slinkydeveloper-sql-client-fedora.log │ ├── flink-slinkydeveloper-standalonesession-0-fedora.log │ ├── flink-slinkydeveloper-standalonesession-0-fedora.out │ ├── flink-slinkydeveloper-taskexecutor-0-fedora.log │ └── flink-slinkydeveloper-taskexecutor-0-fedora.out ├── opt │ ├── flink-azure-fs-hadoop-1.15-SNAPSHOT.jar │ ├── flink-cep-1.15-SNAPSHOT.jar │ ├── flink-cep-scala_2.12-1.15-SNAPSHOT.jar │ ├── flink-gelly-1.15-SNAPSHOT.jar │ ├── flink-gelly-scala_2.12-1.15-SNAPSHOT.jar │ ├── flink-oss-fs-hadoop-1.15-SNAPSHOT.jar │ ├── flink-python_2.12-1.15-SNAPSHOT.jar │ ├── flink-queryable-state-runtime-1.15-SNAPSHOT.jar │ ├── flink-s3-fs-hadoop-1.15-SNAPSHOT.jar │ ├── flink-s3-fs-presto-1.15-SNAPSHOT.jar │ ├── flink-shaded-netty-tcnative-dynamic-2.0.39.Final-14.0.jar │ ├── flink-shaded-zookeeper-3.5.9.jar │ ├── flink-sql-client_2.12-1.15-SNAPSHOT.jar │ ├── flink-state-processor-api-1.15-SNAPSHOT.jar │ └── python │ ├── cloudpickle-1.2.2-src.zip │ ├── py4j-0.10.8.1-src.zip │ └── pyflink.zip ├── plugins │ ├── external-resource-gpu │ │ ├── flink-external-resource-gpu-1.15-SNAPSHOT.jar │ │ ├── gpu-discovery-common.sh │ │ └── nvidia-gpu-discovery.sh │ ├── metrics-datadog │ │ └── flink-metrics-datadog-1.15-SNAPSHOT.jar │ ├── metrics-graphite │ │ └── flink-metrics-graphite-1.15-SNAPSHOT.jar │ ├── metrics-influx │ │ └── flink-metrics-influxdb-1.15-SNAPSHOT.jar │ ├── metrics-jmx │ │ └── flink-metrics-jmx-1.15-SNAPSHOT.jar │ ├── metrics-prometheus │ │ └── flink-metrics-prometheus-1.15-SNAPSHOT.jar │ ├── metrics-slf4j │ │ └── flink-metrics-slf4j-1.15-SNAPSHOT.jar │ ├── metrics-statsd │ │ └── flink-metrics-statsd-1.15-SNAPSHOT.jar │ └── README.txt └── README.txt

twalthr

Great work @slinkydeveloper. I added some comments. Same feedback as before: We could have split this PR into a couple of commits. E.g. the relocation of columnar classes to flink-table-common would have been a nice commit on its own. It is very annoying for reviewers to scan single commits that contains both refactorings + crucial changes.

...ble/flink-table-common/src/main/java/org/apache/flink/table/factories/ServiceLoaderUtil.java

twalthr · 2021-11-29T14:29:55Z

...table-runtime/src/test/java/org/apache/flink/formats/testcsv/TestCsvSerializationSchema.java

+
+import static org.apache.flink.api.java.io.CsvOutputFormat.DEFAULT_FIELD_DELIMITER;
+
+class TestCsvSerializationSchema implements SerializationSchema<RowData> {


Again: please add JavaDocs. What is this class good for?

There is already a doc in TestCsvFormatFactory, let me link it.

...iles/src/main/java/org/apache/flink/connector/file/table/stream/compact/CompactOperator.java

.../src/test/java/org/apache/flink/connector/file/table/stream/compact/CompactOperatorTest.java

...nk-orc-nohive/src/main/java/org/apache/flink/orc/nohive/OrcNoHiveColumnarRowInputFormat.java

twalthr · 2021-11-30T10:32:18Z

flink-table/flink-table-uber/pom.xml

@@ -88,6 +88,11 @@ under the License.
 			<artifactId>flink-cep</artifactId>
 			<version>${project.version}</version>
 		</dependency>
+		<dependency>
+			<groupId>org.apache.flink</groupId>


I'm also skeptical about this change. It would be better to explicitly add flink-connector-files as a separate dependency to lib. It must not be part of the the table jars.

twalthr · 2021-11-30T10:32:48Z

flink-table/flink-table-uber/pom.xml

@@ -88,6 +88,11 @@ under the License.
 			<artifactId>flink-cep</artifactId>
 			<version>${project.version}</version>
 		</dependency>
+		<dependency>
+			<groupId>org.apache.flink</groupId>


We should also add a comment in the docs that the filesystem connector is in lib by default.

slinkydeveloper · 2021-12-01T13:20:14Z

@fapaul @twalthr this is ready for another pass. I added the test for discoverFactories, fixed the pom and I also took the chance to convert the FactoryUtilTest to use assertj.

twalthr

Thanks for the update. We should also update sql_connectors.yml to remove the Scala suffixes. Please scan the docs one more time if there is more to update.

twalthr · 2021-12-02T08:20:44Z

flink-dist/pom.xml

@@ -129,6 +129,12 @@ under the License.
 		<!-- Default file system support. The Hadoop and MapR dependencies -->
 		<!--       are optional, so not being added to the dist jar        -->

+		<dependency>


with this you add the flink-connector-files to the flink-dist jar, no? So it is in the release two times?

What makes you say is it in the release two times? My understanding is that in master flink-connector-files is included as part of the table uber jar, which is imported here and copied in /lib. With this PR instead, the flink-connector-files is not part of the table uber jar but is imported directly and copied to /lib.

I might be missing something here...

Also in the tree here i see it only once: #17897 (comment)

twalthr · 2021-12-02T08:32:00Z

...table/flink-table-common/src/test/java/org/apache/flink/table/factories/FactoryUtilTest.java

+        // Delete the sub interface now, so it can't be loaded
+        Files.delete(tempDir.resolve(subInterfaceName + ".class"));
+
+        assertThat(FactoryUtil.discoverFactories(classLoaderIncludingTheInterface))


nice test 👍

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…ies in the flink-dist to enforce the reactor order Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…ase in its uber jar Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…o tolerate NoClassDefFoundError. Added a test and converted FactoryUtil to use assertj. Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…onnectorOptions Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…imalDataUtils#is32BitDecimal in ParquetSchemaConverter to remove the dependency on DecimalDataUtils (from planner) Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…dent of planner (except ScanRuntimeProviderContext.INSTANCE::createDataStructureConverter) and to implement SerializationSchema more than BulkWriterFormatFactory. Moved to a specific package Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

twalthr

Thanks for the update. One last comment. Should be good in the next iteration. Please rebase and resolve merge conflicts.

docs/data/sql_connectors.yml

…FileSystemTableSource, BulkDecodingFormat, FileSystemFormatFactory, BulkWriterFormatFactory and BulkReaderFormatFactory to flink-connector-files. Move columnar data types to flink-table-common. Now table packages don't depend on flink-connector-files anymore. Fix orc and parquet format to use only common classes and not planner nor runtime classes. Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…rfaces Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…k-parquet Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

…endency, which was previously brought in through flink-table-api-java-bridge -> flink-table-api-java -> flink-table-common -> flink-connector-files -> flink-connector-base. Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

slinkydeveloper · 2021-12-03T09:49:58Z

@flinkbot run azure

…ide table-uber anymore but it's loaded in /lib in the distribution as flink-connector-files Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

twalthr

LGTM

…TableSource to flink-connector-files and columnar support to flink-table-common Now table packages don't depend on flink-connector-files anymore. Fix orc and parquet format to use only common classes and not planner nor runtime classes. - [connector-files] Add @internal to all public classes and interfaces - [orc][parquet][hive] Drop scala suffix from flink-orc and flink-parquet - [architecture-tests] Updated the violations file - [connector-elasticsearch-base] Add flink-connector-base as dependency, which was previously brought in through flink-table-api-java-bridge -> flink-table-api-java -> flink-table-common -> flink-connector-files -> flink-connector-base. - [orc][parquet] Add issue link for partition keys handling - [table-uber][dist] Now flink-connector-files is not shaded inside table-uber anymore but it's loaded in /lib in the distribution as flink-connector-files - [docs] Update sql_connectors.yml This closes apache#17897.

slinkydeveloper force-pushed the FLINK-24687 branch from 8a51fc1 to 620501a Compare November 24, 2021 15:28

slinkydeveloper force-pushed the FLINK-24687 branch from 620501a to 1ade96b Compare November 24, 2021 15:30

slinkydeveloper force-pushed the FLINK-24687 branch from 1ade96b to 6eeff57 Compare November 24, 2021 15:45

rmetzger added the component=TableSQL/API label Nov 24, 2021

fapaul requested changes Nov 26, 2021

View reviewed changes

slinkydeveloper force-pushed the FLINK-24687 branch 2 times, most recently from 80e8bf5 to 6793541 Compare November 29, 2021 15:22

twalthr requested changes Nov 30, 2021

View reviewed changes

slinkydeveloper force-pushed the FLINK-24687 branch from 6793541 to 0f1e5c4 Compare December 1, 2021 13:19

twalthr requested changes Dec 2, 2021

View reviewed changes

slinkydeveloper force-pushed the FLINK-24687 branch 4 times, most recently from fc4cc48 to 2217c68 Compare December 3, 2021 09:06

slinkydeveloper added 7 commits December 3, 2021 10:28

[hotfix][table-runtime] Update copyright for some filesystem classes

72871ec

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[hotfix][dist] flink-json and flink-csv are now declared as dependenc…

b47f026

…ies in the flink-dist to enforce the reactor order Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[hotfix][connectors] Every connector now shades the flink-connector-b…

e126aba

…ase in its uber jar Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[FLINK-24687][table-common] Fix the Table Factory loading mechanism t…

45d5d31

…o tolerate NoClassDefFoundError. Added a test and converted FactoryUtil to use assertj. Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[FLINK-24687][table-planner] Remove planner dependency on FileSystemC…

bf9c0c1

…onnectorOptions Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[FLINK-24687][parquet] Copied DecimalDataUtils#is32BitDecimal and Dec…

eacfe3f

…imalDataUtils#is32BitDecimal in ParquetSchemaConverter to remove the dependency on DecimalDataUtils (from planner) Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

twalthr requested changes Dec 3, 2021

View reviewed changes

docs/data/sql_connectors.yml Show resolved Hide resolved

slinkydeveloper added 3 commits December 3, 2021 10:42

[fixup][connector-files] Add @internal to all public classes and inte…

4a1c933

…rfaces Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[fixup][orc][parquet][hive] Drop scala suffix from flink-orc and flin…

7580998

…k-parquet Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

slinkydeveloper added 3 commits December 3, 2021 10:42

[fixup][architecture-tests] Updated the violations file

ea8634a

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[fixup][orc][parquet] Add issue link for partition keys handling

87ae71b

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

slinkydeveloper force-pushed the FLINK-24687 branch 2 times, most recently from fd6cc2f to 004179a Compare December 3, 2021 09:45

slinkydeveloper force-pushed the FLINK-24687 branch from 004179a to 512fd91 Compare December 3, 2021 10:40

slinkydeveloper added 2 commits December 3, 2021 13:37

[fixup][table-uber][dist] Now flink-connector-files is not shaded ins…

628bd2a

…ide table-uber anymore but it's loaded in /lib in the distribution as flink-connector-files Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

[fixup][docs] Update sql_connectors.yml

a96dbbc

Signed-off-by: slinkydeveloper <francescoguard@gmail.com>

slinkydeveloper force-pushed the FLINK-24687 branch from 512fd91 to a96dbbc Compare December 3, 2021 12:38

twalthr approved these changes Dec 3, 2021

View reviewed changes

twalthr closed this in 9bbadb9 Dec 3, 2021

slinkydeveloper deleted the FLINK-24687 branch December 3, 2021 16:58


		import static org.apache.flink.api.java.io.CsvOutputFormat.DEFAULT_FIELD_DELIMITER;

		class TestCsvSerializationSchema implements SerializationSchema<RowData> {

[FLINK-24687][table] Move FileSystemTableSource/Sink in flink-connector-files #17897

[FLINK-24687][table] Move FileSystemTableSource/Sink in flink-connector-files #17897

Conversation

slinkydeveloper commented Nov 24, 2021

What is the purpose of the change

Brief change log

Classes changes:

Module changes:

Commit details

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented Nov 24, 2021 • edited

CI report:

flinkbot commented Nov 24, 2021

Automated Checks

Review Progress

Airblader commented Nov 24, 2021

slinkydeveloper commented Nov 24, 2021

fapaul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slinkydeveloper Nov 26, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twalthr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slinkydeveloper commented Dec 1, 2021

twalthr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twalthr left a comment • edited

Choose a reason for hiding this comment

slinkydeveloper commented Dec 3, 2021

twalthr left a comment

Choose a reason for hiding this comment

flinkbot commented Nov 24, 2021 •

edited

slinkydeveloper Nov 26, 2021 •

edited

twalthr left a comment •

edited