[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode #21703

luoyuxia · 2023-01-17T12:50:03Z

What is the purpose of the change

Introduce auto compaction for Hive sink in batch mode

Brief change log

Introduce options compaction.small-files.avg-size/compaction.file-size/compaction.parallelism for auto compaction
when autom compaction is enabled, create a topology containing BatchFileWriter -> BatchCompactCoordinator -> BatchCompactOperator -> BatchPartitionCommitter which will support auto compaction.
Extract the common logic of compact files from BatchCompactOperator and CompactOperator to CompactFileUtils#doCompact.

Verifying this change

Added ITCase in HiveTableCompactSinkITCase.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (yes / no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
The serializers: (yes / no / don't know)
The runtime per-record code paths (performance sensitive): (yes / no / don't know)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
The S3 file system connector: (yes / no / don't know)

Documentation

Does this pull request introduce a new feature? (yes / no)
If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

flinkbot · 2023-01-17T12:56:15Z

CI report:

848535e UNKNOWN
7435e1f Azure: SUCCESS

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run azure re-run the last Azure build

luoyuxia · 2023-01-18T07:07:41Z

@flinkbot run azure

lsyldliu

@luoyuxia Thanks for your contribution, I left some comments.

lsyldliu · 2023-01-19T03:39:29Z

...ink-connector-files/src/main/java/org/apache/flink/connector/file/table/batch/BatchSink.java

+            final int parallelism)
+            throws IOException {
+        return dataStream
+                .map((MapFunction<RowData, Row>) value -> (Row) converter.toExternal(value))


Suggested change

.map((MapFunction<RowData, Row>) value -> (Row) converter.toExternal(value))

.map(value -> ((Row) converter.toExternal(value)))

lsyldliu · 2023-01-19T03:40:08Z

...ink-connector-files/src/main/java/org/apache/flink/connector/file/table/batch/BatchSink.java

+            DynamicTableSink.DataStructureConverter converter,
+            FileSystemOutputFormat<Row> fileSystemOutputFormat,
+            final int parallelism)
+            throws IOException {


Here is no need to throw exception.

lsyldliu · 2023-01-19T03:40:19Z

...ink-connector-files/src/main/java/org/apache/flink/connector/file/table/batch/BatchSink.java

+            boolean isToLocal,
+            boolean overwrite,
+            final int compactParallelism)
+            throws IOException {


lsyldliu · 2023-01-19T03:53:39Z

...ctors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java

+
+        DataStream<CoordinatorInput> writerDataStream =
+                dataStream
+                        .map((MapFunction<RowData, Row>) value -> (Row) converter.toExternal(value))


Suggested change

.map((MapFunction<RowData, Row>) value -> (Row) converter.toExternal(value))

.map(value -> (Row) converter.toExternal(value))

lsyldliu · 2023-01-19T04:15:10Z

...ink-connector-files/src/main/java/org/apache/flink/connector/file/table/batch/BatchSink.java

+public class BatchSink {
+    private BatchSink() {}
+
+    public static DataStreamSink<Row> createBatchNoAutoCompactSink(


I think createBatchSink is enough.

I still prefer createBatchNoAutoCompactSink which is consistent with createBatchCompactSink.

lsyldliu · 2023-01-19T04:19:02Z

...ctors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java

+
+        PartitionCommitPolicyFactory partitionCommitPolicyFactory =
+                new PartitionCommitPolicyFactory(
+                        conf.get(HiveOptions.SINK_PARTITION_COMMIT_POLICY_KIND),


We should also check the commit policy is not null when table has partition key like streaming sink?

No, we don't. In batch mode, it'll always commit partitions even though the metastore policy has been configured.

lsyldliu · 2023-01-19T04:25:48Z

...ctors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSink.java

+        catalogTable.getOptions().forEach(conf::setString);
+        boolean autoCompaction = conf.getBoolean(FileSystemConnectorOptions.AUTO_COMPACTION);
+        if (autoCompaction) {
+            if (batchShuffleMode != BatchShuffleMode.ALL_EXCHANGES_BLOCKING) {


I think in pipeline shuffle mode, auto compaction can also work.

Thanks for configuring it out. After think it over, yes, it should still work in pipeline shufle mode.

lsyldliu · 2023-01-19T04:38:21Z

...nectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveOptions.java

+    public static final ConfigOption<MemorySize> COMPACT_SMALL_FILES_AVG_SIZE =
+            key("compaction.small-files.avg-size")
+                    .memoryType()
+                    .defaultValue(MemorySize.ofMebiBytes(16))


How do you get this default value? Is it reasonable for user?

I'm not sure. But It comes for Hive, I think it may be reasonable at least to Hive user.

lsyldliu · 2023-01-19T04:41:52Z

...onnector-hive/src/test/java/org/apache/flink/connectors/hive/HiveTableCompactSinkITCase.java

+    private TableEnvironment createNoBlockingModeTableEnv() {
+        EnvironmentSettings settings = EnvironmentSettings.inBatchMode();
+        settings.getConfiguration()
+                .set(ExecutionOptions.BATCH_SHUFFLE_MODE, BatchShuffleMode.ALL_EXCHANGES_PIPELINED);


I think this option is job level, so we don't need to create a new TableEnvironment.

tableEnv.getConfig().set(ExecutionOptions.BATCH_SHUFFLE_MODE, BatchShuffleMode.ALL_EXCHANGES_PIPELINED);

luoyuxia · 2023-01-19T05:20:19Z

@lsyldliu Thanks for reviewing. I have addressed your comments.

lsyldliu · 2023-01-19T10:03:08Z

...onnector-hive/src/test/java/org/apache/flink/connectors/hive/HiveTableCompactSinkITCase.java

+    }
+
+    @Test
+    public void testNoCompaction() throws Exception {


I think we should use JUnit parameter to cover the two cases: ALL_EXCHANGES_PIPELINED and ALL_EXCHANGES_BLOCKING.

In deed, I'm intended to add case to cover ALL_EXCHANGES_BLOCKING for it'll increase the test time. We always try to reduce the test time as hive moudle has cost much time.
Also, from the side of these file compaction pipeline, the shuffle mode makes no difference. And if we cover the two cases, what about the other shuffle modes.

I think we should test it manually.

Yes, have tested it manually. The test passes for the different shuffle mode.

lsyldliu · 2023-01-19T10:07:21Z

docs/content/docs/connectors/table/hive/hive_read_write.md

+#### Batch Mode
+
+When it's in batch mode and auto compaction is enabled, after finishing writing files, Flink will calculate the average size of written files for each partition. And if the average size is less than the
+threshold configured, Flink will then try to compact these files to files with a target size. The following is the table's options for file compactions.


Suggested change

threshold configured, Flink will then try to compact these files to files with a target size. The following is the table's options for file compactions.

configured threshold, then Flink will try to compact these files to files with the target size. The following are the table's options for file compaction.

I accpet it except that I still think we should use a target size instead of the target size

lsyldliu · 2023-01-19T10:08:45Z

docs/content/docs/connectors/table/hive/hive_read_write.md

+
+#### Stream Mode
+
+In stream mode, the behavior is same to `FileSystem` sink. Please refer to [File Compaction]({{< ref "docs/connectors/table/filesystem" >}}#file-compaction) for more details.


Suggested change

In stream mode, the behavior is same to `FileSystem` sink. Please refer to [File Compaction]({{< ref "docs/connectors/table/filesystem" >}}#file-compaction) for more details.

In stream mode, the behavior is the same as `FileSystem` sink. Please refer to [File Compaction]({{< ref "docs/connectors/table/filesystem" >}}#file-compaction) for more details.

lsyldliu · 2023-01-19T10:12:03Z

docs/content/docs/connectors/table/hive/hive_read_write.md

+        <td>Integer</td>
+        <td>
+        The parallelism to compact files. If not set, it will use the <a href="{{< ref "docs/connectors/table/filesystem" >}}#sink-parallelism">sink parallelism</a>.
+        When use <a href="{{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler">adaptive batch scheduler</a>, the parallelism may be small, which will cause taking much time to finish compaction.


Suggested change

When use <a href="{{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler">adaptive batch scheduler</a>, the parallelism may be small, which will cause taking much time to finish compaction.

When using <a href="{{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler">adaptive batch scheduler</a>, the parallelism of the compact operator deduced by the scheduler may be small, which will cause taking much time to finish compaction.

lsyldliu · 2023-01-19T10:12:36Z

docs/content/docs/connectors/table/hive/hive_read_write.md

+        <td>
+        The parallelism to compact files. If not set, it will use the <a href="{{< ref "docs/connectors/table/filesystem" >}}#sink-parallelism">sink parallelism</a>.
+        When use <a href="{{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler">adaptive batch scheduler</a>, the parallelism may be small, which will cause taking much time to finish compaction.
+        In such case, please remember to set this value to a bigger value manually.


Suggested change

In such case, please remember to set this value to a bigger value manually.

In such a case, please remember to set this option to a bigger value manually.

lsyldliu · 2023-01-19T10:14:19Z

docs/content.zh/docs/connectors/table/hive/hive_read_write.md

+
+#### Batch Mode
+
+在批模式，并且自动合并小文件已经开启的情况下，在结束写 Hive 表后，Flink 会计算每个分区下的文件平均大小，如果文件的平均大小小于用户指定的一个阈值，Flink 则会将这些文件合并成指定大小的文件。下面是文件合并涉及到的参数：


Suggested change

在批模式，并且自动合并小文件已经开启的情况下，在结束写 Hive 表后，Flink 会计算每个分区下的文件平均大小，如果文件的平均大小小于用户指定的一个阈值，Flink 则会将这些文件合并成指定大小的文件。下面是文件合并涉及到的参数：

在批模式，并且自动合并小文件已经开启的情况下，在结束写 Hive 表后，Flink 会计算每个分区下文件的平均大小，如果文件的平均大小小于用户指定的一个阈值，Flink 则会将这些文件合并成指定大小的文件。下面是文件合并涉及到的参数：

…mode

luoyuxia · 2023-01-29T09:52:47Z

@flinkbot run azure

lsyldliu · 2023-01-30T02:09:21Z

...onnector-hive/src/test/java/org/apache/flink/connectors/hive/HiveTableCompactSinkITCase.java

+    }
+
+    @Test
+    public void testNoCompaction() throws Exception {


I think we should test it manually.

lsyldliu · 2023-01-30T02:15:03Z

docs/content/docs/connectors/table/hive/hive_read_write.md

+        <td>no</td>
+        <td style="word-wrap: break-word;">false</td>
+        <td>Boolean</td>
+        <td>Whether to enable automatic compaction in Hive sink or not. The data will be written to temporary files. The temporary files are invisible before compaction.</td>


Suggested change

<td>Whether to enable automatic compaction in Hive sink or not. The data will be written to temporary files. The temporary files are invisible before compaction.</td>

<td>Whether to enable automatic compaction in Hive sink or not. The data will be written to temporary files first. The temporary files are invisible before compaction.</td>

lsyldliu · 2023-01-30T02:26:19Z

docs/content.zh/docs/connectors/table/hive/hive_read_write.md

+        <td>yes</td>
+        <td style="word-wrap: break-word;">16MB</td>
+        <td>MemorySize</td>
+        <td>合并文件的阈值，当文件的平均大小小于该阈值, Flink 将对这些文件进行合并. 默认值是 16MB.</td>


Suggested change

<td>合并文件的阈值，当文件的平均大小小于该阈值, Flink 将对这些文件进行合并. 默认值是 16MB.</td>

<td>合并文件的阈值，当文件的平均大小小于该阈值，Flink 将对这些文件进行合并。默认值是 16MB。</td>

Please note the Chinese and English symbols.

luoyuxia · 2023-01-30T03:24:05Z

@lsyldliu Thanks for reviewing. I have addressed your comments.

lsyldliu

LGTM

luoyuxia · 2023-01-30T12:33:44Z

@wuchong Could you please help merge. Thanks.

…mode (apache#21703)

luoyuxia force-pushed the FLINK-29880 branch 2 times, most recently from 0a2a9f6 to 079e6bd Compare January 17, 2023 13:53

flinkbot added the component=Connectors/Hive label Jan 17, 2023

lsyldliu reviewed Jan 19, 2023

View reviewed changes

luoyuxia force-pushed the FLINK-29880 branch from de5f1ad to 848535e Compare January 19, 2023 15:46

luoyuxia added 3 commits January 29, 2023 14:28

[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch …

eabfd3d

…mode

[FLINK-29880][hive] fix comments

0e8f9f9

[FLINK-29880][hive] fix comments for doc

94c0659

luoyuxia force-pushed the FLINK-29880 branch from 848535e to 94c0659 Compare January 29, 2023 06:28

lsyldliu reviewed Jan 30, 2023

View reviewed changes

[FLINK-29880][hive] fix comments for doc

7435e1f

luoyuxia force-pushed the FLINK-29880 branch from f2d27ef to 7435e1f Compare January 30, 2023 08:07

lsyldliu approved these changes Jan 30, 2023

View reviewed changes

wuchong merged commit cf358d7 into apache:master Jan 31, 2023

JunRuiLee pushed a commit to JunRuiLee/flink that referenced this pull request Jan 31, 2023

[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch …

6159243

…mode (apache#21703)

akkinenivijay pushed a commit to krisnaru/flink that referenced this pull request Feb 11, 2023

[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch …

3bedf17

…mode (apache#21703)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode #21703

[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode #21703

luoyuxia commented Jan 17, 2023

flinkbot commented Jan 17, 2023 •

edited

luoyuxia commented Jan 18, 2023

lsyldliu left a comment

lsyldliu Jan 19, 2023

lsyldliu Jan 19, 2023

lsyldliu Jan 19, 2023

lsyldliu Jan 19, 2023

lsyldliu Jan 19, 2023

luoyuxia Jan 19, 2023

lsyldliu Jan 19, 2023

luoyuxia Jan 19, 2023

lsyldliu Jan 19, 2023

luoyuxia Jan 19, 2023

lsyldliu Jan 19, 2023

luoyuxia Jan 19, 2023

lsyldliu Jan 19, 2023

luoyuxia commented Jan 19, 2023

lsyldliu Jan 19, 2023

luoyuxia Jan 19, 2023 •

edited

lsyldliu Jan 30, 2023

luoyuxia Jan 30, 2023

lsyldliu Jan 19, 2023

luoyuxia Jan 19, 2023

lsyldliu Jan 19, 2023

lsyldliu Jan 19, 2023

lsyldliu Jan 19, 2023

lsyldliu Jan 19, 2023

luoyuxia commented Jan 29, 2023

lsyldliu Jan 30, 2023

lsyldliu Jan 30, 2023

lsyldliu Jan 30, 2023

luoyuxia commented Jan 30, 2023

lsyldliu left a comment

luoyuxia commented Jan 30, 2023

	.map((MapFunction<RowData, Row>) value -> (Row) converter.toExternal(value))
	.map(value -> ((Row) converter.toExternal(value)))

	.map((MapFunction<RowData, Row>) value -> (Row) converter.toExternal(value))
	.map(value -> (Row) converter.toExternal(value))

	threshold configured, Flink will then try to compact these files to files with a target size. The following is the table's options for file compactions.
	configured threshold, then Flink will try to compact these files to files with the target size. The following are the table's options for file compaction.


		#### Stream Mode

		In stream mode, the behavior is same to `FileSystem` sink. Please refer to [File Compaction]({{< ref "docs/connectors/table/filesystem" >}}#file-compaction) for more details.

	In stream mode, the behavior is same to `FileSystem` sink. Please refer to [File Compaction]({{< ref "docs/connectors/table/filesystem" >}}#file-compaction) for more details.
	In stream mode, the behavior is the same as `FileSystem` sink. Please refer to [File Compaction]({{< ref "docs/connectors/table/filesystem" >}}#file-compaction) for more details.

	When use <a href="{{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler">adaptive batch scheduler</a>, the parallelism may be small, which will cause taking much time to finish compaction.
	When using <a href="{{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler">adaptive batch scheduler</a>, the parallelism of the compact operator deduced by the scheduler may be small, which will cause taking much time to finish compaction.

	In such case, please remember to set this value to a bigger value manually.
	In such a case, please remember to set this option to a bigger value manually.


		#### Batch Mode

		在批模式，并且自动合并小文件已经开启的情况下，在结束写 Hive 表后，Flink 会计算每个分区下的文件平均大小，如果文件的平均大小小于用户指定的一个阈值，Flink 则会将这些文件合并成指定大小的文件。下面是文件合并涉及到的参数：

	<td>Whether to enable automatic compaction in Hive sink or not. The data will be written to temporary files. The temporary files are invisible before compaction.</td>
	<td>Whether to enable automatic compaction in Hive sink or not. The data will be written to temporary files first. The temporary files are invisible before compaction.</td>

	<td>合并文件的阈值，当文件的平均大小小于该阈值, Flink 将对这些文件进行合并. 默认值是 16MB.</td>
	<td>合并文件的阈值，当文件的平均大小小于该阈值，Flink 将对这些文件进行合并。默认值是 16MB。</td>

[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode #21703

[FLINK-29880][hive] Introduce auto compaction for Hive sink in batch mode #21703

Conversation

luoyuxia commented Jan 17, 2023

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented Jan 17, 2023 • edited

CI report:

luoyuxia commented Jan 18, 2023

lsyldliu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luoyuxia commented Jan 19, 2023

Choose a reason for hiding this comment

luoyuxia Jan 19, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luoyuxia commented Jan 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luoyuxia commented Jan 30, 2023

lsyldliu left a comment

Choose a reason for hiding this comment

luoyuxia commented Jan 30, 2023

flinkbot commented Jan 17, 2023 •

edited

luoyuxia Jan 19, 2023 •

edited