[HUDI-3981] Flink engine support for comprehensive schema evolution #5830

trushev · 2022-06-10T04:57:34Z

Change Logs

This PR adds support of reading by flink when comprehensive schema evolution(RFC-33) enabled and there are operations add column, rename column, change type of column, drop column.

Impact

user-facing feature change: comprehensive schema evolution in flink

Risk level medium

This change added tests and can be verified as follows:

Added unit test TestCastMap to verify that type conversion is correct
Added integration test ITTestSchemaEvolution to verify that table with added, renamed, casted, dropped columns is read as expected.

Documentation Update

There is schema evolution doc https://hudi.apache.org/docs/schema_evolution

Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

danny0405 · 2022-06-13T07:46:13Z

If it is ready for reviewing, you can ping someone for help :)

trushev · 2022-06-13T08:01:23Z

@xiarixiaoyao could you pls review this PR :)

xiarixiaoyao · 2022-06-15T08:54:33Z

@trushev thanks for your contribution， i will review it next few days

...nk-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java

xiarixiaoyao · 2022-06-28T09:16:52Z

...datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/SchemaEvolutionContext.java

+  }
+
+  public static LogicalType[] project(List<DataType> fieldTypes, int[] selectedFields) {
+    return Arrays.stream(selectedFields)


it will be better to support nested column projects in the future

xiarixiaoyao · 2022-06-28T09:34:59Z

Partial review. Still looking @trushev Overall, looks good.

xiarixiaoyao · 2022-06-28T09:36:58Z

@danny0405 @XuQianJin-Stars could you pls help review this pr, thanks very much

trushev · 2022-07-01T02:31:58Z

Sorry for force push, rebased on the latest master to get fix [HUDI-4258]

trushev · 2022-07-07T04:06:28Z

Resolved conflict with master

trushev · 2022-07-10T11:18:21Z

Resolved conflict with master

trushev · 2022-07-12T05:17:57Z

I think it is ready to merge

danny0405 · 2022-07-12T05:20:05Z

Thanks, i will take a look this week, and before that, please do not merge.

trushev · 2022-07-15T08:30:01Z

Ok, then I will fix the typo in commit message HUDI-3983 => HUDI-3981 along with comment fixes

danny0405 · 2022-07-20T08:05:18Z

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java

-    if (!internalSchemaOption.isPresent()) {
-      throw new HoodieException(String.format("cannot find schema for current table: %s", config.getBasePath()));
-    }
-    return Pair.of(internalSchemaOption.get(), metaClient);


I take a quick look at the PR and feels that the schema about codes is too invasive to be everywhere, which is hard to maintain and prone to be buggy, we need a more neat way for the code engineering.

Ok, thanks for the review. I will think about decoupling schema evo from the other code

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java

danny0405 · 2022-07-20T08:40:26Z

...nt/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/BaseMergeHelper.java

+    boolean needToReWriteRecord = false;
+    Map<String, String> renameCols = new HashMap<>();
+    // TODO support bootstrap
+    if (querySchemaOpt.isPresent() && !baseFile.getBootstrapBaseFile().isPresent()) {


Do we need to check the schema evolution for each file ? or each read/commit ?

I just moved this code snippet from HoodieMergeHelper to BaseMergeHelper as is. Anyway I will think about avoiding unnecessary checks you pointed

@trushev
can we avoid moved this code snippet, i donnot think flink evolution need to modify those codes.
#6358 and #7183 will optimize this code

@danny0405
we need check evolution for each base file.
Once we have made multiple columns changes, different base files may have different schemas, and we cannot use the schema of the current table to read these files directly, an exception will be thrown directly

tableA: a int, b string, c double and there exist three files in this table: f1, f2, f3

drop column from tableA and add new column d, and then we update tableA, but we only update f2,and f3, f1 is not touched
now schema

schema1 from tableA: a int, b string, d long. schema2 from f2,f3: a int, b string, d long schema3 from f1 is: a int, b string , c double

we should not use schema1 to read f1.

@trushev can we avoid moved this code snippet, i donnot think flink evolution need to modify those codes. #6358 and #7183 will optimize this code

@xiarixiaoyao This code should be moved from HoodieMergeHelper to BaseMergeHelper due to current class hierarchy:

I don't want to modify that code I just want to reuse it in flink

danny0405 · 2022-07-20T08:44:03Z

hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieUnMergedLogRecordScanner.java

@@ -135,10 +137,15 @@ public Builder withLogRecordScannerCallback(LogRecordScannerCallback callback) {
      return this;
    }

+    public Builder withInternalSchema(InternalSchema internalSchema) {
+      this.internalSchema = internalSchema;
+      return this;


There is already a read schema, why we pass around another schema, whatever it is, please use just one schema !

I used the second schema here to be consistent with HoodieMergedLogRecordScanner which already uses this approach to scan logs in HoodieMergeOnReadRDD#scanLog. Do you think it is a bad practice?

Yes, a very confusing practice, the reader/format should be deterministic to one static given schema, it should not care about how the schema is generated, or where it comes from, say: it should not be imposed to the evolution logic.

I reverted changes in HoodieMergedLogRecordScanner. Now there is only one schema -- InternalSchema which wraps org.apache.avro.Schema. The same approach is used in HoodieUnMergedLogRecordScanner

hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java

hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/CastMap.java

hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/FormatUtils.java

...datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/SchemaEvolutionContext.java

...source/hudi-flink/src/main/java/org/apache/hudi/table/format/cow/CopyOnWriteInputFormat.java

...source/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadInputFormat.java

flashJd · 2022-09-23T03:36:06Z

@trushev Good job, I've tested it and it works on the whole, but a little defects and I'll point out

flashJd · 2022-09-23T09:36:45Z

@danny0405 @xiarixiaoyao this pr is pending for two mouths, when can we merge it, as spark only support full schema evolution in spark 3.x.x, my spark version is 2.4.

danny0405 · 2022-11-28T02:04:48Z

hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java

@@ -120,6 +121,12 @@ private FlinkOptions() {
      .withDescription("The default partition name in case the dynamic partition"
          + " column value is null/empty string");

+  public static final ConfigOption<Boolean> SCHEMA_EVOLUTION_ENABLED = ConfigOptions
+      .key(HoodieCommonConfig.SCHEMA_EVOLUTION_ENABLE.key())
+      .booleanType()


There is no need to add the option if the key is the same with Hoodie core's.

No worries, just add a tool in OptionsResolver

Replaced with deprecated conf.getBoolean(HoodieCommonConfig.SCHEMA_EVOLUTION_ENABLE.key(), false)

Added OptionsResolver.isSchemaEvolutionEnabled

danny0405 · 2022-11-28T02:09:04Z

hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/HoodieFlinkTable.java

@@ -102,4 +108,9 @@ public <T extends SpecificRecordBase> Option<HoodieTableMetadataWriter> getMetad
      return Option.empty();
    }
  }
+
+  private static void setLatestInternalSchema(HoodieWriteConfig config, HoodieTableMetaClient metaClient) {
+    Option<InternalSchema> internalSchema = new TableSchemaResolver(metaClient).getTableInternalSchemaFromCommitMetadata();


Add pre-condition check in case of null values.

replaced with isPresent()

danny0405 · 2022-11-28T03:58:57Z

...nk-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/HoodieParquetReader.java

+    Option<RowDataProjection> castProjection;
+    InternalSchema fileSchema = internalSchemaManager.getFileSchema(path.getName());
+    if (fileSchema.isEmptySchema()) {
+      castProjection = Option.empty();


Can return HoodieParquetReader directly here when we know castProjection is empty.

copy-pasted

danny0405 · 2022-11-28T04:00:47Z

...-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/InternalSchemaManager.java

+  }
+
+  private static void assertSchemasAreNotEmpty(InternalSchema schema1, InternalSchema schema2) {
+    Preconditions.checkArgument(!schema1.isEmptySchema(), "InternalSchema cannot be empty here");


There is no need to bind the schema validation together, and we can give more details exception msg for different schemas.

removed method, replaced message "InternalSchema..." with "querySchema..."

danny0405 · 2022-11-28T04:05:49Z

hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/RowDataProjection.java

@@ -87,4 +88,8 @@ public Object[] projectAsValues(RowData rowData) {
    }
    return values;
  }
+
+  protected @Nullable Object rewriteVal(int pos, @Nullable Object val) {
+    return val;


rewriteVal => getVal , usually we do not overwrite impl methods, but only abstract methods, the override is not very friendly for base class performance.

renamed rewriteVal => getVal

danny0405 · 2022-11-28T04:08:15Z

hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/format/TestCastMap.java

+    castMap.add(1, new DecimalType(), new VarCharType());
+    DecimalData val = DecimalData.fromBigDecimal(BigDecimal.ONE, 2, 1);
+    assertEquals(DecimalData.fromBigDecimal(BigDecimal.ONE, 3, 2), castMap.castIfNeeded(0, val));
+    assertEquals(BinaryStringData.fromString("1.0"), castMap.castIfNeeded(1, val));


For float, double and decimal data types, what is case when the target data type has precision loss, do we throw exception here ? Exactly what is the data type precedence(what kind of data type is castable here) for each of the type ?

do we throw exception here

no, we followed by spark's implementation org.apache.hudi.client.utils.SparkInternalSchemaConverter#convertDoubleType

what kind of data type is castable here

Float => Double, Decimal

Double => Decimal

Decimal => Decimal (change precision or scale)

String => Decimal

Thanks, i see we return null when CastMap cast a type that is not in its precedence list, is that reasonable ?

the following example throws exception:

CastMap castMap = new CastMap(); castMap.add(0, new BigIntType(), new IntType()); // <---- error, cast long to int is unsupported

java.lang.IllegalArgumentException: Cannot create cast BIGINT => INT at pos 0

the following example throws exception as well:

CastMap castMap = new CastMap(); castMap.add(0, new IntType(), new BigIntType()); // cast int => long castMap.castIfNeeded(0, "wrong arg"); // <----- error, expected int but actual is string

java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number

danny0405 · 2022-11-28T09:03:30Z

...nk-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/HoodieParquetReader.java

+    InternalSchema fileSchema = internalSchemaManager.getFileSchema(path.getName());
+    if (fileSchema.isEmptySchema()) {
+      return new HoodieParquetSplitReader(
+          ParquetSplitReaderUtil.genPartColumnarRowReader(


The HoodieParquetSplitReader can be shared ?

You mean shared with another file split? I guess no because of ParquetColumnarRowSplitReader is not shareable. Currently, we always create new parquet reader for each file

danny0405 · 2022-11-28T09:21:09Z

...tasource/hudi-flink/src/main/java/org/apache/hudi/table/format/HoodieParquetSplitReader.java

+
+  public HoodieParquetSplitReader(ParquetColumnarRowSplitReader reader) {
+    this.reader = reader;
+  }


ParquetColumnarRowSplitReader can implement HoodieParquetReader directly.

I avoided it on purpose because of

ParquetColumnarRowSplitReader is copied from flink. I'd like to avoid any changes in this class.

We should maintain 3 versions of this: 1.13.x, 1.14.x, 1.15.x.

There is note in ParquetSplitReaderUtil:

* <p>NOTE: reference from Flink release 1.11.2 {@code ParquetSplitReaderUtil}, modify to support INT64 * based TIMESTAMP_MILLIS as ConvertedType, should remove when Flink supports that.

I think if we remove ParquetSplitReaderUtil then we want to remove ParquetColumnarRowSplitReader as well

danny0405 · 2022-11-29T02:32:54Z

@hudi-bot run azure

danny0405 · 2022-11-29T03:56:25Z

Can you scrash and force push here. I didn't see the Azure CI history, let's re-trigger it.

trushev · 2022-11-29T04:21:28Z

Can you scrash and force push here. I didn't see the Azure CI history, let's re-trigger it.

Done, waiting for azure

trushev · 2022-11-29T05:04:17Z

CI build failure due to broken master branch. I've pushed the fix #7319

trushev · 2022-11-29T07:06:07Z

It looks like azure doesn't run on this PR anymore. Verifying PR is opened #7321

hudi-bot · 2022-11-30T08:30:56Z

CI report:

5b7eed2 Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

danny0405

+1

trushev · 2022-11-30T09:04:58Z

@danny0405 @xiarixiaoyao @flashJd thank you for the review this PR

xiarixiaoyao · 2022-11-30T09:27:49Z

@trushev
Thanks a lot for contributing this feature and waiting for the review patiently.

XuQianJin-Stars · 2022-11-30T09:59:30Z

@trushev Thanks a lot for contributing this feature.

…pache#5830)

waywtdcc · 2022-12-04T03:50:03Z

@trushev @danny0405 Hello, can this pr be merged into 0.12.1 to support flink schema evolution? Do I need to merge other PRs?

trushev · 2022-12-05T03:57:40Z

@trushev @danny0405 Hello, can this pr be merged into 0.12.1 to support flink schema evolution? Do I need to merge other PRs?

Yes there are several commits that this PR depends on. I think it is not a big deal to backport the feature. I'm just not sure about release policy. Is such change suitable minor update 1.12.1 -> 1.12.2

waywtdcc · 2022-12-13T07:26:10Z

Hope this pr can be merged into 0.12.2

…5830)

…pache#5830) (cherry picked from commit da89e12)

voonhous · 2022-12-16T08:59:38Z

@trushev I've read through the PR and noticed that the scope of the changes included here is limited to supporting Hudi Full Schema Evolution (HFSE).

Prior to HFSE, Hudi has been relying on Avro's native Schema-Resolution (ASR) to perform schema evolution when performing UPSERTs via Spark, where schema changes are applied implicitly.

These implicit schema changes do not write to .schema and hence, the feature here will not support ASR reads via Flink.

I provided some examples (Mainly on Spark) in this issue here: #7444.

I was wondering if you have any plans on supporting ASR reads via Flink.

If there are none, I plan on adding this support for ASR reads via Flink. Wanted to clarify to prevent repeated effort on the same feature.

trushev · 2022-12-16T12:47:51Z

@voonhous
As I understand you are talking about this case

Flink SQL>
-- write with schema1
create table tbl(`id` int primary key, `value` int)
    partitioned by (`id`)
    with ('connector'='hudi', 'path'='/tmp/tbl');
insert into tbl values (1, 10);

-- write with schema2 int => double
drop table tbl;
create table tbl(`id` int primary key, `value` double)
    partitioned by (`id`)
    with ('connector'='hudi', 'path'='/tmp/tbl');
insert into tbl values (2, 20.0);

-- read all data
select * from tbl; -- throws exception due to tbl consists of two partitioned files (1, 10) and (2, 20.0)

Caused by: java.lang.IllegalArgumentException: Unexpected type: INT32

While if we delete partitioned by ('id') in sql above, tbl will consist of two unpartitioned files (1, 10.0), (2, 20.0) and read query will work fine

select * from tbl;

+----+-------------+--------------------------------+
| op |          id |                          value |
+----+-------------+--------------------------------+
| +I |           1 |                           10.0 |
| +I |           2 |                           20.0 |
+----+-------------+--------------------------------+

In my opinion it's a good idea to support such scenario you described for spark #7480
Currently, I have no plans to implement it so you can do it if you wish

voonhous · 2022-12-21T03:15:21Z

@trushev

Yes, this is what I intend to work on.

What you described is operations made entirely on FlinkSQL. I was thinking of cross-engine operations.

i.e. Tables that were evolved using Avro Schema Resolution (ASR) via Spark, but read using Flink, the same error will be thrown for such cases too.

…pache#5830)

trushev mentioned this pull request Jun 10, 2022

[HUDI-3982] Comprehensive schema evolution in flink when read/batch/cow/snapshot #5443

Closed

5 tasks

trushev changed the title ~~[HUDI-3981] Flink engine support for comprehensive schema evolution(R…~~ [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33) Jun 10, 2022

xiarixiaoyao reviewed Jun 28, 2022

View reviewed changes

...nk-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java Outdated Show resolved Hide resolved

xiarixiaoyao reviewed Jun 28, 2022

View reviewed changes

xiarixiaoyao requested review from danny0405 and XuQianJin-Stars June 28, 2022 09:37

trushev force-pushed the se branch from 5164437 to 88ce744 Compare July 1, 2022 02:30

xiarixiaoyao approved these changes Jul 1, 2022

View reviewed changes

danny0405 self-assigned this Jul 3, 2022

trushev force-pushed the se branch from 197780a to c71cb55 Compare July 10, 2022 11:17

trushev force-pushed the se branch from 2dc34ce to c71cb55 Compare July 10, 2022 16:21

danny0405 reviewed Jul 20, 2022

View reviewed changes

danny0405 requested changes Jul 20, 2022

View reviewed changes

trushev changed the title ~~[HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)~~ [WIP][HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33) Jul 23, 2022

yihua added schema-and-data-types flink Issues related to flink priority:major degraded perf; unable to move forward; potential bugs labels Sep 13, 2022

danny0405 reviewed Nov 28, 2022

View reviewed changes

trushev force-pushed the se branch from 9b493f0 to 152776f Compare November 29, 2022 04:20

trushev force-pushed the se branch from 152776f to 9d3d901 Compare November 29, 2022 06:11

trushev mentioned this pull request Nov 29, 2022

Flink schema evolution support #7321

Closed

4 tasks

[HUDI-3981] Flink engine support for comprehensive schema evolution

5b7eed2

trushev changed the title ~~[HUDI-3981][RFC-33] Flink engine support for comprehensive schema evolution~~ [HUDI-3981] Flink engine support for comprehensive schema evolution Nov 30, 2022

trushev force-pushed the se branch from 9d3d901 to 5b7eed2 Compare November 30, 2022 03:09

danny0405 approved these changes Nov 30, 2022

View reviewed changes

danny0405 merged commit da89e12 into apache:master Nov 30, 2022

vamshigv pushed a commit to vamshigv/hudi that referenced this pull request Dec 1, 2022

[HUDI-3981] Flink engine support for comprehensive schema evolution (a…

a89eab8

…pache#5830)

satishkotha pushed a commit that referenced this pull request Dec 13, 2022

[HUDI-3981] Flink engine support for comprehensive schema evolution (#…

60973dc

…5830)

neverdizzy pushed a commit to neverdizzy/hudi that referenced this pull request Dec 13, 2022

[HUDI-3981] Flink engine support for comprehensive schema evolution (a…

175a67d

…pache#5830) (cherry picked from commit da89e12)

fengjian428 pushed a commit to fengjian428/hudi that referenced this pull request Apr 5, 2023

[HUDI-3981] Flink engine support for comprehensive schema evolution (a…

ba1be8b

…pache#5830)

[HUDI-3981] Flink engine support for comprehensive schema evolution #5830

[HUDI-3981] Flink engine support for comprehensive schema evolution #5830

Conversation

trushev commented Jun 10, 2022 • edited

Change Logs

Impact

Risk level medium

Documentation Update

Contributor's checklist

danny0405 commented Jun 13, 2022

trushev commented Jun 13, 2022

xiarixiaoyao commented Jun 15, 2022

Choose a reason for hiding this comment

xiarixiaoyao commented Jun 28, 2022

xiarixiaoyao commented Jun 28, 2022

trushev commented Jul 1, 2022

trushev commented Jul 7, 2022

trushev commented Jul 10, 2022

trushev commented Jul 12, 2022

danny0405 commented Jul 12, 2022

trushev commented Jul 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trushev Nov 15, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trushev Jul 22, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flashJd commented Sep 23, 2022

flashJd commented Sep 23, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trushev Nov 28, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trushev Nov 28, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trushev Nov 28, 2022 • edited

Choose a reason for hiding this comment

danny0405 commented Nov 29, 2022

danny0405 commented Nov 29, 2022

trushev commented Nov 29, 2022

trushev commented Nov 29, 2022

trushev commented Nov 29, 2022

hudi-bot commented Nov 30, 2022

CI report:

danny0405 left a comment

Choose a reason for hiding this comment

trushev commented Nov 30, 2022

xiarixiaoyao commented Nov 30, 2022

XuQianJin-Stars commented Nov 30, 2022

waywtdcc commented Dec 4, 2022

trushev commented Dec 5, 2022

waywtdcc commented Dec 13, 2022

voonhous commented Dec 16, 2022

trushev commented Dec 16, 2022 • edited

voonhous commented Dec 21, 2022

trushev commented Jun 10, 2022 •

edited

trushev Nov 15, 2022 •

edited

trushev Jul 22, 2022 •

edited

trushev Nov 28, 2022 •

edited

trushev Nov 28, 2022 •

edited

trushev Nov 28, 2022 •

edited

trushev commented Dec 16, 2022 •

edited