[HUDI-6872] Simplify Out Of Box Schema Evolution Functionality #9743

jonvex · 2023-09-18T16:03:03Z

Change Logs

Change how out of the box schema evolution works so it is easier to understand both by users and Hudi developers.

Things you can't do:

reorder columns
add new meta columns to nested structs

Added and Dropped Columns

New fields can be added to the end of the schema or to the end of nested structs. Those fields will be in the schema of any future write.
Fields in the latest table schema that are missing from the incoming schema will be added to the incoming data with null values.

Type Promotion

Promotions work on complex types such as arrays or maps as well
Promotions:

int is promotable to long, float, double, or string
long is promotable to float, double, or string
float is promotable to double or string
string is promotable to bytes
bytes is promotable to string

Rules:

If the incoming schema has a column that is promoted from the table schema's column type, the field will be the promoted type in the tables schema from now on
If the incoming schema has a column that is demoted from the table schema's column type, the incoming batch will have it's data promoted to the incoming schema

Impact

Change how out of the box schema evolution works so it is easier to understand both by users and Hudi developers.

Risk level (write none, low medium or high below)

High

lots of testing has been done, including performance testing on tpcds 1tb to ensure MOR avro log block reading has not been degraded

Documentation Update

jira

pr

Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

… columns

voonhous · 2023-09-19T03:10:29Z

Hello @jonvex, a little curious as to why we are maintaining so 2 implementations of schema evolution.

Out-of-box (supported by AVRO)
Comprehensive (using hudi's InternalSchemaManager)

I believe it's a little confusing for users (especially new users)

jonvex · 2023-09-19T15:34:55Z

@voonhous there is also hoodie.datasource.write.reconcile.schema as well. Very confusing.
It would be great to unify everything. It is difficult to make changes like this because different users may rely on all of these. With 1.0 I think changes like this may be allowed. Right now we are evaluating the current capabilities of schema evolution and hopefully we can make this feature better

…tests

kazdy · 2023-09-28T22:24:09Z

@voonhous there is also hoodie.datasource.write.reconcile.schema as well. Very confusing.
It would be great to unify everything. It is difficult to make changes like this because different users may rely on all of these. With 1.0 I think changes like this may be allowed. Right now we are evaluating the current capabilities of schema evolution and hopefully we can make this feature better

Can we discuss this as a community in email thread? Let people share ideas and needs around schema evolution and enforcement?

…a and back, to standardize the ordering of null going first in unions. I tried to use a method so that ordering didn't matter, but the MDT uses a trick to get around union only allowing 2 values that was causing my first approach to fail

...lient/hudi-client-common/src/main/java/org/apache/hudi/common/table/log/CachingIterator.java

...hudi-client-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java

hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java

hudi-common/src/main/java/org/apache/hudi/avro/AvroCastingGenericRecord.java

nsivabalan · 2023-10-25T15:05:08Z

@hudi-bot run azure

nsivabalan

Good job on the patch man.

btw, can you do the following as well.

Check for meta sync flows to see how schema change is deduced. We have optimization where in we trigger remote sync only incase of partitions add/remove and during schema changes. So, lets ensure we are good on that part.
Also, did we test all the new behavior w/ diff meta syncs (like hive, presto, trino etc). I am sure we would have tested w/ spark. but did you get a chance to test our other query engines.
Also, lets ensure we do a round of testing w/ spark-sql. Most common schema evolutions come from this writer.
After evolving the schema, if we do savepoint and restore to a older commit, I assume the table will still be intact. i.e it might go back to older schema. and is readable.
I am half way through reviewing the patch. But tell me something. if a commit which is trying to evolve the schema failed just before creating the completed commit metadata in the timeline, we are still intact right. there is complete isolation and a new writer is not impacted by it. i.e. it should not see the evolved schema at all at any point in time. For eg, incase its a MOR table, all log blocks rely on table scheme and not on previous log file schema. just calling out an example.
Can you write a mini RFC may be on this patch. Even if it covers the current state of things, and flow of schemas and evolution for developers to understand the end to end flows, it would be great. Or you can publish it as a hudi blog as well. whatever works. For eg, where the schema reconciliation comes into play, what is reconcile, what happens in write handles, how partial file groups are updated when schema evols or how read happens w/ diff file groups in diff schemas etc.
Can you call out how does time travel and incremental queries behave w/ schema evolution.
for eg:
C1: 5 col schema (V1)
C2: 6 col schema (V2)
C3: 6 col schema (V2) //no change in schema
C4: 7 col schema (V3)

for a time travel query with commit time C2, what does the schema look like? Is it V2 or V3 (assuming c4 is complete) when the query is issued.

and for an incremental query
for a. until C1: is it V1?
b. begin C1, end C2: is it V2?
c. being C1, end C4: is it V3?

hudi-common/src/main/java/org/apache/hudi/avro/AvroSchemaUtils.java

hudi-common/src/main/java/org/apache/hudi/common/config/HoodieCommonConfig.java

hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java

hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java

hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/AvroSchemaEvolutionUtils.java

...scala/org/apache/spark/sql/execution/datasources/parquet/HoodieParquetFileFormatHelper.scala

hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/LazyCastingIterator.java

...scala/org/apache/spark/sql/execution/datasources/parquet/HoodieParquetFileFormatHelper.scala

hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java

… provided

hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java

hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java

nsivabalan · 2023-11-01T23:17:21Z

...ommon/src/main/java/org/apache/hudi/internal/schema/convert/AvroInternalSchemaConverter.java

+    if (schema.getType() == Schema.Type.NULL) {
+      return schema;
+    }
+    return convert(convert(schema), schema.getFullName());


curious to know why do we need two convertions. can't we directly fix or create AvorSchema by fixing the null ordering

nsivabalan · 2023-11-01T23:19:18Z

hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/AvroSchemaEvolutionUtils.java

-  public static Schema reconcileNullability(Schema sourceSchema, Schema targetSchema, Map<String, String> opts) {
-    if (sourceSchema.getFields().isEmpty() || targetSchema.getFields().isEmpty()) {
+  public static Schema reconcileSchemaRequirements(Schema sourceSchema, Schema targetSchema, Map<String, String> opts) {
+    if (sourceSchema.getType() == Schema.Type.NULL || sourceSchema.getFields().isEmpty() || targetSchema.getFields().isEmpty()) {


if source schema fields are empty, shouldn't we be returning targetSchema.

@jonvex : do you have any pointers here

...spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala

nsivabalan · 2023-11-01T23:23:20Z

...spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala

@@ -545,33 +552,37 @@ class HoodieSparkSqlWriterInternal {
                         latestTableSchemaOpt: Option[Schema],
                         internalSchemaOpt: Option[InternalSchema],
                         opts: Map[String, String]): Schema = {
+    val addNullForDeletedColumns = opts.getOrDefault(DataSourceWriteOptions.ADD_NULL_FOR_DELETED_COLUMNS.key(),


we definitely need good docs for this method. lets enhance docs in L545. even having illustrative examples is totally fine. We have been soft/low key on schema evolution in general. lets button up and ensure we get it right this time.

the schema handling is grown up to be sizable now. Lets move it to a separate static class instead of adding everything to HoodieSparkSqlWrtier.

I am moving deduceWriterSchema() to HoodieSchemaUtils

...spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala

hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/AvroSchemaEvolutionUtils.java

nsivabalan · 2023-11-01T23:38:00Z

...spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala

+                       |Table's schema ${latestTableSchema.toString(true)}
+                       |""".stripMargin)
+                  throw new SchemaCompatibilityException("Incoming batch schema is not compatible with the table's one")
+                }


also, can you throw light on why we don't call AvroSchemaEvolutionUtils.reconcileSchema(canonicalizedSourceSchema, latestTableSchema)
in else block in L 658.

ie. when reconcile schema is set to false, and AVRO_SCHEMA_VALIDATE_ENABLE is set to true, looks like we don't ever call AvroSchemaEvolutionUtils.reconcileSchema(canonicalizedSourceSchema, latestTableSchema) .

also, curious to know whats the diff b/w AvroSchemaEvolutionUtils.reconcileSchema(canonicalizedSourceSchema, latestTableSchema) and HoodieSparkSqlWriter.canonicalizeSchema()

bcoz, both takes in both source schema and table schema as args

@jonvex ping

nsivabalan · 2023-11-09T04:37:13Z

@hudi-bot run azure

hudi-bot · 2023-11-10T00:24:31Z

CI report:

097ef61 UNKNOWN
e32b58f UNKNOWN
a21058e Azure: FAILURE

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

nsivabalan · 2023-11-10T05:43:06Z

…e#9743) Change how out of the box schema evolution works so it is easier to understand both by users and Hudi developers. Things you can't do: - Reorder columns - add new meta columns to nested structs Support being added OOB: - New fields can be added to the end of the schema or to the end of nested structs. Those fields will be in the schema of any future write. - Fields in the latest table schema that are missing from the incoming schema will be added to the incoming data with null values. - Type Promotion - Promotions work on complex types such as arrays or maps as well Promotions: int is promotable to long, float, double, or string long is promotable to float, double, or string float is promotable to double or string string is promotable to bytes bytes is promotable to string Rules: - If the incoming schema has a column that is promoted from the table schema's column type, the field will be the promoted type in the tables schema from now on - If the incoming schema has a column that is demoted from the table schema's column type, the incoming batch will have it's data promoted to the incoming schema

voonhous · 2023-12-22T11:36:26Z

@jonvex @xiarixiaoyao

Apologies for necro-ing this PR. I was revisiting this PR today and noticed that using Hudi's comprehensive schema evolution using that is managed via Hudi's InternalSchema is still supported via write paths using the Dataframe API.

i.e. dataframe.write.format("hudi").options(...).mode(...).save(basePath)

Where the options require these 2 confguration:

hoodie.schema.on.read.enable=true
hoodie.datasource.write.reconcile.schema=true

This means that we can still make use of Hudi comprehensive schema evolution to perform schema evolution. This behaviour is consistent with what we used to have in the past, which is good!

However, this means of using schema evolution is not really documented and I am wondering if the community has any plans to ensure that the end-to-end flow for this use-case is error-free.

For now, there are 2 entrypoints for Spark in which Hudi comprehensive schema evolution can be done. One, via Spark-SQL, and the other via Dataframe API as i described above just now.

Considering the case where tables are sync-ed to a hive-catalogue, which is one of the more common use cases. Spark-SQL current does this via sparkSession.sessionState.catalog.externalCatalog.alterTableDataSchema. Hence, hive-sync is done by Spark's code.

However, when Dataframe API, especially Deltrastreamer, hive-sync is done using hudi-hive-sync. Something that Hudi manages internally.

Referring to the code below, if one does a Hudi comprehensive schema evolution outside of the scope defined here, hive-sync will fail, although UPSERT succeeds.

hudi/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/util/HiveSchemaUtil.java

Lines 424 to 439 in f0356de

    
           public static boolean isSchemaTypeUpdateAllowed(String prevType, String newType) { 
        
             if (prevType == null || prevType.trim().isEmpty() || newType == null || newType.trim().isEmpty()) { 
        
               return false; 
        
             } 
        
             prevType = prevType.toLowerCase(); 
        
             newType = newType.toLowerCase(); 
        
             if (prevType.equals(newType)) { 
        
               return true; 
        
             } else if (prevType.equalsIgnoreCase(INT_TYPE_NAME) && newType.equalsIgnoreCase(BIGINT_TYPE_NAME)) { 
        
               return true; 
        
             } else if (prevType.equalsIgnoreCase(FLOAT_TYPE_NAME) && newType.equalsIgnoreCase(DOUBLE_TYPE_NAME)) { 
        
               return true; 
        
             } else { 
        
               return prevType.contains("struct") && newType.toLowerCase().contains("struct"); 
        
             } 
        
           }

Questions

Is the community planning to support Hudi comprehensive schema evolution via Dataframe API?
If so, a refactoring might be in store to move InternalSchema into the different hudi-sync implementations such that there is a translation of Hudi's InternalSchema type to XYZ-sync type.
If we are not doing this, should we document this behaviour and explicitly let users know what our intended usage pattern for schema evolution is? i.e. users should stop all their write jobs, perform hudi comprehensive schema evolution via Spark-SQL, then resume their write jobs via deltastreamer/other non-Spark-SQL writes.

CC @TengHuo

TengHuo · 2023-12-26T03:46:35Z

Questions

Is the community planning to support Hudi comprehensive schema evolution via Dataframe API?

If so, a refactoring might be in store to move InternalSchema into the different hudi-sync implementations such that there is a translation of Hudi's InternalSchema type to XYZ-sync type.

If we are not doing this, should we document this behaviour and explicitly let users know what our intended usage pattern for schema evolution is? i.e. users should stop all their write jobs, perform hudi comprehensive schema evolution via Spark-SQL, then resume their write jobs via deltastreamer/other non-Spark-SQL writes.

Sorry for necro-ing this MR again. Complement about this question,

Is the community planning to support Hudi comprehensive schema evolution via Dataframe API?

Is the community planning to support Hudi comprehensive schema evolution via SparkRDDWriteClient API and HiveSyncTool?**

This issue mainly impact the pipelines of HoodieDeltaStreamer, e.g. UPSERT a batch of data with schema evolution succeeded, but hive sync could fail, then causing schema inconsistent issues when user run a query.

And add a little bit more about the current HoodieSyncTool.

As we understand, HoodieSyncTool could be used as an independent tool to sync Hudi table information to some outside meta data management service, e.g. HMS, with the method HoodieSyncTool#syncHoodieTable. So, there is no input parameters for this method, it needs to infer the schema and partition changes by itself.

Thus, in the method HiveSyncTool#syncHoodieTable, it will load Hudi table schema information from HDFS and HMS, and do the schema comparison work in the method HiveSchemaUtil#getSchemaDifference. As Voon mentioned, this method has its own logic to decide the schema type update role, which is not totally compatible with the current Hudi Schema Evolution.

For solving this issue, we are thinking about utilising the information from InternalSchema in HoodieSyncTool, and let each meta sync tool to decide how to translate these TableChange to its own meta service.

Jonathan Vexler added 3 commits September 16, 2023 19:23

add non type promotion scenarios

26a3d7a

add type promotion and table services

57148da

clean up a bit

d07690e

jonvex changed the title ~~[6872] Test out of box schema evolution for deltastreamer~~ [HUDI-6872] Test out of box schema evolution for deltastreamer Sep 18, 2023

Jonathan Vexler added 2 commits September 18, 2023 15:52

add testing for nested and complex promotion, as well as non-nullable…

097ef61

… columns

fix for compaction

e6050ef

Jonathan Vexler added 10 commits September 19, 2023 13:38

add pull apache#9571 to this pr

9365e3b

add reconcile tests and make build

fcfd0d2

fix clustering mor

4e952ab

fix test for both record type

b6a9a29

clustering type promotion mostly working:

f3d8462

add to string type promotion

7e06b0f

add multiple filegroups and multiple log files to non type promotion …

d71288b

…tests

do refactoring

3e335ff

add extra scenarios to type promo

0d0c413

add more docs, and fix one of the tests

9e0134b

Jonathan Vexler added 9 commits October 2, 2023 18:50

2 impls for fixing string type promo. Need to do perf tests

f627d9b

add auto promotion for demoted inputs

56aa98d

fix drop column support

556e762

all schema evolutions can be done at the same time

c1ca876

apply delete block change to file slice reader

724e42e

Merge branch 'master' into out_of_box_schema_evolution

852a37f

fix byte-string stuff

70c2646

don't convert to internal row and then convert back

20243ac

the-other-tim-brown reviewed Oct 9, 2023

View reviewed changes

Jonathan Vexler added 2 commits October 19, 2023 17:43

fix for async clustering

0fe4d74

account for alter schema commit

7c353cd

Jonathan Vexler and others added 2 commits October 25, 2023 12:32

change async table fix to be exclusive instead of inclusive

f98cbcb

Merge branch 'apache:master' into out_of_box_schema_evolution

661b169

nsivabalan added the priority:critical production down; pipelines stalled; Need help asap. label Oct 26, 2023

nsivabalan reviewed Oct 26, 2023

View reviewed changes

Fixing schema to be evolved wrt to table schema when target schema is…

43f9d0b

… provided

lokesh-lingarajan-0310 reviewed Oct 26, 2023

View reviewed changes

hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java Outdated Show resolved Hide resolved

lokesh-lingarajan-0310 and others added 2 commits October 27, 2023 10:27

Fixing schema deduce for target schema provider code path

56ed4a1

add avrokafka source and transformers and schemaprovider to the tests

7fed094

nsivabalan requested changes Nov 1, 2023

View reviewed changes

Addressing feedback from Siva

16128cd

nsivabalan force-pushed the out_of_box_schema_evolution branch from 00361ee to 16128cd Compare November 4, 2023 12:13

lokeshj1703 and others added 5 commits November 8, 2023 17:33

Add tests

bfe8edd

Fix test failure with testTypePromotion

58d6194

Resolving merge conflicts

db06d7f

Fixing clean up of resources in tests

d3e58fe

disabling schema evol tests

4c36398

nsivabalan added 3 commits November 8, 2023 22:50

Fixing build failure

a1ab15d

reverting disabling tests

b8cab7a

Fixing test failures

a21058e

nsivabalan approved these changes Nov 10, 2023

View reviewed changes

nsivabalan merged commit d859c46 into apache:master Nov 10, 2023
30 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HUDI-6872] Simplify Out Of Box Schema Evolution Functionality #9743

[HUDI-6872] Simplify Out Of Box Schema Evolution Functionality #9743

jonvex commented Sep 18, 2023 •

edited

Loading

voonhous commented Sep 19, 2023

jonvex commented Sep 19, 2023

kazdy commented Sep 28, 2023

nsivabalan commented Oct 25, 2023

nsivabalan left a comment

nsivabalan Nov 1, 2023

nsivabalan Nov 1, 2023

nsivabalan Nov 4, 2023

nsivabalan Nov 1, 2023

nsivabalan Nov 1, 2023

nsivabalan Nov 4, 2023

nsivabalan Nov 1, 2023

nsivabalan Nov 4, 2023

nsivabalan commented Nov 9, 2023

hudi-bot commented Nov 10, 2023

nsivabalan commented Nov 10, 2023

voonhous commented Dec 22, 2023 •

edited

Loading

TengHuo commented Dec 26, 2023

Questions

[HUDI-6872] Simplify Out Of Box Schema Evolution Functionality #9743

[HUDI-6872] Simplify Out Of Box Schema Evolution Functionality #9743

Conversation

jonvex commented Sep 18, 2023 • edited Loading

Change Logs

Added and Dropped Columns

Type Promotion

Impact

Risk level (write none, low medium or high below)

Documentation Update

Contributor's checklist

voonhous commented Sep 19, 2023

jonvex commented Sep 19, 2023

kazdy commented Sep 28, 2023

nsivabalan commented Oct 25, 2023

nsivabalan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsivabalan commented Nov 9, 2023

hudi-bot commented Nov 10, 2023

CI report:

nsivabalan commented Nov 10, 2023

voonhous commented Dec 22, 2023 • edited Loading

Questions

TengHuo commented Dec 26, 2023

Questions

jonvex commented Sep 18, 2023 •

edited

Loading

voonhous commented Dec 22, 2023 •

edited

Loading