0.10.0 by guoke111 · Pull Request #4482 · apache/hudi

guoke111 · 2021-12-31T08:01:29Z

Tips

Thank you very much for contributing to Apache Hudi.
Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.

What is the purpose of the pull request

(For example: This pull request adds quick-start document.)

Brief change log

(for example:)

Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end.
Added HoodieClientWriteTest to verify the change.
Manually verified the change by running a job locally.

Committer checklist

Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

…4142)

#4147)

…titioning' (#4130)

…4161)

…to 'DefaultHoodieRecordPayload' (#4115)" (#4169) This reverts commit 88067f5.

This reverts commit 257a6a7.

* Fixing loading of props from default dir * addressing comments

…ite amplification (#4152)

… is composed of a single column (#4183)

Signed-off-by: zzzhy <candle_1667@163.com>

…writers and table services (#4186) - Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local> - Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

…rofile (#4199)

…ing virtual keys by default for metadata table (#4194)

…ulting in incorrect `KeyGenerator` configuration (#4195)

… types by GenericRecord and Row (#3944)" (#4201)

…on base files over S3 (#4185) - Fetching partition files or all partitions from the metadata table is failing when run over S3. Metadata table uses HFile format for the base files and the record lookup uses HFile.Reader and HFileScanner interfaces to get records by partition keys. When the backing storage is S3, this record lookup from HFiles is failing with IOException, in turn failing the caller commit/update operations. - Metadata table looks up HFile records with positional read enabled so as to perform better for random lookups. But this positional read key lookup is returning with partial read sizes over S3 leading to HFile scanner throwing IOException. This doesn't happen over HDFS. Metadata table though uses the HFile for random key lookups, the positional read is not mandatory as we sort the keys when doing a lookup for multiple keys. - The fix is to disable HFile positional read for all HFile scanner based key lookups.

…bleFileSystemView, aiming to reduce unnecessary list/get requests" Co-authored-by: yuezhang <yuezhang@freewheel.tv>

* skip shutdown zookeeper in `@AfterAll` in TestHBaseIndex * rebalance CI tests

…concurrent operations (#4211) * Fix kafka connect readme * Fix handling of errors in write records for kafka connect * By default, ensure we skip error records and keep the pipeline alive * Fix indentation Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>

…inflight (#4206) * [HUDI-2923] Fixing metadata table reader when metadata compaction is inflight * Fixing retry of pending compaction in metadata table and enhancing tests

close #4215

…kpoint retrival (#4216) - We now seek backwards to find the checkpoint - No need to return empty anymore

…adata table (#4336)

Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>

[HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields

…with empty checkpoint (#4334) * Adding ability to read entire data with HoodieIncrSource with empty checkpoint * Addressing comments

…4443)

…o deserialize Avro binaries (#4353)

…eyGenerator (#4416)

…ilure (#4343)" (#4465) This reverts commit 7e7ad15.

* [HUDI-3083] Support component data types for flink bulk_insert * add nested row type test

…d clean (#4016)

Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>

…3173) * [HUDI-2154] Add index key field to HoodieKey * [HUDI-2157] Add the bucket index and its read/write implemention of Spark engine. * revert HUDI-2154 add index key field to HoodieKey * fix all comments and introduce a new tricky way to get index key at runtime support double insert for bucket index * revert spark read optimizer based on bucket index * add the storage layout * index tag, hash function and add ut * fix ut * address partial comments * Code review feedback * add layout config and docs * fix ut * rename hoodie.layout and rebase master Co-authored-by: Vinoth Chandar <vinoth@apache.org>

Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>

* [HUDI-3095] abstract partition filter logic to enable code reuse * [HUDI-3095] address reviews

…hms (#4453) * constructDropPartitions when drop partitions using jdbc * done * done * code style * code review Co-authored-by: yuezhang <yuezhang@freewheel.tv>

hudi-bot · 2021-12-31T08:08:12Z

CI report:

ef9923f Azure: PENDING

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

danny0405 and others added 30 commits November 27, 2021 17:22

Moving to 0.11.0-SNAPSHOT on master branch.

a1d0ff4

[MINOR] fix typo (#4140)

eca1693

[MINOR] Fixing integ test suite for hudi-aws and archival validation (#…

52aae36

…4142)

Removing rfc from release package and fixing release validation script (

38e75ea

#4147)

[MINOR] Fix syntax error in create_source_release.sh (#4150)

536af4b

[MINOR] Fix typo,rename 'getUrlEncodePartitoning' to 'getUrlEncodePar…

3433f00

…titioning' (#4130)

[HUDI-2642] Add support ignoring case in update sql operation (#3882)

a398aad

[HUDI-2891] Fix write configs for Java engine in Kafka Connect Sink (#…

ea009b5

…4161)

Revert "[HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' …

24380c2

…to 'DefaultHoodieRecordPayload' (#4115)" (#4169) This reverts commit 88067f5.

Revert "[HUDI-2856] Bit cask disk map delete modified (#4116)" (#4171)

9b254b6

This reverts commit 257a6a7.

[HUDI-2880] Fixing loading of props from default dir (#4167)

f4c25ba

* Fixing loading of props from default dir * addressing comments

[HUDI-2881] Compact the file group with larger log files to reduce wr…

5284730

…ite amplification (#4152)

Fixed partitions produced by layout optimization in case order-by key…

772f5ca

… is composed of a single column (#4183)

[MINOR] Fix the wrong usage of timestamp length variable bug (#4179)

61a03bc

Signed-off-by: zzzhy <candle_1667@163.com>

[HUDI-2904] Fix metadata table archival overstepping between regular …

91d2e61

…writers and table services (#4186) - Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local> - Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

[HUDI-2914] Fix remote timeline server config for flink (#4191)

934fe54

[minor] Refactor write profile to always generate fs view (#4198)

f74b3d1

[HUDI-2924] Refresh the fs view on successful checkpoints for write p…

0699521

…rofile (#4199)

[MINOR] use catalog schema if can not find table schema (#4182)

ca42724

[HUDI-2902] Fixing populate meta fields with Hfile writers and Disabl…

e483f7c

…ing virtual keys by default for metadata table (#4194)

[HUDI-2911] Removing default value for PARTITIONPATH_FIELD_NAME res…

bed7f98

…ulting in incorrect `KeyGenerator` configuration (#4195)

Revert "[HUDI-2495] Resolve inconsistent key generation for timestamp…

2f96f43

… types by GenericRecord and Row (#3944)" (#4201)

Revert "[HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTa…

5616830

…bleFileSystemView, aiming to reduce unnecessary list/get requests" Co-authored-by: yuezhang <yuezhang@freewheel.tv>

[MINOR] Mitigate CI jobs timeout issues (#4173)

a799fae

* skip shutdown zookeeper in `@AfterAll` in TestHBaseIndex * rebalance CI tests

[HUDI-2933] DISABLE Metadata table by default (#4213)

0fd6b2d

[HUDI-2923] Fixing metadata table reader when metadata compaction is …

1d4fb82

…inflight (#4206) * [HUDI-2923] Fixing metadata table reader when metadata compaction is inflight * Fixing retry of pending compaction in metadata table and enhancing tests

[HUDI-2934] Optimize RequestHandler code style

568181a

close #4215

[HUDI-2935] Remove special casing of clustering in deltastreamer chec…

36b69d8

…kpoint retrival (#4216) - We now seek backwards to find the checkpoint - No need to return empty anymore

danny0405 and others added 27 commits December 22, 2021 11:10

[HUDI-3032] Do not clean the log files right after compaction for met…

f1286c2

…adata table (#4336)

[HUDI-2547] Schedule Flink compaction in service (#4254)

15eb7e8

Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>

Merge pull request #4308 from harsh1231/HUDI-3008

b5890cd

[HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields

[HUDI-3011] Adding ability to read entire data with HoodieIncrSource …

1a5f869

…with empty checkpoint (#4334) * Adding ability to read entire data with HoodieIncrSource with empty checkpoint * Addressing comments

[HUDI-3060] drop table for spark sql (#4364)

5d93edc

[MINOR] Fix DedupeSparkJob typo (#4418)

57f43de

[HUDI-3014] Add table option to set utc timezone (#4306)

032b883

[MINOR] Remove unused method in HoodieActiveTimeline (#4435)

4721073

[HUDI-3101] Excluding compaction instants from pending rollback info (#…

7b07aac

…4443)

[HUDI-3102] Do not store rollback plan in inflight instant (#4445)

c81df99

[HUDI-3099] Purge drop partition for spark sql (#4436)

282aa68

[HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

6409fc7

…o deserialize Avro binaries (#4353)

[HUDI-3093] fix spark-sql query table that write with TimestampBasedK…

1f7afba

…eyGenerator (#4416)

[HUDI-3106] Fix HiveSyncTool not sync schema (#4452)

32505d5

[HUDI-2811] Support Spark 3.2 (#4270)

05942e0

Fixing dynamoDbLockConfig required prop check (#4422)

3d7a869

[HUDI-2983] Remove Log4j2 transitive dependencies (#4281)

9412281

[MINOR] HoodieInstantTimeGenerator improve method used (#4462)

a29b27c

[HUDI-3108] Fix Purge Drop MOR Table Cause error (#4455)

504747e

Revert "[HUDI-3043] Revert async cleaner leak commit to unblock CI fa…

5c0e4ce

…ilure (#4343)" (#4465) This reverts commit 7e7ad15.

[HUDI-3083] Support component data types for flink bulk_insert (#4470)

674c149

* [HUDI-3083] Support component data types for flink bulk_insert * add nested row type test

[HUDI-2675] Fix the exception 'Not an Avro data file' when archive an…

436becf

…d clean (#4016)

[HUDI-3124] Bootstrap when timeline have completed instant (#4467)

0f0088f

Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>

[HUDI-3120] Cache compactionPlan in buffer (#4463)

e88b5fd

Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>

[HUDI-3095] abstract partition filter logic to enable code reuse (#4454)

2444f40

* [HUDI-3095] abstract partition filter logic to enable code reuse * [HUDI-3095] address reviews

[HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or …

ef9923f

…hms (#4453) * constructDropPartitions when drop partitions using jdbc * done * done * code style * code review Co-authored-by: yuezhang <yuezhang@freewheel.tv>

guoke111 changed the title ~~分支0.10.0~~ 0.10.0 Dec 31, 2021

guoke111 closed this Dec 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.10.0#4482

0.10.0#4482
guoke111 wants to merge 132 commits intoapache:release-0.10.0from
guoke111:master

guoke111 commented Dec 31, 2021

Uh oh!

hudi-bot commented Dec 31, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

guoke111 commented Dec 31, 2021

Tips

What is the purpose of the pull request

Brief change log

Verify this pull request

Committer checklist

Uh oh!

hudi-bot commented Dec 31, 2021

CI report:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants