Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FixNPEIncremental #9

Closed
wants to merge 254 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
254 commits
Select commit Hold shift + click to select a range
c2f9094
[HUDI-2756] Fix flink parquet writer decimal type conversion (#3988)
danny0405 Nov 14, 2021
0bb6d8f
[HUDI-2706] refactor spark-sql to make consistent with DataFrame api …
YannByron Nov 14, 2021
a14d104
[HUDI-2589] Claiming RFC-37 for Metadata based bloom index feature. (…
manojpec Nov 15, 2021
a0dae41
[HUDI-2758] remove redundant code in the hoodieRealtimeInputFormatUit…
xiarixiaoyao Nov 15, 2021
3c43197
[MINOR] Fix typo in IntervalTreeBasedGlobalIndexFileFilter (#3993)
dufeng1010 Nov 15, 2021
53d2d6a
[HUDI-2744] Fix parsing of metadadata table compaction timestamp when…
nsivabalan Nov 15, 2021
38b6934
[HUDI-2683] Parallelize deleting archived hoodie commits (#3920)
zhangyue19921010 Nov 15, 2021
bff8769
[HUDI-2712] Fixing a bug with rollback of partially failed commit whi…
nsivabalan Nov 16, 2021
6f5e661
[HUDI-2769] Fix StreamerUtil#medianInstantTime for very near instant …
danny0405 Nov 16, 2021
cbcbec4
[MINOR] Fixed checkstyle config to be based off Maven root-dir (requi…
Nov 17, 2021
04eb5fd
[HUDI-2753] Ensure list based rollback strategy is used for restore (…
nsivabalan Nov 17, 2021
ce7d233
[HUDI-2151] Part3 Enabling marker based rollback as default rollback …
nsivabalan Nov 17, 2021
aec5d11
Check --source-avro-schema-path parameter (#3987)
0x3E6 Nov 17, 2021
4d884bd
[MINOR] Fix typo,'Hooide' corrected to 'Hoodie' (#4007)
dongkelun Nov 17, 2021
826414c
[MINOR] Add the Schema for GooseFS to StorageSchemes (#3982)
lubo212 Nov 17, 2021
1ee12cf
[HUDI-2314] Add support for DynamoDb based lock provider (#3486)
zhedoubushishi Nov 17, 2021
f715cf6
[HUDI-2716] InLineFS support for S3FS logs (#3977)
manojpec Nov 17, 2021
2d3f2a3
[HUDI-2734] Setting default metadata enable as false for Java (#4003)
nsivabalan Nov 17, 2021
71a2ae0
[HUDI-2789] Flink batch upsert for non partitioned table does not wor…
danny0405 Nov 18, 2021
8772cec
[HUDI-2790] Fix the changelog mode of HoodieTableSource (#4029)
danny0405 Nov 18, 2021
24def0b
[HUDI-2362] Add external config file support (#3416)
zhedoubushishi Nov 18, 2021
4e067ca
[HUDI-2641] Avoid deleting all inflight commits heartbeats while roll…
umehrot2 Nov 18, 2021
7a00f86
[HUDI-2791] Allows duplicate files for metadata commit (#4033)
danny0405 Nov 19, 2021
bf00876
[HUDI-2798] Fix flink query operation fields (#4041)
danny0405 Nov 19, 2021
eba354e
[HUDI-2731] Make clustering work regardless of whether there are base…
codope Nov 19, 2021
459b342
[HUDI-2593] Virtual keys support for metadata table (#3968)
manojpec Nov 19, 2021
c8617d9
[HUDI-2472] Enabling metadata table for TestHoodieMergeOnReadTable an…
manojpec Nov 20, 2021
0230d40
[HUDI-2796] Metadata table support for Restore action to first commit…
manojpec Nov 20, 2021
3dc6262
[HUDI-2242] Add configuration inference logic for few options (#3359)
zhedoubushishi Nov 20, 2021
6cc97cc
Remove the aws packages from hudi flink bundle jar (#4050)
lsyldliu Nov 20, 2021
f4b974a
[HUDI-2742] Added S3 object filter to support multiple S3EventsHoodie…
h7kanna Nov 20, 2021
ae0c67d
[HUDI-2795] Add mechanism to safely update,delete and recover table p…
vinothchandar Nov 20, 2021
1a5484d
[MINOR] Claim RFC number for RFC for debezium source for deltastreame…
rmahindra123 Nov 21, 2021
305d160
[MINOR] optimize in constructor of inputbatch class (#4040)
dufeng1010 Nov 21, 2021
74b59a4
[HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integrat…
leesf Nov 21, 2021
0411f73
[HUDI-2804] Add option to skip compaction instants for streaming read…
danny0405 Nov 21, 2021
520538b
[HUDI-2392] Make flink parquet reader compatible with decimal BINARY …
danny0405 Nov 21, 2021
887787e
[HUDI-1932] Update Hive sync timestamp when change detected (#3053)
nateradtke Nov 21, 2021
2533a9c
[MINOR] Fix typos (#4053)
dongkelun Nov 21, 2021
8281cbf
[HUDI-2799] Fix the classloader of flink write task (#4042)
danny0405 Nov 22, 2021
02f7ca2
[HUDI-1870] Add more Spark CI build tasks (#4022)
xushiyan Nov 22, 2021
a2c91a7
[HUDI-2533] New option for hoodieClusteringJob to check, rollback and…
zhangyue19921010 Nov 22, 2021
7f3b89f
[HUDI-2472] Enabling metadata table for TestHoodieIndex test case (#4…
manojpec Nov 22, 2021
8945206
[MINOR] Fix instant parsing in HoodieClusteringJob (#4071)
codope Nov 22, 2021
fc9ca6a
[HUDI-2559] Converting commit timestamp format to millisecs (#4024)
nsivabalan Nov 22, 2021
fe57e9b
[HUDI-2599] Make addFilesToview and fetchLatestBaseFiles public (#4066)
codope Nov 22, 2021
3bdab01
[HUDI-2550] Expand File-Group candidates list for appending for MOR t…
Nov 23, 2021
772af93
[HUDI-2737] Use earliest instant by default for async compaction and …
yihua Nov 23, 2021
0d1e7ec
[MINOR] Fix typo,'multipe' corrected to 'multiple' (#4068)
YongjinZhou Nov 23, 2021
e22150f
[HUDI-1937] Rollback unfinished replace commit to allow updates (#3869)
codope Nov 23, 2021
6aa710e
[MINOR] Add more configuration to Kafka setup script (#3992)
yihua Nov 23, 2021
c88c2af
[HUDI-2743] Assume path exists and defer fs.exists() in AbstractTable…
codope Nov 23, 2021
9de9951
[HUDI-2778] Optimize statistics collection related codes and add some…
xiarixiaoyao Nov 23, 2021
9ed28b1
[HUDI-2409] Using HBase shaded jars in Hudi presto bundle (#3623)
zhangyue19921010 Nov 23, 2021
ca9bfa2
[HUDI-2332] Add clustering and compaction in Kafka Connect Sink (#3857)
yihua Nov 23, 2021
969a5bf
[MINOR] Fix typo,rename 'HooodieAvroDeserializer' to 'HoodieAvroDeser…
dongkelun Nov 23, 2021
fbff079
[HUDI-2325] Add hive sync support to kafka connect (#3660)
rmahindra123 Nov 23, 2021
18cf595
[HUDI-2831] Securing usages of `SimpleDateFormat` to be thread-safe (…
Nov 24, 2021
5078d29
[HUDI-2818] Fix 2to3 upgrade when set `hoodie.table.keygenerator.clas…
xushiyan Nov 24, 2021
0cf2f10
[HUDI-2838] refresh table after drop partition (#4084)
YannByron Nov 24, 2021
323be33
Revert "[HUDI-2799] Fix the classloader of flink write task (#4042)" …
danny0405 Nov 24, 2021
0bb506f
[HUDI-2847] Flink metadata table supports virtual keys (#4096)
danny0405 Nov 24, 2021
a234833
[HUDI-2759] extract HoodieCatalogTable to coordinate spark catalog ta…
YannByron Nov 24, 2021
9af219b
[HUDI-2688] Claim the next rfc 40 for Hudi connector for Trino (#4105)
codope Nov 24, 2021
90f2ea2
[HUDI-2671] Fix kafka offset handling in Kafka Connect protocol (#4021)
rmahindra123 Nov 24, 2021
973f78f
[HUDI-2443] Hudi KVComparator for all HFile writer usages (#3889)
manojpec Nov 24, 2021
60b23b9
[HUDI-2788] Fixing issues w/ Z-order Layout Optimization (#4026)
Nov 24, 2021
ff94d92
[HUDI-2766] Cluster update strategy should not be fenced by write con…
codope Nov 24, 2021
435ea15
[HUDI-2793] Fixing deltastreamer checkpoint fetch/copy over (#4034)
nsivabalan Nov 24, 2021
7286b56
[HUDI-2853] Add JMX deps in hudi utilities and kafka connect bundles …
rmahindra123 Nov 25, 2021
5129773
[HUDI-2844][CLI] Fixing archived Timeline crashing if timeline contai…
Nov 25, 2021
bef373f
[MINOR] Fix build failure due to checkstyle issues (#4111)
yihua Nov 25, 2021
abc0175
[HUDI-1290] [RFC-39] Deltastreamer avro source for Debezium CDC (#4048)
rmahindra123 Nov 25, 2021
83f8ed2
[HUDI-1290] Add Debezium Source for deltastreamer (#4063)
rmahindra123 Nov 25, 2021
a9bd208
[HUDI-2792] Configure metadata payload consistency check (#4035)
nsivabalan Nov 25, 2021
88067f5
[HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' to 'Defa…
dongkelun Nov 25, 2021
a2eb2b0
[HUDI-2480] FileSlice after pending compaction-requested instant-time…
danny0405 Nov 25, 2021
264e1ce
[HUDI-1290] fixing mysql debezium source (#4119)
data-storyteller Nov 25, 2021
b972aa5
[HUDI-2800] Remove rdd.isEmpty() validation to prevent CreateHandle b…
codope Nov 25, 2021
7bb90e8
[HUDI-2794] Guarding table service commits within a single lock to co…
nsivabalan Nov 25, 2021
f692078
[HUDI-2671] Making error -> warn logs from timeline server with concu…
nsivabalan Nov 25, 2021
6a0f079
[HUDI-2858] Fixing handling of cluster update reject exception in del…
nsivabalan Nov 25, 2021
8e13793
[HUDI-2841] Fixing lazy rollback for MOR with list based strategy (#4…
nsivabalan Nov 25, 2021
e0125a7
[HUDI-2801] Add Amazon CloudWatch metrics reporter (#4081)
umehrot2 Nov 25, 2021
6f5d8d0
[HUDI-2840] Fixed DeltaStreaemer to properly respect configuration pa…
Nov 25, 2021
8340ccb
[HUDI-2005] Removing direct fs call in HoodieLogFileReader (#3865)
nsivabalan Nov 25, 2021
38585e4
[HUDI-2851] Shade org.apache.hadoop.hive.ql.optimizer package for fli…
lsyldliu Nov 26, 2021
f5da9b5
[MINOR] Include hudi-aws in flink bundle jar (#4127)
danny0405 Nov 26, 2021
e554c7f
[HUDI-2852] Table metadata returns empty for non-exist partition (#4117)
minchowang Nov 26, 2021
e9efbdb
[HUDI-2863] Rename option 'hoodie.parquet.page.size' to 'write.parque…
danny0405 Nov 26, 2021
3d75aca
[HUDI-2850] Fixing Clustering CLI - schedule and run command fixes to…
manojpec Nov 26, 2021
5755ff2
[HUDI-2814] Addressing issues w/ Z-order Layout Optimization (#4060)
Nov 26, 2021
a88691f
[MINOR] Fixing test failure to fix CI build failure (#4132)
nsivabalan Nov 26, 2021
f8e0176
[HUDI-2861] Re-use same rollback instant time for failed rollbacks (#…
nsivabalan Nov 26, 2021
d1e83e4
[HUDI-2767] Enabling timeline-server-based marker as default (#4112)
yihua Nov 26, 2021
445208a
[HUDI-2845] Metadata CLI - files/partition file listing fix and new v…
manojpec Nov 26, 2021
8402cac
[HUDI-2848] Excluse guava from hudi-cli pom (#4100)
huleilei Nov 26, 2021
9028e6e
[HUDI-2864] Fix README and scripts with current limitations of hive s…
rmahindra123 Nov 26, 2021
257a6a7
[HUDI-2856] Bit cask disk map delete modified (#4116)
xuzifu666 Nov 26, 2021
9c059ef
[MINOR] Follow ups from HUDI-2861 (re-use same rollback instant for f…
nsivabalan Nov 27, 2021
3a8d64e
[HUDI-2868] Fix skipped HoodieSparkSqlWriterSuite (#4125)
xushiyan Nov 27, 2021
2c7656c
[HUDI-2475] [HUDI-2862] Metadata table creation and avoid bootstrappi…
manojpec Nov 27, 2021
780a2ac
[HUDI-2102] Support hilbert curve for hudi (#3952)
xiarixiaoyao Nov 27, 2021
a1d0ff4
Moving to 0.11.0-SNAPSHOT on master branch.
danny0405 Nov 27, 2021
eca1693
[MINOR] fix typo (#4140)
vortual Nov 28, 2021
52aae36
[MINOR] Fixing integ test suite for hudi-aws and archival validation …
nsivabalan Nov 29, 2021
38e75ea
Removing rfc from release package and fixing release validation scrip…
nsivabalan Nov 29, 2021
536af4b
[MINOR] Fix syntax error in create_source_release.sh (#4150)
danny0405 Nov 29, 2021
3433f00
[MINOR] Fix typo,rename 'getUrlEncodePartitoning' to 'getUrlEncodePar…
dongkelun Nov 30, 2021
a398aad
[HUDI-2642] Add support ignoring case in update sql operation (#3882)
dongkelun Nov 30, 2021
ea009b5
[HUDI-2891] Fix write configs for Java engine in Kafka Connect Sink (…
yihua Nov 30, 2021
24380c2
Revert "[HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' …
Dec 1, 2021
9b254b6
Revert "[HUDI-2856] Bit cask disk map delete modified (#4116)" (#4171)
yihua Dec 1, 2021
f4c25ba
[HUDI-2880] Fixing loading of props from default dir (#4167)
nsivabalan Dec 1, 2021
5284730
[HUDI-2881] Compact the file group with larger log files to reduce wr…
minihippo Dec 2, 2021
772f5ca
Fixed partitions produced by layout optimization in case order-by key…
Dec 2, 2021
61a03bc
[MINOR] Fix the wrong usage of timestamp length variable bug (#4179)
zzzhy Dec 2, 2021
91d2e61
[HUDI-2904] Fix metadata table archival overstepping between regular …
rmahindra123 Dec 2, 2021
934fe54
[HUDI-2914] Fix remote timeline server config for flink (#4191)
danny0405 Dec 3, 2021
f74b3d1
[minor] Refactor write profile to always generate fs view (#4198)
danny0405 Dec 3, 2021
0699521
[HUDI-2924] Refresh the fs view on successful checkpoints for write p…
danny0405 Dec 3, 2021
ca42724
[MINOR] use catalog schema if can not find table schema (#4182)
YannByron Dec 3, 2021
e483f7c
[HUDI-2902] Fixing populate meta fields with Hfile writers and Disabl…
nsivabalan Dec 3, 2021
bed7f98
[HUDI-2911] Removing default value for `PARTITIONPATH_FIELD_NAME` res…
Dec 3, 2021
2f96f43
Revert "[HUDI-2495] Resolve inconsistent key generation for timestamp…
YannByron Dec 3, 2021
383d5ed
[HUDI-2894][HUDI-2905] Metadata table - avoiding key lookup failures …
manojpec Dec 3, 2021
5616830
Revert "[HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTa…
zhangyue19921010 Dec 4, 2021
a799fae
[MINOR] Mitigate CI jobs timeout issues (#4173)
xushiyan Dec 4, 2021
0fd6b2d
[HUDI-2933] DISABLE Metadata table by default (#4213)
vinothchandar Dec 4, 2021
94f45e9
[HUDI-2890] Kafka Connect: Fix failed writes and avoid table service …
rmahindra123 Dec 4, 2021
1d4fb82
[HUDI-2923] Fixing metadata table reader when metadata compaction is …
nsivabalan Dec 4, 2021
568181a
[HUDI-2934] Optimize RequestHandler code style
lsyldliu Dec 4, 2021
36b69d8
[HUDI-2935] Remove special casing of clustering in deltastreamer chec…
vinothchandar Dec 4, 2021
a8fb696
[HUDI-2877] Support flink catalog to help user use flink table conven…
lsyldliu Dec 5, 2021
63b1560
[HUDI-2937] Introduce a pulsar implementation of hoodie write commit …
XuQianJin-Stars Dec 5, 2021
734c9f5
[HUDI-2418] Support HiveSchemaProvider (#3671)
fengjian428 Dec 5, 2021
f0e46bf
[HUDI-2916] Add IssueNavigationLink for IDEA (#4192)
leesf Dec 6, 2021
84b531a
[HUDI-2900] Fix corrupt block end position (#4181)
lsyldliu Dec 6, 2021
57c4bf8
[HUDI-2876] for hive/presto hudi should remove the temp file which cr…
xiarixiaoyao Dec 6, 2021
2d66451
[MINOR] Fix partition path formatting in error log (#4168)
yihua Dec 6, 2021
4a437f2
[MINOR] Use maven-shade-plugin version for hudi-timeline-server-bundl…
zhedoubushishi Dec 6, 2021
6dab307
[MINOR] Remove redundant and conflicting spark-hive dependency (#4228)
codope Dec 7, 2021
e8473b9
[HUDI-2951] Disable remote view storage config for flink (#4237)
danny0405 Dec 7, 2021
c9e18d1
[HUDI-2942] add error message log in HoodieCombineHiveInputFormat (#4…
xuzifu666 Dec 8, 2021
c56d93e
[MINOR] Update DOAP with 0.10.0 Release (#4246)
danny0405 Dec 8, 2021
082faa3
[HUDI-2832][RFC-41] Proposal to integrate Hudi on Snowflake platform …
Dec 8, 2021
7c3f077
[HUDI-2964] Fixing aws lock configs to inherit from HoodieConfig (#4258)
nsivabalan Dec 9, 2021
bd08470
[HUDI-2957] Shade kryo jar for flink bundle jar (#4251)
danny0405 Dec 9, 2021
9c8ad0f
[HUDI-2665] Fix overflow of huge log file in HoodieLogFormatWriter (#…
guanziyue Dec 9, 2021
5ac9ce7
[MINOR] Fix Compile broken (#4263)
leesf Dec 9, 2021
f612a20
[HUDI-2779] Cache BaseDir if HudiTableNotFound Exception thrown (#4014)
Dec 9, 2021
68f8597
[HUDI-2966] Add TaskCompletionListener for HoodieMergeOnReadRDD to cl…
xiarixiaoyao Dec 9, 2021
3fb2f97
[MINOR] FAQ link in SUPPORT_REQUEST template (#4266)
Arun-kc Dec 9, 2021
8321d20
Claiming RFC for data skipping index for updated version (#4271)
nsivabalan Dec 10, 2021
ea154bc
Revert "Claiming RFC for data skipping index for updated version (#42…
nsivabalan Dec 10, 2021
456d74c
[HUDI-2901] Fixed the bug clustering jobs cannot running in parallel …
xiarixiaoyao Dec 10, 2021
c7473a7
[HUDI-2936] Add data count checks in async clustering tests (#4236)
codope Dec 10, 2021
f194566
[HUDI-2849] Improve SparkUI job description for write path (#4222)
YuweiXiao Dec 10, 2021
be36826
[HUDI-2952] Fixing metadata table for non-partitioned dataset (#4243)
nsivabalan Dec 10, 2021
3ad9b12
[HUDI-2912] Fix CompactionPlanOperator typo (#4187)
yuzhaojing Dec 10, 2021
3ce0526
Adding verbose output for metadata validate files command (#4166)
nsivabalan Dec 10, 2021
3ba2909
[HUDI-2892][BUG] Pending Clustering may stain the ActiveTimeLine and …
zhangyue19921010 Dec 10, 2021
72901a3
[HUDI-2784] Add a hudi-trino-bundle for Trino (#4279)
yihua Dec 10, 2021
2d864f7
[HUDI-2814] Make Z-index more generic Column-Stats Index (#4106)
Dec 10, 2021
c48a2a1
[HUDI-2527] Multi writer test with conflicting async table services (…
manojpec Dec 11, 2021
9797fdf
[HUDI-2974] Make the prefix for metrics name configurable (#4274)
rmahindra123 Dec 11, 2021
9bdcee0
[HUDI-2959] Fix the thread leak of cleaning service (#4252)
danny0405 Dec 11, 2021
2dcb3f0
[HUDI-2985] Shade jackson for hudi flink bundle jar (#4284)
danny0405 Dec 11, 2021
b5f05fd
[HUDI-2906] Add a repair util to clean up dangling data and log files…
yihua Dec 11, 2021
8dd0444
[HUDI-2984] Implement #close for AbstractTableFileSystemView (#4285)
danny0405 Dec 11, 2021
15444c9
[HUDI-2946] Upgrade maven plugins to be compatible with higher Java v…
zhedoubushishi Dec 12, 2021
b22c2c6
[HUDI-2938] Metadata table util to get latest file slices for reader/…
manojpec Dec 12, 2021
dd96129
[HUDI-2990] Sync to HMS when deleting partitions (#4291)
XuQianJin-Stars Dec 13, 2021
46de25d
[HUDI-2994] Add judgement to existed partitionPath in the catch code …
minchowang Dec 13, 2021
29bc5fd
[HUDI-2996] Flink streaming reader 'skip_compaction' option does not …
Fugle666 Dec 14, 2021
c8d6bd8
[HUDI-2997] Skip the corrupt meta file for pending rollback action (#…
danny0405 Dec 14, 2021
bc8bf04
[HUDI-2995] Enabling metadata table by default (#4295)
manojpec Dec 14, 2021
dbec6c5
[HUDI-3022] Fix NPE for isDropPartition method (#4319)
XuQianJin-Stars Dec 15, 2021
9a2030a
[HUDI-3024] Add explicit write handler for flink (#4329)
minchowang Dec 15, 2021
3b89457
[HUDI-3025] Add additional wait time for namenode availability during…
yihua Dec 15, 2021
27907de
[HUDI-3028] Use blob storage to speed up CI downloads (#4331)
xushiyan Dec 15, 2021
f5b07a7
[HUDI-2998] claiming rfc number for consistent hashing index (#4303)
YuweiXiao Dec 15, 2021
ea2eba1
[HUDI-3015] Implement #reset and #sync for metadata filesystem view (…
danny0405 Dec 16, 2021
a8a192a
[Minor] Catch and ignore all the exceptions in quietDeleteMarkerDir (…
zhangyue19921010 Dec 16, 2021
294d712
[HUDI-3001] Clean up the marker directory when finish bootstrap opera…
xiarixiaoyao Dec 16, 2021
7e7ad15
[HUDI-3043] Revert async cleaner leak commit to unblock CI failure (#…
nsivabalan Dec 17, 2021
d0087d4
[HUDI-3037] Add back remote view storage config for flink (#4338)
danny0405 Dec 17, 2021
e4cfb42
[HUDI-3046] Claim RFC number for RFC for Compaction / Clustering Serv…
yuzhaojing Dec 17, 2021
9246b16
[HUDI-2958] Automatically set spark.sql.parquet.writelegacyformat, wh…
xiarixiaoyao Dec 17, 2021
6eba834
[HUDI-3043] Adding some test fixes to continuous mode multi writer te…
nsivabalan Dec 17, 2021
7784249
[HUDI-2962] InProcess lock provider to guard single writer process wi…
manojpec Dec 18, 2021
4785244
[HUDI-3043] De-coupling multi writer tests (#4362)
nsivabalan Dec 18, 2021
d1d48ed
[HUDI-3029] Transaction manager: avoid deadlock when doing begin and…
manojpec Dec 18, 2021
733732b
[HUDI-3029] Transaction manager: avoid deadlock when doing begin and…
manojpec Dec 18, 2021
dc40397
[HUDI-3064] Fixing a bug in TransactionManager and FileSystemTestLock…
nsivabalan Dec 18, 2021
77abb5c
[HUDI-3054] Fixing default lock configs for FileSystemBasedLock and f…
nsivabalan Dec 18, 2021
f57e28f
[MINOR] Azure CI IT tasks clean up (#4337)
xushiyan Dec 19, 2021
bb99836
[HUDI-3052] Fix flaky testJsonKafkaSourceResetStrategy (#4381)
xushiyan Dec 19, 2021
478f9f3
[minor] fix NetworkUtils#getHostname (#4355)
danny0405 Dec 19, 2021
03f71ef
[HUDI-2970] Adding tests for archival of replace commit actions (#4268)
nsivabalan Dec 19, 2021
4a48f99
[HUDI-3064][HUDI-3054] FileSystemBasedLockProviderTestClass tryLock f…
manojpec Dec 19, 2021
3ca9210
remove unused import (#4349)
xuzifu666 Dec 20, 2021
f166dda
[MINOR] Remove unused method in HoodieActiveTimeline (#4401)
xuzifu666 Dec 20, 2021
982ae3d
[MINOR] Increasing CI timeout to 90 mins (#4407)
nsivabalan Dec 21, 2021
f3f6112
[HUDI-3070] Add rerunFailingTestsCount for flakly testes (#4398)
zhangyue19921010 Dec 21, 2021
32a44bb
[HUDI-2970] Add test for archiving replace commit (#4345)
xushiyan Dec 21, 2021
7d046f9
[HUDI-3008] Fixing HoodieFileIndex partition column parsing for neste…
harsh1231 Dec 14, 2021
92f54ce
[HUDI-3027] Update hudi-examples README.md (#4330)
Aimiyoo Dec 21, 2021
f1286c2
[HUDI-3032] Do not clean the log files right after compaction for met…
danny0405 Dec 22, 2021
15eb7e8
[HUDI-2547] Schedule Flink compaction in service (#4254)
yuzhaojing Dec 22, 2021
b5890cd
Merge pull request #4308 from harsh1231/HUDI-3008
xiarixiaoyao Dec 22, 2021
1a5f869
[HUDI-3011] Adding ability to read entire data with HoodieIncrSource …
nsivabalan Dec 22, 2021
5d93edc
[HUDI-3060] drop table for spark sql (#4364)
XuQianJin-Stars Dec 22, 2021
57f43de
[MINOR] Fix DedupeSparkJob typo (#4418)
Aimiyoo Dec 22, 2021
032b883
[HUDI-3014] Add table option to set utc timezone (#4306)
xuzifu666 Dec 23, 2021
4721073
[MINOR] Remove unused method in HoodieActiveTimeline (#4435)
xuzifu666 Dec 24, 2021
7b07aac
[HUDI-3101] Excluding compaction instants from pending rollback info …
danny0405 Dec 25, 2021
c81df99
[HUDI-3102] Do not store rollback plan in inflight instant (#4445)
danny0405 Dec 25, 2021
282aa68
[HUDI-3099] Purge drop partition for spark sql (#4436)
XuQianJin-Stars Dec 28, 2021
6409fc7
[HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…
harsh1231 Dec 28, 2021
1f7afba
[HUDI-3093] fix spark-sql query table that write with TimestampBasedK…
YannByron Dec 28, 2021
32505d5
[HUDI-3106] Fix HiveSyncTool not sync schema (#4452)
XuQianJin-Stars Dec 28, 2021
05942e0
[HUDI-2811] Support Spark 3.2 (#4270)
YannByron Dec 28, 2021
3d7a869
Fixing dynamoDbLockConfig required prop check (#4422)
nsivabalan Dec 28, 2021
9412281
[HUDI-2983] Remove Log4j2 transitive dependencies (#4281)
umehrot2 Dec 28, 2021
a29b27c
[MINOR] HoodieInstantTimeGenerator improve method used (#4462)
xuzifu666 Dec 29, 2021
504747e
[HUDI-3108] Fix Purge Drop MOR Table Cause error (#4455)
XuQianJin-Stars Dec 29, 2021
5c0e4ce
Revert "[HUDI-3043] Revert async cleaner leak commit to unblock CI fa…
nsivabalan Dec 30, 2021
674c149
[HUDI-3083] Support component data types for flink bulk_insert (#4470)
lsyldliu Dec 30, 2021
436becf
[HUDI-2675] Fix the exception 'Not an Avro data file' when archive an…
dongkelun Dec 30, 2021
0f0088f
[HUDI-3124] Bootstrap when timeline have completed instant (#4467)
yuzhaojing Dec 30, 2021
a4e622a
[HUDI-1951] Add bucket hash index, compatible with the hive bucket (#…
minihippo Dec 30, 2021
e88b5fd
[HUDI-3120] Cache compactionPlan in buffer (#4463)
yuzhaojing Dec 31, 2021
2444f40
[HUDI-3095] abstract partition filter logic to enable code reuse (#4454)
YuweiXiao Dec 31, 2021
ef9923f
[HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or …
zhangyue19921010 Dec 31, 2021
bfa169d
[HUDI-3040] Fix HoodieSparkBootstrapExample error info for usage (#4341)
Aimiyoo Jan 1, 2022
188d033
[HUDI-3134] Fix insert error after adding columns on Spark 3.2.0 (#4488)
leesf Jan 2, 2022
1622b52
[HUDI-3136] Fix merge/insert/show partitions error on Spark3.2 (#4490)
YannByron Jan 2, 2022
fe9406d
[HUDI-3131] fix ctas error in spark3.1.1 (#4476)
YannByron Jan 2, 2022
1e2d2c4
[HUDI-3138] Fix broken UT test for TestHiveSyncTool.testDropPartition…
zhangyue19921010 Jan 3, 2022
0273f2e
[MINOR] Update README.md (#4492)
xushiyan Jan 3, 2022
2b2ae34
[HUDI-2558] Fixing Clustering w/ sort columns with null values fails …
harsh1231 Jan 3, 2022
29ab6fb
[HUDI-3140] Fix bulk_insert failure on Spark 3.2.0 (#4498)
leesf Jan 4, 2022
7329d22
Adding tests to validate different key generators (#4473)
nsivabalan Jan 4, 2022
aaf5727
[HUDI-2774] Handle duplicate instants when fetching pending clusterin…
codope Jan 4, 2022
bf4e3d6
[HUDI-3141] Metadata merged log record reader - avoiding NullPointerE…
manojpec Jan 4, 2022
37b15ff
[HUDI-3147] Add endpoint_url to dynamodb lock provider (#4500)
parisni Jan 4, 2022
a66212d
[HUDI-2966] Closing LogRecordScanner in compactor (#4478)
nsivabalan Jan 5, 2022
0e297c0
[HUDI-3171] Sync empty table to hive metastore (#4511)
danny0405 Jan 5, 2022
8307160
Fixing null schema with empty commit in incremental relation
vinishjail97 Jan 5, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/SUPPORT_REQUEST.md
Expand Up @@ -8,7 +8,7 @@ labels: question

**_Tips before filing an issue_**

- Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)?
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?

- Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.

Expand Down
12 changes: 12 additions & 0 deletions .github/workflows/bot.yml
Expand Up @@ -18,8 +18,20 @@ jobs:
include:
- scala: "scala-2.11"
spark: "spark2"
- scala: "scala-2.11"
spark: "spark2,spark-shade-unbundle-avro"
- scala: "scala-2.12"
spark: "spark3,spark3.0.x"
- scala: "scala-2.12"
spark: "spark3,spark3.0.x,spark-shade-unbundle-avro"
- scala: "scala-2.12"
spark: "spark3,spark3.1.x"
- scala: "scala-2.12"
spark: "spark3,spark3.1.x,spark-shade-unbundle-avro"
- scala: "scala-2.12"
spark: "spark3"
- scala: "scala-2.12"
spark: "spark3,spark-shade-unbundle-avro"
steps:
- uses: actions/checkout@v2
- name: Set up JDK 8
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Expand Up @@ -61,7 +61,8 @@ local.properties
# IntelliJ specific files/directories #
#######################################
.out
.idea
.idea/*
!.idea/vcs.xml
*.ipr
*.iws
*.iml
Expand Down
36 changes: 36 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions NOTICE
Expand Up @@ -159,3 +159,9 @@ its NOTICE file:
This product includes software developed at
StreamSets (http://www.streamsets.com/).

--------------------------------------------------------------------------------

This product includes code from hilbert-curve project
* Copyright https://github.com/davidmoten/hilbert-curve
* Licensed under the Apache-2.0 License

13 changes: 10 additions & 3 deletions README.md
Expand Up @@ -51,7 +51,7 @@ Prerequisites for building Apache Hudi:
* Unix-like system (like Linux, Mac OS X)
* Java 8 (Java 9 or 10 may work)
* Git
* Maven
* Maven (>=3.3.1)

```
# Checkout code and build
Expand All @@ -78,12 +78,19 @@ The default Scala version supported is 2.11. To build for Scala 2.12 version, bu
mvn clean package -DskipTests -Dscala-2.12
```

### Build with Spark 3.0.0
### Build with Spark 3

The default Spark version supported is 2.4.4. To build for Spark 3.0.0 version, build using `spark3` profile
The default Spark version supported is 2.4.4. To build for different Spark 3 versions, use the corresponding profile

```
# Build against Spark 3.2.0 (default build shipped with the public jars)
mvn clean package -DskipTests -Dspark3

# Build against Spark 3.1.2
mvn clean package -DskipTests -Dspark3.1.x

# Build against Spark 3.0.3
mvn clean package -DskipTests -Dspark3.0.x
```

### Build without spark-avro module
Expand Down
43 changes: 27 additions & 16 deletions azure-pipelines.yml
Expand Up @@ -26,13 +26,14 @@ variables:
MAVEN_OPTS: '-Dmaven.repo.local=$(MAVEN_CACHE_FOLDER) -Dcheckstyle.skip=true -Drat.skip=true -Djacoco.skip=true'
SPARK_VERSION: '2.4.4'
HADOOP_VERSION: '2.7'
SPARK_HOME: $(Pipeline.Workspace)/spark-$(SPARK_VERSION)-bin-hadoop$(HADOOP_VERSION)
SPARK_ARCHIVE: spark-$(SPARK_VERSION)-bin-hadoop$(HADOOP_VERSION)

stages:
- stage: test
jobs:
- job: UT_FT_1
displayName: UT FT common & flink & UT client/spark-client
timeoutInMinutes: '90'
steps:
- task: Cache@2
displayName: set cache
Expand All @@ -47,7 +48,7 @@ stages:
inputs:
mavenPomFile: 'pom.xml'
goals: 'install'
options: -DskipTests
options: -T 2.5C -DskipTests
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
Expand All @@ -71,6 +72,7 @@ stages:
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- job: UT_FT_2
displayName: FT client/spark-client
timeoutInMinutes: '90'
steps:
- task: Cache@2
displayName: set cache
Expand All @@ -85,7 +87,7 @@ stages:
inputs:
mavenPomFile: 'pom.xml'
goals: 'install'
options: -DskipTests
options: -T 2.5C -DskipTests
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
Expand All @@ -99,7 +101,8 @@ stages:
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- job: UT_FT_3
displayName: UT FT cli & utilities & sync/hive-sync
displayName: UT FT clients & cli & utilities & sync/hive-sync
timeoutInMinutes: '90'
steps:
- task: Cache@2
displayName: set cache
Expand All @@ -114,30 +117,31 @@ stages:
inputs:
mavenPomFile: 'pom.xml'
goals: 'install'
options: -DskipTests
options: -T 2.5C -DskipTests
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- task: Maven@3
displayName: UT cli & utilities & sync/hive-sync
displayName: UT clients & cli & utilities & sync/hive-sync
inputs:
mavenPomFile: 'pom.xml'
goals: 'test'
options: -Punit-tests -pl hudi-cli,hudi-utilities,hudi-sync/hudi-hive-sync
options: -Punit-tests -pl hudi-client/hudi-client-common,hudi-client/hudi-flink-client,hudi-client/hudi-java-client,hudi-cli,hudi-utilities,hudi-sync/hudi-hive-sync
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- task: Maven@3
displayName: FT cli & utilities & sync/hive-sync
displayName: FT clients & cli & utilities & sync/hive-sync
inputs:
mavenPomFile: 'pom.xml'
goals: 'test'
options: -Pfunctional-tests -pl hudi-cli,hudi-utilities,hudi-sync/hudi-hive-sync
options: -Pfunctional-tests -pl hudi-client/hudi-client-common,hudi-client/hudi-flink-client,hudi-client/hudi-java-client,hudi-cli,hudi-utilities,hudi-sync/hudi-hive-sync
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- job: UT_FT_4
displayName: UT FT other modules
timeoutInMinutes: '90'
steps:
- task: Cache@2
displayName: set cache
Expand All @@ -152,7 +156,7 @@ stages:
inputs:
mavenPomFile: 'pom.xml'
goals: 'install'
options: -DskipTests
options: -T 2.5C -DskipTests
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
Expand All @@ -161,7 +165,7 @@ stages:
inputs:
mavenPomFile: 'pom.xml'
goals: 'test'
options: -Punit-tests -pl !hudi-common,!hudi-flink,!hudi-client/hudi-spark-client,!hudi-cli,!hudi-utilities,!hudi-sync/hudi-hive-sync
options: -Punit-tests -pl !hudi-common,!hudi-flink,!hudi-client/hudi-spark-client,!hudi-client/hudi-client-common,!hudi-client/hudi-flink-client,!hudi-client/hudi-java-client,!hudi-cli,!hudi-utilities,!hudi-sync/hudi-hive-sync
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
Expand All @@ -170,16 +174,23 @@ stages:
inputs:
mavenPomFile: 'pom.xml'
goals: 'test'
options: -Pfunctional-tests -pl !hudi-common,!hudi-flink,!hudi-client/hudi-spark-client,!hudi-cli,!hudi-utilities,!hudi-sync/hudi-hive-sync
options: -Pfunctional-tests -pl !hudi-common,!hudi-flink,!hudi-client/hudi-spark-client,!hudi-client/hudi-client-common,!hudi-client/hudi-flink-client,!hudi-client/hudi-java-client,!hudi-cli,!hudi-utilities,!hudi-sync/hudi-hive-sync
publishJUnitResults: false
jdkVersionOption: '1.8'
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- job: IT
steps:
- task: AzureCLI@2
displayName: Prepare for IT
inputs:
azureSubscription: apachehudici-service-connection
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
echo 'Downloading $(SPARK_ARCHIVE)'
az storage blob download -c ci-caches -n $(SPARK_ARCHIVE).tgz -f $(Pipeline.Workspace)/$(SPARK_ARCHIVE).tgz --account-name apachehudici
tar -xvf $(Pipeline.Workspace)/$(SPARK_ARCHIVE).tgz -C $(Pipeline.Workspace)/
mkdir /tmp/spark-events/
- script: |
echo 'Downloading spark-$(SPARK_VERSION)-bin-hadoop$(HADOOP_VERSION)'
wget https://archive.apache.org/dist/spark/spark-$(SPARK_VERSION)/spark-$(SPARK_VERSION)-bin-hadoop$(HADOOP_VERSION).tgz -O $(Pipeline.Workspace)/spark-$(SPARK_VERSION).tgz
tar -xvf $(Pipeline.Workspace)/spark-$(SPARK_VERSION).tgz -C $(Pipeline.Workspace)/
mkdir /tmp/spark-events/
mvn $(MAVEN_OPTS) -Pintegration-tests verify
displayName: IT
26 changes: 26 additions & 0 deletions conf/hudi-defaults.conf.template
@@ -0,0 +1,26 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Default system properties included when running Hudi jobs.
# This is useful for setting default environmental settings.

# Example:
# hoodie.datasource.hive_sync.jdbcurl jdbc:hive2://localhost:10000
# hoodie.datasource.hive_sync.use_jdbc true
# hoodie.datasource.hive_sync.support_timestamp false
# hoodie.index.type BLOOM
# hoodie.metadata.enable false
5 changes: 5 additions & 0 deletions doap_HUDI.rdf
Expand Up @@ -76,6 +76,11 @@
<created>2021-08-26</created>
<revision>0.9.0</revision>
</Version>
<Version>
<name>Apache Hudi 0.10.0</name>
<created>2021-12-08</created>
<revision>0.10.0</revision>
</Version>
</release>
<repository>
<GitRepository>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/base/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/datanode/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/historyserver/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/hive_base/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/namenode/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
4 changes: 2 additions & 2 deletions docker/hoodie/hadoop/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
<relativePath>../../../pom.xml</relativePath>
</parent>
<modelVersion>4.0.0</modelVersion>
Expand Down Expand Up @@ -54,7 +54,7 @@
<docker.hive.version>2.3.3</docker.hive.version>
<docker.hadoop.version>2.8.4</docker.hadoop.version>
<docker.presto.version>0.217</docker.presto.version>
<dockerfile.maven.version>1.4.3</dockerfile.maven.version>
<dockerfile.maven.version>1.4.13</dockerfile.maven.version>
<checkstyle.skip>true</checkstyle.skip>
<main.basedir>${project.parent.basedir}</main.basedir>
</properties>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/prestobase/pom.xml
Expand Up @@ -20,7 +20,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/spark_base/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/sparkadhoc/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down
2 changes: 1 addition & 1 deletion docker/hoodie/hadoop/sparkmaster/pom.xml
Expand Up @@ -19,7 +19,7 @@
<parent>
<artifactId>hudi-hadoop-docker</artifactId>
<groupId>org.apache.hudi</groupId>
<version>0.10.0-SNAPSHOT</version>
<version>0.11.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
Expand Down