docs: Add release 1.2.0 documentation by yihua · Pull Request #18880 · apache/hudi

yihua · 2026-05-29T07:35:42Z

Describe the issue this Pull Request addresses

Cuts the 1.2.0 documentation by snapshotting the current docs into versioned_docs/version-1.2.0 and making 1.2.0 the default version in docusaurus.config.js. Also refreshes hardcoded Hudi versions in code examples (1.0.x / 1.1.1 → 1.2.0) within the 1.2.0 docs.

Release-page redirects continue to point at release-1.1; they will be flipped to release-1.2 in the follow-up PR that adds releases/release-1.2.md.

Summary and Changelog

Add versioned_docs/version-1.2.0 + sidebar, register 1.2.0 in versions.json
Set 1.2.0 as the default released version in docusaurus.config.js
Bump hardcoded Maven / bundle / Dataproc examples to 1.2.0 in the 1.2.0 docs

Impact

As above

Risk Level

none

Documentation Update

Release 1.2.0 documentation

Contributor's checklist

Read through contributor's guide
Enough context is provided in the sections above
Adequate tests were added if applicable

Snapshot the current docs into versioned_docs/version-1.2.0 and make 1.2.0 the default in docusaurus.config.js. Bump release-1.1 redirect targets to release-1.2 and refresh hardcoded Hudi versions (1.0.x/1.1.1) in code examples to 1.2.0.

hudi-agent

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR snapshots the current docs into versioned_docs/version-1.2.0 and bumps hardcoded versions, registering 1.2.0 as the default, while also introducing new 1.2.0 content around Lance, vector search, BLOB/VARIANT types, HMS 4.x JDBC fallback, and new Flink/Spark configs. The version bumps are mostly consistent, but a few stale version strings appear to have been missed (e.g., hudi-gcp-bundle-0.13.0.jar, hudi-flink-bundle_2.11-0.9.0.jar, the CLI table-version mapping), and there are some typos, a leftover author TODO, and minor inconsistencies in the new feature descriptions worth a cleanup pass. Please take a look at the inline comments, and this should be ready for a Hudi committer or PMC member to take it from here.

hudi-agent · 2026-05-29T07:45:39Z

+|----------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Flink global RLI with native MDT access            | `index.type=GLOBAL_RECORD_LEVEL_INDEX`<br />`metadata.enabled=true`<br />`index.global.enabled=true`<br />`index.bootstrap.enabled=false`<br />`hoodie.metadata.record.level.index.defer.init=false`                                                                                                                                                                                                                                      | Flink reads global record locations directly from the metadata table and uses the in-task RLI cache for recently accessed keys. Use this when a shared metadata-table index is preferred over task-local RocksDB state. |
+| Flink global RLI with local RocksDB cache          | `index.type=GLOBAL_RECORD_LEVEL_INDEX`<br />`metadata.enabled=true`<br />`index.global.enabled=true`<br />`index.bootstrap.enabled=true`<br />`index.bootstrap.rocksdb.path=<local-path>`<br />`hoodie.metadata.record.level.index.defer.init=false`                                                                                                                                                                                     | Flink bootstraps the global RLI into a local RocksDB backend. This can reduce repeated metadata-table lookups at the cost of local disk usage and bootstrap time.                                                       |
+| Dynamic bucket scaling with partitioned RLI        | `index.type=RECORD_LEVEL_INDEX`<br />`metadata.enabled=true`<br />`index.global.enabled=false`<br />`hoodie.metadata.record.level.index.min.filegroup.count=<initial-file-groups-per-partition>`<br />`hoodie.metadata.record.level.index.max.filegroup.count=<max-file-groups-per-partition>`<br />Optionally tune `index.rli.cache.size` and `index.rli.cache.concurrent.partitions.num` for the partition cache. | Flink uses partition-scoped RLI to route existing keys to their recorded file group and assigns new keys through dynamic bucket assignment. This supports streaming upsert and insert overwrite workloads.              |


🤖 The default column for index.bootstrap.enabled says false (Optional), but the description says "If not explicitly set for global RLI, Flink enables bootstrap by default." These appear contradictory. The companion table in ingestion_flink.md simply lists the default as false. Could you clarify whether the actual runtime default flips to true for GLOBAL_RECORD_LEVEL_INDEX, or whether the description is overstating it? Otherwise users wiring this up for the first time may set it explicitly only because they're unsure.

_{- AI-generated; verify before applying. React 👍/👎 to flag quality.}

hudi-agent · 2026-05-29T07:45:39Z

+
+|  Option Name  | Description | Default | Remarks |
+|  -----------  | -------  | ------- | ------- |
+| `write.buffer.disruptor.ring.size` | Size of the Disruptor ring buffer (must be a power of 2) | `16384` | Larger values absorb write bursts but consume more heap memory |


🤖 The wording "As of Hudi 1.2.0, Flink writers support TIMELINE_SERVER_BASED marker type" reads as if this is a brand-new capability in 1.2.0. TIMELINE_SERVER_BASED markers themselves have been around since 0.9.0. Is the intent to call out that Flink specifically gained support in 1.2.0, or is this restating an existing feature? @danny0405 could you confirm when Flink writers gained TIMELINE_SERVER_BASED marker support so this can be worded accurately?

_{- AI-generated; verify before applying. React 👍/👎 to flag quality.}

hudi-agent · 2026-05-29T07:45:40Z

+a query vector under the chosen distance metric.
+
+```sql
+SELECT *


🤖 Minor wording cleanup: secondary keys stores in the secondary index partition → secondary keys stored in the secondary index partition. Also two lines later, All the different index types can be queries by specifying... → queried.

_{- AI-generated; verify before applying. React 👍/👎 to flag quality.}

hudi-agent · 2026-05-29T07:45:40Z

+-- COW
+CREATE TABLE my_ai_table (
+    id        STRING,
+    embedding VECTOR(768),


🤖 The Lance CREATE TABLE example sets hoodie.record.merger.impls = 'org.apache.hudi.DefaultSparkRecordMerger'. This config typically accepts a comma-separated list of merger class names. Is DefaultSparkRecordMerger the correct FQCN required here, and is it a required property for Lance-backed tables, or only recommended? A short sentence clarifying why this needs to be set explicitly (and that it's a single value here, not the full list) would help users not familiar with the merger API. @yihua could you confirm the recommended value here?

_{- AI-generated; verify before applying. React 👍/👎 to flag quality.}

hudi-agent · 2026-05-29T07:45:40Z

+hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://hiveserver:10000
+hoodie.datasource.hive_sync.username=<username>
+hoodie.datasource.hive_sync.password=<password>
+```


🤖 The HMS 4.x JDBC fallback section says fallback only happens for callers that go through Thrift first, and the flag never resets within a HoodieHiveSyncClient instance. Should we explicitly note here whether long-running writers (e.g., Hudi Streamer continuous mode) will re-probe Thrift on each sync run, or whether the same sync-client instance is reused across runs and therefore stays in JDBC-fallback mode for the lifetime of the process? Users tuning continuous pipelines on HMS 4.x will want to know.

_{- AI-generated; verify before applying. React 👍/👎 to flag quality.}

…n mappings

docs: cut 1.2.0 docs

94e9bd6

Snapshot the current docs into versioned_docs/version-1.2.0 and make 1.2.0 the default in docusaurus.config.js. Bump release-1.1 redirect targets to release-1.2 and refresh hardcoded Hudi versions (1.0.x/1.1.1) in code examples to 1.2.0.

yihua changed the title ~~[DOCS] Cut 1.2.0 docs~~ docs: Cut release 1.2.0 documentation May 29, 2026

yihua changed the title ~~docs: Cut release 1.2.0 documentation~~ docs: Add release 1.2.0 documentation May 29, 2026

Keep release redirects at release-1.1 until 1.2 release notes land

ecab379

github-actions Bot added docs size:XL PR with lines of changes > 1000 labels May 29, 2026

voonhous approved these changes May 29, 2026

View reviewed changes

hudi-agent reviewed May 29, 2026

View reviewed changes

yihua added 2 commits May 29, 2026 01:03

Address review comments: typos, broken links, bundle versions, versio…

445f3db

…n mappings

Fix typo in skip_compaction remarks (created for tables created by)

b9c03bd

yihua mentioned this pull request May 29, 2026

docs: Fix typos and stale references in current docs #18881

Merged

3 tasks

Make skip_compaction remarks less verbose

76bd9f3

yihua merged commit 51a185c into apache:asf-site May 29, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Add release 1.2.0 documentation#18880

docs: Add release 1.2.0 documentation#18880
yihua merged 5 commits into
apache:asf-sitefrom
yihua:cut-1.2.0-docs

yihua commented May 29, 2026 •

edited

Loading

Uh oh!

hudi-agent left a comment

Uh oh!

Uh oh!

hudi-agent May 29, 2026

Uh oh!

hudi-agent May 29, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hudi-agent May 29, 2026

Uh oh!

hudi-agent May 29, 2026

Uh oh!

hudi-agent May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yihua commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the issue this Pull Request addresses

Summary and Changelog

Impact

Risk Level

Documentation Update

Contributor's checklist

Uh oh!

hudi-agent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hudi-agent May 29, 2026

Choose a reason for hiding this comment

Uh oh!

hudi-agent May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hudi-agent May 29, 2026

Choose a reason for hiding this comment

Uh oh!

hudi-agent May 29, 2026

Choose a reason for hiding this comment

Uh oh!

hudi-agent May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yihua commented May 29, 2026 •

edited

Loading