Skip to content

[core] [flink] [spark] Preserve blob view references during forwarding#7970

Merged
JingsongLi merged 2 commits into
apache:masterfrom
leaves12138:blob-view-preserve-reference
May 26, 2026
Merged

[core] [flink] [spark] Preserve blob view references during forwarding#7970
JingsongLi merged 2 commits into
apache:masterfrom
leaves12138:blob-view-preserve-reference

Conversation

@leaves12138
Copy link
Copy Markdown
Contributor

Purpose

Blob view currently resolves BlobViewStruct to upstream BLOB content on reads. This makes it hard to forward one blob view table into another blob view table while preserving the original upstream reference.

This PR adds a dynamic read option to preserve blob view references during such forwarding:

  • blob-view.resolve.enabled=true keeps the existing default behavior and resolves blob view fields at read time.
  • blob-view.resolve.enabled=false skips read-time resolution and returns the original BlobViewStruct as a BlobView, so writing it into another blob-view-field stores the same upstream reference.

Changes

  • Add CoreOptions.BLOB_VIEW_RESOLVE_ENABLED.
  • Make DataEvolutionTableRead skip blob view resolving when this option is disabled.
  • Add a test that forwards T1 blob view rows into T2 while preserving references to T0.
  • Document the option and the forwarding pattern.

Tests

  • mvn -pl paimon-format -am -DskipTests package
  • mvn -pl paimon-api,paimon-common,paimon-format -am -DskipTests install
  • mvn -pl paimon-core -Pfast-build -Dtest=BlobTableTest#testForwardBlobViewReference test
  • mvn -pl paimon-core -Pfast-build -DskipTests compile
  • mvn -pl paimon-api,paimon-core spotless:check -DskipTests
  • git diff --check

@leaves12138 leaves12138 marked this pull request as draft May 26, 2026 03:52
@leaves12138 leaves12138 force-pushed the blob-view-preserve-reference branch from c406fb8 to be01fe6 Compare May 26, 2026 04:24
@leaves12138 leaves12138 force-pushed the blob-view-preserve-reference branch from be01fe6 to b8d8dd3 Compare May 26, 2026 05:21
@leaves12138 leaves12138 changed the title [blob] Preserve blob view references during forwarding [core] [flink] [spark] Preserve blob view references during forwarding May 26, 2026
@leaves12138 leaves12138 marked this pull request as ready for review May 26, 2026 08:15
@leaves12138
Copy link
Copy Markdown
Contributor Author

Thanks for the update. I reviewed the latest patch and the blob-view forwarding path looks good to me.

The new blob-view.resolve.enabled=false read option cleanly bypasses the blob-view resolving reader, so unresolved BlobViewStruct values can be forwarded without triggering upstream lookup or materializing the actual blob bytes. The Flink and Spark row wrappers also preserve unresolved BlobView values via Blob.serializeBlob, while keeping the existing resolved/blob-as-descriptor behavior unchanged.

The added core/Flink/Spark coverage exercises the important end-to-end forwarding case, and the current CI is green. I do not see remaining blocking issues.

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 386a85e into apache:master May 26, 2026
13 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants