Skip to content

refactor(blob): refactor BlobFileContext to honor the actual schema when classifying BLOB fields#332

Open
lxy-9602 wants to merge 4 commits into
alibaba:mainfrom
lxy-9602:fix-blob-file-context
Open

refactor(blob): refactor BlobFileContext to honor the actual schema when classifying BLOB fields#332
lxy-9602 wants to merge 4 commits into
alibaba:mainfrom
lxy-9602:fix-blob-file-context

Conversation

@lxy-9602
Copy link
Copy Markdown
Collaborator

@lxy-9602 lxy-9602 commented Jun 2, 2026

Purpose

Linked issue: #283

Classify BLOB fields by intersecting the option-configured fields with the columns actually present in the given schema, so partial (projected) schemas no longer pull in absent fields. Also removes the unused Is*Field query helpers.

Tests

BlobFileContextTest.PartialSchemaIgnoresAbsentFields
BlobFileContextTest.PartialSchemaWithOnlyBlobFileField

API and Format

Documentation

Generative AI tooling

Generated-by: Claude-4.8-Opus

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors BlobFileContext::Create to classify BLOB fields based on the intersection of (a) option-configured field lists and (b) the BLOB columns actually present in the provided (possibly projected) Arrow schema, preventing absent columns from being incorrectly treated as inline/view/descriptor/external-storage fields. It also removes now-unused per-field query helpers and updates unit tests accordingly.

Changes:

  • Filter descriptor/view/inline/external-storage field sets to only include BLOB fields present in the provided schema.
  • Derive blob_file_fields from “schema BLOB fields minus inline fields” (schema-aware).
  • Remove Is*Field query helpers and adjust/add tests for partial-schema behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/paimon/core/operation/blob_file_context.h Removes unused Is*Field helpers from the public API.
src/paimon/core/operation/blob_file_context.cpp Makes field classification schema-aware by intersecting option lists with BLOB fields present in the given schema.
src/paimon/core/operation/blob_file_context_test.cpp Updates tests to use set getters and adds coverage for projected/partial schema scenarios.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants