Skip to content

Comments

feat: add custom_metadata support to RecordBatch with IPC read/write#9445

Open
rustyconover wants to merge 1 commit intoapache:mainfrom
rustyconover:feat/recordbatch-custom-metadata
Open

feat: add custom_metadata support to RecordBatch with IPC read/write#9445
rustyconover wants to merge 1 commit intoapache:mainfrom
rustyconover:feat/recordbatch-custom-metadata

Conversation

@rustyconover
Copy link

Which issue does this PR close?

What changes are included in this PR?

Add per-batch custom_metadata to RecordBatch, matching the custom_metadata field on the IPC Message flatbuffer envelope. This allows attaching per-batch metadata separate from schema-level metadata, bringing parity with PyArrow's write_batch(custom_metadata=...) API (available since PyArrow v11.0.0).

Changes:

  • Add custom_metadata: HashMap<String, String> field to RecordBatch with custom_metadata(), custom_metadata_mut(), with_custom_metadata(), and into_parts_with_custom_metadata() accessors
  • IPC writer: serialize custom_metadata to Message flatbuffer
  • IPC reader: extract custom_metadata from Message at FileDecoder, StreamReader, and StreamDecoder call sites
  • arrow-flight: extract and propagate custom_metadata in flight_data_to_arrow_batch
  • arrow-select: preserve custom_metadata through filter_record_batch and take_record_batch
  • Metadata preserved through slice(), project(), normalize(), with_schema(), and remove_column()
  • PyArrow-generated test data for cross-language interop validation

Are these changes tested?

Yes there are tests in the PR.

Are there any user-facing changes?

There are no breaking changes.

Written with AI assistance; all changes reviewed by the author.

Add per-batch `custom_metadata` to `RecordBatch`, matching the
`custom_metadata` field on the IPC `Message` flatbuffer envelope.
This allows attaching per-batch metadata separate from schema-level
metadata, bringing parity with PyArrow's `write_batch(custom_metadata=...)`
API (available since PyArrow v11.0.0).

Changes:
- Add `custom_metadata: HashMap<String, String>` field to `RecordBatch`
  with `custom_metadata()`, `custom_metadata_mut()`, `with_custom_metadata()`,
  and `into_parts_with_custom_metadata()` accessors
- IPC writer: serialize custom_metadata to Message flatbuffer
- IPC reader: extract custom_metadata from Message at FileDecoder,
  StreamReader, and StreamDecoder call sites
- arrow-flight: extract and propagate custom_metadata in
  `flight_data_to_arrow_batch`
- arrow-select: preserve custom_metadata through `filter_record_batch`
  and `take_record_batch`
- Metadata preserved through `slice()`, `project()`, `normalize()`,
  `with_schema()`, and `remove_column()`
- PyArrow-generated test data for cross-language interop validation

Written with AI assistance; all changes reviewed by the author.
@github-actions github-actions bot added arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate labels Feb 20, 2026
@rustyconover
Copy link
Author

This actually doesn't have anything to do with arrow-flight

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support per-batch custom_metadata on RecordBatch (IPC Message field)

1 participant