[python] Add row kind support for TableRead#7394
Merged
JingsongLi merged 1 commit intoapache:masterfrom Mar 11, 2026
Merged
Conversation
Add include_row_kind option to TableRead that prepends a _row_kind string column to Arrow output. For RecordBatchReader (append-only tables) all rows default to "+I"; for RowIterator (primary-key tables) row kind is read per-row via OffsetRow.get_row_kind(). The feature is opt-in (include_row_kind=False by default) so existing read paths are unaffected. StreamReadBuilder in the next PR will enable it for changelog/streaming reads. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e87c3bf to
a1681ac
Compare
Contributor
|
+1 |
This was referenced Mar 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Exposes row_kind from TableRead so that users can perform different actions depending on the type of change happening.
include_row_kindparameter toTableReadfor streaming change tracking_row_kindstring column (+I,-D,+U,-U) to Arrow batches when enabledRecordBatchReader(default+I) andOffsetRow-based readers (fromRowKind)Stacked PR series
This is PR 1b part 2 in the Python streaming read series:
AsyncStreamingTableScan, consumer management ([python] Add StreamReadBuilder and AsyncStreamingTableScan #7350)paimon tail([python] Add paimon tail CLI for streaming table reads #7351)Incremental diff (vs 1b): tub/paimon@python-streaming-1b-scanners...tub:paimon:python-streaming-1b2-row-kind
Test plan
flake8passespython -m pytestpassesLLM generated, human reviewed
Generated using Claude, tested manually and iterated upon by a human :)