Skip to content

[Trino] Add MoR read tests for delete markers and custom record payloads #18898

@voonhous

Description

@voonhous

Follow-up from the RFC-105 Trino plugin PR #18837 (review thread on HudiTrinoReaderContext.getRecordMerger).

HudiTrinoReaderContext.getRecordMerger now dispatches on RecordMergeMode, mirroring HoodieAvroReaderContext:

  • EVENT_TIME_ORDERING -> HoodieAvroRecordMerger
  • COMMIT_TIME_ORDERING -> OverwriteWithLatestMerger
  • CUSTOM -> HoodieRecordUtils.createValidRecordMerger(...)

The existing MoR snapshot / _rt tests all use the default payload, where event-time merge results are identical regardless of which merger is selected. So they do not exercise the paths where the merger choice actually matters:

  1. Delete markers (_hoodie_is_deleted / delete records in log files) on MoR snapshot reads - combineAndGetUpdateValue propagates deletes, preCombine does not.
  2. Custom record payloads (e.g. PartialUpdateAvroPayload, OverwriteNonDefaultsWithLatestAvroPayload, AWSDmsAvroPayload) where combineAndGetUpdateValue differs from preCombine.
  3. COMMIT_TIME_ORDERING MoR tables - verify OverwriteWithLatestMerger semantics (latest write wins).

Add functional tests (preferably Trino smoke-test level against registered MoR tables) covering these so the merge-mode dispatch is verified end-to-end.

Tracked inline via a TODO in HudiTrinoReaderContext.getRecordMerger.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions