fix: merge_insert uses full schema path for reordered columns #5541

wjones127 · 2025-12-18T19:38:59Z

Summary

When merge_insert received columns in a different order than the dataset schema, it fell back to the slower partial schema path
Added ignore_field_order: true to the schema comparison so reordered columns can use the efficient full schema path
Added test verifying both the fast path eligibility and data correctness with reordered columns

Fixes #5323

🤖 Generated with Claude Code

codecov · 2025-12-18T21:00:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

jackye1995 · 2025-12-18T21:26:07Z

rust/lance/src/dataset/write/merge_insert.rs

                // Allow nullable source fields for non-nullable targets.
                compare_nullability: NullabilityComparison::Ignore,
+                // Allow columns to be in a different order; they will be matched by name.
+                ignore_field_order: true,


in what case would we not want this behavior?

It doesn't seem like we assert field order anywhere right now, but it seems like a useful capability. ArrowSchema PartialEq implementation is strict on field order. There are places like in query plans where field order matters.

When merge_insert received columns in a different order than the dataset schema, it would fall back to the slower partial schema path. This adds `ignore_field_order: true` to the schema comparison so that reordered columns can use the more efficient full schema path. Fixes lance-format#5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…format#5541) ## Summary - When `merge_insert` received columns in a different order than the dataset schema, it fell back to the slower partial schema path - Added `ignore_field_order: true` to the schema comparison so reordered columns can use the efficient full schema path - Added test verifying both the fast path eligibility and data correctness with reordered columns Fixes lance-format#5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

## Summary - When `merge_insert` received columns in a different order than the dataset schema, it fell back to the slower partial schema path - Added `ignore_field_order: true` to the schema comparison so reordered columns can use the efficient full schema path - Added test verifying both the fast path eligibility and data correctness with reordered columns Fixes #5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

…format#5541) ## Summary - When `merge_insert` received columns in a different order than the dataset schema, it fell back to the slower partial schema path - Added `ignore_field_order: true` to the schema comparison so reordered columns can use the efficient full schema path - Added test verifying both the fast path eligibility and data correctness with reordered columns Fixes lance-format#5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

github-actions bot added the bug Something isn't working label Dec 18, 2025

wjones127 force-pushed the fix/merge-insert-order branch 2 times, most recently from cbd0f9f to 9da423e Compare December 18, 2025 20:19

wjones127 marked this pull request as ready for review December 18, 2025 20:24

jackye1995 reviewed Dec 18, 2025

View reviewed changes

jackye1995 approved these changes Dec 18, 2025

View reviewed changes

wjones127 force-pushed the fix/merge-insert-order branch from 9da423e to 2c63415 Compare December 19, 2025 00:18

wjones127 merged commit 481485f into lance-format:main Dec 19, 2025
28 of 29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: merge_insert uses full schema path for reordered columns #5541

fix: merge_insert uses full schema path for reordered columns #5541

Uh oh!

wjones127 commented Dec 18, 2025

Uh oh!

codecov bot commented Dec 18, 2025

Uh oh!

jackye1995 Dec 18, 2025

Uh oh!

wjones127 Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: merge_insert uses full schema path for reordered columns #5541

fix: merge_insert uses full schema path for reordered columns #5541

Uh oh!

Conversation

wjones127 commented Dec 18, 2025

Summary

Uh oh!

codecov bot commented Dec 18, 2025

Codecov Report

Uh oh!

jackye1995 Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

wjones127 Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants