-
Notifications
You must be signed in to change notification settings - Fork 513
fix: merge_insert uses full schema path for reordered columns #5541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cbd0f9f to
9da423e
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
| // Allow nullable source fields for non-nullable targets. | ||
| compare_nullability: NullabilityComparison::Ignore, | ||
| // Allow columns to be in a different order; they will be matched by name. | ||
| ignore_field_order: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in what case would we not want this behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like we assert field order anywhere right now, but it seems like a useful capability. ArrowSchema PartialEq implementation is strict on field order. There are places like in query plans where field order matters.
When merge_insert received columns in a different order than the dataset schema, it would fall back to the slower partial schema path. This adds `ignore_field_order: true` to the schema comparison so that reordered columns can use the more efficient full schema path. Fixes lance-format#5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
9da423e to
2c63415
Compare
…format#5541) ## Summary - When `merge_insert` received columns in a different order than the dataset schema, it fell back to the slower partial schema path - Added `ignore_field_order: true` to the schema comparison so reordered columns can use the efficient full schema path - Added test verifying both the fast path eligibility and data correctness with reordered columns Fixes lance-format#5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Summary - When `merge_insert` received columns in a different order than the dataset schema, it fell back to the slower partial schema path - Added `ignore_field_order: true` to the schema comparison so reordered columns can use the efficient full schema path - Added test verifying both the fast path eligibility and data correctness with reordered columns Fixes #5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…format#5541) ## Summary - When `merge_insert` received columns in a different order than the dataset schema, it fell back to the slower partial schema path - Added `ignore_field_order: true` to the schema comparison so reordered columns can use the efficient full schema path - Added test verifying both the fast path eligibility and data correctness with reordered columns Fixes lance-format#5323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Summary
merge_insertreceived columns in a different order than the dataset schema, it fell back to the slower partial schema pathignore_field_order: trueto the schema comparison so reordered columns can use the efficient full schema pathFixes #5323
🤖 Generated with Claude Code