Skip to content

Conversation

@wjones127
Copy link
Contributor

Summary

  • When merge_insert received columns in a different order than the dataset schema, it fell back to the slower partial schema path
  • Added ignore_field_order: true to the schema comparison so reordered columns can use the efficient full schema path
  • Added test verifying both the fast path eligibility and data correctness with reordered columns

Fixes #5323

🤖 Generated with Claude Code

@github-actions github-actions bot added the bug Something isn't working label Dec 18, 2025
@wjones127 wjones127 force-pushed the fix/merge-insert-order branch 2 times, most recently from cbd0f9f to 9da423e Compare December 18, 2025 20:19
@wjones127 wjones127 marked this pull request as ready for review December 18, 2025 20:24
@codecov
Copy link

codecov bot commented Dec 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

// Allow nullable source fields for non-nullable targets.
compare_nullability: NullabilityComparison::Ignore,
// Allow columns to be in a different order; they will be matched by name.
ignore_field_order: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in what case would we not want this behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like we assert field order anywhere right now, but it seems like a useful capability. ArrowSchema PartialEq implementation is strict on field order. There are places like in query plans where field order matters.

When merge_insert received columns in a different order than the dataset
schema, it would fall back to the slower partial schema path. This adds
`ignore_field_order: true` to the schema comparison so that reordered
columns can use the more efficient full schema path.

Fixes lance-format#5323

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@wjones127 wjones127 force-pushed the fix/merge-insert-order branch from 9da423e to 2c63415 Compare December 19, 2025 00:18
@wjones127 wjones127 merged commit 481485f into lance-format:main Dec 19, 2025
28 of 29 checks passed
wjones127 added a commit to wjones127/lance that referenced this pull request Dec 19, 2025
…format#5541)

## Summary

- When `merge_insert` received columns in a different order than the
dataset schema, it fell back to the slower partial schema path
- Added `ignore_field_order: true` to the schema comparison so reordered
columns can use the efficient full schema path
- Added test verifying both the fast path eligibility and data
correctness with reordered columns

Fixes lance-format#5323

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
wjones127 added a commit that referenced this pull request Dec 19, 2025
## Summary

- When `merge_insert` received columns in a different order than the
dataset schema, it fell back to the slower partial schema path
- Added `ignore_field_order: true` to the schema comparison so reordered
columns can use the efficient full schema path
- Added test verifying both the fast path eligibility and data
correctness with reordered columns

Fixes #5323

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
wjones127 added a commit to wjones127/lance that referenced this pull request Dec 30, 2025
…format#5541)

## Summary

- When `merge_insert` received columns in a different order than the
dataset schema, it fell back to the slower partial schema path
- Added `ignore_field_order: true` to the schema comparison so reordered
columns can use the efficient full schema path
- Added test verifying both the fast path eligibility and data
correctness with reordered columns

Fixes lance-format#5323

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When all columns provided out-of-order, we should use full-schema merge_insert

2 participants