fix: align reordered shared-dim coords by label in merge (#550)#758
Merged
Conversation
…550) §8 aligns by label, not position, so the same labels in a different order are the same coordinate. The constant path already reindexed a pure reorder, but the expression-merge path raised under v1 and silently misaligned under legacy. merge() now conforms shared user dims to the first operand's order before the §8/§11 checks, so a reorder reindexes (correct under both semantics) while a differing label set still raises. Aux coords ride along the reindex, so §11 conflicts are preserved. Spec: convention.md §8 now states order-independence explicitly and is retitled "Shared dimensions must carry the same labels" (was framed as xarray's order-sensitive `exact`). Supersedes #550. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4 tasks
…pass The reorder fix added a second walk over the shared user dims (one to reindex reordered coords, one to detect a genuine mismatch), duplicating the per-dim .equals() work on every join=None merge — the hot path during model building. conform_merge_dims does both in a single pass and replaces merge_shared_user_coord_mismatch + the separate conform helper. Behaviour is unchanged (full suite 6476 passed under both semantics). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ultiIndex paths Regression guards for the §8 by-label alignment beyond the 2-operand case: multi-operand merge([a,b,c]) pairs by label, quadratic merge aligns reordered dims, and a reordered stacked MultiIndex raises (xarray cannot reindex it by tuple — left to §11; tied to the #744 MultiIndex-storage decision). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…11 gap) A reordered full MultiIndex is "the same coordinate spelled differently" and per §8/§11 should align — but reindex cannot reorder a stacked MI by tuple, so it previously fell through to a confusing §11 aux-coord raise. conform_merge_dims now permutes via positional isel using get_indexer, which works uniformly for a plain index and a MultiIndex's tuples (and get_indexer doubles as the same-set test, replacing the set() comparison — cheaper). A genuinely different label set still raises the §8 mismatch. convention.md §11 now states order-independence for the full-MI case explicitly. Only the MultiIndex case was affected; a plain dim with aux coords already reordered correctly (the aux coord rides along the permute). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A shared-dim label mismatch is the root cause, so report it as a dim conflict rather than letting the §11 aux check fire first — a different stacked MultiIndex otherwise surfaced as its level coords conflicting (the wrong message). Aux conflicts still raise once the dims agree. Adds a routing test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reorder tests reached through the internal term storage (.coeffs.coords) for coordinates that the expression exposes directly via .indexes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the transitioning contract, legacy must not change: reordering coords on a shared dim was always positional in expression merges (the constant path, by contrast, has always aligned labelled operands by label — that asymmetry is genuine legacy and is what v1 unifies). conform_merge_dims now permutes only under v1; under legacy it leaves the operands positional and the caller warns with a reorder-specific message (v1 would align by label, a different result). Tests: the align cases are now @pytest.mark.v1; added legacy guards for the positional result and the full warning text. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improving a hole in the semantic convention
Note
This PR — description and code — was generated with Claude Code.
What & why
§8 of the v1 convention aligns by label, not position — so the same labels in a different order are the same coordinate. The constant path already reindexed a pure reorder, but the expression-
merge()path raised under v1 and silently misaligned under legacy (the #550 bug).merge()now conforms reordered shared dims to the first operand's order before the §8 / §11 checks:Spec:
convention.md§8 now states order-independence explicitly and is retitled "Shared dimensions must carry the same labels" (it was framed as xarray's order-sensitiveexact).Supersedes #550 (same bug; that PR auto-reindexed only the merge path on master — this fixes both paths consistently and specs the rule).
Tests
merge()(runs under both semantics)Full suite 6476 passed / 562 skipped under both semantics; mypy + pre-commit clean.