Skip to content

Fix some issues around merge key/anchor handling #2400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jul 23, 2025

Conversation

stevenwdv
Copy link
Contributor

@stevenwdv stevenwdv commented Jun 13, 2025

This PR fixes multiple issues related to merge keys (!!merge <<):

  • Allow inline maps instead of just aliases (e.g. {<<: {a: 42}}, {<<: [{a: 42}]}), both for traversing and exploding (closes Merge key with inline map broken #2386)
  • Allow aliases to sequences to be used (e.g. {s: &s [{a: 42}], m: {<<: *s}}), both for traversing and exploding (closes Explode with merge aliases throw an error #2178)
  • Disallow nested sequences (e.g. {<<: [[{a: 42}]]})
  • Disallow other (inline) types (e.g. {<<: 42})
  • However: Ignore invalid merge keys when traversing maps. Previously cases like {a: &a 42, m: {<<: *a}} were reported but cases like {m: {<<: 42}} were not. I chose to ignore all errors because otherwise traversing such a map (e.g. to correct types) becomes impossible.
  • Errors are not ignored when using explode. Previously explode would just leave out invalid merge keys. Another potential solution would be to leave them as-is.
  • Fix precedence of merge anchor sequence for traverse (explode was already correct), now it correctly lets earlier keys overwrite later keys according to the spec.
  • Fix excessive exploding with merge keys, e.g. previously echo '{b: &b {a: &a 42}, r: *a, c: {<<: *b}}' | yq 'explode(.c)' would yield {b: &b {a: 42}, r: *a, c: {a: 42}}, making r: *a a dangling reference. Now it yields {b: &b {a: &a 42}, r: *a, c: {a: &a 42}}.
  • For exploding, detect merge keys using tag (!!merge) rather than value (<<).

I do still have a question: I saw in operator_traverse_path_test.go that goccy would reject at least some invalid merge types at parse time, which might make using custom tags like <<: !include file.yaml (see #2386) impossible. Am I right about that? And does that mean that this usage will not be supported in the future? If so, that would be a bit sad. See also the note above about ignoring invalid merge keys.

- Allow inline maps instead of just aliases
- Disallow nested sequences
- Disallow other types

Closes mikefarah#2386
@stevenwdv stevenwdv marked this pull request as draft June 13, 2025 19:35
@stevenwdv
Copy link
Contributor Author

Wait, this actually doesn't fix explode(.) yet. Converting to draft for now, I'll do that later.

- Allow inline maps instead of just aliases
- Allow aliased sequences
- Disallow other types
- Use tag `!!merge` instead of key `<<`
- Fix insertion index for sequence merge

Closes mikefarah#2386
@stevenwdv stevenwdv changed the title Fix merge anchor traversing Fix merge anchor handling Jun 16, 2025
@stevenwdv stevenwdv marked this pull request as ready for review June 16, 2025 12:57
@stevenwdv stevenwdv changed the title Fix merge anchor handling Fix merge key/anchor handling Jun 16, 2025
@stevenwdv stevenwdv changed the title Fix merge key/anchor handling Fix issues around merge key/anchor handling Jun 16, 2025
@stevenwdv stevenwdv changed the title Fix issues around merge key/anchor handling Fix some issues around merge key/anchor handling Jun 16, 2025
@stevenwdv
Copy link
Contributor Author

stevenwdv commented Jun 16, 2025

EDIT: nvm fixed it


Ok, the excessive exploding fix I mentioned above may actually cause problems of its own:
echo '{a: {b: &b 42}, <<: {c: *b}}' | yq 'explode(.)' now returns {a: {b: 42}, c: *b}, which also has a dangling alias...

However, I don't really know how to solve this, because it depends on if the mapping node to which the merge anchor refers is also being exploded, e.g. with {r: &r {...}, m: {<<: *r}} we don't want to explode .r, although we might make a copy and then explode it.

And apparently this issue is already present when partially exploding: echo '{a: {b: &b 42}, c: *b}' | yq 'explode(.a)' in the latest release v4.45.4 returns {a: {b: 42}, c: *b}, which also has a dangling alias. Is that a bug or intentional?

@mikefarah
Copy link
Owner

Minor comment, need more time to review this and consider if there are any side effects - thanks for the contribution though!

document: mergeDocSample,
expression: `.foobarList.thing`,
expected: []string{
"D0, P[bar thing], (!!str)::bar_thing\n",
"D0, P[foo thing], (!!str)::foo_thing\n",
Copy link
Owner

@mikefarah mikefarah Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh nice find - this isn't to the YAML spec and it's works opposite to explode 🤦🏼 but this is a breaking change... and will cause all sorts of issues for people :/ Hmmmm

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to go through with this PR if either; this fix is not included, or it's controlled via a new flag, maybe --yaml-correct-merge-order or something, defaulting to false. Should have a warning in there if the flag is not set and the logic is executed.

Then later with enough notice given, I can update the default flag value to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A flag may be a good idea, I'll have a look at it later. Maybe the same could be done for #2110.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good idea 👍🏼

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be done under the same flag, I'm happy to work on 2110 and adding a flag, then perhaps you can add the if check into this PR?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a flag --yaml-fix-merge-anchor-to-spec available as a check in code as ConfiguredYamlPreferences.FixMergeAnchorToSpec. If you can add this for the fix you've done in traverse that'd be ace 👍🏼 You can see how I've tested true/false in the operator_anchor_aliases_test.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I put the fixed overriding behind the flag. You missed fixing #2110 for traversing (only exploding), but I could fix it both for local keys (#2110) & list merge keys at once anyway.

I also noticed that after 6e8cc00 it'll warn about overriding behavior with list merge anchor keys as well (e.g. {<<: [{a: 1}, {a: 2}]}), while this behavior is and was already correct (explode(.).a = 1), see the TODOs I added. Maybe you know of a good fix? Or add 'unless...' to the warning..?

Copy link
Contributor Author

@stevenwdv stevenwdv Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just remembered, in the main branch the flag currently actually breaks list merge anchors for explode... 🙃

Edit: added a testcase for this

Copy link
Contributor Author

@stevenwdv stevenwdv Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that the way I fixed #2110 for operator_traverse_path actually breaks overriding keys in regular maps... :( I added a failing test.

I really wonder what the rationale behind the ordering in the merge keys spec was..

if !ConfiguredYamlPreferences.FixMergeAnchorToSpec {
//TODO This also fires in when an earlier element in a list merge anchor overwrites a later element, which *is* to the spec
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh darn so this means I broke that behavior? I thought I would have had a failing test somewhere :/

@@ -122,12 +142,22 @@ var anchorOperatorScenarios = []expressionScenario{
expression: ".[4] | explode(.)",
expected: []string{"D0, P[4], (!!map)::r: 10\nx: 1\ny: 2\n"},
},
//TODO The following 2 tests warn about overwriting [3].r not being to spec while they shouldn't
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see...thanks for making this super clear! Will fix...somehow

@mikefarah
Copy link
Owner

This is really good - I think you have a clearer picture of how merge works better than I do 😅 - I'm going to go over it again later just to double check it as it's a sensitive part of the code.

@stevenwdv
Copy link
Contributor Author

Thanks for having another look! Did you see my comments in the thread above as well? Currently I've actually broken key overriding in regular maps for traversing, and I don't have a good solution yet...

@stevenwdv
Copy link
Contributor Author

stevenwdv commented Jul 17, 2025

Ah spelling is also wrong I see, didn't know this was a bri'ish codebase ;)

Edit: apparently the codebase it not entirely consistent, but anyway

@mikefarah
Copy link
Owner

mikefarah commented Jul 19, 2025

Ok I think to properly fix explode I need to take a step further and rewrite the reconstructAliasedMap - which I've done in a branch MakeExplodeGreatAgain and included your extra tests 👍🏼 I've also changed the warning to just be a generic you haven't set the flag warning, rather than trying to specify which maps are affected.

Tempted to merge it into master - it'll no doubt conflict with what you've got here though :/

@stevenwdv
Copy link
Contributor Author

Ok I think to properly fix explode I need to take a step further and rewrite the reconstructAliasedMap - which I've done in a branch MakeExplodeGreatAgain

Nice, I'll merge it in here as well

@stevenwdv
Copy link
Contributor Author

stevenwdv commented Jul 20, 2025

@mikefarah I notice that fixedReconstructAliasedMap in MakeExplodeGreatAgain is lacking a lot of logic that is present in reconstructAliasedMap. Compare:

$ echo '{e: &empty {}, b: &b 42, a: {b2: *b}, <<: *empty}' | ./yq 'explode(.)'
[WARNING] 14:18:06 --yaml-fix-merge-anchor-to-spec is false; causing merge anchors to override the existing values which isn't to the yaml spec. This flag will default to true in late 2025.
{e: {}, b: 42, a: {b2: 42}}

vs

$ echo '{e: &empty {}, b: &b 42, a: {b2: *b}, <<: *empty}' | ./yq --yaml-fix-merge-anchor-to-spec 'explode(.)'
{e: &empty {}, b: &b 42, a: {b2: *b}}

This is because it's not using overrideEntry. It's also not using Context, but I'm actually not sure what the exact purpose of that is.

This may also break aliases (*b is invalid now):

$ echo '{b: &b 42, m: {e: &empty {}, a: {b2: *b}, <<: *empty}}' | ./yq --yaml-fix-merge-anchor-to-spec 'explode(.)'
{b: 42, m: {e: &empty {}, a: {b2: *b}}}

I'm seeing if I can quickly fix it in the merge.

…e-anchor-fix

# Conflicts:
#	pkg/yqlib/doc/operators/anchor-and-alias-operators.md
#	pkg/yqlib/operator_anchors_aliases.go
#	pkg/yqlib/operator_anchors_aliases_test.go
…anchor-fix

# Conflicts:
#	pkg/yqlib/doc/operators/anchor-and-alias-operators.md
#	pkg/yqlib/operator_anchors_aliases_test.go
@stevenwdv
Copy link
Contributor Author

stevenwdv commented Jul 20, 2025

I'm seeing if I can quickly fix it in the merge.

I thought of a way to quickly solve the issue in a different way, by evaluating merge keys first, I hope you don't mind.

I'm now looking at the issue in my code I mentioned above.

@stevenwdv
Copy link
Contributor Author

stevenwdv commented Jul 20, 2025

Ok, it should be all done now, I hope! Well, maybe the tests could be cleaned up a bit / made more consistent between traverse & explode.

Btw, I thought of some interesting edge cases, where I don't know what the intended way to handle them would be, which again is probably one of the reasons merge anchors are not part of the recent spec anymore: the more I know about them, the less sense it makes to me to use regular keys for behavior like this.

(current behavior in bold)

  • {<<: {a: 42}, <<: {b: 42}}: {a: 42, b: 42}/{b: 42}?
    Arguably it should be the latter, since after de-duplication only the second merge key remains
  • {<<: {a: 42}, !!str <<: {b: 42}}: {a: 42, !!str <<: {b: 42}}/{!!str <<: {b: 42}}?
    I think this is right, since the keys have different types
  • {<<: {a: 42}, !!merge mykey: {a: 37, b: 42}}: {a: 42, b: 42}/{a: 37, b: 42}/invalid?

Not that other YAML tools handle them in a good way; yamllint.com with input {<<: {a: 42}, !!str <<: {b: 42}} gives {a: 42, b: 42} (as did yq previously; with explode, not traverse).

@mikefarah
Copy link
Owner

Ah the reason I was processing like that was to preserve key order on merging. Not a big issue though, this is looking good! Thanks for PR!

@mikefarah
Copy link
Owner

Hmm - only problem with the way you're processing explode is that they key order is mangled if merge is processed first :/ Not a massive issues I guess...though I might see if I can fix that

@stevenwdv
Copy link
Contributor Author

True, although YAML does act like merge keys come first, so it kinda makes sense in a way

@mikefarah mikefarah merged commit 70ac3d6 into mikefarah:master Jul 23, 2025
3 of 4 checks passed
@stevenwdv
Copy link
Contributor Author

stevenwdv commented Jul 25, 2025

Hey @mikefarah, I'm really confused because it seems like something went really wrong merging this; a lot of changes didn't make it and some logic from MakeExplodeGreatAgain that I changed it in this MR was merged. E.g.

  • fixedReconstructAliasedMap re-appeared
  • !!str << is broken again
  • Inline merge anchors are broken without --yaml-fix-merge-anchor-to-spec while I fixed those for both cases
  • Nested merge anchors ({<<: {<<: {a: 42}}}) are broken even with the fix
  • Probably more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge key with inline map broken Explode with merge aliases throw an error
2 participants