Skip to content

fix(engine): idempotent talm apply for object-array and merge:replace fields#139

Merged
lexfrei merged 6 commits into
mainfrom
fix/object-array-dedup-on-apply
May 7, 2026
Merged

fix(engine): idempotent talm apply for object-array and merge:replace fields#139
lexfrei merged 6 commits into
mainfrom
fix/object-array-dedup-on-apply

Conversation

@lexfrei
Copy link
Copy Markdown
Contributor

@lexfrei lexfrei commented May 7, 2026

What changed

talm apply now preserves rendered entries when the user re-states or partially edits per-node config. The engine's pre-merge prune was correctly deduplicating primitive arrays at the top level, but had two gaps that landed as silent data loss across repeated applies:

  • Object-array nested primitives were appended on every round-trip. When the upstream patcher matches an object-array element by identity (interface, vlanId, name) and recurses inside, nested primitive arrays (interface addresses, vlan addresses, admission-control exemption namespaces) appended rather than replaced — every apply doubled them.
  • merge:"replace" fields were silently overwritten on partial edit. The prune subtracted rendered entries from body, reducing body to just the user's deltas; the upstream replace then wrote those deltas as the entire final value, losing rendered's other entries.

Why

Both regressions are reachable through the standard talm template -I workflow that writes the rendered template back as the per-node body.

How

  • Extended the prune to descend into object-array elements matched by their identity field (mirroring upstream's body-driven switch).
  • Added a path-based skip for merge:"replace" fields so body reaches the patcher verbatim.
  • Table-drift tests lock both lookups against the upstream surface.

Tests

Eleven new integration tests + unit tests for the identity selector and table contracts. README documents the round-trip dedup contract.

Closes #138.
Closes #77.
Closes #28.

@lexfrei lexfrei marked this pull request as ready for review May 7, 2026 15:41
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces idempotent apply logic to prevent entry duplication during repeated configuration runs. The core change involves enhancing the pruning mechanism to recursively descend into object arrays using identity-based matching (e.g., for network interfaces and VLANs) and skipping pruning for paths marked with merge:"replace" to avoid data loss on partial edits. The PR includes comprehensive tests covering these new behaviors and updates the documentation to reflect the idempotent apply process. I have no feedback to provide.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

Warning

Rate limit exceeded

@lexfrei has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 38 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7b57a20e-e2b2-4e17-bff2-c8f503865ba0

📥 Commits

Reviewing files that changed from the base of the PR and between dc4877e and 29f645c.

📒 Files selected for processing (4)
  • README.md
  • pkg/engine/engine.go
  • pkg/engine/engine_test.go
  • pkg/engine/render_test.go
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/object-array-dedup-on-apply

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

lexfrei added 6 commits May 7, 2026 18:48
Add four MergeFileAsPatch subtests covering the scenarios where Talos's
strategic merge appends primitive lists nested inside object arrays:

- machine.network.interfaces[].addresses + routes (partial edit)
- machine.network.interfaces[].vlans[].addresses (nested)
- cluster.apiServer.admissionControl[].configuration.exemptions.namespaces
- regression-safety: body adds a new interface, rendered interface's
  addresses must not duplicate

All four currently fail. The matched object-array element (by interface,
vlanId, or name) reaches configpatcher.Apply with rendered-side
primitives still present; strategic merge appends them, doubling every
rendered entry per apply round-trip.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Extend pruneIdenticalKeys to descend into object-array elements
matched by their identity field, recurse through the matched pair,
drop items reduced to nothing, and re-attach identity keys that the
inner deep-equal pass stripped so the upstream merge can still match.

Without this descent, configpatcher.Apply matches an object-array
element by its identity field upstream (interface, vlanId, name) and
recurses inside, but the inner primitive arrays nested in the matched
element append rather than replace. Every apply round-trip then
doubles every interface address, route, vlan address, and
admissionControl exemption namespace.

Identity fields are looked up by YAML path against a static table
covering the v1alpha1 List types whose custom Merge method matches
by identity AND whose elements would mishandle inner-primitive
append on partial edit:

  - machine/network/interfaces — NetworkDeviceList.Merge by interface
    or deviceSelector (body-driven switch).
  - machine/network/interfaces/vlans — VlanList.Merge by vlanId.
  - cluster/apiServer/admissionControl — AdmissionPluginConfigList.Merge
    by name.

Other object arrays (extraVolumes, kernel.modules, wireguard.peers,
authorizationConfig, ...) have no custom upstream Merge; the patcher
appends body to rendered, so re-attaching an identity key would not
avoid the append — those paths fall through to the deep-equal
fallback in matchObjectArrayItem, which drops body items that
byte-equal a rendered counterpart and covers the dominant
full-restate case.

ConfigFileList (typed ExtensionServiceConfig.configFiles, identity
mountPath) is intentionally omitted: ConfigFile carries only string
fields, so the upstream merge.Merge field-by-field plus the
deep-equal fallback already produce the right result without a
path-keyed identity match.

Object arrays without a single primary key (most notably
machine/network/interfaces[].routes) sit in the same fallback bucket
— two items are the same only if every field matches, the only
"same item" semantic the schema admits when no key is declared.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Add three MergeFileAsPatch subtests covering the v1alpha1 fields
tagged `merge:"replace"` upstream:

- cluster.network.podSubnets
- cluster.network.serviceSubnets
- cluster.apiServer.auditPolicy (Unstructured map)

For these fields the upstream patcher overwrites rendered's value
with body's verbatim — unless body is the zero value, in which case
rendered survives. The current pruneIdenticalKeysAt subtracts
rendered-side primitives from body's slice (and recursively strips
deep-equal keys from body's map), reducing body to just the user's
deltas. The upstream replace then writes those deltas as the entire
final value, and rendered's other entries / map keys silently
vanish — partial-edit on podSubnets that adds a CIDR ends up losing
the original; partial-edit on auditPolicy that adds a rule loses the
rendered apiVersion/kind/other rules. All three currently fail.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Add replaceSemanticPaths — the YAML paths in the v1alpha1 schema where
the upstream merge layer is annotated `merge:"replace"`. At those
paths the patcher overwrites rendered's value with body's verbatim
(unless body is zero-valued), so any prune that subtracts entries from
body's primitive list, recurses into body's map, or descends into body's
object array reduces body to the user's deltas only — the upstream
replace then writes those deltas as the final value and rendered's
other entries / map keys silently vanish.

Skip the partial-edit branches at those paths so body reaches the
upstream merge intact. The deep-equal short-circuit at the top of
pruneIdenticalKeysAt is still safe: when body byte-equals rendered,
deleting the body key reduces body to the zero value at that path and
the upstream replace leaves rendered untouched.

Three v1alpha1 fields are tagged today:

  - cluster.network.podSubnets
  - cluster.network.serviceSubnets
  - cluster.apiServer.auditPolicy

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Repeated talm apply runs against an already-configured node do not
duplicate entries: the engine prunes from the body every primitive list
entry the rendered template already carries and every object-array
element that deep-equals a rendered counterpart, descending into pairs
that the upstream patcher merges by identity to dedupe their inner
primitive lists too.

Pin this in the per-node patches section so users running 'talm template
-I' do not have to discover from the test suite that the contract
exists.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Add coverage for behavioural contracts that are otherwise only
exercised indirectly through the existing integration tests:

- TestHasIdentityValue pins the boundary cases of the body-driven
  identity selector (empty string, empty map, empty slice, nil, zero
  int, bool) so a future change cannot collapse a body without
  identity onto the wrong rendered element.

- TestObjectArrayMergeKeysMatchesUpstreamMergerSurface and
  TestReplaceSemanticPathsMatchesUpstreamReplaceTags lock both lookup
  tables against the upstream merger / merge-tag surface so a stray
  addition or omission surfaces here at test time rather than as a
  silent idempotence regression.

- NetworkDefaultActionConfig.ingress integration test pins that the
  flat path-based replace-skip is harmless when a different typed-doc
  kind shares the bare path 'ingress' with NetworkRuleConfig: the
  collision is benign as long as the colliding field stays scalar.

- NetworkRuleConfig multi-doc pairing test pins that the documentIdentity
  tuple includes 'name', so two NetworkRuleConfig docs with different
  names are not collapsed by the prune.

- 'interface body without identity field reaches merge intact' test
  pins that hasIdentityValue rejection of empty fields prevents a
  body item with no interface or deviceSelector from being matched
  onto the wrong rendered element.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
@lexfrei lexfrei force-pushed the fix/object-array-dedup-on-apply branch from 0de4081 to 29f645c Compare May 7, 2026 15:49
@lexfrei lexfrei merged commit 892c543 into main May 7, 2026
5 checks passed
@lexfrei lexfrei deleted the fix/object-array-dedup-on-apply branch May 7, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants