feat(charts): allow overriding of the cluster's name#148
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughBoth Helm charts ( ChangesCluster Name Configurability
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces the ability to override the cluster name in both the cozystack and generic Helm charts. It adds a clusterName field to values.yaml and updates the _helpers.tpl templates to use this value if provided, defaulting to the chart name. I have no feedback to provide.
47ba7e4 to
7570565
Compare
lexfrei
left a comment
There was a problem hiding this comment.
NOT LGTM — the override is wired only into the chart templates; the talm talosconfig regeneration path still reads the cluster name from Chart.yaml, and the new behaviour ships without test coverage that pins either the legacy default or the override precedence.
Business context: lets users override the cluster name without renaming the chart directory or re-running talm init --name. The previous escape hatch (--name baked into Chart.yaml at init) requires re-init; this PR makes the override mutable post-init via values.yaml.
Blockers
B1: talm talosconfig regenerates with the chart name, not the new override
File: pkg/commands/talosconfig.go:131-137 (interaction with charts/{cozystack,generic}/templates/_helpers.tpl)
When regenerateTalosconfig runs without an existing talosconfig (fresh checkout, deleted file, or first talm talosconfig after bootstrap loss), it falls back to getClusterNameFromChart(), which reads only Chart.yaml.name. The rendered Talos machine config now has cluster.clusterName: <values.clusterName>, but the regenerated talosconfig context is <Chart.Name>.
getClusterNameFromChart() at pkg/commands/talosconfig.go:194-210 parses only Chart.yaml; it never opens values.yaml. The path is reachable via the user-facing talm talosconfig command (talosconfig.go:34-82), which is the documented "regenerate after cert expiry" flow.
Impact: silent cluster-name divergence between client and server config whenever the override is in use and the regenerate path is hit.
Fix: either (a) extend getClusterNameFromChart() to also load values.yaml and prefer clusterName when non-empty, or (b) call out in the PR description and the values.yaml comment that the override is template-only and the user must keep Chart.yaml.name in sync.
B2: Missing test coverage for the new behaviour
File: pkg/engine/render_test.go
Current tests (lines 231, 292, 345, 507) only assert the clusterName: substring is present — none of them pin the rendered value. Both the legacy fallback and the new override are uncovered, and the change touches two charts (cozystack and generic) with independent template files, so a "fixed cozystack, forgot generic" regression would slip through CI. Minimum coverage expected before merge:
Backward compatibility (legacy contract — must not regress)
clusterName: ""(explicit empty in values) → rendered output containsclusterName: "<Chart.Name>". Confirms thedefaultpipeline restores the prior literal.clusterNamekey entirely absent from values → sameclusterName: "<Chart.Name>". Guards against a futurevalues.yamlcleanup dropping the key and silently breaking the fallback.
Override precedence
clusterName: "my-cluster"→ rendered output containsclusterName: "my-cluster", NOT<Chart.Name>. Confirms the override beats the default.
Edge cases that pin current behaviour
- Special characters / spaces:
clusterName: "foo bar"→ output containsclusterName: "foo bar"(quoted). Confirmsquoteactually fires. - Override equal to
Chart.Name: idempotent — same output as case 1. - YAML-ambiguous string values:
clusterName: "true"andclusterName: "123"→ output keeps them as quoted strings, not booleans/ints. Withoutquotethese would round-trip as native YAML scalars. - Whitespace-only:
clusterName: " "→ output isclusterName: " "(sprigdefaultdoes NOT treat whitespace as empty). Either pin this behaviour with a comment explaining why, or add atrimin the template — but the test must exist either way.
Both charts
Cases 1–3 must be exercised against BOTH charts/cozystack and charts/generic (the diff touches both _helpers.tpl files independently). A single shared helper executed twice with different chart roots is fine.
The existing pkg/engine/render_test.go fixtures already provide an inline-values pattern; the simplest delivery is to convert one of the existing four assertContains(t, output, "clusterName:") call sites into a table-driven subtest covering the cases above.
Non-blocking follow-ups
- The
values.yamlcomment ("Optional override for the cluster's name (defaults to Chart.Name)") understates the implication:cluster.clusterNameis baked into PKI (cert SANs) and ETCD identity at bootstrap. Changing it on a live cluster is unsafe. A one-sentence warning in the comment would prevent the typical "I'll just rename my cluster" mistake. - No DNS-1123 validation. A non-conforming string (uppercase, spaces, special chars) propagates through and fails downstream in Talos with a less obvious error. An optional Helm
failguard could surface this at template time. - README's only mention of
clusterNameis in a sample output snippet (line 154). Neither the existing--nameinit flag nor the new override are documented as user-facing knobs. Pre-existing gap, not made worse by this PR.
7570565 to
3b7c174
Compare
|
Thanks for feedback! I've addressed all comments except non-blocking 3, which seems out of scope for this work. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/commands/talosconfig.go`:
- Around line 196-204: The clusterName parsing logic is inverted: the
os.ReadFile and yaml.Unmarshal error checks are currently gating the happy path
so values.yaml is never used; update the block that reads valuesYamlPath to
first proceed only when os.ReadFile returns nil (err == nil), then call
yaml.Unmarshal on the read data and, if yaml.Unmarshal returns nil and
valuesData.ClusterName is not empty, return valuesData.ClusterName; reference
the same symbols (valuesYamlPath, valuesData struct, os.ReadFile,
yaml.Unmarshal) and ensure errors are handled/ignored only on failure so the
override behavior from values.yaml works.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5959e22a-2547-4f2d-9934-9f20c3c2845c
📒 Files selected for processing (6)
charts/cozystack/templates/_helpers.tplcharts/cozystack/values.yamlcharts/generic/templates/_helpers.tplcharts/generic/values.yamlpkg/commands/talosconfig.gopkg/engine/render_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- charts/generic/values.yaml
- charts/cozystack/values.yaml
- charts/cozystack/templates/_helpers.tpl
- charts/generic/templates/_helpers.tpl
3b7c174 to
185d546
Compare
Signed-off-by: Colden Cullen <colden@coldencullen.com>
185d546 to
c388ee7
Compare
lexfrei
left a comment
There was a problem hiding this comment.
LGTM — both prior blockers are addressed: getClusterNameFromChart now reads values.yaml.clusterName ahead of Chart.yaml.name, and the test suite pins the override behaviour with both positive and negative cases on each chart.
Business context: lets users override the cluster name without renaming the chart directory or re-running talm init --name.
What changed since the prior review
pkg/commands/talosconfig.go:194-219—getClusterNameFromChartnow resolvesvalues.yaml.clusterNamefirst, falls back toChart.yaml.name. Closes the regenerate-path divergence flagged in the previous review.pkg/engine/render_test.go— four new tests (TestMultiDoc{Cozystack,Generic}_{Valid,Invalid}ClusterNameOverride) and four existing tests upgraded from substring-onlyclusterName:to value-pinningclusterName: "<expected>". Closes the test-coverage blocker on the chart side.charts/{cozystack,generic}/values.yaml— comment now warns that changing the value on a live cluster is dangerous (PKI / ETCD identity baked at bootstrap).charts/{cozystack,generic}/templates/_helpers.tpl— DNS-1123 regex viaregexFind ... | requiredrejects invalid names at render time.
Non-blocking follow-ups
- No unit test in
pkg/commandscovers the newvalues.yaml.clusterName > Chart.yaml.namepriority ingetClusterNameFromChart. The chart-side tests inrender_test.goexercise the template, not the regenerate-talosconfig flow. A short test that creates a tmp dir withvalues.yaml: clusterName: foo+Chart.yaml: name: barand asserts the helper returnsfoowould close the gap. - The DNS-1123 regex does not enforce the 63-character label length cap (RFC 1035). A name longer than 63 chars passes the chart validation but Talos / Kubernetes downstream may still reject it. Minor; the regex captures the practical typo cases (uppercase, underscore, leading dash, etc.) which is the high-value fraction.
…priority Add four new tests covering the values.yaml-takes-precedence behaviour added in #148. The chart-side tests in pkg/engine/render_test.go exercise the template rendering path; this test pins the regenerate-talosconfig flow which reads the cluster name through getClusterNameFromChart in pkg/commands. Coverage: - ValuesYamlOverridesChartYaml: both files exist, values.yaml has a non-empty clusterName -> values.yaml wins. The talosconfig regenerate path picks up the operator's override instead of silently using Chart.Name. - EmptyValuesClusterNameFallsBack: values.yaml present but clusterName: "" (the shipped cozystack/generic default) falls through to Chart.yaml.name. Without this short-circuit a fresh install would resolve to empty and downstream callers would substitute placeholders silently. - AbsentValuesKeyFallsBack: values.yaml has no clusterName key at all (other shape) -> falls through to Chart.yaml. yaml.Unmarshal into the typed struct yields the zero string for a missing field; pin so the resolver treats it the same as an explicit empty. - MalformedValuesFallsBack: a syntax error in values.yaml does NOT poison the lookup; the function silently moves on to Chart.yaml. Operators with a half-edited values.yaml still get a usable regenerate path. The existing ReadsTopLevelName / MissingReturnsEmpty / MalformedReturnsEmpty tests are preserved; their comments updated to clarify they exercise the Chart.yaml fallback when no values.yaml override is in play. Coverage: pkg/commands 38.0% -> 38.2%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
…e packages (#154) * test(engine): pin cluster section contract for cozystack and generic charts Add contract_cluster_test.go documenting the user-facing semantics of every values.yaml knob that surfaces under cluster.* across the chart matrix (cozystack/generic, legacy/multidoc, controlplane/worker). The tests are intentionally repetitive across the matrix: when one combination drifts, the test name points directly at the broken cell. Each test starts with a Contract: comment describing what is pinned and why — the comment is the user-facing documentation, the body is the executable enforcement. Coverage: - Shared (both charts): clusterName defaults to Chart.Name, podSubnets default 10.244.0.0/16, serviceSubnets default 10.96.0.0/16, endpoint always quoted. - Worker invariant (both charts): no apiServer/controllerManager/ scheduler/etcd/proxy/allowSchedulingOnControlPlanes leakage. - Cozystack-only: clusterDomain=cozy.local, unconditional 127.0.0.1 in apiServer.certSANs, certSANs append behaviour, OIDC absent by default and present when issuer set, allocateNodeCIDRs default-on with cluster-cidr emission, allocateNodeCIDRs disabled drops cluster-cidr, allowSchedulingOnControlPlanes, proxy disabled, discovery disabled. - Generic-only (pin minimalism): no clusterDomain, no unconditional loopback, no cozystack-specific defaults, apiServer block emitted but empty without certSANs, certSANs appended verbatim. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin machine section contract for cozystack and generic charts Add contract_machine_test.go documenting machine.* semantics across the chart matrix. Continues the contract-as-documentation pattern from contract_cluster_test.go. Coverage: - Shared (both charts): machine.type matches template name (cp vs worker), kubelet.nodeIP.validSubnets always emitted, install.disk always emitted. - Controlplane-only: cozystack emits nodeLabels with $patch:delete on exclude-from-external-load-balancers; generic and worker never do. - Cozystack-only: kubelet extraConfig (cpuManagerPolicy:static, maxPods:512), gc_thresh sysctls trio (4096/8192/16384), vm.nr_hugepages opt-in (absent at default 0, emitted as quoted string when set), six pinned kernel modules with drbd usermode_helper=disabled parameter, machine.certSANs unconditional 127.0.0.1, two hardcoded files (containerd device-ownership patch and lvm.conf with global_filter for drbd/dm/zd), install.image default to ghcr.io/cozystack/cozystack/talos, install.image override path, registries.mirrors docker.io->mirror.gcr.io (legacy schema only — multi-doc uses RegistryMirrorConfig). - Generic-only (pin minimalism): no extraConfig, no sysctls, no kernel modules, no machine-level files, no install.image, no unconditional 127.0.0.1, no nodeLabels, no registries; certSANs appended verbatim on both machine and apiServer levels. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin schema selection contract (legacy vs multi-doc) Add contract_schema_test.go pinning how templateOptions.talosVersion controls schema selection in talos.config (_helpers.tpl). Coverage: - Empty version → legacy (default behaviour, single document with machine.network). - v1.10/v1.11/partial "1.11" → legacy. - v1.12.0+/v1.13/v2.0.0/partial "1.12" → multi-doc (HostnameConfig + ResolverConfig as separate documents joined by ---). - v1.12.0-rc/-alpha/-beta pre-releases satisfy the >=1.12.0-0 anchor and render multi-doc (cluster-bootstrap path: nodes booted off pre-release maintenance images must get the new schema). - Legacy is always exactly one document — no internal --- separator. - Multi-doc emits HostnameConfig + ResolverConfig unconditionally on controlplane regardless of discovery state. - RegistryMirrorConfig (multi-doc) emitted only on cozystack chart; generic does not ship a default mirror. - Schema selection is per-render, not cached: alternating legacy and multi-doc renders in one process produce distinct outputs. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin chart fail-path contract (error message stability) Add contract_errors_test.go pinning every `fail` and `required` directive in cozystack and generic _helpers.tpl. Error messages appear in talm template stderr and CI logs; users grep for the talm: prefix and specific substrings to debug bad inputs. Pinning the wording makes changing an error a reviewable act, not a silent break for everyone with a matching alert rule. Coverage: - endpoint required guard (cluster-wide, no auto-derive, URL example). - advertisedSubnets empty + empty discovery (names the values key, the missing default-gateway-bearing link, and operator recourse). - multi-doc + legacy machine.network.interfaces[] in running MachineConfig (lists LinkConfig/VLANConfig/BondConfig as recourse, v1.11 version pin as fallback). - multi-doc + bridge as IPv4-default link (names the link, missing BridgeConfig feature, vipLink workaround). - multi-doc + VLAN with unresolvable parent (spec.linkIndex hint). - multi-doc + VLAN with no vlanID (spec.vlan.vlanID hint). Add renderExpectingError helper that calls helm Render directly and returns the raw error so tests can assert on the message body. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin multi-doc network rendering contract Add contract_network_multidoc_test.go pinning the Talos v1.12+ multi-doc network output for both shipped charts. The renderer reconstructs network configuration from COSI discovery resources and emits one typed document per configurable link plus HostnameConfig + ResolverConfig. Coverage: - HostnameConfig: discovered hostname surfaced when 'real'; fallback to synthesized 'talos-<hash>' for placeholder names (rescue, talos, localhost, localhost.localdomain) or empty discovery. - ResolverConfig: populated from discovery dnsServers; YAML empty list fallback when resolvers unknown. - LinkConfig: single physical NIC produces one document (name + addresses + routes for gateway link). - Multi-NIC routing rule: only the gateway-bearing link's LinkConfig carries 'routes:'; non-gateway links emit addresses without routes (otherwise duplicate default routes shadow each other). - BondConfig: bondMaster fields surfaced verbatim (bondMode, xmitHashPolicy, lacpRate, miimon); slaves listed under links:. - Bond slaves: never emitted as standalone LinkConfig (configurable_link_names filters them by spec.slaveKind). - VLANConfig: emitted for VLAN with resolvable parent + vlanID; bond-as-parent supported. - Bridge non-gateway: skipped silently (no BridgeConfig emission yet). - Layer2VIPConfig discovery path: emitted on controlplane with link=<discovered-default-link>; never on worker. - Layer2VIPConfig vipLink override: emits with link=<vipLink>, suppresses discovery-derived emission, works on fresh-boot when no default link is discoverable yet. - floatingIP stripping: VIP not duplicated under LinkConfig.addresses even when discovery sees it on the link. Reuses existing lookup factories (simpleNicLookup, multiNicLookup, bondTopologyLookup, vlanOnBondTopologyLookup, bridgeLookup, freshNicLookup) — those fixtures are themselves part of the contract. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin legacy network rendering contract (machine.network) Add contract_network_legacy_test.go pinning the pre-v1.12 / Talos legacy network output for cozystack and generic charts. The legacy schema collapses everything into a single machine.network block: hostname, nameservers, and an interfaces[] list carrying addresses, routes, optional vlans[], optional bond block, and optional inline vip block. Coverage: - machine.network always emits hostname (quoted, with placeholder/ fallback parity with multi-doc) and nameservers (JSON array form). - Fresh-boot case: no discovery → no interfaces: mapping (only the '# -- Discovered interfaces:' debug comment from physical_links_info, asserted with indent-aware substring). - Plain physical NIC: emits - interface, addresses, routes (network/gateway pair). - floatingIP on controlplane → inline vip block on the discovered interface; never on worker. - VLAN: nested under parent's vlans: list with vlanId/addresses/ routes (legacy schema cannot represent a VLAN at top level). - Bond: top-level - interface: <bond>, nested bond: with interfaces (slaves), mode, and optional bondMaster fields. - vipLink override differs from default link → second interfaces[] entry with only vip:; inline vip on default suppressed (no double pinning of the same VIP). - vipLink override equal to default link → no extra entry. - Existing machine.network.interfaces[] in running MachineConfig → copied verbatim via toYaml (alphabetic key reorder is the YAML contract; chart trusts what the operator already declared). Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): pin CLI flag-parsing and root-detection helper contracts Add root_detection_test.go covering the pure-helper functions used during command-line argument parsing and project-root detection. These functions decide where talm finds chart files based on user- supplied -f/--file/-t/--template flags — directly observable CLI behaviour worth pinning as documentation. Coverage: - parseCommaSeparatedValues: comma-split with whitespace trim and empty-entry drop, single value, only-commas / only-whitespace edge. - parseFlagFromArgs: short and long form, space-separated and equal-sign forms, comma-separated values, absent flag, flag followed by another flag (no value), first-occurrence-wins semantics, both -f/--file and -t/--template flag pairs. - ResolveSecretsPath: empty defaults to secrets.yaml, relative paths anchored to Config.RootDir, absolute returned verbatim. - ExpandFilePaths: file passes through, directory expanded recursively to .yaml/.yml children (non-YAML siblings ignored), empty directory is an explicit error (operator typo guard), non-existent path propagated as-is so callers produce precise downstream errors. - isValidPreset: exact-match-only (case sensitive, no fuzzy match). - fileExists: thin os.Stat wrapper, true/false only. - filesDiffer: missing file differs from any content; equal content does not differ; changed content differs (drives the conditional 'do you want to overwrite?' prompt during talm init --update). Coverage rises from 16.8% to 20.8% on pkg/commands. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(modeline): pin modeline parsing, file reading, and generation contract Add contract_test.go raising pkg/modeline coverage from 40.9% to 90.9%. The modeline is the first line of every per-node values file talm consumes — '# talm: nodes=[...], endpoints=[...], templates=[...]'. The format is contractual: editors highlight it, talm parses it on every apply, GenerateModeline writes it. Round-trip stability is the contract. Coverage: - ParseModeline rejects without '# talm: ' prefix (variants tested: empty, foreign comment, vim modeline, missing colon, missing space before/after talm:). - TrimSpace tolerance: indented modelines accepted (operators may indent for readability). - Malformed key-value rejected with precise errors (missing equals, non-JSON value, JSON value not an array, empty value). - Canonical separator '\, ' (comma + space) — matches GenerateModeline output, ensures generated modelines parse back. - Empty JSON arrays are valid (express 'no nodes' without dropping the key, useful for round-trip tooling). - ReadAndParseModeline: only the first line is read (subsequent lines are YAML the engine consumes); missing file error mentions path; empty file surfaces 'empty' substring; non-modeline first line returns the parse error verbatim. - GenerateModeline round-trip: all-populated, all-empty, partial, special characters in paths/URLs. - Generated modeline starts with '# talm: ' and emits keys in fixed order (nodes, endpoints, templates) for editor tooling. GenerateModeline's remaining 27% are unreachable json.Marshal error paths on []string inputs (json.Marshal cannot fail there). Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(secureperm): pin atomic-write contract on Unix (extra scenarios) Add contract_unix_test.go documenting operator-observable contracts that complement existing secureperm_unix_test.go. The package writes secrets material (age private keys, secrets.yaml, talosconfig, kubeconfig) atomically with mode 0o600 — losing one of these costs a cluster PKI reissue, so the contract surface is worth pinning explicitly. Coverage: - WriteFile bytes integrity: NUL, high-bit, newlines round-trip unchanged (the helper is a byte-pipe, not a serializer). - WriteFile empty payload: zero-length file with mode 0o600. - No tmp leftover on success: the .secureperm-* sibling is renamed over the target, never left behind cluttering the project root. - No tmp leftover on failure (non-existent parent dir): cleanup contract holds even when tmp creation fails before any tmp exists. - Idempotency: three repeat writes leave one file at mode 0o600 with the expected payload (atomic overwrite path stable on existing target). - LockDown on missing file: returns os.IsNotExist error (no silent skip — silently skipping would leave secrets world-readable). - LockDown idempotency: file already at 0o600 stays at 0o600. Coverage stays at 73.1% — these are contract documentation tests, not new code paths. Remaining 27% are error returns inside WriteFile (chmod / write / sync / close / rename) that require kernel-level fault injection to exercise. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(age): pin key management and YAML-value encryption contract Add contract_test.go raising pkg/age coverage from 3.5% to 62.9%. The package implements 'talm init --encrypt' / '--decrypt' flows: generate an age X25519 keypair, persist to talm.key (mode 0600), encrypt/decrypt secrets.yaml round-trip, support key rotation. Encryption is per-string-value with an 'ENC[AGE,data:<base64>]' envelope so encrypted files remain diffable in git. Coverage: - GenerateKey on empty dir produces age keygen layout: '# created:' comment, '# public key:' comment, AGE-SECRET-KEY-1 line. - GenerateKey is idempotent (load-or-create): a second call returns created=false and the same identity. - LoadKey accepts both age keygen format (with comments) and the legacy plain format (just the secret-key line) — backward-compat. - LoadKey errors precisely when the file is malformed (no AGE-SECRET-KEY marker) or missing. - GetPublicKey returns a string starting with 'age1'. - GetPublicKeyFromFile prefers the comment line and falls back to LoadKey when the comment is absent. - EncryptSecretsFile/DecryptSecretsFile round-trip integrity. - Encrypted file uses the ENC[AGE,data:...] envelope on each string value, keys remain plaintext (diffability), and the original plaintext does NOT leak into the encrypted output. - Incremental re-encryption: re-encrypting unchanged plaintext produces byte-identical ciphertext (so unchanged secrets stay byte-stable in git history). - Localized diff: changing one value re-encrypts only that key's envelope; siblings stay byte-stable. - RotateKeys preserves round-trip integrity. NOTE: the function contains a known bug (calls GenerateKey, which is load-or-create, so the on-disk key does not actually rotate). The test pins the minimum integrity guarantee and explicitly documents the bug without asserting on the broken side, to avoid locking in the bug as a contract. - EncryptYAMLFile/DecryptYAMLFile: generic file-pair version round-trips for arbitrary file names (kubeconfig, etc.). Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): pin project-root detection contract Add contract_root_detection_test.go pinning the rules talm uses to decide which directory is 'the project'. Walking up from a file path or CWD until both Chart.yaml AND a secrets file (secrets.yaml or secrets.encrypted.yaml) are present. Coverage: - DetectProjectRoot: direct match (Chart.yaml + secrets.yaml in same dir), encrypted variant accepted as substitute for plain secrets, walking up from a deep sub-directory, no-match returns empty string with no error (callers fall through to other strategies), Chart.yaml alone is NOT enough (rules out random helm charts on disk being mistaken for talm projects). - DetectProjectRootForFile: derives directory from file path then delegates to DetectProjectRoot. - ValidateAndDetectRootsForFiles: empty input returns ('', nil); same-root files return that root; files spanning two roots is an explicit error (talm refuses inconsistent applies); orphan file (no markers up the tree) errors with the file name. - DetectRootForTemplate: behavioral alias of DetectProjectRootForFile (pin equivalence so future divergence is intentional). Coverage rises from 20.8% to 23.4% on pkg/commands. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * ci(workflows): add report-only coverage job Add a 'coverage' job to .github/workflows/pr.yml that runs go test -coverprofile and surfaces a per-package summary in the GitHub Actions step-summary block. Profile is also uploaded as an artifact (14-day retention) so reviewers can download and explore. The job is REPORT-ONLY — it does NOT gate merges. Rationale: a threshold gate would force every PR to also touch tests, which penalises small focused PRs (a one-line bugfix should not need a synthetic test to clear an arbitrary number). Reviewers eyeball the summary; sustained drift gets addressed in dedicated coverage commits like this branch. The summary uses an awk one-liner over 'go tool cover -func' output to collapse per-function coverage into per-package averages. The overall total comes from cover's last line. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin loadValues and mergeMaps contracts Add contract_loadvalues_test.go covering the values-aggregation pipeline that backs every 'talm template' / 'talm apply' invocation. loadValues mirrors Helm's flag precedence (-f / --set / --set-string / --set-file / --set-json / --set-literal); mergeMaps performs the deep merge. Coverage: - mergeMaps: empty overlay preserves base, top-level scalar overwrite, nested recursive merge keeping siblings, type-mismatch overwrite (map replaces scalar and vice versa, no auto-promotion), slice-replaces-not-appends contract (lists are replaced wholesale, matching Helm's --set semantics — operators relying on this for arrays of subnets/SANs must restate every element). - loadValues: -f file ordering precedence (later wins on shared keys), --set-json merges JSON object onto tree (numbers as float64), malformed JSON errors precisely, --set with dotted keys builds nested maps (strvals contract), --set-string forces string type regardless of numeric/bool appearance (required for Talos sysctls), --set-file reads file content and feeds <path>=<content> through strvals (path becomes the key when dot-free), missing --set-file path errors, missing -f file errors, malformed YAML in -f errors with file name, empty Options returns non-nil empty map, --set wins over -f, --set-string wins over --set (loadValues pipeline order). Coverage: pkg/engine 54.4% -> 59.6%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin Render and FullConfigProcess top-level contracts Add contract_render_test.go covering the entry point that backs 'talm template' and 'talm apply'. Render glues chart load + values aggregation + helm engine + applyPatchesAndRenderConfig into a single byte stream; FullConfigProcess / InitializeConfigBundle / SerializeConfiguration are the trio that reload the bundle once machine type / cluster name / endpoint are known from patches. Coverage: - Render error surfaces: * empty TemplateFiles -> mentions --file/--template * missing template -> names the template * bad chart Root -> chart-load error * bad ValueFile -> loadValues error propagated * malformed TalosVersion -> fast-fail BEFORE chart load (pin order so a refactor that defers validation surfaces here) - Render happy path: minimal chart with one worker template -> bytes contain machine: and type: worker. - --set values reach rendered template (end-to-end values pipeline). - Render with Offline=true skips the multi-node check (the online- only FailIfMultiNodes guard). - InitializeConfigBundle: empty Options returns usable bundle with ControlPlaneCfg + WorkerCfg; bad TalosVersion errors; missing WithSecrets path errors with 'secrets bundle' substring. - SerializeConfiguration: controlplane and worker outputs differ (cluster section presence) and contain expected 'type:' lines. - FullConfigProcess: no patches uses bundle default (controlplane); patch with type:controlplane is detected; malformed patch errors via LoadPatches; bad TalosVersion errors via InitializeConfigBundle. Coverage: pkg/engine 59.6% -> 76.6%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): pin Chart.yaml preset detection and .gitignore management Add contract_chart_gitignore_test.go pinning two user-facing helpers that ship with every 'talm init' invocation: readChartYamlPreset (determines the active preset from Chart.yaml dependencies) and writeGitignoreFile (manages .gitignore so a fresh 'git add .' does not commit private cluster material). Coverage: - readChartYamlPreset: * Picks first non-talm dependency as the preset (talm itself is the always-present library chart, not a preset). * Order in Chart.yaml matters — first non-talm wins. Pin so a refactor returning the LAST entry surfaces here. * No preset (talm-only deps) errors with 'preset not found'. * Missing Chart.yaml errors mentioning the file. * Malformed YAML errors (without this guard, unmarshal returns nil deps and the operator gets a misleading 'preset not found' instead of 'malformed Chart.yaml'). - writeGitignoreFile: * Creates .gitignore from scratch with the four secrets-bearing files: secrets.yaml, talosconfig, talm.key, kubeconfig. * When Config.GlobalOptions.Kubeconfig is an absolute path, only the base name lands in .gitignore (paths there are repo-relative — absolute paths would be useless). * Existing operator entries are preserved; the function appends missing required entries, never rewrites the file. * Idempotent: when all required entries are already present, no write occurs (mtime stable across repeat 'talm init' invocations). * Tolerant of annotated entries: 'secrets.yaml # never commit' and 'talosconfig#TODO' both count as present, no duplicate append. Coverage: pkg/commands 23.4% -> 26.3%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): pin endpoint/kubeconfig/gitignore/file helpers contracts Add contract_helpers_test.go for the pure helpers across pkg/commands that do not require a live Talos client. Each pin captures operator-observable behaviour exposed via talm CLI flows. Coverage: - normalizeEndpoint: every variant collapses to canonical https://<host>:6443 (host plus optional protocol/port stripped, port hardcoded to 6443). Includes IPv6-in-brackets and explicit-port forms. - updateKubeconfigServer: rewrites every cluster's server field, no-op when already normalised (mtime stable), missing file errors. - addToGitignore: creates .gitignore on first call, appends new entries on top of existing operator-supplied content, idempotent on repeat with same entry, treats path-prefix matches as already- present (existing 'dist' covers a request to add 'dist'). - getClusterNameFromChart: reads top-level Chart.yaml name (NOT dependencies — distinct from readChartYamlPreset). Missing or malformed Chart.yaml returns empty string by contract (caller chains a fallback default). - updateFileWithConfirmation no-prompt happy paths: creates new file with parent directories materialised; skips silently when target exists with byte-identical content (mtime stable). The prompt path requires stdin and is tested separately. Coverage: pkg/commands 26.3% -> 29.7%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): pin extractResourceData metadata and yaml-spec contract Add contract_extractresource_test.go covering the helper that projects a COSI resource (as returned by Talos's gRPC ResourceService) into the plain-data shape that helm template's lookup() function consumes — {metadata: {...}, spec: <yaml-decoded>}. Coverage: - Metadata projection: namespace, type, id, version, phase, owner surfaced as flat string-valued map. The chart helpers reference these keys verbatim — renaming any one would break every helper that does lookup().metadata.X. - Spec is yaml-decoded: scalar strings, ints, and nested maps all round-trip with their Go-typed values intact. - Empty yaml spec returns nil at .spec, no error (yaml.Unmarshal('') yields nil). Chart helpers handle missing-spec via standard 'with' / 'if .spec' patterns. - Malformed yaml surfaces a clean error so a corrupted resource on the wire fails the chart render explicitly. Test fixture uses cosiproto.Unmarshal() to construct a real *protobuf.Resource — same code path Talos's COSI client uses on the client side, so the test exercises the actual unexported-yaml-field reflection machinery that readUnexportedField wraps. Coverage: pkg/engine 76.6% -> 79.0%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(age): pin incremental encrypt edge cases Add contract_internals_test.go covering edge cases of the incremental encryption machinery (mergeAndEncryptYAMLValues + decryptYAMLValuesString) that the round-trip pin in contract_test.go does not touch. Coverage: - decryptYAMLValuesString: * envelope (ENC[AGE,data:...]) round-trip back to plaintext. * Strings without the envelope pass through verbatim (lets the merge logic feed both raw and encrypted values through one helper without a type switch at the call site). * Corrupted base64 inside an envelope errors precisely. - mergeAndEncryptYAMLValues (through EncryptSecretsFile): * New key added to plain secrets.yaml gets encrypted on the next round; existing keys keep byte-stable ciphertext. * Deeply-nested change re-encrypts only the changed leaf; siblings up and down the tree stay byte-stable. * Type change at a key (scalar -> map) falls back to full re-encrypt of that branch — the chart's incremental rule degrades gracefully when the structure shifts. * Slice length change re-encrypts the full list (per-element merge requires stable index mapping, which a length change invalidates). * Slice same-length value change re-encrypts only the changed element; siblings byte-stable. Coverage: pkg/age 62.9% -> 69.5%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): clarify floatingIP-strip test contract comment Address review feedback from gemini-code-assist on pkg/engine/contract_network_multidoc_test.go:339: the previous comment claimed the test picked a VIP that did NOT match any discovery address, contradicting both the trailing comment on line 341 and the test body — which intentionally sets the VIP to the same host as the discovered address so the strip path actually fires. Comment rewritten to describe the real intent. Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): assert no discovery: block on worker templates Address review feedback from coderabbitai on pkg/engine/contract_cluster_test.go:227: the Contract: comment listed 'discovery' in the controlplane-only block enumeration but the test body lacked the assertion. Add assertNotContains(t, out, "discovery:") so the documented contract is also enforced executable. Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): preserve original Config.GlobalOptions.Kubeconfig in gitignore tests Address review feedback from coderabbitai on pkg/commands/contract_chart_gitignore_test.go:235 (and lines 246, 282): three writeGitignoreFile tests unconditionally reset Config.GlobalOptions.Kubeconfig to '' on cleanup, which would clobber any non-empty value an earlier test had set. Capture the original value into a local at test start and restore it from cleanup, matching the pattern already in TestContract_WriteGitignoreFile_CreatesWithRequiredEntries and TestContract_WriteGitignoreFile_KubeconfigBaseNameOnly. Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(age): surface ReadFile errors in changed-value diff test Address review feedback from coderabbitai on pkg/age/contract_test.go:357,366: the test assigned the result of os.ReadFile to first/second with the error discarded, so a failed read would manifest as the misleading 'could not isolate a's ciphertext line' assertion later instead of the real I/O error. Add explicit error checks with t.Fatal. Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): document IPv6 normalizeEndpoint case as known-broken Address review feedback from coderabbitai on pkg/commands/contract_helpers_test.go:61: the IPv6 test case pinned a malformed URL (https://2001:db8::1:6443 — no brackets, not URL-parseable). Keep the case but document it as a known-bug pin via FIXME(#155); when the fix lands the expected value becomes https://[2001:db8::1]:6443 and the test starts asserting the canonical bracketed form. Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): pin root-detection dispatch and kubeconfig-bytes contracts Add contract_root_dispatch_test.go pinning the layer that decides which directory becomes Config.RootDir at command-startup time (file flag -> template flag -> CWD), the talosconfig-path resolution, and the in-memory kubeconfig endpoint rewriter used by rotate-CA. Coverage: - getFlagValues: non-persistent and persistent StringSlice flags, absent flag returns non-nil empty slice (so callers can range without a guard), registered-but-unset returns empty. - detectRootFromFiles / detectRootFromTemplates: empty input yields ('', nil) so callers fall through; positive case under a real project root resolves to that root's abs path. - detectRootFromCWD walks up from CWD; macOS /var/folders symlink resolution handled via filepath.EvalSymlinks fallback. - checkRootConflict: no-op when --root not explicit, no error when explicit matches detected, error names BOTH paths on conflict (guides operator to either drop --root or move files). - DetectAndSetRoot: -f flag wins over CWD; files in two different roots is an explicit error; no flags + no markers leaves Config.RootDir untouched. - DetectAndSetRootFromFiles: happy-path file under root sets Config.RootDir; empty input falls back to CWD detection; --root explicit + files in a different root errors with 'conflicting' substring. - EnsureTalosconfigPath: no-op when --talosconfig is explicit; defaults to <RootDir>/talosconfig when nothing else set; Chart.yaml-supplied relative path anchored to RootDir; absolute path preserved as-is. - updateKubeconfigEndpoint: rewrites every cluster's server: to https://<host>:6443 (drops protocol/port from input, hardcodes 6443); malformed bytes surface parse error. Each test that mutates package-level Config / GlobalArgs uses withConfigSnapshot to capture and restore prior state, so tests can run in any order without leaking. Coverage: pkg/commands 29.7% -> 35.0%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): pin template render and path-resolution contracts Add contract_template_test.go pinning the talm template flow above engine.Render: resolveEngineTemplatePaths converts user-supplied paths into the forward-slash relative form helm engine indexes by; generateOutput composes the modeline + AUTOGENERATED warning banner + rendered config bytes; template(args) prints the result to stdout. Coverage: - resolveEngineTemplatePaths: * absolute path inside root is returned as forward-slash relative (helm engine map keys use forward slashes regardless of OS). * relative-from-CWD that lands inside root is returned relative. * outside-root path with matching <root>/templates/<basename> falls back to the templates/<basename> form (documented fallback so an operator running from a sibling directory still hits the canonical location). * outside-root path with no fallback returns the input normalized through forward slashes. * empty input yields a non-nil zero-length slice. - generateOutput: * happy path composes three sections in order — modeline, AUTOGENERATED warning, engine-rendered body — separated by newlines. The modeline is the first line so subsequent talm invocations against the file pick up nodes/endpoints/templates without explicit flags. * missing template surfaces 'failed to render templates' wrap. * empty templates list errors via engine.Render's 'templates are not set' guard. - template(args): prints the result of generateOutput to stdout unchanged (modeline header + warning + body); propagates the wrapped error from generateOutput without double-wrapping. Helpers: - withTemplateFlagsSnapshot saves and restores templateCmdFlags + GlobalArgs.{Nodes,Endpoints} + Config.RootDir so tests can run in any order. - makeMinimalChart writes a chart fixture with Chart.yaml, values.yaml, templates/config.yaml emitting a worker machine config patch, plus a real serialized secrets bundle so engine.Render's bundle.NewBundle path completes. - loadSharedSecretsYAML generates the secrets bundle once per test process via sync.Once — secrets.NewBundle populates a full PKI tree (~half a second), generating it per-test would dominate suite runtime. - captureStdout redirects os.Stdout to a pipe for the duration of a function, returns whatever was printed. Coverage: pkg/commands 35.0% -> 38.0%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test: fix Windows-only failures in cross-OS contract tests Address Windows CI failures introduced by previous contract-test commits. None of the contracts themselves were wrong — only the test expectations baked in POSIX assumptions (forward-slash paths, LF-only multi-doc separators, and strvals.ParseInto behaviour on drive-letter paths). Fixes: - pkg/commands/root_detection_test.go: TestContract_ResolveSecretsPath builds expected values via filepath.Join so the OS-native separator matches on Windows. - pkg/commands/contract_root_dispatch_test.go: * TestContract_CheckRootConflict_ExplicitConflict: uses filepath.Abs of the path arguments for the substring assertion (Windows resolves /explicit/root to D:\explicit\root). * TestContract_EnsureTalosconfigPath_*: builds expected via filepath.Join. - pkg/engine/contract_loadvalues_test.go: TestContract_LoadValues_SetFileReadsContent skips on Windows. strvals.ParseInto interprets ':' and '\\' in Windows temp paths as separators, splitting the path into multiple keys. The contract holds; the test fixture cannot be expressed cross-platform. - pkg/engine/contract_schema_test.go: * TestContract_Schema_Versions112AndLaterRenderMultidoc matches the literal '---' token instead of '\n---\n' so CRLF-rendered output on Windows still satisfies the contract. * TestContract_Schema_LegacyIsSingleDocument scans line-by-line with TrimRight on \r so a CRLF '---' line is also caught. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): use drive-prefixed absolute paths for Windows CI Follow-up to the previous Windows-fix commit. The crossPlatformAbs helper in contract_root_dispatch_test.go takes care of building an absolute path that satisfies filepath.IsAbs on both POSIX and Windows: filepath.Join("\", "etc", ...) on Windows yields a backslash-rooted path that lacks a drive letter and is therefore NOT IsAbs, so ResolveSecretsPath / EnsureTalosconfigPath wrongly anchor it to RootDir. The helper prepends the volume of the current working directory (typically C: on a CI runner) so the final string is treated as absolute by the OS. Sites updated: - TestContract_CheckRootConflict_ExplicitConflict: uses crossPlatformAbs for both explicit and detected paths. - TestContract_EnsureTalosconfigPath_*: uses crossPlatformAbs for Config.RootDir and the absolute talosconfig fixture. - TestContract_ResolveSecretsPath/absolute_returned_verbatim: uses crossPlatformAbs for the absolute /etc/secrets.yaml fixture. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(commands): pin getClusterNameFromChart values.yaml > Chart.yaml priority Add four new tests covering the values.yaml-takes-precedence behaviour added in #148. The chart-side tests in pkg/engine/render_test.go exercise the template rendering path; this test pins the regenerate-talosconfig flow which reads the cluster name through getClusterNameFromChart in pkg/commands. Coverage: - ValuesYamlOverridesChartYaml: both files exist, values.yaml has a non-empty clusterName -> values.yaml wins. The talosconfig regenerate path picks up the operator's override instead of silently using Chart.Name. - EmptyValuesClusterNameFallsBack: values.yaml present but clusterName: "" (the shipped cozystack/generic default) falls through to Chart.yaml.name. Without this short-circuit a fresh install would resolve to empty and downstream callers would substitute placeholders silently. - AbsentValuesKeyFallsBack: values.yaml has no clusterName key at all (other shape) -> falls through to Chart.yaml. yaml.Unmarshal into the typed struct yields the zero string for a missing field; pin so the resolver treats it the same as an explicit empty. - MalformedValuesFallsBack: a syntax error in values.yaml does NOT poison the lookup; the function silently moves on to Chart.yaml. Operators with a half-edited values.yaml still get a usable regenerate path. The existing ReadsTopLevelName / MissingReturnsEmpty / MalformedReturnsEmpty tests are preserved; their comments updated to clarify they exercise the Chart.yaml fallback when no values.yaml override is in play. Coverage: pkg/commands 38.0% -> 38.2%. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> --------- Signed-off-by: Aleksei Sviridkin <f@lex.la>
|
Thank you! |
This allows users to override the generated cluster name in the case they don't want it to match the chart name.
Summary by CodeRabbit
New Features
Configuration
Tests