feat: formalize source repository layout (spec 016, v2.0.0)#28
Conversation
Specifies the formalized CoreOps source repository layout (spec 016): payload-kind directories (quadlet/, systemd/, config/), optional service.yaml with config-root, host overlay mirroring service shape, core-ops skill install (.agents/skills/), reserved name rules, and a hard cut from the legacy quadlets/ parser. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec 016 source repository layout: Phase 1-3 implementation landed.
- /speckit.plan, /speckit.tasks, /speckit.analyze artifacts (plan.md,
research.md, data-model.md, quickstart.md, tasks.md, contracts/).
data-model.md was rewritten mid-implementation to preserve the existing
planner contract; see decision drawer 016-data-model-scope-correction.
- spec.md: FR-002 amended to permit recursive subdirectories under config/
(analyze finding F6).
- src/core/errors.rs: LayoutError enum (10 variants, miette::Diagnostic).
- src/core/types.rs: layout_version: Option<String> on
DesiredStateProvenance; config_root: String on ServiceDefinition.
- src/io/repo.rs: full parser rewrite (1107 -> 1393 LOC). Legacy artifact
rejection (FR-012); reserved-name validation (FR-009); ServiceManifest
deserializer (FR-006/007); HostYaml deny_unknown_fields (FR-008);
PayloadKind + read_payload_units + read_payload_dropins (FR-003/004);
read_config_files with path-traversal rejection (FR-010/002 incl.
recursive subdirs); walk_host_service_overlay rejecting base units
(FR-005, FR-018). Planner contract preserved.
- src/io/state.rs: persists layout_version as Some("1").
- changes/016-source-repository-layout.md: release_intent: major.
- Mechanical updates to integration + unit tests for new struct fields.
cargo build, cargo build --tests, and cargo clippy --all-targets
-- -D warnings all green. cargo test deferred until T118 deletes the
legacy layered_overrides fixture and its 6 dependent test files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Authors the minimal reference source repository for spec 016 layout v1: one service (whoami) with no service.yaml, a Quadlet *.container, a single config file, and a host (example-host) selecting it. Backs US1 acceptance scenario 1 and demonstrates the default-config-root rule (absent service.yaml -> config-root equals svc-id). The example is parser-loadable as-is; T105 will exercise it via tests/integration/test_source_repo_layout.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reference fixture for spec 016 layout v1: a service whose identifier differs from its deployment target. service.yaml declares config-root: traefik so that services/traefik-dnschallenge/config/ deploys to /etc/traefik/, not /etc/traefik-dnschallenge/. Backs US1 acceptance scenario 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reference fixture for spec 016 layout v1: a single service spanning both quadlet/ and systemd/ payload-kind directories, with a base drop-in on each parent unit. Demonstrates multi-payload-kind dispatch (FR-003/FR-004) and lex-sorted drop-in order (FR-014). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reference fixture for spec 016 layout v1: a base service with one container and one config file, plus a host that contributes a drop-in addition (under <unit>.<ext>.d/) and a whole-file config/ replacement. Demonstrates the host-overlay shape (FR-005), drop-in merge order (FR-014, base lex then host lex), and the config whole-file replacement semantic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reverts 8460d8d. The MemPalace plugin and skill have been removed in favor of plain file-based handoffs under ~/.claude/projects/<repo-slug>/memory/. The pyproject.toml / uv.lock / .python-version / .envrc venv-activation lines existed only to host the mempalace CLI inside this Rust repo's .venv and are now dead weight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds validate_config_destination_conflicts in src/io/repo.rs, called from both load_layered_repo and load_layered_desired_state after the host overlay set is filtered. Scans ConfigFileSource entries from every selected service for target_path collisions, then scans the overlay set independently. Host overrides intentionally win over base files at the same target (override semantics) and are not treated as conflicts; cross-checking is therefore deliberately omitted. Quadlet / native unit name collisions remain caught downstream by validate_workloads via DuplicateUnitName. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FR-016 mandates that when host.yaml selects a service whose directory does not exist, the diagnostic names both the host identifier and the missing service identifier. The previous message named only the service. Reformat validate_service_selection's error to include the host id and a pointer to the expected services/<svc>/ directory. Single consumer (validation.rs itself); no callers asserted on the old string. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds tests/integration/test_source_repo_layout.rs with 15 #[test] functions exercising the formalized layout parser: - 4 happy-path tests, one per in-tree example fixture, asserting key unit names and managed_config_paths. - FR-009 reserved-name rejection (services/_admin). - FR-010 nested-config rooting invariant. The literal `..` defense in read_config_files is unreachable through a real filesystem walk; this test documents that and exercises the integration-level invariant FR-010 protects (paths stay rooted under /etc/<root>/). - FR-011 destination conflict (two services with the same config-root colliding on a config file). - FR-012 legacy artifact rejection: top-level quadlets/ and services/<svc>/quadlet-overrides/. - FR-013 orphan drop-in (parent unit absent). - FR-014/FR-015 determinism: load twice, compare every field driving the planner (repository_ref differs by design — internal TempDir per call — so it's excluded from the diff). - FR-016 missing-service diagnostic asserts both host id and service id appear in the message. - FR-017 ×2: malformed service.yaml unknown key + parse error. - FR-018 host overlay base-unit rejection. Each example is materialised via copy_dir_recursive into a TempDir + git init/commit so load_desired_state's clone path is exercised. CORE_OPS_HOST is overridden via HostGuard under path_lock() to coordinate with sibling integration tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ts (T118)
Removes tests/fixtures/layered_overrides/ and the 3 integration tests
that fed off it. Extracts the helpers that test_source_repo_layout
introduced (HostGuard, materialize_skeleton, materialize_example,
git_init_commit, write_host_yaml, load_with_host) into a shared
tests/integration/source_repo_support.rs module — three callers
(plus T105) is past the inline-duplication threshold.
Per-file outcome:
- tests/integration/test_overlay_validation.rs: deleted. Its single
test (orphan drop-in detection) is subsumed by
test_source_repo_layout::orphan_dropin_rejected.
- tests/integration/test_host_overrides.rs: rewritten with an inline
fixture exercising base→host drop-in ordering on a container plus
socket drop-in precedence — neither covered by T105.
- tests/integration/test_service_selection.rs: rewritten with an
inline multi-host fixture exercising host-scoped service selection.
The companion test
(`derives_managed_config_roots_from_config_targets_not_service_names`)
is dropped: T105's example_02_variant_config_root_loads already
asserts that managed_config_roots derives from config target paths,
not service ids (`traefik-dnschallenge` → `/etc/traefik`).
Two files the prior session's handoff listed for migration are
deliberately untouched:
- tests/integration/test_quickstart_validation.rs only matched the
grep because it asserts on the existing
specs/003-layered-overrides/quickstart.md doc; no fixture or layout
reference.
- tests/unit/{test_dropin_order,test_evaluation_determinism}.rs
build HostOverlaySet in memory; per the
016-data-model-scope-correction decision the type is preserved.
Both pass under the new parser without edits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites all 5 tests in tests/integration/test_repo_load.rs against services/<svc>/quadlet/ + hosts/<host>/host.yaml using the shared source_repo_support module. Replaces inline temp-dir / git plumbing with materialize_skeleton + git_init_commit + load_with_host, and adds a load_source_with_host variant in source_repo_support.rs for the file:// URL test (existing load_with_host takes &Path). Per-test changes: - loads_desired_state_from_quadlet_dir: same alpha.container shape but rooted under services/alpha/quadlet/. - loads_desired_state_from_git_url_fixture: same fixture, loaded via file:// URL through load_source_with_host. - ignores_dotfiles_and_warns_on_unknown_extensions → tolerates_dotfiles_in_service_directory: the unknown-extension half is dropped because the new parser is strict (unknown files in payload-kind dirs raise InvalidPayloadKindFile, files at service root raise LegacyArtifact). The dotfile-tolerance half is preserved with .gitkeep at the service root and .DS_Store inside quadlet/. - layered_repo_preserves_requested_repository_and_ref: identical semantics, now goes through load_with_host. - loads_mount_declarations_and_dependencies_from_quadlet_only_repo: container under quadlet/, mount under systemd/ per the payload-kind split, both owned by the same `immich` service id so the mount dependency resolves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites plan_orders_volume_before_container_before_socket on a single `triad` service holding all three artifacts split by payload kind: alpha.container and gamma.volume under quadlet/, beta.socket under systemd/. The planner orders globally (volume → container → socket), not per-service, so a single service exercises the same code path as the prior three-file fixture. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites the local init_git_repo helper to build services/alpha/quadlet/alpha.container plus hosts/example-host/host.yaml under the formalized layout, delegating to source_repo_support::git_init_commit. The three apply/plan tests that exercise state-snapshot persistence (which run the actual CLI entry points and therefore call load_desired_state internally) gain a HostGuard + CORE_OPS_HOST=example-host setup so the parser can resolve the host. The other 9 tests in this file are doc / format / in-memory-state tests that don't load a repository and stay untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites commit_quadlet and init_git_repo_with_two_commits to build services/alpha/quadlet/alpha.container plus hosts/example-host/ host.yaml under the formalized layout. host.yaml is committed once during init; the two revs differ only in the alpha.container payload. Both rollback tests gain HostGuard + CORE_OPS_HOST= example-host setup so apply_with_report and execute_rollback_with_ report can resolve the host through the parser. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites init_git_repo to build services/alpha/quadlet/alpha.container plus hosts/example-host/host.yaml under the formalized layout. The agent test gains HostGuard + CORE_OPS_HOST=example-host setup so the internal load_desired_state call can resolve the host. Switches the path_lock acquisition from .expect to .unwrap_or_else(into_inner) to match the poison-resilient pattern used in the other apply/plan integration tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites init_git_repo to take a host name parameter and build services/alpha/quadlet/alpha.container plus hosts/<host>/host.yaml under the formalized layout. Updates all 10 call sites accordingly: 9 pass "example-host", 1 (cli_plan_uses_host_override_for_scope_*) passes "kadath" to match its CORE_OPS_HOST=kadath assertion. Adds HostGuard + CORE_OPS_HOST=example-host setup to the 6 tests that previously omitted it (the parser now requires a hosts/<id>/ declaration). The CLI subprocess test (cli_plan_json_stdout_remains _machine_parseable_with_audit_dir) gets CORE_OPS_HOST passed through the .env() builder so the child binary can resolve the host. Switches every path_lock acquisition in this file from .expect to .unwrap_or_else(into_inner) so a panic in any test no longer poison-cascades the rest. plan_does_not_apply_changes previously ran without the lock at all and panicked first; it now holds the lock too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…T119)
Adds source_repo_support::init_alpha_repo (the canonical
services/alpha/quadlet/alpha.container + hosts/<host>/host.yaml
fixture every "is the loader/planner/applier basically working?"
integration test relies on) and migrates every consumer in one
sweep. Each file's local init_git_repo collapses to a one-line
delegation, every existing path_lock acquisition is upgraded to
the poison-resilient .unwrap_or_else(into_inner) form and gains a
HostGuard + CORE_OPS_HOST=example-host setup so the parser can
resolve the host.
Files migrated:
- test_apply_report.rs (5 tests using the alpha fixture)
- test_reconcile_apply.rs (3 tests)
- test_reboot_recovery.rs (2 tests)
- test_idempotence.rs (1 test)
- test_verification.rs (1 test)
test_verification_rules.rs and test_quadlet_artifacts.rs build a
container + socket + volume mix that doesn't fit init_alpha_repo's
single-artifact shape; they get an inline "triad" service
(quadlet/{alpha.container, gamma.volume} + systemd/beta.socket)
under services/triad/ instead of the helper.
One test in test_reconcile_apply pre-stages alpha.container in the
host quadlet dir to simulate an already-converged-but-inactive
runtime; updates the pre-staged content to match init_alpha_repo's
trailing-newline form so the planner classifies it as a recovery
StartUnit, not a WriteQuadlet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e (T119)
Two integration suites whose fixtures don't fit init_alpha_repo's
single-artifact shape:
test_performance: rewrites init_git_repo as a `bench` service
holding `count` workloadN.container files under
services/bench/quadlet/, with hosts/example-host/host.yaml
selecting that single service. Both tests gain HostGuard +
CORE_OPS_HOST=example-host.
test_reconcile_provenance: rewrites init_git_repo to lay down
hosts/example-host/host.yaml + git init -b main + initial
commit_quadlet, and rewrites commit_quadlet to write
services/alpha/quadlet/alpha.container with parameterized
contents (used by tests producing two-revision histories).
desired_state_provenance_remains_host_scoped's count assertion
goes from 4 to 5: spec 016's T004 added an optional layout-version
field to DesiredStateProvenance, and build_state populates it with
Some("1"), so the host-scoped serialized object now carries one
more key. Adds an explicit assertion on the new key's value to
make the schema-version intent visible.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…out (T119)
test_rollback: two test setups built fixtures via repo_dir.join(
"quadlets/...") instead of the canonical let-quadlets-dir =
join("quadlets") pattern, so they slipped past the earlier sweep.
Both rewrites place alpha.container under
services/alpha/quadlet/ and add a hosts/kadath/host.yaml selecting
alpha (matching the test's existing host:kadath scope id).
test_mount_reuse: more involved migration. The fixture had files
directly under services/immich/ (legacy intermediate layout —
files at service root, no payload-kind subdir) and host overlays
under hosts/<host>/overrides/<unit>.<ext>.d/ (legacy host-overlay
layout). Both reshape to the formalized layout:
- services/immich/quadlet/immich.container
- services/immich/systemd/var-lib-immich-media.{mount,automount}
- services/immich/systemd/var-lib-immich-cache.mount
- hosts/<host>/immich/systemd/<unit>.<ext>.d/<file>.conf
(host overlays now mirror per-service shape directly under
hosts/<id>/<svc>/, no overrides/ segment). Both tests' assertions
(mount_declarations, dependency wiring, X-CoreOps field rejection)
remain unchanged — only the on-disk fixture shape changed.
Switches the local init_repo / inline init helpers to
source_repo_support::git_init_commit so every fixture commit goes
through the same authoring path. test_mount_reuse:init_repo
parameter relaxes to &Path to satisfy clippy::ptr_arg.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every test that holds path_lock used .expect("path lock") (or
.expect("lock env") in test_mount_reconcile), which panics when
the mutex is poisoned by an upstream test's own panic. With
hundreds of integration tests sharing this lock, a single panic
poisons every downstream test, turning one failure into a
cascade of dozens of misleading PoisonError reports.
Switches every site to .unwrap_or_else(|err| err.into_inner()).
The lock's invariant — "only one test mutates CORE_OPS_HOST at
a time" — is preserved: the mutex still serializes access. Only
the panic-recovery path changes: poisoned locks are now reused
rather than re-panicking.
Sweeps source_repo_support::load_source_with_host (which
load_with_host delegates to) along with eight integration test
files that were missed in the earlier per-file migrations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The three resolve_explain_target_* unit tests in src/cli/explain.rs all mutate the process-global STATE_FILE_ENV variable without any coordination. Running them in parallel under cargo test --lib produces a race where one test reads another test's setup; the failing test is non-deterministic (different test fails on different runs). Adds a module-local OnceLock<Mutex<()>> and acquires it at the top of each test via .unwrap_or_else(into_inner) so a panicking test does not poison the lock for the rest. Fixes intermittent cargo test failures that pre-existed but only surfaced once the broader integration test suite was made green by the spec/016 layout migration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backfills the Phase 3 (US1) checkboxes that should have been flipped to [X] as each task shipped. Each entry now records the verification path or the implementation outcome so the task list stands as a useful summary on its own (e.g. T114 is verified transitively, not implemented; T115 needed a parser change and points at the commit; T118's "6 dependents" was actually 3 in practice; T119 grew to a 15-test migration plus two cross-cutting fixes). Phase 3 / US1 is complete; T106 (VM scenario) remains [ ] per its Phase 6 release-gate placement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…201, T202) T201 — specs/016-source-repository-layout/skill/SKILL.md: the canonical authoring guide for a CoreOps source repository. Twelve sections covering repository shape, identifier rules, service.yaml and host.yaml schemas, the payload-kind dispatch table, host-overlay semantics, drop-in conventions, the five validation rules (FR-009..FR-013), determinism, legacy artifacts (FR-012, with replacement guidance), four worked walk-throughs producing the in-tree example shapes, and a ten-point authoring checklist. Self-contained: every example tree is inlined verbatim so the skill stands alone after install at .agents/skills/core-ops-source-repo/SKILL.md. T202 — tests/integration/test_skill_install.rs: the six named test functions from contracts/skill-cli.md §Test contract. Tests run the binary as a subprocess via CARGO_BIN_EXE_core-ops to exercise clap parsing, file-writer side effects, and the --print stdout encoding the same way an operator would. --global tests hold path_lock() and override HOME to a tempdir through a HomeGuard helper that restores the prior value on drop. Each test fails pre-implementation (clap rejects the unknown subcommand) — confirmed before implementation begins per the spec kit's TDD discipline. Also ticks T201/T202 in tasks.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
T203: extends `Commands` in src/cli/args.rs with `Skill(SkillArgs)`. SkillArgs wraps a SkillOp subcommand whose only variant today is Install(SkillInstallArgs); SkillInstallArgs carries the mutually-exclusive --global / --print flags via clap's conflicts_with. T204: src/cli/skill.rs embeds the bundle at compile time (SKILL_BUNDLE, currently a single SKILL.md entry) via include_bytes! relative to src/cli/skill.rs. run() resolves destination from the flags — <cwd>/.agents/skills/core-ops-source-repo/, or $HOME/.agents/skills/core-ops-source-repo/ under --global, or stdout under --print — and writes (or streams) each entry byte-identically. Idempotent on byte-identical re-installs; raises a miette diagnostic naming the offending path when the existing tree has diverged content. The print stream uses the contract's "==> <relative-path> <==" header format. T205: pub mod skill in src/cli/mod.rs and dispatch added in main.rs::run for Commands::Skill. miette::Report errors print via Debug formatting and exit(1), bypassing the standard CoreError pipeline so the contract's "errors are surfaced as miette diagnostics" guarantee is preserved verbatim. T206: cargo test green (28 lib unit + 415 integration); cargo clippy --all-targets -- -D warnings green. The six contract tests in tests/integration/test_skill_install.rs all pass; the three module-local unit tests in src/cli/skill.rs (print stream header check, idempotent install, divergent-content refusal) also pass. Closes Phase 4 / US2. The bundle is installable into any cwd or under $HOME, the binary needs no runtime access to the source repository, and the path standard is agentskills.io (.agents/skills/<skill-name>/) — never .claude/skills/ — per FR-021 and the feedback_vendor_neutral_skills memory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
T301: tests/fixtures/legacy_source_repo/ — minimal legacy-shape
fixture covering every transformation in research.md D10. Two
services (`traefik`, `traefik-dnschallenge`) and one host
(`example-host`) exercise: kind reassignment (quadlet/<unit>.socket
→ systemd/), split-overrides flattening (quadlet-overrides/...d
→ quadlet/...d), config etc-mirror flattening
(config/etc/<root>/ → config/), variant config-root requiring
synthesized service.yaml, host drop-in re-rooting under
hosts/<h>/<svc>/quadlet/, and host config override re-rooting
under hosts/<h>/<svc>/config/. expected-destinations.txt
records the post-migration plan destination set.
T302: tests/integration/test_migrate_legacy.rs — four contract
tests: (a) the migrated tree loads via the new parser; (b) the
destination set matches expected-destinations.txt; (c) the
script is idempotent (FNV-1a tree-snapshot equality before
and after the second run); (d) the variant service emerges
with service.yaml declaring config-root: traefik. Tests run
the script via std::process::Command against a TempDir copy
of the fixture so the read-only fixture survives.
T303: scripts/migrate-legacy-source-repo.sh implements the D10
mapping in two phases:
Phase 1 (per-service):
1.a quadlet/<unit>.<systemd-ext> → systemd/<unit>.<ext>
1.b quadlet-overrides/<u>.<e>.d/<f> → quadlet/<u>.<e>.d/<f>
1.c config/etc/<root>/<rel> → config/<rel>; if <root> != <svc>,
synthesize services/<svc>/service.yaml with
config-root: <root>
Phase 2 (per-host):
2.a overrides/quadlet/<u>.<e>.d/<f> →
<svc>/quadlet/<u>.<e>.d/<f>, where <svc> is resolved by
scanning every services/<s>/quadlet/<u>.<e> and
services/<s>/systemd/<u>.<e>; multi-owner units fail
loudly with manual-resolution guidance.
2.b overrides/config/etc/<root>/<rel> →
<svc>/config/<rel>, where <svc> is resolved by matching
<root> to either svc-id (no service.yaml) or
service.yaml's config-root.
File-moves only via mv; legacy scaffolding is rmdir-ed after
each move so a second run is a no-op.
T304: cargo test green (28 lib unit + 419 integration = 447);
cargo clippy --all-targets -- -D warnings green. shellcheck
isn't installed in this dev env; CI's e2e-gate.yml is the
binding shellcheck gate.
Closes Phase 5 / US3. The live ulthar repo can now be migrated
with one command.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(T401–T405, T407)
T401 — Cargo.toml version 1.0.0 → 2.0.0; the fixture
tests/fixtures/provenance_state/valid-success.json (which the
controller_version_provenance_matches_cargo_package_version
contract pins) tracks the bump.
T402 — the release-intent fragment summary was terse ("formalize
source repository layout; remove legacy parser; add core-ops
skill install") and missed the in-tree examples and the
migration script. Per the task's own directive ("if anything
is missing, fix the fragment, not the changelog"), the summary
now enumerates all five user-visible deliverables. The
machine-managed [Unreleased] block in CHANGELOG.md is synced
to match the rendering.
T403 — core-ops-release validate --base-ref HEAD^ passes:
required-bump=minor, declared-bump=major (allowed; the public
CLI surface lost the legacy parser code paths and gained the
skill subcommand, both of which justify a major).
T404 — cargo test green (28 lib unit + 419 integration = 447
pass); cargo clippy --all-targets -- -D warnings green.
T405 — quickstart walkthrough verified through the existing
test surface plus a manual --help check on `core-ops skill
install`. The four in-tree examples + the migration test
cover authoring steps 1–5; the integration suite covers the
plan/apply/skill-install CLI surface in steps 6–7.
T407 — AGENTS.md gains a leading "Recent Changes" entry for
spec 016 noting the layout formalization, the skill subcommand,
the in-tree examples + migration script, and the version bump.
T406 (VM-backed scenario, release-blocker) remains deferred —
it requires a libvirt-managed VM via the existing
core-ops-verify run harness which is not available in this
dev environment. CI's e2e-gate.yml is the binding gate; the
release tag should not be cut until T406 passes against a
libvirt host. Documented as exempted-with-justification in
tasks.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds tests/fixtures/verification/scenarios/source-repo-variant-config-root.yaml
plus a single-revision repo fixture under
tests/fixtures/verification/repos/source-repo-variant-config-root/
source-repo-variant-config-root-v1/ — services/traefik-dnschallenge
with service.yaml declaring config-root: traefik, plus a config/
traefik.yaml payload, a placeholder Quadlet container, and a
hosts/example-host/host.yaml selecting the service.
The scenario uses the existing repository_evolution flow (the only
materialize path the harness actually wires up — every other
scenario also uses it). To pin the parser's host resolution
deterministically, the apply step passes host: example-host as a
clap --host override rather than relying on the auto-generated
VM hostname `<guest_name>-<scenario_id>`.
Four guest_command steps assert the externally observable
behavior of FR-006/FR-007:
- /etc/traefik/traefik.yaml exists (config-root drives destination)
- the file contains "level: INFO" (right file, not stale)
- /etc/traefik-dnschallenge does NOT exist (no double-deploy via svc-id)
- /etc/containers/systemd/traefik-dnschallenge.container exists
(the Quadlet itself is keyed by svc-id, independent of config-root)
The full assertion set is seven (init/apply success + apply
converged + the four filesystem checks). The scenario passes the
harness's --synthetic dry-run mode, which validates corpus
parsing and scenario structure without touching libvirt. Real
VM execution is T406.
YAML quoting note: the description, behavioral_claim, and one
failure_message field needed explicit quoting because they
embedded `: ` substrings inside backticks/double-quotes and
serde_yaml interpreted those as flow-mapping delimiters in the
unquoted form.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 53404ab1c8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Two more Codex P1s on PR #28 53404ab — both load-bearing for any legacy repo with systemd-kind drop-ins. Phase 1.b previously routed every `services/<svc>/quadlet-overrides/ <unit>.<ext>.d/` into `services/<svc>/quadlet/<unit>.<ext>.d/`, regardless of the unit's extension. But Phase 1.a moves systemd-kind units (.socket, .mount, .timer, .target, .path, .automount) from quadlet/ into systemd/. So a legacy `quadlet-overrides/foo.socket.d/` ended up at `services/<svc>/quadlet/foo.socket.d/` while the base unit landed at `services/<svc>/systemd/foo.socket`. The new parser's cross-kind drop-in check (added to address an earlier Codex P2) then rejects the migrated tree as `InvalidPayloadKindFile`. Result: "successful" migration produces an unloadable repo for any legacy repo carrying socket/mount/timer/target/path drop-ins. Phase 2.a had the same bug for host overrides: `hosts/<h>/overrides/quadlet/foo.socket.d/` always re-rooted to `hosts/<h>/<svc>/quadlet/foo.socket.d/` regardless of kind. Both phases now classify each drop-in by its target unit's extension (using the existing `is_systemd_ext` / `is_quadlet_ext` helpers) and route to the matching subtree. Unrecognized extensions fail loudly. Tests: tests/integration/test_migrate_legacy.rs:: migration_routes_systemd_kind_dropins_to_systemd_subtree builds a fixture with both service-level and host-level drop-ins targeting both a container (Quadlet kind) and a socket (systemd kind), exercising both phases. Asserts the migrated paths land in the correct subtree AND that the resulting tree loads cleanly through the new parser end-to-end. Verification: 31 lib unit + 429 integration = 460 pass; clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 923e7286b5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Two more Codex catches on PR #28 923e728. P1 — Phase 2 only handled `hosts/<h>/overrides/quadlet/<unit>.<ext>.d/` (intermediate legacy shape) and skipped the spec-003 original shape where drop-ins live directly at `hosts/<h>/overrides/<unit>.<ext>.d/`. Repos in that bare shape had every host drop-in skipped silently; the leftover `overrides/` directory then tripped the new loader's hard-fail on legacy artifacts. "Successful" migration produced an unloadable repo. The Phase 2.a logic is extracted into a `migrate_host_dropin` bash helper (DRY: routing-by-kind logic was duplicated otherwise) and called from three loops: - bare `overrides/<unit>.<ext>.d/` - `overrides/quadlet/<unit>.<ext>.d/` - `overrides/systemd/<unit>.<ext>.d/` (some legacy variants split by kind here too; routing decision is per-unit anyway, so we treat the source subdir as advisory) The known phase-2.b directories (config/) and phase-2.a partner directories (quadlet/, systemd/) are excluded from the bare-shape loop so we don't double-process them. P2 — `grep -q "^config-root: ${config_root}\b"` interpolated user data into a regex without escaping. Identifiers may legitimately contain `.`, which `grep` treats as a wildcard: a host override under `overrides/config/etc/a.c/` would match a `service.yaml` declaring `config-root: aYc`, silently routing the override to the wrong service. The two grep call sites are replaced with a `service_config_root` helper that parses `service.yaml` via awk and emits the literal config-root value. Comparison happens with `[[ "$declared" == "$config_root" ]]` — fixed-string, no regex interpretation. Tests: tests/integration/test_migrate_legacy.rs gains: - migration_handles_bare_overrides_dropin_shape: spec-003 shape (overrides/<unit>.<ext>.d) for both a container and a socket; asserts kind-aware routing AND that overrides/ is cleaned up AND that the migrated tree loads through the new parser. - migration_config_root_match_is_literal_not_regex: service.yaml with config-root: aYc and a host override under overrides/config/etc/a.c/. Pre-fix the buggy grep would have silently matched (`.` as wildcard); post-fix the script errors with "no matching service" because the literal compare correctly sees aYc != a.c. Verification: 31 lib unit + 431 integration = 462 pass; clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e0795c8e9c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…des/ Codex P2 on PR #28 e0795c8: Phase 2.a's bare-shape iteration silently skipped any directory under hosts/<h>/overrides/ that didn't end in `.d`. So a typo like overrides/web.container.dropin/ (note: `.dropin`, not `.d`) was passed over, the script reported success, and the leftover overrides/ scaffold then tripped the new loader's hard-fail on legacy artifacts. Operator sees "migration succeeded" plus an unrelated "legacy artifact: hosts/host-a/overrides" load error. Two fail-loudly checks added: 1. In the bare-shape loop, an entry under overrides/ that is NOT in {quadlet, systemd, config} and does NOT end in `.d` is now an explicit error naming the entry and the expected pattern. Also rejects regular files at this scope (e.g., a stray README.md under overrides/) which were previously silently skipped too. 2. The final `rmdir overrides/` cleanup previously suppressed failures via `2>/dev/null || true`. Now it succeeds-or-fails explicitly: if the directory is non-empty after all phase-2 handlers ran, the script lists the remaining entries and exits 65, surfacing the issue rather than hiding it. Tests: tests/integration/test_migrate_legacy.rs:: migration_rejects_typoed_host_override_directory builds the typo case Codex described (overrides/web.container.dropin/) and asserts the script exits non-zero with a diagnostic that names the offending directory and the expected `<unit>.<ext>.d/` pattern. Verification: 31 lib unit + 432 integration = 463 pass; clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2cd5a49d8e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…s fixtures Codex P1 on PR #28 2cd5a49: the previous fix routed Timer/Target/Path to systemd_unit_dir on the WRITE path (target_dir_for_workload), but target_dir_for_name and unit_name_for_start_stop's extension allow- lists still only knew about service|socket|mount|automount. So a removal of a desired-removed *.timer (workload no longer in desired, file still on disk) couldn't resolve either the file location (for RemoveQuadlet) or the runtime unit name (for StopUnit), failing with MissingWorkload. Adds timer/target/path to both allow-lists. Two unit tests in src/io/apply.rs::tests guard the symmetry. E2E gate fix: every accepted-corpus fixture under tests/fixtures/verification/repos/ used the legacy layout (flat quadlets/<unit> for frontend/mount fixtures, services/<svc>/<unit> at root + config/etc/<root>/ for config-change-history). Under v2.0.0's hard-cut parser these fixtures fail to load, breaking the e2e gate against this PR's branch. Migrated all six fixtures to the formalized layout: - frontend-history (3 revs): quadlets/frontend.container → services/frontend/quadlet/... + hosts/test-vm/host.yaml selecting frontend. - mount-blocked-history, mount-reboot-history (1 rev each): quadlets/{immich.container, var-lib-immich-media.mount} → services/immich/quadlet/immich.container + services/immich/systemd/var-lib-immich-media.mount + hosts/test-vm/host.yaml selecting immich. - mount-removal-history (2 revs): v1 same as mount-blocked; v2 (post-removal) drops the .keep marker for an empty services/ tree and host.yaml with services: [] (host applies no service). - partial-verification-history (1 rev, two containers): services/frontend/quadlet/frontend.container + services/transient/quadlet/transient.container + hosts/test-vm/host.yaml selecting both. - config-change-history (2 revs): services/runner/runner.container → services/runner/quadlet/... + flatten config/etc/runner/runner.env → config/runner.env (the host.yaml selecting runner already existed). Eight scenario YAMLs that drove apply against these fixtures (accepted-mount-blocked-failure, accepted-mount-convergence-reboot, accepted-mount-removal-ordering, accepted-layered-convergence- idempotency, accepted-layered-upgrade-transition, accepted-partial- apply-verification-failure, minimal-accepted, minimal-candidate) now pass `host: test-vm` to their apply steps. Without the override, the parser would resolve the host id from the runtime hostname `<guest_name>-<scenario_id>`, look for `hosts/<that>/`, and fail with MissingHostDeclaration since the fixtures use a stable `test-vm` directory. accepted-config-change-restart already had `host: test-vm` in place; source-repo-variant-config-root has `host: example-host` (matching its fixture). tests/unit/test_verification_execution.rs::execution_plan_preserves _step_order_and_default_timeouts updated to match the new apply command form (`--host test-vm` now part of every apply command). Verification: 31 lib unit + 432 integration = 463 pass; clippy clean. e2e validation pending — re-running on the libvirt host now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /.claude/ directory contains Claude Code session-internal state (transient task locks, scheduled-task tracking) that should never be versioned. The previous commit accidentally tracked .claude/scheduled _tasks.lock; this gitignores the directory so it can't recur (and the next commit can keep the tree clean of session leakage). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2f0885f55
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The new parser hard-requires `services/` at the repo root. My mount-removal-history/demo-mount-remove-v2 (the post-removal state) only had hosts/test-vm/host.yaml with services: [], so apply on the live VM failed with "missing services dir". Adds an empty services/.keep so the directory exists for the parser; the empty host.yaml still drives the actual removal semantics. 10/11 e2e scenarios on ulthar already pass with the migration; this unblocks the last one. Re-running. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two more Codex P1s on PR #28 c5b3aed. P1 — `read_native_passive_units` only included files matching desired-set membership. After the operator removes a *.timer/.target/ .path from the source repo, the file persists on disk and the runtime unit keeps running indefinitely: the planner has no observed-only workload to pair with the absent desired entry, so it never emits StopUnit / RemoveQuadlet. The fix mirrors the existing `read_native_mount_units` stale-discovery path. `workload_from_artifact` now injects `# managed-by: core-ops` into Timer/Target/Path contents (in addition to Socket) on write via a new `normalize_native_unit_contents` helper that wraps the existing socket logic. `read_native_passive_units` then surfaces a file under either condition: it's in desired (regular reconcile pair) OR it carries the managed-by marker but has been dropped from desired (stale CoreOps leftover — surface so planner can remove it). User-authored unrelated *.timer files in /etc/systemd/system stay ignored (no marker, no desired-set membership). P1 — Migration script's bare `mv` operations silently overwrote pre-existing destination files. Phase 1.c (config etc-mirror flatten), Phase 2.b (host config override re-rooting), and three other mv sites were all unsafe. A partially-migrated tree where the operator hand-edited the post-migration path would have its work clobbered. Adds a `safe_mv` helper that errors out (exit 65, naming both src and dst) if the destination already exists. Replaces all 5 `mv --` sites with `safe_mv`. Tests: - src/io/observed.rs::tests::passive_units_with_managed_marker_ surface_when_dropped_from_desired: empty desired set + two *.timer/.target files carrying the marker + one user-authored unmarked *.timer. Asserts the marked files surface (so the planner can remove them) and the unmarked one stays ignored. - tests/integration/test_migrate_legacy.rs::migration_refuses_to_ clobber_existing_destination: a pre-staged operator `services/runner/config/runner.env` plus a legacy `services/runner/config/etc/runner/runner.env`. Asserts the script errors with "refusing to overwrite" AND that the operator's hand-edit content is untouched. Verification: 34 lib unit + 433 integration = 467 pass; clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Lines 386 to 389 in 94567a6
validate_id
The loader enforces identifier rules for service IDs and config-root, but host IDs are never checked against the same regex/reserved-prefix policy before constructing HostDeclaration. As a result, repos with hosts/<id>/ names like _metadata or names containing disallowed characters are accepted even though the formalized layout contract says host identifiers must follow the same pattern; this can silently admit invalid trees instead of failing fast. Add validate_id on the resolved host directory name (and/or parsed host value) in this path.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codex P2 on PR #28 94567a6: load_host_declaration_inner accepted any filesystem-valid directory name as a host id, while service ids and config-roots both go through validate_id (FR-009 reserved-prefix check + the [A-Za-z0-9][A-Za-z0-9._-]* identifier pattern). So hosts/_metadata/ (reserved prefix) or hosts/foo bar/ (invalid char) trees would silently load even though the formalized layout contract requires host identifiers to follow the same rules. Adds a single validate_id(host_name) call in load_host_declaration_inner — placed BEFORE the parsed.host == host_name check so the diagnostic identifies the identifier problem rather than the symptom. Tests: tests/integration/test_source_repo_layout.rs gains: - host_id_with_reserved_prefix_rejected: hosts/_metadata/ → RepoError::ReservedName(_metadata) - host_id_with_invalid_chars_rejected: hosts/host with space/ → RepoError::InvalidIdentifier("host with space") Verification: 34 lib unit + 435 integration = 469 pass; clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f3dfa20926
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codex P2 on PR #28 f3dfa20: validate_no_legacy_root_artifacts only checked for the legacy `quadlets/` directory and otherwise accepted arbitrary top-level entries. The contract (contracts/layout.md) names exactly two content directories at the repo root — services/ and hosts/ — plus anything in the reserved `_*` / `.*` namespace. So a typo like `<repo>/servcies/` (transposed) silently dropped the operator's intended services tree because the loader walked the empty `services/` and reported no workloads. Extends the validator with a directory scan that rejects any top-level directory not named `services`, `hosts`, or beginning with `_` / `.`. Files at the repo root remain tolerated per the contract — README.md, LICENSE, CHANGELOG.md, etc. don't need to be renamed to satisfy the strict layout. Tests: - typoed_top_level_directory_rejected: real layout PLUS a `<repo>/servcies/` dir; loader rejects with LegacyArtifact pointing at the typo. - top_level_files_and_reserved_dirs_are_tolerated: README.md, LICENSE (regular files) and `_local/`, `.cache/` (reserved-prefix dirs) coexist with services/ and hosts/ and must NOT trip the new check. Symmetry guard against over-strictness. Verification: 34 lib unit + 437 integration = 471 pass; clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
quadlet/,systemd/,config/) at each service root, an optionalservice.yamldeclaringconfig-rootfor variant services, and host overlays that mirror the service shape directly underhosts/<host-id>/<svc-id>/. Hard cut: legacyquadlets/,quadlet-overrides/, andhosts/<h>/overrides/are rejected at load time with diagnostics pointing at the migration script.core-ops skill install [--global] [--print]writing a vendor-neutral agent-skill bundle to.agents/skills/core-ops-source-repo/(agentskills.io standard). Independent ofcore-ops init.scripts/migrate-legacy-source-repo.shfor one-pass conversion of the live legacy repository. Major version bump 1.0.0 → 2.0.0.Test plan
cargo test— 28 lib unit + 419 integration = 447 passcargo clippy --all-targets -- -D warnings— cleancargo run --bin core-ops-release -- validate --base-ref master— passed (required=minor, declared=major)source-repo-variant-config-rootagainstCORE_OPS_VERIFY_VM_HOST=ulthar— 7/7 assertions in ~60s./etc/traefik/traefik.yamldeployed;/etc/traefik-dnschallenge/correctly absent. Run IDrun-1777662397344944784-accepted-corpus.ci.yml(matrix build) green on PR heade2e-gate.ymlgreen against the protected runnerv2.0.0; CI rewritesCHANGELOG.mdand publishes the release with the[Unreleased]block as release noteschanges/016-source-repository-layout.mdper the fragment lifecycle in CLAUDE.mdscripts/migrate-legacy-source-repo.sh ~/code/ulthar/repoon a feature branch first; verifycore-ops planproduces the expected destination set before merging the migrated tree🤖 Generated with Claude Code