Skip to content

v0.7.1

@tlamadon tlamadon tagged this 04 Jun 00:04
Workflows can now declare which stacks they need without restating the
stack's bash by hand. The runtime expands `{stacks: [julia-1.12]}` on
any env rule into a synthetic env rule carrying that stack's `init:`
text, then continues with the rule's own set/append/init as usual.
The wiring is explicit (the workflow author names the stack); the
runtime does NOT auto-install or auto-check (operator's job, via
`scripthut stack install`).

Schema:
- New `stacks: list[str]` field on EnvRule, sibling of `include:`.
  Empty by default — every existing rule keeps working unchanged.

Resolver:
- `runs/env.collect_stacks(config, doc_stacks=None)` merges server-
  config stacks with the source repo's project YAML stacks (project
  wins, mirroring `_merge_configs` / `env_groups`).
- `flatten(rules, groups, stacks)` expands `stacks:` references by
  emitting one synthetic rule per stack, each carrying the stack's
  init text and inheriting the parent rule's `if:` guard via
  `extra_guards` (so `{if: {SCRIPTHUT_BACKEND: mercury}, stacks:
  [cuda]}` only fires on the matching backend).
- An unknown stack name raises ValueError at resolve time. Deliberate
  asymmetry vs. unknown `include:` (which warns + skips): stacks are
  task ↔ environment contracts, and silently running a task without
  the env the author expected is the worst possible failure mode.
- A stack with empty `init:` is a no-op at this layer — legitimate
  (the prep may have produced files the task reads instead of env).
- `resolve_for_task` / `resolve_for_task_detailed` gain a
  `doc_stacks: dict[str, Stack] | None = None` parameter to receive
  the merge result.

Run plumbing:
- New `Run.doc_stacks` field (`dict[str, Stack]`), persisted via
  storage.save_run / load_run alongside `doc_env_groups`.
- `RunManager._build_run` gains `doc_stacks=` and stores it on Run.
- `RunManager.create_run_from_source` merges the source repo's
  project YAML stacks (already loaded by v0.7.0's
  `_load_source_project_config`) into doc_stacks and passes it to
  `_build_run`. The dry-run path mirrors the same merge so previews
  reflect what would actually run.
- `RunManager._resolve_environment` passes `doc_stacks=run.doc_stacks`
  to the resolver — same convention as `doc_env_groups`. The env-
  debug page at /api/v1/runs/.../tasks/.../env also threads it.

Agent prompt (the user explicitly asked for this to be clear):
- New `## Stacks — define once, install once, reference everywhere`
  section walks through:
  - What a stack is (content-hashed, prep-once, init-per-task).
  - Where to define one (server-global vs. per-repo `scripthut.yaml`,
    repo wins on name collision).
  - The full YAML shape with backends / inputs / prep / init.
  - The operator install flow: `stack check`, `stack install`,
    `stack install --source <name>` (the v0.7.0 way for repo-defined
    stacks without cloning), `--rebuild`, `stack delete`.
  - The reference syntax: `{stacks: [name]}` on any env rule, with
    both the JSON task-level form and the YAML env_group form.
  - Multiple stacks + `if:` guards composing on one rule.
  - The two critical gotchas (unknown name → ValueError, no auto-
    install) — agents must NOT suggest `stacks: [name]` without
    first running `scripthut stack check`.

Tests:
- `tests/test_env_resolver.py` — 9 new tests on `flatten`:
  basic expansion, set: still applies after stack expansion,
  unknown stack raises, empty-init no-op, multiple stacks in one
  rule, if-guard inheritance, group+stack+set on one rule compose,
  and `collect_stacks` server vs. doc_stacks override.
- `tests/test_source_project_config.py` — new
  `TestEndToEndStackReference` covers the full path: workflow env
  with `stacks: [name]` resolves to extra_init; repo doc_stacks
  override server stacks on collision; unknown stack raises.
  Plus `test_project_stacks_threaded_to_doc_stacks` confirms the
  helper-to-resolver wiring through `create_run_from_source`.
- `tests/test_agent_prompt.py` — new `TestStackGuidance` class
  pins all five subsections (define, install, reference, gotchas)
  so the briefing can't silently drift.
- Updated `test_env_integration.py` and the existing
  `TestRunEnvLayering` fixtures to set `doc_stacks` on the mock
  Run / capture it from `_build_run`.

500/500 in the broad sweep + 156 in the backend-specific suites pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Assets 2
Loading