fix: Defer env expressions to runtime, add globalArguments validation#351
Conversation
Env expressions (${{ env.* }}) were previously resolved at persist time,
which caused two problems: sensitive environment variables (API keys,
tokens) were written as plaintext to evaluated definition files on disk,
and schema validation was completely bypassed when ANY field contained
a CEL expression.
This change defers env expressions to runtime (matching vault behavior)
and adds runtime globalArguments schema validation after all expressions
are resolved.
Changes:
- Add containsEnvExpression() and containsRuntimeExpression() detection
- Replace containsVaultExpression with containsRuntimeExpression at all
persist-time skip checks (7 call sites across 3 files)
- Add resolveRuntimeExpressionsInDefinition/Data methods that handle
both vault AND env resolution in a single pass
- Keep old resolveVaultExpressionsIn* methods as deprecated aliases
- Add globalArguments schema validation in executeWorkflow() after
definition upgrade and runtime resolution
- Update integration tests to verify env expressions are left raw at
persist time and resolved at runtime
Co-authored-by: Magistr <umag@users.noreply.github.com>
There was a problem hiding this comment.
Code Review: PR #351 - Defer env expressions to runtime, add globalArguments validation
Overall Assessment: APPROVE ✓
This PR addresses two security and correctness issues (#323, #329) with a clean, well-designed solution. The changes follow domain-driven design principles and maintain good separation of concerns.
Blocking Issues
None identified.
Summary of Changes Reviewed
Security Fix (#323): Environment variable expressions (${{ env.* }}) are now deferred to runtime instead of being evaluated at persist time. This prevents sensitive environment variables from being written as plaintext to .swamp/definitions-evaluated/.
Correctness Fix (#329): Added globalArguments schema validation in executeWorkflow() after runtime expression resolution, ensuring type constraints are enforced on fully-resolved values.
Positive Observations
-
Good abstraction: The
containsRuntimeExpression()function cleanly unifies vault and env detection, replacing 7 separate checks with a single semantic concept. -
Backward compatibility: Deprecated aliases (
resolveVaultExpressionsInDefinition,resolveVaultExpressionsInData) allow existing code to continue working while signaling the migration path. -
Comprehensive test coverage:
- Unit tests for
containsEnvExpression()andcontainsRuntimeExpression() - Unit tests for
globalArgumentsvalidation (valid, invalid, and missing schema cases) - Integration tests updated to verify two-phase resolution (persist-time vs runtime)
- Unit tests for
-
DDD compliance: Changes stay within the appropriate domain boundaries:
ExpressionEvaluationServicehandles expression detection and resolution (domain service)DefaultMethodExecutionServicehandles execution-time validation (domain service)- No leaking of persistence concerns into domain logic
-
Clear code comments: The PR updates comments consistently to reflect the new "runtime expressions" terminology.
Suggestions (Non-blocking)
-
Consider a type alias for clarity (optional):
type RuntimeExpressionChecker = (celExpression: string) => boolean;
Could make the pattern more discoverable, but current implementation is clear enough.
-
Edge case coverage (minor): The
ENV_PATTERNuses word boundary (\benv\.) which correctly rejectsmyenv.FOObut would match_env.FOOsince_is a word character. This seems like the right behavior for variable names, just noting it. -
Error message enhancement (minor): The globalArguments validation error message could include which field(s) failed validation for easier debugging. Current implementation uses
formatZodErrorwhich does provide path information, so this is already reasonably good.
Files Reviewed
src/domain/expressions/expression_evaluation_service.ts- Core changessrc/domain/expressions/expression_evaluation_service_test.ts- Unit testssrc/domain/models/method_execution_service.ts- Validation additionsrc/domain/models/method_execution_service_test.ts- Validation testssrc/domain/workflows/execution_service.ts- Updated to usecontainsRuntimeExpressionsrc/cli/commands/model_method_run.ts- Updated method callsrc/cli/commands/workflow_evaluate.ts- Updated to usecontainsRuntimeExpressionintegration/cel_data_access_test.ts- Updated integration testintegration/definition_lifecycle_test.ts- Updated integration test
All changes follow the project's code style guidelines (TypeScript strict mode, named exports, no any types) and include the required AGPLv3 license headers.
…42) (systeminit#1383) ## Summary Removes the legacy `registries.<kind>.failures[]` dual failure surface and makes `aggregateState.sourceDetails[]` the single authoritative source of extension failure state in `swamp doctor extensions`. Closes the W1-W6 rearchitecture story — W7 removes the last vestige of the old failure-reporting model. ### What changed - **Shared failure recorder** (`src/domain/extensions/source_failure_recorder.ts`) — extracted from reconcile's catch block. Both reconcile and buildIndex call the same helper, eliminating the consistency gap where buildIndex bypassed the aggregate. - **buildIndex failure routing** — warm-start catch routes failures through `recordSourceFailure` → `repository.saveAll()`. Cold-start failures also routed, with `originalError` preserving `ValidationError` type for correct state-tag discrimination. - **`markCatalogValidationFailed` removed** — validation failures now throw `ValidationError`, caught by the shared helper. No more direct catalog upserts that bypass the aggregate. - **`registries.failures[]` removed** — per-registry status derived from `sourceDetails[]` failure-state tags (`ValidationFailed`, `BundleBuildFailed`, `EntryPointUnreadable`). - **`extension_load_warnings.ts` split** — kept `emitTypeExtractionFailure` for stderr regex-mismatch warnings; removed `recordLoadFailures`, `getExtensionLoadWarnings`, and the warnings array accumulation. - **`last_error` catalog column** — schema migration adds `last_error TEXT` to `bundle_types`. Error messages persist across process invocations. Clears on recovery to Indexed. - **SQL state-preservation fix** — `upsert` ON CONFLICT UPDATE now conditionally excludes `state` and `last_error` when callers don't provide them, preventing `populateCatalogFromDir` from overwriting reconcile's failure states with `Indexed`. ### JSON schema changes (doctor extensions --json) 1. **Removed**: `registries.<kind>.failures[]` — the field no longer exists on `DoctorRegistryResult` 2. **Added**: `sourceDetails[].lastError?: string` — populated for failure states (`ValidationFailed`, `BundleBuildFailed`, `EntryPointUnreadable`), absent for healthy states ### Migration note Pre-migration catalog rows in failure states get empty `lastError` until the next reconcile pass. Self-heals on first run after upgrade — not a bug, just a transient state. ### Known behavior changes (follow-ups filed) - **Type-extraction warnings** (systeminit#351): models with non-literal `type` fields (e.g. `const TYPE = "..."`) produce stderr warnings but no longer flip doctor exit code. They don't register and don't appear in `sourceDetails[]`. Previously surfaced via the removed `registries.failures[]` pipeline. - **Streaming protocol** (systeminit#352): `kind-completed` events carry preliminary `status: "pass"` during the scan phase; final status is computed from aggregate state in the `completed` event. Internal-only (renderer is the sole consumer). ## Test plan - [x] 5908 unit/integration tests pass (0 failures) - [x] `deno check` / `deno lint` / `deno fmt` clean - [x] 24-point binary verification matrix against compiled `./swamp`: - Healthy model → Indexed, exit 0 - Schema-invalid model → failure state, exit 1, lastError populated - Missing version → ValidationFailed, exit 1 - Recovery: broken → fixed → Indexed, lastError clears - Rebundle loop guard: stable broken source converges (3 runs) - Mixed healthy + broken: healthy registers, broken surfaces - JSON contract: no `failures` field, `lastError` present - Vault failure surfaces in sourceDetails with correct registry status - lastError persists across separate process invocations - lastError absent from sourceDetails after recovery to Indexed - [x] SQL regression tests: upsert without state preserves both state and last_error; upsert with explicit state overwrites both - [x] Integration test: lastError lifecycle (populates on failure, clears on recovery) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Closes #323 and #329.
Problem: Environment variable expressions (
${{ env.* }}) were resolved at persist time, unlike vault expressions which are deferred to runtime. This caused two issues:Secret leakage (Environment variables exposed to CEL expressions via buildEnvContext() #323): Sensitive environment variables (e.g.,
AWS_SECRET_ACCESS_KEY,GITHUB_TOKEN) referenced via${{ env.* }}were resolved during expression evaluation and written as plaintext YAML to.swamp/definitions-evaluated/on disk.Schema validation bypass (Schema validation bypassed when any field contains a CEL expression #329): When ANY field in a definition contained a CEL expression, validation of ALL static fields was completely skipped. Because env expressions were resolved at persist time but the validation gap existed, there was no point where a fully-resolved definition could be schema-validated.
Fix: Defer env expressions to runtime (matching vault behavior), then add runtime
globalArgumentsschema validation after all expressions are resolved. This fixes both issues by design — at runtime, the definition is fully resolved and can be completely validated.What changes for users
.swamp/definitions-evaluated/) will now contain the raw${{ env.HOST }}template instead of the resolved value. This means sensitive env vars are never written to disk.globalArgumentsschema requiresportto be a number but the resolved value is a string, execution will now fail with a clear validation error instead of silently proceeding.Changes
containsEnvExpression()andcontainsRuntimeExpression()as a unified check for vault OR env referencescontainsVaultExpressionwithcontainsRuntimeExpressionat all 7 persist-time skip checks acrossexpression_evaluation_service.ts,execution_service.ts, andworkflow_evaluate.tsresolveRuntimeExpressionsInDefinition()andresolveRuntimeExpressionsInData()that handle both vault AND env resolution in a single pass. OldresolveVaultExpressionsIn*methods kept as deprecated aliases for backward compatibility.globalArgumentsschema validation inexecuteWorkflow()after definition upgrade and runtime expression resolutionTest plan
deno check— type checking passesdeno lint— no lint errorsdeno fmt— formatting cleandeno run test— all 1697 tests pass (0 failures)Co-authored-by: Magistr umag@users.noreply.github.com
🤖 Generated with Claude Code