feat(serialization,flow): harden the untrusted flow-file boundary (#345, #416, #400, #340) by dgenio · Pull Request #474 · dgenio/ChainWeaver

dgenio · 2026-06-25T05:44:21Z

Summary

Hardens the untrusted flow-file input boundary — the primary untrusted
surface (repos, contributor PRs validated by the GitHub Action, generated
drafts). Four tightly-coupled issues that share the flow deserialization path
(serialization.py + flow.resolve_class_ref + the CI Action + the fuzz
harness) and explicitly cross-reference one another as belonging in one PR:

Add parse-size and structural guardrails for untrusted flow files #416 — parse-size & structural guardrails so a hostile file cannot exhaust memory/CPU before validation.
Add an opt-in allowlist hook for schema-ref module resolution #345 — opt-in allowlist for schema-ref module resolution, the code-execution vector on the same load path.
Add an adversarial regression corpus for malformed flow files #400 — an adversarial corpus proving every malformed shape maps to a clean, typed failure (the regression net for Add parse-size and structural guardrails for untrusted flow files #416/Add an opt-in allowlist hook for schema-ref module resolution #345).
Add a scheduled fuzz-testing job to CI #340 — a scheduled fuzz job that exercises the same loaders continuously.

Changes

Add parse-size and structural guardrails for untrusted flow files #416 chainweaver/serialization.py — add FlowParseLimits
(max_bytes/max_nodes/max_depth/max_string_length/max_steps) +
DEFAULT_PARSE_LIMITS, applied automatically in flow_from_json /
flow_from_yaml / flow_from_dict. A byte-size precheck runs before parse; a
bounded, iterative structural walk caps node count (also bounding YAML
alias/anchor expansion), depth, and string length; the step count is checked.
Each violation raises FlowSerializationError naming the limit. Overridable
via limits=; FlowParseLimits.unlimited() opts out for trusted input.
Add an opt-in allowlist hook for schema-ref module resolution #345 chainweaver/flow.py + exceptions.py — a ContextVar-backed
schema-ref policy (set_schema_ref_policy, schema_ref_policy context
manager, SchemaRefAllowlist) consulted in resolve_class_ref before
importlib.import_module, so a denied module is never imported. New
SchemaRefPolicyError (CW-E051). CLI: --schema-ref-allow PREFIX on run
and serve.
Add an adversarial regression corpus for malformed flow files #400 tests/corpus/flow_files/ — 29 malformed files + manifest.json +
README; tests/test_flow_corpus.py drives each through the library loaders
and chainweaver validate (table + JSON), with generated resource-shaped
cases (oversized, deep nesting, huge string, 10k steps, YAML alias bomb) for
the Add parse-size and structural guardrails for untrusted flow files #416 limits.
Add a scheduled fuzz-testing job to CI #340 .github/workflows/fuzz.yml — weekly schedule + workflow_dispatch
running chainweaver fuzz over a new schema-typed fixture
(examples/fuzzable_linear.flow.yaml) against gracefully_handles_input
(examples/fuzz_properties.py); seed = run id (echoed for local repro),
minimized counterexample uploaded on failure. Scheduled-only (off the
PR-blocking path).
Docs/exports — docs/security.md (loading untrusted flow files),
docs/agent-context/workflows.md (fuzz job + corpus), docs/reference/error-table.md
(CW-E051), new public symbols in chainweaver/__init__.py + regenerated
tests/fixtures/public_api.json.

Testing

Linting passes (ruff check chainweaver/ tests/ examples/)
Formatting check passes (ruff format --check chainweaver/ tests/ examples/)
Type checking passes (python -m mypy chainweaver/ tests/)
All existing tests pass (python -m pytest tests/ -v — 2062 passed, 1 skipped, 93% coverage)
New tests added for new functionality (test_flow_corpus.py, test_schema_ref_policy.py)

The fuzz invocation was verified locally green across 5 seeds; the CLI
--schema-ref-allow flag was verified end-to-end (CW-E051 with a non-allowlisted
ref, clean run without).

Issues closed by this PR

Closes #345
Closes #416
Closes #400
Closes #340

Related Issues

Checklist

Code follows project conventions (see AGENTS.md and docs/agent-context/)
Public API changes are documented
No secrets or credentials included

Scope notes & decisions (Mode B)

Grouped four issues by explicit request ("one PR for all"); they share one
implementation path and the issues themselves say to land together.
Parse limits are auto-applied defaults + a full limits= API; I did not
add 5×N per-limit CLI flags (surface creep) — the conservative defaults
protect the CI Action automatically. Per-limit CLI overrides are an easy
follow-up if wanted.
--schema-ref-allow is on run/serve only, not validate/check:
those commands only deserialize and never import refs, so the flag would be a
misleading no-op there.
Add a scheduled fuzz-testing job to CI #340 needed a green-by-default gate. The builtin flow_succeeds property
intentionally flags inputs a flow rejects, and the fuzzer corrupts ~50% of
generated inputs — so it can never be green over a strict-schema flow. I added
a schema-typed fixture + a gracefully_handles_input invariant (success with
output, or a typed recorded failure — never a crash), which is the genuine
robustness contract a fuzz gate should protect.

🤖 Generated with Claude Code

https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo

Generated by Claude Code

Bundles four tightly-coupled hardening issues that share the flow deserialization path (the issues explicitly cross-reference one another): #416 — parse-size & structural guardrails: add FlowParseLimits (max_bytes/nodes/depth/string_length/steps) applied by default in flow_from_json/yaml/dict; a byte-size precheck plus a bounded structural walk that also caps traversal of YAML alias/anchor expansion. Overridable via limits=, with FlowParseLimits.unlimited() to opt out. #345 — schema-ref module-resolution allowlist: a ContextVar-backed policy (set_schema_ref_policy / schema_ref_policy / SchemaRefAllowlist) consulted in resolve_class_ref BEFORE importlib.import_module, so a denied module's top-level code never runs. New SchemaRefPolicyError (CW-E051). Surfaced on the CLI as `--schema-ref-allow PREFIX` for `run` and `serve` (the commands that actually resolve refs). #400 — adversarial corpus: tests/corpus/flow_files with 29 malformed files + a manifest, driven through the library loaders and `chainweaver validate` (table + JSON), plus generated resource-shaped cases for the #416 limits. #340 — scheduled fuzz workflow: .github/workflows/fuzz.yml runs the existing fuzz harness weekly + on dispatch over a new schema-typed fixture (examples/fuzzable_linear.flow.yaml) against a graceful-handling invariant (examples/fuzz_properties.py); seed echoed for repro, counterexample uploaded on failure. Docs: security.md (untrusted flow-file loading), workflows.md (fuzz job + corpus), error-table.md (CW-E051). Public API + snapshot updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo

The lint job's actionlint/shellcheck flagged SC2153 ("SEED/RUNS may not be assigned; did you mean seed/runs?") because the run script reassigned the env-provided vars to lowercase locals. Drop the locals and reference the all-caps env vars directly — shellcheck treats all-caps names as environment variables, so neither SC2153 nor SC2154 fires. Behaviour is unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo

Copilot

Pull request overview

Hardens ChainWeaver’s untrusted flow-file deserialization boundary by adding configurable parse guardrails, introducing an opt-in schema-ref module allowlist (to prevent unsafe module imports), and adding regression infrastructure (adversarial corpus + scheduled fuzz workflow) to keep these guarantees stable over time.

Changes:

Add FlowParseLimits + default limits applied by flow_from_json/flow_from_yaml/flow_from_dict to bound size/structure before validation.
Add SchemaRefAllowlist + schema_ref_policy/set_schema_ref_policy (ContextVar-backed) and wire --schema-ref-allow into chainweaver run/serve.
Add adversarial corpus tests + scheduled fuzz workflow + supporting examples/docs/error-table/public-API snapshot updates.

Reviewed changes

Copilot reviewed 48 out of 49 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`tests/test_schema_ref_policy.py`	Tests schema-ref allowlist matching, pre-import rejection, scoping, and CLI flag behavior.
`tests/test_flow_corpus.py`	Drives a manifest-defined invalid-flow corpus through library loaders and `chainweaver validate`, plus generated resource-shaped limit cases.
`tests/test_cli_serve.py`	Updates serve CLI test to include new `schema_ref_allow` param.
`tests/schema_ref_sentinel.py`	Sentinel module used to prove schema-ref rejection happens before import side effects.
`tests/fixtures/public_api.json`	Updates public API snapshot for new exports and loader signatures (`limits=`).
`tests/corpus/flow_files/README.md`	Documents corpus layout and how to add cases.
`tests/corpus/flow_files/manifest.json`	Manifest enumerating invalid-flow corpus inputs and expected `detail` substrings.
`tests/corpus/flow_files/invalid/top_level_list.flow.json`	Corpus case: invalid top-level shape (list).
`tests/corpus/flow_files/invalid/top_level_string.flow.json`	Corpus case: invalid top-level shape (string).
`tests/corpus/flow_files/invalid/top_level_number.flow.json`	Corpus case: invalid top-level shape (number).
`tests/corpus/flow_files/invalid/top_level_null.flow.json`	Corpus case: invalid top-level shape (null).
`tests/corpus/flow_files/invalid/top_level_list.flow.yaml`	Corpus case: invalid top-level shape (YAML list).
`tests/corpus/flow_files/invalid/top_level_scalar.flow.yaml`	Corpus case: invalid top-level shape (YAML scalar).
`tests/corpus/flow_files/invalid/trailing_comma.flow.json`	Corpus case: invalid JSON syntax.
`tests/corpus/flow_files/invalid/truncated.flow.json`	Corpus case: truncated JSON.
`tests/corpus/flow_files/invalid/bom_prefixed.flow.json`	Corpus case: UTF-8 BOM-prefixed JSON.
`tests/corpus/flow_files/invalid/unclosed_sequence.flow.yaml`	Corpus case: invalid YAML (unclosed sequence).
`tests/corpus/flow_files/invalid/tab_indentation.flow.yaml`	Corpus case: invalid YAML indentation (tabs).
`tests/corpus/flow_files/invalid/unsafe_python_tag.flow.yaml`	Corpus case: unsafe YAML tag rejected by safe loader.
`tests/corpus/flow_files/invalid/empty.flow.yaml`	Corpus case: empty YAML payload.
`tests/corpus/flow_files/invalid/missing_type.flow.json`	Corpus case: missing `type` discriminator (JSON).
`tests/corpus/flow_files/invalid/unknown_type.flow.json`	Corpus case: unknown `type` discriminator (JSON).
`tests/corpus/flow_files/invalid/missing_type.flow.yaml`	Corpus case: missing `type` discriminator (YAML).
`tests/corpus/flow_files/invalid/future_format_version.flow.json`	Corpus case: unsupported `format_version`.
`tests/corpus/flow_files/invalid/missing_required_fields.flow.json`	Corpus case: missing required fields (JSON).
`tests/corpus/flow_files/invalid/steps_not_a_list.flow.json`	Corpus case: wrong `steps` type (string).
`tests/corpus/flow_files/invalid/steps_null.flow.json`	Corpus case: `steps` is null.
`tests/corpus/flow_files/invalid/name_wrong_type.flow.json`	Corpus case: wrong `name` type.
`tests/corpus/flow_files/invalid/version_wrong_type.flow.json`	Corpus case: wrong `version` type.
`tests/corpus/flow_files/invalid/step_missing_tool_name.flow.json`	Corpus case: missing step `tool_name`.
`tests/corpus/flow_files/invalid/step_not_an_object.flow.json`	Corpus case: step element not an object.
`tests/corpus/flow_files/invalid/step_input_mapping_scalar.flow.json`	Corpus case: wrong `input_mapping` type.
`tests/corpus/flow_files/invalid/dag_missing_version.flow.json`	Corpus case: DAGFlow missing required version.
`tests/corpus/flow_files/invalid/dag_step_missing_id.flow.json`	Corpus case: DAGFlowStep missing required id.
`tests/corpus/flow_files/invalid/steps_scalar.flow.yaml`	Corpus case: `steps` scalar in YAML.
`tests/corpus/flow_files/invalid/missing_required_fields.flow.yaml`	Corpus case: missing required fields (YAML).
`examples/README.md`	Documents fuzz fixture/property used by the scheduled fuzz workflow.
`examples/fuzzable_linear.flow.yaml`	Adds a schema-typed example flow intended for fuzzing.
`examples/fuzz_properties.py`	Adds `gracefully_handles_input` property for fuzz gating.
`docs/security.md`	Documents untrusted flow-file guardrails and schema-ref allowlisting usage.
`docs/reference/error-table.md`	Adds `CW-E051` for `SchemaRefPolicyError`.
`docs/agent-context/workflows.md`	Documents the fuzz workflow and adversarial corpus testing patterns.
`chainweaver/serialization.py`	Implements `FlowParseLimits` + default parse guardrails across loaders.
`chainweaver/flow.py`	Implements ContextVar-backed schema-ref policy and pre-import enforcement in `resolve_class_ref`.
`chainweaver/exceptions.py`	Adds `SchemaRefPolicyError` and assigns stable code `CW-E051`.
`chainweaver/cli/run.py`	Wires `--schema-ref-allow` into `run` and `serve`.
`chainweaver/cli/_shared.py`	Adds shared `--schema-ref-allow` option + helper to apply the allowlist.
`chainweaver/__init__.py`	Exposes new public symbols (`FlowParseLimits`, `DEFAULT_PARSE_LIMITS`, schema-ref policy helpers, `SchemaRefPolicyError`).
`.github/workflows/fuzz.yml`	Adds a scheduled/manual fuzz workflow that runs `chainweaver fuzz` and uploads minimized counterexamples on failure.

- test_flow_corpus: assert the real >= 25 corpus cases (29 committed satisfy it) instead of >= 20, matching issue #400's acceptance criterion. - cli: fix the --schema-ref-allow help example so the sentence-final period isn't copy-pasted as a trailing dot in the module prefix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo

The `_assert_fast_failure` / library-loader corpus checks assert a guardrail-bounded parse fails in under `_MAX_PARSE_SECONDS`. Guardrail failures are milliseconds-scale, so the 2.0s ceiling only needs to catch an unbounded blowup or hang — not micro-time the parse. Widen it to 5.0s so it does not flake on a saturated CI runner while still proving the #416 fast-failure contract. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011E7oGoTqo75hWRJwLftdow

Copilot AI review requested due to automatic review settings June 25, 2026 05:44

Copilot started reviewing on behalf of dgenio June 25, 2026 05:44 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

Comment thread tests/test_flow_corpus.py Outdated

Comment thread chainweaver/cli/_shared.py

dgenio mentioned this pull request Jul 1, 2026

refactor: split flow models into a package #484

Draft

12 tasks

dgenio merged commit 213631d into main Jul 3, 2026
21 checks passed

dgenio deleted the claude/issue-triage-grouping-hy3zh2 branch July 3, 2026 00:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(serialization,flow): harden the untrusted flow-file boundary (#345, #416, #400, #340)#474

feat(serialization,flow): harden the untrusted flow-file boundary (#345, #416, #400, #340)#474
dgenio merged 4 commits into
mainfrom
claude/issue-triage-grouping-hy3zh2

dgenio commented Jun 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dgenio commented Jun 25, 2026

Summary

Changes

Testing

Issues closed by this PR

Related Issues

Checklist

Scope notes & decisions (Mode B)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants