Skip to content

feat(serialization,flow): harden the untrusted flow-file boundary (#345, #416, #400, #340)#474

Merged
dgenio merged 4 commits into
mainfrom
claude/issue-triage-grouping-hy3zh2
Jul 3, 2026
Merged

feat(serialization,flow): harden the untrusted flow-file boundary (#345, #416, #400, #340)#474
dgenio merged 4 commits into
mainfrom
claude/issue-triage-grouping-hy3zh2

Conversation

@dgenio

@dgenio dgenio commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Summary

Hardens the untrusted flow-file input boundary — the primary untrusted
surface (repos, contributor PRs validated by the GitHub Action, generated
drafts). Four tightly-coupled issues that share the flow deserialization path
(serialization.py + flow.resolve_class_ref + the CI Action + the fuzz
harness) and explicitly cross-reference one another as belonging in one PR:

Changes

  • Add parse-size and structural guardrails for untrusted flow files #416 chainweaver/serialization.py — add FlowParseLimits
    (max_bytes/max_nodes/max_depth/max_string_length/max_steps) +
    DEFAULT_PARSE_LIMITS, applied automatically in flow_from_json /
    flow_from_yaml / flow_from_dict. A byte-size precheck runs before parse; a
    bounded, iterative structural walk caps node count (also bounding YAML
    alias/anchor expansion), depth, and string length; the step count is checked.
    Each violation raises FlowSerializationError naming the limit. Overridable
    via limits=; FlowParseLimits.unlimited() opts out for trusted input.
  • Add an opt-in allowlist hook for schema-ref module resolution #345 chainweaver/flow.py + exceptions.py — a ContextVar-backed
    schema-ref policy (set_schema_ref_policy, schema_ref_policy context
    manager, SchemaRefAllowlist) consulted in resolve_class_ref before
    importlib.import_module, so a denied module is never imported. New
    SchemaRefPolicyError (CW-E051). CLI: --schema-ref-allow PREFIX on run
    and serve.
  • Add an adversarial regression corpus for malformed flow files #400 tests/corpus/flow_files/ — 29 malformed files + manifest.json +
    README; tests/test_flow_corpus.py drives each through the library loaders
    and chainweaver validate (table + JSON), with generated resource-shaped
    cases (oversized, deep nesting, huge string, 10k steps, YAML alias bomb) for
    the Add parse-size and structural guardrails for untrusted flow files #416 limits.
  • Add a scheduled fuzz-testing job to CI #340 .github/workflows/fuzz.yml — weekly schedule + workflow_dispatch
    running chainweaver fuzz over a new schema-typed fixture
    (examples/fuzzable_linear.flow.yaml) against gracefully_handles_input
    (examples/fuzz_properties.py); seed = run id (echoed for local repro),
    minimized counterexample uploaded on failure. Scheduled-only (off the
    PR-blocking path).
  • Docs/exportsdocs/security.md (loading untrusted flow files),
    docs/agent-context/workflows.md (fuzz job + corpus), docs/reference/error-table.md
    (CW-E051), new public symbols in chainweaver/__init__.py + regenerated
    tests/fixtures/public_api.json.

Testing

  • Linting passes (ruff check chainweaver/ tests/ examples/)
  • Formatting check passes (ruff format --check chainweaver/ tests/ examples/)
  • Type checking passes (python -m mypy chainweaver/ tests/)
  • All existing tests pass (python -m pytest tests/ -v2062 passed, 1 skipped, 93% coverage)
  • New tests added for new functionality (test_flow_corpus.py, test_schema_ref_policy.py)

The fuzz invocation was verified locally green across 5 seeds; the CLI
--schema-ref-allow flag was verified end-to-end (CW-E051 with a non-allowlisted
ref, clean run without).

Issues closed by this PR

Closes #345
Closes #416
Closes #400
Closes #340

Related Issues

Checklist

  • Code follows project conventions (see AGENTS.md and docs/agent-context/)
  • Public API changes are documented
  • No secrets or credentials included

Scope notes & decisions (Mode B)

  • Grouped four issues by explicit request ("one PR for all"); they share one
    implementation path and the issues themselves say to land together.
  • Parse limits are auto-applied defaults + a full limits= API; I did not
    add 5×N per-limit CLI flags (surface creep) — the conservative defaults
    protect the CI Action automatically. Per-limit CLI overrides are an easy
    follow-up if wanted.
  • --schema-ref-allow is on run/serve only, not validate/check:
    those commands only deserialize and never import refs, so the flag would be a
    misleading no-op there.
  • Add a scheduled fuzz-testing job to CI #340 needed a green-by-default gate. The builtin flow_succeeds property
    intentionally flags inputs a flow rejects, and the fuzzer corrupts ~50% of
    generated inputs — so it can never be green over a strict-schema flow. I added
    a schema-typed fixture + a gracefully_handles_input invariant (success with
    output, or a typed recorded failure — never a crash), which is the genuine
    robustness contract a fuzz gate should protect.

🤖 Generated with Claude Code

https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo


Generated by Claude Code

Bundles four tightly-coupled hardening issues that share the flow
deserialization path (the issues explicitly cross-reference one another):

#416 — parse-size & structural guardrails: add FlowParseLimits
  (max_bytes/nodes/depth/string_length/steps) applied by default in
  flow_from_json/yaml/dict; a byte-size precheck plus a bounded structural
  walk that also caps traversal of YAML alias/anchor expansion. Overridable
  via limits=, with FlowParseLimits.unlimited() to opt out.

#345 — schema-ref module-resolution allowlist: a ContextVar-backed policy
  (set_schema_ref_policy / schema_ref_policy / SchemaRefAllowlist) consulted
  in resolve_class_ref BEFORE importlib.import_module, so a denied module's
  top-level code never runs. New SchemaRefPolicyError (CW-E051). Surfaced on
  the CLI as `--schema-ref-allow PREFIX` for `run` and `serve` (the commands
  that actually resolve refs).

#400 — adversarial corpus: tests/corpus/flow_files with 29 malformed files +
  a manifest, driven through the library loaders and `chainweaver validate`
  (table + JSON), plus generated resource-shaped cases for the #416 limits.

#340 — scheduled fuzz workflow: .github/workflows/fuzz.yml runs the existing
  fuzz harness weekly + on dispatch over a new schema-typed fixture
  (examples/fuzzable_linear.flow.yaml) against a graceful-handling invariant
  (examples/fuzz_properties.py); seed echoed for repro, counterexample
  uploaded on failure.

Docs: security.md (untrusted flow-file loading), workflows.md (fuzz job +
corpus), error-table.md (CW-E051). Public API + snapshot updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo
Copilot AI review requested due to automatic review settings June 25, 2026 05:44
The lint job's actionlint/shellcheck flagged SC2153 ("SEED/RUNS may not be
assigned; did you mean seed/runs?") because the run script reassigned the
env-provided vars to lowercase locals. Drop the locals and reference the
all-caps env vars directly — shellcheck treats all-caps names as environment
variables, so neither SC2153 nor SC2154 fires. Behaviour is unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens ChainWeaver’s untrusted flow-file deserialization boundary by adding configurable parse guardrails, introducing an opt-in schema-ref module allowlist (to prevent unsafe module imports), and adding regression infrastructure (adversarial corpus + scheduled fuzz workflow) to keep these guarantees stable over time.

Changes:

  • Add FlowParseLimits + default limits applied by flow_from_json/flow_from_yaml/flow_from_dict to bound size/structure before validation.
  • Add SchemaRefAllowlist + schema_ref_policy/set_schema_ref_policy (ContextVar-backed) and wire --schema-ref-allow into chainweaver run/serve.
  • Add adversarial corpus tests + scheduled fuzz workflow + supporting examples/docs/error-table/public-API snapshot updates.

Reviewed changes

Copilot reviewed 48 out of 49 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_schema_ref_policy.py Tests schema-ref allowlist matching, pre-import rejection, scoping, and CLI flag behavior.
tests/test_flow_corpus.py Drives a manifest-defined invalid-flow corpus through library loaders and chainweaver validate, plus generated resource-shaped limit cases.
tests/test_cli_serve.py Updates serve CLI test to include new schema_ref_allow param.
tests/schema_ref_sentinel.py Sentinel module used to prove schema-ref rejection happens before import side effects.
tests/fixtures/public_api.json Updates public API snapshot for new exports and loader signatures (limits=).
tests/corpus/flow_files/README.md Documents corpus layout and how to add cases.
tests/corpus/flow_files/manifest.json Manifest enumerating invalid-flow corpus inputs and expected detail substrings.
tests/corpus/flow_files/invalid/top_level_list.flow.json Corpus case: invalid top-level shape (list).
tests/corpus/flow_files/invalid/top_level_string.flow.json Corpus case: invalid top-level shape (string).
tests/corpus/flow_files/invalid/top_level_number.flow.json Corpus case: invalid top-level shape (number).
tests/corpus/flow_files/invalid/top_level_null.flow.json Corpus case: invalid top-level shape (null).
tests/corpus/flow_files/invalid/top_level_list.flow.yaml Corpus case: invalid top-level shape (YAML list).
tests/corpus/flow_files/invalid/top_level_scalar.flow.yaml Corpus case: invalid top-level shape (YAML scalar).
tests/corpus/flow_files/invalid/trailing_comma.flow.json Corpus case: invalid JSON syntax.
tests/corpus/flow_files/invalid/truncated.flow.json Corpus case: truncated JSON.
tests/corpus/flow_files/invalid/bom_prefixed.flow.json Corpus case: UTF-8 BOM-prefixed JSON.
tests/corpus/flow_files/invalid/unclosed_sequence.flow.yaml Corpus case: invalid YAML (unclosed sequence).
tests/corpus/flow_files/invalid/tab_indentation.flow.yaml Corpus case: invalid YAML indentation (tabs).
tests/corpus/flow_files/invalid/unsafe_python_tag.flow.yaml Corpus case: unsafe YAML tag rejected by safe loader.
tests/corpus/flow_files/invalid/empty.flow.yaml Corpus case: empty YAML payload.
tests/corpus/flow_files/invalid/missing_type.flow.json Corpus case: missing type discriminator (JSON).
tests/corpus/flow_files/invalid/unknown_type.flow.json Corpus case: unknown type discriminator (JSON).
tests/corpus/flow_files/invalid/missing_type.flow.yaml Corpus case: missing type discriminator (YAML).
tests/corpus/flow_files/invalid/future_format_version.flow.json Corpus case: unsupported format_version.
tests/corpus/flow_files/invalid/missing_required_fields.flow.json Corpus case: missing required fields (JSON).
tests/corpus/flow_files/invalid/steps_not_a_list.flow.json Corpus case: wrong steps type (string).
tests/corpus/flow_files/invalid/steps_null.flow.json Corpus case: steps is null.
tests/corpus/flow_files/invalid/name_wrong_type.flow.json Corpus case: wrong name type.
tests/corpus/flow_files/invalid/version_wrong_type.flow.json Corpus case: wrong version type.
tests/corpus/flow_files/invalid/step_missing_tool_name.flow.json Corpus case: missing step tool_name.
tests/corpus/flow_files/invalid/step_not_an_object.flow.json Corpus case: step element not an object.
tests/corpus/flow_files/invalid/step_input_mapping_scalar.flow.json Corpus case: wrong input_mapping type.
tests/corpus/flow_files/invalid/dag_missing_version.flow.json Corpus case: DAGFlow missing required version.
tests/corpus/flow_files/invalid/dag_step_missing_id.flow.json Corpus case: DAGFlowStep missing required id.
tests/corpus/flow_files/invalid/steps_scalar.flow.yaml Corpus case: steps scalar in YAML.
tests/corpus/flow_files/invalid/missing_required_fields.flow.yaml Corpus case: missing required fields (YAML).
examples/README.md Documents fuzz fixture/property used by the scheduled fuzz workflow.
examples/fuzzable_linear.flow.yaml Adds a schema-typed example flow intended for fuzzing.
examples/fuzz_properties.py Adds gracefully_handles_input property for fuzz gating.
docs/security.md Documents untrusted flow-file guardrails and schema-ref allowlisting usage.
docs/reference/error-table.md Adds CW-E051 for SchemaRefPolicyError.
docs/agent-context/workflows.md Documents the fuzz workflow and adversarial corpus testing patterns.
chainweaver/serialization.py Implements FlowParseLimits + default parse guardrails across loaders.
chainweaver/flow.py Implements ContextVar-backed schema-ref policy and pre-import enforcement in resolve_class_ref.
chainweaver/exceptions.py Adds SchemaRefPolicyError and assigns stable code CW-E051.
chainweaver/cli/run.py Wires --schema-ref-allow into run and serve.
chainweaver/cli/_shared.py Adds shared --schema-ref-allow option + helper to apply the allowlist.
chainweaver/__init__.py Exposes new public symbols (FlowParseLimits, DEFAULT_PARSE_LIMITS, schema-ref policy helpers, SchemaRefPolicyError).
.github/workflows/fuzz.yml Adds a scheduled/manual fuzz workflow that runs chainweaver fuzz and uploads minimized counterexamples on failure.

Comment thread tests/test_flow_corpus.py Outdated
Comment thread chainweaver/cli/_shared.py
- test_flow_corpus: assert the real >= 25 corpus cases (29 committed satisfy
  it) instead of >= 20, matching issue #400's acceptance criterion.
- cli: fix the --schema-ref-allow help example so the sentence-final period
  isn't copy-pasted as a trailing dot in the module prefix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A2nffdLkZjypkTDW6rGLeo
The `_assert_fast_failure` / library-loader corpus checks assert a
guardrail-bounded parse fails in under `_MAX_PARSE_SECONDS`. Guardrail
failures are milliseconds-scale, so the 2.0s ceiling only needs to catch
an unbounded blowup or hang — not micro-time the parse. Widen it to 5.0s
so it does not flake on a saturated CI runner while still proving the
#416 fast-failure contract.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011E7oGoTqo75hWRJwLftdow
@dgenio dgenio merged commit 213631d into main Jul 3, 2026
21 checks passed
@dgenio dgenio deleted the claude/issue-triage-grouping-hy3zh2 branch July 3, 2026 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants