Skip to content

fix(jq): @tsv/@csv defs + short jq-style errors; cross-tool no-Debug-leak guard (TM-INF-022)#1501

Merged
chaliy merged 4 commits intomainfrom
claude/fix-jq-harness-expression-4oFW8
May 2, 2026
Merged

fix(jq): @tsv/@csv defs + short jq-style errors; cross-tool no-Debug-leak guard (TM-INF-022)#1501
chaliy merged 4 commits intomainfrom
claude/fix-jq-harness-expression-4oFW8

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 2, 2026

Summary

Fixes a real-world jq filter that LLM agents were hitting (... | @tsv returning a multi-page jaq Debug dump), then generalizes the underlying leak class — format!("{:?}", e) on a wrapped library error — into a cross-tool guard with three enforcement layers (static + dynamic + fuzz).

What

The originating bug

jq -r 'if (.data | length) == 0 then "..." else (.data[] | [...] | @tsv) end'

failed with stderr containing the entire jaq File { code: "<800 chars of prepended compat-defs source>", path: () }, [("@tsv", Filter(0))]. Two issues:

  1. @tsv/@csv aren't defined in jaq-std — bashkit needs to provide them.
  2. The error formatter used format!("{:?}", e) which dumps internal jaq struct shapes.

The fix

  • builtins/jq.rs: added @tsv and @csv to JQ_COMPAT_DEFS (using jq's def @name: syntax matching jaq's existing @json, @text).
  • format_jq_compile_errors / format_jq_load_errors: replace the {:?} formatting with domain-specific formatters that emit short jq-style messages — @tsv/0 is not defined, expected then, etc. — capped at 240 chars.

The generalization (TM-INF-022)

Realised this leak class affects every builtin wrapping a Debug-ful library. Three-layer guard, all run by cargo test:

  1. Staticbuiltins::tests::no_debug_fmt_in_builtin_source walks src/builtins/*.rs and forbids {:?} / {:#?} / {name:?} directives. Per-line opt-out via // debug-ok: <reason>.
  2. Dynamic per-tool — each tool's mod tests (jq, awk, grep, sed, json, yaml, tomlq, csv, bc, expr, semver, date, numfmt) calls bashkit::testing::assert_no_leak against curated malformed inputs. Plus a cross-tool sweep that throws --xyzzy-not-a-real-flag at every common builtin.
  3. Fuzz — every cargo-fuzz target plus 5 new proptest cases call bashkit::testing::assert_fuzz_invariants on random inputs. This also enforces:
    • TM-INF-013 regression guardfuzz_init() seeds a magic BASHKIT_FUZZ_HOST_CANARY_* value into the host OS env. If it ever appears in builtin stdout/stderr, a builtin is reading std::env instead of sandboxed ctx.env.
    • TM-INF-016 host pathsUNIVERSAL_BANNED includes /rustc/, /.cargo/registry/, target/{debug,release}/deps/, /.rustup/toolchains/.
    • Per-call stderr ≤ 1 KB — catches "one bad input floods stderr" regressions.

The shared helpers live in bashkit::testing (#[doc(hidden)] pub mod) so internal tests, integration scaffold tests, and the cargo-fuzz crate all use the same banned-substring list and canary plumbing.

Why

Stderr is the agent-facing error API. Real shell tools print short, opaque error messages — Debug output dumps internal struct shapes that confuse LLM agents and waste tokens. Documented as TM-INF-022 in specs/threat-model.md.

How

  • 14 cargo-fuzz targets had 2>/dev/null stripped (it was hiding exactly what we want to inspect) and route through bashkit::testing::fuzz_exec for the assertion.
  • builtins::debug_leak_check is now a thin re-export of crate::testing so the per-tool inline tests keep their existing imports.
  • Existing legitimate {:?} test asserts in awk.rs, grep.rs, jq.rs, ls.rs annotated with // debug-ok: assert-failure message.

Test plan

  • cargo test --all-features --lib → 2640 passed (was 2604 — +36 new tests)
  • cargo test --all-features --tests → all integration suites pass except ssh_supabase_tests::ssh_supabase_connects which is pre-existing/network-dependent and fails on origin/main too
  • cargo test --test jq_fuzz_scaffold_tests --test awk_fuzz_scaffold_tests --test proptest_security → 18 passed (5 proptest cases × 64 random inputs each = 320 random invocations exercised)
  • cargo check in crates/bashkit/fuzz/ → all 16 fuzz targets compile
  • cargo fmt --check clean
  • cargo clippy --all-targets -- -D warnings clean across the workspace
  • Smoke test: original bug-report filter via bashkit-cli -c '...' produces correct tab-separated output
  • Smoke test: echo {} | jq totally_undefined produces jq: error: totally_undefined/0 is not defined (real-jq-style, no leak)

Generated by Claude Code

chaliy added 4 commits May 2, 2026 01:08
LLM-generated filters that end with `| @tsv` (or @csv) hit a confusing
"@tsv/0 is not defined" compile error because jaq-std doesn't ship those
formatters. The error path then dumped jaq's internal Debug output
(File { code: ..., path: () }, [("@TSV", Filter(0))]) which leaked the
entire ~800 chars of prepended compat-defs source into stderr.

- Add jq-compatible @TSV and @csv definitions to JQ_COMPAT_DEFS so the
  filters compile and run.
- Format compile errors as `name/arity is not defined` and parse/lex
  errors as short `expected X` messages, matching real jq's tone.
- Cap diagnostic output at 240 chars and strip offending-token text so
  the prepended stdlib can't leak through any error path.
…Debug leaks

Two complementary defenses against the class of bug we just fixed in jq
(leaking jaq's `File { code: "...", path: () }, [("@TSV", Filter(0))]`
into stderr):

1. Static tripwire — `scripts/check-no-debug-fmt.sh` greps
   `crates/bashkit/src/builtins/` for `{:?}` / `{:#?}` / `{name:?}`.
   Fails CI on match. Per-line opt-out via `// debug-ok: <reason>`.
   Wired into `just pre-pr`.

2. Dynamic test — `tests/no_debug_leak_tests.rs` fires curated
   malformed inputs at high-risk builtins (jq, awk, grep, sed, json,
   yaml, tomlq, csv, bc, expr, semver, date, numfmt) plus a sweep of
   every common builtin with a bogus flag. For each, asserts stderr:
   - is ≤ 1 KB
   - contains no banned Debug-shape substrings (`File {`, `Token(`,
     `Vec [`, `Span {`, `Undefined::`, `Errors {`, …)
   - contains no jq-specific internals (`__bashkit_env__`, prepended
     compat-defs source, raw `Filter(N)` variant tags)

Annotated existing legitimate test-assert `{:?}` uses in awk.rs,
grep.rs, jq.rs, ls.rs with `// debug-ok: assert-failure message` so
they pass the static check.

Documented the policy in AGENTS.md.
…t as TM-INF-022

Restructure of the previous separate-suite + shell-script approach:

- The shell script `scripts/check-no-debug-fmt.sh` is replaced by a Rust
  test `builtins::tests::no_debug_fmt_in_builtin_source` that walks
  `src/builtins/*.rs` and forbids `{:?}` directives. Runs as part of
  `cargo test` — no separate justfile recipe, no shell dependency.
- The standalone `tests/no_debug_leak_tests.rs` suite is split into
  per-tool tests collocated with each tool's existing `mod tests`:
  jq.rs, awk.rs, grep.rs, sed.rs, json.rs, yaml.rs, tomlq.rs, csv.rs,
  bc.rs, expr.rs, semver.rs, date.rs, numfmt.rs.
- Shared helpers (`UNIVERSAL_BANNED`, `assert_no_leak`, `run`) live in
  `builtins::debug_leak_check` so the policy is centralized but per-
  tool tests stay with their tool.
- Cross-tool bogus-flag sweep moves to `builtins::tests`.
- New threat-model entry **TM-INF-022** documents the leak class, the
  originating jq bug, and the two-layer enforcement. AGENTS.md links to
  it instead of restating the rules.
- Reverts the `check-no-debug-fmt` justfile recipe — `cargo test`
  already runs the static check.

Net: 1 static test + 32 per-tool dynamic tests + 1 cross-tool sweep,
all under `cargo test`. 2640 lib tests pass (was 2604).
Extends the no-Debug-leak guard from per-tool unit tests into the
fuzz/proptest layer so leak regressions get caught on random inputs,
not just on hand-curated cases.

What's new:

1. **`bashkit::testing`** (crates/bashkit/src/testing.rs) — public,
   `#[doc(hidden)]` module exposing the helpers previously only
   reachable from inline `#[cfg(test)]` modules. Lets external tests
   and cargo-fuzz targets share the same banned-substring list and
   canary plumbing.

2. **Three invariants** enforced by `assert_fuzz_invariants`:
   - **A. No Debug shapes** in stderr (TM-INF-022, already in `no_leak`).
   - **B. Per-call stderr ≤ 1 KB** (`MAX_STDERR_BYTES`) — catches
     "one bad input floods stderr" regressions.
   - **C. No host paths** in stderr — extends `UNIVERSAL_BANNED` with
     `/rustc/`, `/.cargo/registry/`, `target/{debug,release}/deps/`,
     `/.rustup/toolchains/` (TM-INF-016 fuzz coverage).
   - **E. No host-env canary** in stdout/stderr — `fuzz_init()` seeds
     a magic `BASHKIT_FUZZ_HOST_CANARY_*` value into the host OS env
     once per process; if it ever appears in builtin output, a
     builtin is reading `std::env` instead of sandboxed `ctx.env`
     (TM-INF-013 regression guard).

3. **14 cargo-fuzz targets updated** to drop `2>/dev/null` (which
   discarded exactly what we want to inspect) and route through
   `bashkit::testing::fuzz_exec` instead of bare `bash.exec`.
   Skipped: `lexer_fuzz`, `parser_fuzz` (don't go through builtins).

4. **Scaffold tests** (`tests/{jq,awk}_fuzz_scaffold_tests.rs`)
   collapse to use `fuzz_exec` directly — same checks as the fuzz
   targets, runnable under `cargo test`.

5. **5 new proptest cases** in `tests/proptest_security.rs` push 64
   random strings each through jq/awk/grep/sed/json (320 random
   invocations per `cargo test` run), all asserting the same
   invariants.

6. **`builtins::debug_leak_check`** is now a thin re-export of
   `crate::testing` so existing per-tool inline tests keep working.

Threat model and AGENTS.md updated to describe the three-layer
enforcement (static / dynamic per-tool / fuzz). 2640 lib tests pass;
18 directly-affected integration tests pass.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 40a41c0 Commit Preview URL

Branch Preview URL
May 02 2026, 02:27 AM

@chaliy chaliy merged commit c0bfd32 into main May 2, 2026
34 checks passed
@chaliy chaliy deleted the claude/fix-jq-harness-expression-4oFW8 branch May 2, 2026 02:45
@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant