fix: null|type, skill embed flag, doc corrections, lockstep streaming test by hakimjonas · Pull Request #10 · hakimjonas/lambe

hakimjonas · 2026-05-31T19:20:13Z

Summary

A bundle of small fixes against main that came out of an audit of the
0.11.0 binary against the published skill. Six self-contained commits,
each independently revertable.

1. `null | type` runtime bug fix (`a70b8d1`)

The Pipe evaluator unconditionally short-circuited on a null input,
so null | type returned JSON null instead of the string "null" that
type is defined to produce on any value.

Fix: introduce nullSafePipeOpNames (a Set in lib/src/shape/pipe_ops.dart)
and have _pipe consult it before short-circuiting. Currently only type
opts in. Three pinned regression tests including null | type | length
which confirms the result is a string by chaining length and getting 4.

2. Doc + idiom-hint corrections (`bde4cb3`)

SKILL.md previously claimed and/or/not would be parser-rejected.
In fact and and or are accepted as keyword aliases for && / ||
(intentional jq-compat). not is not aliased; use !.
Added a def non-goal hint: lambé has no user-defined functions,
recursion, or closures by design. The parser now redirects def
attempts with that explanation.

3. `lam --skill` flag (`cc20b68`)

Embeds .agents/skills/lambe/SKILL.md at compile time via a code-gen
step (tool/gen_skill.dart → lib/src/_skill.dart) and surfaces it as
lam --skill. The intended use is install-via-shell-out:

lam --skill > .agents/skills/lambe/SKILL.md

Three CLI integration tests: YAML-frontmatter sanity check, Markdown
round-trip via lam -f markdown, byte-for-byte parity with the on-disk
source.

4. Lockstep streaming test (`d6517a3`)

The --ndjson stdin streaming test was wall-clock-based and was being
skipped under CI=true. Reframed the property: streaming means N inputs
produce N outputs in lockstep. The new helper feeds one line, awaits its
output, then feeds the next; a buffered implementation never produces
the first output before EOF, so the per-line .timeout() fails clearly.
No wall-clock comparison, no CI skip.

5. Static analyser models the Pipe null short-circuit (`e2aae12`)

The runtime fix in #1 surfaced a parallel bug in inferShape and
_analyzeRejection: they didn't know about _pipe's null
short-circuit, so null | length inferred shape any and --explain
warned length rejects null; this will throw at runtime — a warning
that contradicts both the documented "navigation on null returns null"
contract and the actual runtime (which short-circuits and returns
null). as(...) had the same flavour of bug via a separate code path
(inferShape calls _asShape directly, skipping inferPipeOpShape),
making null | as(toml) falsely advertise TOML/HCL as writable.

Fixed in three places: inferPipeOpShape returns SNull for null
input on non-null-safe ops, _analyzeRejection suppresses the
"will throw" warning under the same condition, and the As branch
in infer.dart short-circuits before synthesis. Nine pinned tests
in shape_explain_test.dart covering length / has / keys / sum /
type / type|length / as(toml) / as(csv) plus a negative test that
non-null rejections still warn.

6. CI: generated-files-in-sync (`2f9b8fd`)

lib/src/_version.dart, lib/src/_skill.dart, and doc/lam.1 are
checked-in artifacts produced by tool/gen_version.dart,
tool/gen_skill.dart, and tool/manpage.dart. New CI job re-runs all
three and fails the build if any output differs from the committed
copy. Catches the drift class where a SKILL.md edit lands without
regenerating the embedded copy.

Test plan

dart test — 1673 tests pass (was 1664; +9 new explain tests)
dart format — clean
dart analyze — no issues
AOT compile + smoke: null | type returns "null", null | length returns null
lam -n --explain 'null | as(toml)' no longer claims TOML is writable
lam --skill | lam -f markdown '.title' returns the embedded title
Streaming test runs in ~2s on JIT, no CI skip needed
Generators produce no diff against committed artifacts

Notes

No version bump and no CHANGELOG entry — release decisions are out of scope.
nullSafePipeOpNames is intentionally a Set rather than a nullSafe
field on PipeOpInfo: avoids touching all 28 op declarations to add
one bit of metadata only type currently uses.

…uit) `_pipe` in evaluator.dart unconditionally short-circuited on null inputs: `if (input == null) return null`. That implements the "navigation on null returns null" contract (correct for `.field`, `flatten`, `length`, `map`, …) but breaks ops whose semantics are *defined* over a null context. The canonical case is `type`, whose `--explain` shape is `SString` and whose spec eval has a deliberate `null => 'null'` branch — but that branch was unreachable from any `Pipe` AST node because `_pipe` returned null before the spec was consulted. Concretely on lambé 0.11.0: $ lam -n 'null | type' null # JSON null, not the string "null" $ lam -n '(null | type) == "null"' false $ lam -n 'null | type | length' null # short-circuit cascades The fix adds `nullSafePipeOpNames`, a set of op names that opt out of the short-circuit. `_pipe` consults it; if the right-hand op is a BuiltinPipeOp whose name is in the set, the null is forwarded to its `eval`. Currently the set contains only `type`. Adding an op is a single-line change in pipe_ops.dart. `has` is a related candidate but its spec eval throws on non-map/non-list context rather than returning false. Promoting `has` to nullSafe would change the user-visible behaviour from "returns null" to "throws QueryError", which is a separate design call. Out of scope for this fix. Tests pin the bug at the Pipe AST level: the existing `query('type', null)` test passed because it bypasses `Pipe` entirely (top-level evaluation of a bare `type` expression doesn't go through `_pipe`). Three new tests in `test/to_number_type_test.dart` exercise the Pipe form (`null | type`, `.missing | type`, and a chained `null | type | length`) that would have caught this regression. Verified: 1656 existing tests still pass; the 1 failing test (`--ndjson stdin streaming`) is a pre-existing timing flake on this machine, reproduces on plain main without this change.

Two small fixes that surface together: 1. `.agents/skills/lambe/SKILL.md` previously told agents not to write `and` / `or` / `not` and claimed the parser would reject them. That is correct for `not` but wrong for `and` and `or` — both are documented keyword aliases for `&&` / `||` (lib/src/parser.dart lines 16–20). Updated the skill to accurately describe which forms are aliased and which require the punctuation form. 2. `def` was the only common jq idiom without a tailored `_jqIdiomHint`. The query `def f: . + 1; f` produced the generic "expected ..." token list with no guidance. Added a hint that names the non-goal explicitly: lambé is a bounded tree transformer; no `def`, no recursion, no closures. A pinned test in `test/parse_error_format_test.dart` exercises the `def` redirect to keep it from regressing.

`lam --skill` writes the embedded `.agents/skills/lambe/SKILL.md` to stdout and exits. Lets agent harnesses install the skill regardless of how `lam` was acquired: lam --skill > .agents/skills/lambe/SKILL.md This addresses an adoption gap: `dart pub global activate lambe` does not ship the `.agents/` directory, so pub-installed users have no way to discover the skill on disk. The release-binary path (install.sh from the GitHub release) includes the binary but not the skill file either. Embedding closes both gaps with a single command and a filesystem-free implementation that works in any execution model (AOT, JIT, future WASM CLI). Mechanics mirror `lambeVersion`: - New `tool/gen_skill.dart` reads `.agents/skills/lambe/SKILL.md` and emits `lib/src/_skill.dart` containing a `const String lambeSkill`. - The generator is committed-output: `lib/src/_skill.dart` lives in the tree so `dart compile exe` works from a fresh checkout without a generator pre-step. Re-run after editing the skill source. - Generator uses a raw triple-quoted Dart string (`r'''...'''`) and fails loudly if the skill ever contains `'''` so the embedding scheme stays simple. - `bin/lam.dart` adds an `--skill` flag, handled before any input-acquisition logic so it works without stdin or a file argument. - Three CLI integration tests pin the wiring: header shape, markdown round-trip via `lam -f markdown`, and byte-for-byte parity with the on-disk source (catches "regenerated _skill.dart was not committed" regressions). - `doc/lam.1.md` documents the flag; `doc/lam.1` regenerated from it. `lambeSkill` is also exported from `package:lambe` so library consumers can read the same string without shelling out.

…ock check The --ndjson stdin streaming test asserted that the second output line arrived ≥150 ms before EOF, then ran a parallel timer to prove the time gap. Both halves were sensitive to host CPU scheduling and stdio buffering, so the test was marked CI-skipped. Useful signal lost. The replacement reframes the property: streaming means N inputs produce N outputs in lockstep. The test now feeds one ndjson line, awaits its output line, then feeds the next, etc. A buffered implementation never produces the first output before EOF, so the per-line wait times out and the test fails with a clear streaming-broken signal. No wall-clock comparison; no CI skip. The helper avoids `package:async` by hand-rolling a completer queue front of the stdout LineSplitter.

The runtime [Pipe] evaluator short-circuits on null: a null left-hand side returns null without invoking the right-hand op (except for ops listed in [nullSafePipeOpNames] like `type`). This is the documented "navigation on null returns null" contract and is pinned by tests in null_propagation_test.dart. The static analyser did not model this, with two visible consequences: 1. `inferPipeOpShape` saw a null input and returned [SAny] (since most ops' `accepts` predicate rejects null), so `null | length` inferred to 'any' instead of 'null'. 2. `_analyzeRejection` then saw the not-accepted predicate and emitted "length rejects null; this will throw at runtime" — a warning that contradicts the documented contract and the actual runtime behaviour. Same shape of bug in `infer.dart`: `as(target)` bypasses `inferPipeOpShape` to call the synthesis table directly, so even after fixing the spec-table path, `null | as(toml)` still inferred to `map<value: null>` and falsely listed TOML/HCL as writable. Fix: thread the null short-circuit through both paths. - `inferPipeOpShape`: when input is [SNull] and the op is not in [nullSafePipeOpNames], return [SNull]. - `_analyzeRejection` in explain.dart: suppress the "will throw at runtime" warning under the same condition. - `inferShape` in infer.dart: gate the `As` branch the same way before reaching `_asShape`. Nine pinned tests in shape_explain_test.dart cover length / has / keys / sum / type / type|length / as(toml) / as(csv) and a negative test (non-null shape that the op rejects must still warn).

`lib/src/_version.dart` (from pubspec.yaml), `lib/src/_skill.dart` (from .agents/skills/lambe/SKILL.md), and `doc/lam.1` (from doc/lam.1.md) are checked-in artifacts produced by tools in tool/. Without a CI guard, a SKILL.md edit can land without regenerating the embedded copy and `lam --skill` ships stale content until the next manual release-prep run. The runtime byte-parity test in cli_integration_test.dart only catches drift after a regression has already merged. New `generated-files-in-sync` job re-runs the three generators and fails the build if any of the three files differ from the committed copies. Output includes the diff and the exact commands to run locally.

`test/to_number_type_test.dart` (added in commit a70b8d1) and `tool/gen_skill.dart` (added in commit cc20b68) were created without running `dart format` so the project-wide format CI job fails on this branch. Same content, just reflowed to dart-format conventions. Catches up the format CI before this branch is reviewed; the new `generated-files-in-sync` job verified earlier in the branch is unrelated and was already passing.

* refactor(shape): hand-build curated remediation templates, drop parser import `check.dart` previously imported the parser to validate four curated remediation source strings into [LamExpr] templates at construction time. Those four templates are constants in practice — `{items: .}`, `{value: .}`, `to_entries`, and `{value: .} | to_entries` — and hand-building them as `const LamExpr` values gives the same result without dragging the parser into the import graph. Why it matters: `pipe_ops.dart` is consumed by the parser, so the chain `pipe_ops → synthesize → check → parser → pipe_ops` was cyclic. That cycle prevented `pipe_ops.dart` from importing `synthesize.dart`, which forced the spec table's `_asSpec` entry to live as a stub while the real `as` inference was special-cased in `infer.dart`. Breaking the cycle here is the precondition for unifying `As` through the spec table. Side effect: the public `Remediation()` and `Remediation.withDisplay()` factories that took a `source` string are removed. Internal callers exclusively used `Remediation._` (the private factory). The public type and its read-only fields stay exported, so destructuring callers in `bin/lam.dart` are unaffected. Pre-1.0 breaking change; if a downstream consumer needs string-to-`Remediation` they can compose `parseQuery` and a public hand-built constructor in their own code. No behavior change. All 1673 tests pass. * refactor: unify As through the spec table; eval takes the AST node The spec table in pipe_ops.dart was already declared as the single source of truth for pipe-op metadata, runtime, and parsing — but `as` violated this on three axes: 1. `infer.dart` had an `if (expr is As)` branch that bypassed `inferPipeOpShape` to call a private `_asShape` helper. Any invariant added to `inferPipeOpShape` (e.g. the null short-circuit added in PR #10 commit 5) had to be re-implemented at the As branch or it would not apply. 2. `evaluator.dart` had a similar `As(:final target) => _as(...)` branch with its own private `_as` runtime that duplicated the bridge-resolution logic. 3. The spec's `_asSpec` entry was a stub: `infer` returned SAny, `eval` threw, both with comments saying "the real logic lives in <other file>". This commit makes `_asSpec` a real spec entry. Both `infer` and `eval` now consult the synthesis table directly via the imports unblocked by the previous commit's cycle break. The dispatch site becomes uniform: every pipe op (BuiltinPipeOp and As) flows through `evalPipeOp` and `inferPipeOpShape`, and per-op invariants apply to every op without per-AST branches. To make `_asSpec.eval` able to read its typed `As.target` argument, the `PipeOpInfo.eval` signature changes from `(ctx, args, eval)` to `(ctx, op, eval)`. Specs that need arguments destructure the AST themselves: `(op as BuiltinPipeOp).args[0]` for the generic case, `(op as As).target` for the typed case. Nine existing specs are updated mechanically; the 19 zero-arg specs use `(_, _, _)` and need no body change. `evalBuiltinPipeOp` is renamed to `evalPipeOp` and now accepts any LamExpr, dispatching via `pipeOpInfoFor`. The evaluator's switch case becomes `BuiltinPipeOp() || As() => evalPipeOp(...)`. Removed: - `_as` helper and its imports in evaluator.dart - `_asShape` helper and its imports in infer.dart - The `if (expr is As)` branch in inferShape No behavior change. All 1673 tests pass, including the nine null-short-circuit explain tests added in PR #10 commit 5 — those now pass via the unified spec path instead of the special-case As branch in infer.dart. * docs: update ast.dart and pipe_ops.dart comments to match unified dispatch The As class doc described the runtime as living in evaluator.dart's `_as` (now removed). The PipeOpParseKind.custom enum doc said the spec provides shape metadata only. Both reflected the pre-refactor world. Updated to describe the actual dispatch model: As keeps a dedicated AST class for its typed argument, but inference and runtime flow through the spec table. * test(shape): pin As through unified evalPipeOp dispatch The BuiltinPipeOp round-trip test skipped `custom` parseKind ops, so the As node's resolution through the unified spec-table dispatch was unpinned — the central invariant of this PR. Add three cases: - As(fmt) resolves through pipeOpInfoFor to the `as` spec for every OutputFormat (the missing half of the round-trip test). - evalPipeOp runs an As node end-to-end via the spec's eval field. - evalPipeOp rejects a non-pipe-op AST with the programmer-error QueryError, covering the reworded guard message.

…#12) PR #10 fixed six bugs against the 0.11.0 binary surfaced by an audit against the published skill. PR #11 (stacked) addressed the underlying smell: `as` was the lone pipe op outside the spec table, so per-op invariants like the null short-circuit had to be reimplemented at its callsite. Both PRs landed before the release tag; this commit bumps versions, regenerates embedded artefacts, and corrects the diacritic spelling of "Lambë" (the umlaut form is the Quenya original; both an acute-form typo and a bare-ASCII typo had spread across docs and prose). Test count: 1676. Breaking: PipeOpInfo.eval signature and the public Remediation source-string factories.

hakimjonas added 7 commits May 31, 2026 20:46

hakimjonas mentioned this pull request May 31, 2026

refactor: unify As through the spec table #11

Merged

8 tasks

hakimjonas merged commit 5d785bd into main Jun 1, 2026
5 checks passed

hakimjonas deleted the fix/null-safe-pipe-ops-and-skill-flag branch June 1, 2026 07:26

hakimjonas mentioned this pull request Jun 1, 2026

Lambë 0.12.0 #12

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: null|type, skill embed flag, doc corrections, lockstep streaming test#10

fix: null|type, skill embed flag, doc corrections, lockstep streaming test#10
hakimjonas merged 8 commits into
mainfrom
fix/null-safe-pipe-ops-and-skill-flag

hakimjonas commented May 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hakimjonas commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. null | type runtime bug fix (a70b8d1)

2. Doc + idiom-hint corrections (bde4cb3)

3. lam --skill flag (cc20b68)

4. Lockstep streaming test (d6517a3)

5. Static analyser models the Pipe null short-circuit (e2aae12)

6. CI: generated-files-in-sync (2f9b8fd)

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hakimjonas commented May 31, 2026 •

edited

Loading

1. `null | type` runtime bug fix (`a70b8d1`)

2. Doc + idiom-hint corrections (`bde4cb3`)

3. `lam --skill` flag (`cc20b68`)

4. Lockstep streaming test (`d6517a3`)

5. Static analyser models the Pipe null short-circuit (`e2aae12`)

6. CI: generated-files-in-sync (`2f9b8fd`)