Skip to content

fix: null|type, skill embed flag, doc corrections, lockstep streaming test#10

Merged
hakimjonas merged 8 commits into
mainfrom
fix/null-safe-pipe-ops-and-skill-flag
Jun 1, 2026
Merged

fix: null|type, skill embed flag, doc corrections, lockstep streaming test#10
hakimjonas merged 8 commits into
mainfrom
fix/null-safe-pipe-ops-and-skill-flag

Conversation

@hakimjonas
Copy link
Copy Markdown
Owner

@hakimjonas hakimjonas commented May 31, 2026

Summary

A bundle of small fixes against main that came out of an audit of the
0.11.0 binary against the published skill. Six self-contained commits,
each independently revertable.

1. null | type runtime bug fix (a70b8d1)

The Pipe evaluator unconditionally short-circuited on a null input,
so null | type returned JSON null instead of the string "null" that
type is defined to produce on any value.

Fix: introduce nullSafePipeOpNames (a Set in lib/src/shape/pipe_ops.dart)
and have _pipe consult it before short-circuiting. Currently only type
opts in. Three pinned regression tests including null | type | length
which confirms the result is a string by chaining length and getting 4.

2. Doc + idiom-hint corrections (bde4cb3)

  • SKILL.md previously claimed and/or/not would be parser-rejected.
    In fact and and or are accepted as keyword aliases for && / ||
    (intentional jq-compat). not is not aliased; use !.
  • Added a def non-goal hint: lambé has no user-defined functions,
    recursion, or closures by design. The parser now redirects def
    attempts with that explanation.

3. lam --skill flag (cc20b68)

Embeds .agents/skills/lambe/SKILL.md at compile time via a code-gen
step (tool/gen_skill.dartlib/src/_skill.dart) and surfaces it as
lam --skill. The intended use is install-via-shell-out:

lam --skill > .agents/skills/lambe/SKILL.md

Three CLI integration tests: YAML-frontmatter sanity check, Markdown
round-trip via lam -f markdown, byte-for-byte parity with the on-disk
source.

4. Lockstep streaming test (d6517a3)

The --ndjson stdin streaming test was wall-clock-based and was being
skipped under CI=true. Reframed the property: streaming means N inputs
produce N outputs in lockstep. The new helper feeds one line, awaits its
output, then feeds the next; a buffered implementation never produces
the first output before EOF, so the per-line .timeout() fails clearly.
No wall-clock comparison, no CI skip.

5. Static analyser models the Pipe null short-circuit (e2aae12)

The runtime fix in #1 surfaced a parallel bug in inferShape and
_analyzeRejection: they didn't know about _pipe's null
short-circuit, so null | length inferred shape any and --explain
warned length rejects null; this will throw at runtime — a warning
that contradicts both the documented "navigation on null returns null"
contract and the actual runtime (which short-circuits and returns
null). as(...) had the same flavour of bug via a separate code path
(inferShape calls _asShape directly, skipping inferPipeOpShape),
making null | as(toml) falsely advertise TOML/HCL as writable.

Fixed in three places: inferPipeOpShape returns SNull for null
input on non-null-safe ops, _analyzeRejection suppresses the
"will throw" warning under the same condition, and the As branch
in infer.dart short-circuits before synthesis. Nine pinned tests
in shape_explain_test.dart covering length / has / keys / sum /
type / type|length / as(toml) / as(csv) plus a negative test that
non-null rejections still warn.

6. CI: generated-files-in-sync (2f9b8fd)

lib/src/_version.dart, lib/src/_skill.dart, and doc/lam.1 are
checked-in artifacts produced by tool/gen_version.dart,
tool/gen_skill.dart, and tool/manpage.dart. New CI job re-runs all
three and fails the build if any output differs from the committed
copy. Catches the drift class where a SKILL.md edit lands without
regenerating the embedded copy.

Test plan

  • dart test — 1673 tests pass (was 1664; +9 new explain tests)
  • dart format — clean
  • dart analyze — no issues
  • AOT compile + smoke: null | type returns "null", null | length returns null
  • lam -n --explain 'null | as(toml)' no longer claims TOML is writable
  • lam --skill | lam -f markdown '.title' returns the embedded title
  • Streaming test runs in ~2s on JIT, no CI skip needed
  • Generators produce no diff against committed artifacts

Notes

  • No version bump and no CHANGELOG entry — release decisions are out of scope.
  • nullSafePipeOpNames is intentionally a Set rather than a nullSafe
    field on PipeOpInfo: avoids touching all 28 op declarations to add
    one bit of metadata only type currently uses.

…uit)

`_pipe` in evaluator.dart unconditionally short-circuited on null
inputs: `if (input == null) return null`. That implements the "navigation
on null returns null" contract (correct for `.field`, `flatten`,
`length`, `map`, …) but breaks ops whose semantics are *defined* over
a null context. The canonical case is `type`, whose `--explain` shape
is `SString` and whose spec eval has a deliberate `null => 'null'`
branch — but that branch was unreachable from any `Pipe` AST node
because `_pipe` returned null before the spec was consulted.

Concretely on lambé 0.11.0:

  $ lam -n 'null | type'
  null                              # JSON null, not the string "null"

  $ lam -n '(null | type) == "null"'
  false

  $ lam -n 'null | type | length'
  null                              # short-circuit cascades

The fix adds `nullSafePipeOpNames`, a set of op names that opt out of
the short-circuit. `_pipe` consults it; if the right-hand op is a
BuiltinPipeOp whose name is in the set, the null is forwarded to its
`eval`. Currently the set contains only `type`. Adding an op is a
single-line change in pipe_ops.dart.

`has` is a related candidate but its spec eval throws on non-map/non-list
context rather than returning false. Promoting `has` to nullSafe would
change the user-visible behaviour from "returns null" to "throws
QueryError", which is a separate design call. Out of scope for this fix.

Tests pin the bug at the Pipe AST level: the existing `query('type', null)`
test passed because it bypasses `Pipe` entirely (top-level evaluation
of a bare `type` expression doesn't go through `_pipe`). Three new
tests in `test/to_number_type_test.dart` exercise the Pipe form
(`null | type`, `.missing | type`, and a chained `null | type | length`)
that would have caught this regression.

Verified: 1656 existing tests still pass; the 1 failing test
(`--ndjson stdin streaming`) is a pre-existing timing flake on this
machine, reproduces on plain main without this change.
Two small fixes that surface together:

1. `.agents/skills/lambe/SKILL.md` previously told agents not to write
   `and` / `or` / `not` and claimed the parser would reject them.
   That is correct for `not` but wrong for `and` and `or` — both are
   documented keyword aliases for `&&` / `||` (lib/src/parser.dart
   lines 16–20). Updated the skill to accurately describe which forms
   are aliased and which require the punctuation form.

2. `def` was the only common jq idiom without a tailored
   `_jqIdiomHint`. The query `def f: . + 1; f` produced the generic
   "expected ..." token list with no guidance. Added a hint that
   names the non-goal explicitly: lambé is a bounded tree transformer;
   no `def`, no recursion, no closures.

A pinned test in `test/parse_error_format_test.dart` exercises the
`def` redirect to keep it from regressing.
`lam --skill` writes the embedded `.agents/skills/lambe/SKILL.md` to
stdout and exits. Lets agent harnesses install the skill regardless of
how `lam` was acquired:

  lam --skill > .agents/skills/lambe/SKILL.md

This addresses an adoption gap: `dart pub global activate lambe` does
not ship the `.agents/` directory, so pub-installed users have no way
to discover the skill on disk. The release-binary path (install.sh
from the GitHub release) includes the binary but not the skill file
either. Embedding closes both gaps with a single command and a
filesystem-free implementation that works in any execution model
(AOT, JIT, future WASM CLI).

Mechanics mirror `lambeVersion`:

- New `tool/gen_skill.dart` reads `.agents/skills/lambe/SKILL.md` and
  emits `lib/src/_skill.dart` containing a `const String lambeSkill`.
- The generator is committed-output: `lib/src/_skill.dart` lives in
  the tree so `dart compile exe` works from a fresh checkout without
  a generator pre-step. Re-run after editing the skill source.
- Generator uses a raw triple-quoted Dart string (`r'''...'''`) and
  fails loudly if the skill ever contains `'''` so the embedding
  scheme stays simple.
- `bin/lam.dart` adds an `--skill` flag, handled before any
  input-acquisition logic so it works without stdin or a file
  argument.
- Three CLI integration tests pin the wiring: header shape,
  markdown round-trip via `lam -f markdown`, and byte-for-byte
  parity with the on-disk source (catches "regenerated _skill.dart
  was not committed" regressions).
- `doc/lam.1.md` documents the flag; `doc/lam.1` regenerated from it.

`lambeSkill` is also exported from `package:lambe` so library consumers
can read the same string without shelling out.
…ock check

The --ndjson stdin streaming test asserted that the second output line
arrived ≥150 ms before EOF, then ran a parallel timer to prove the time
gap. Both halves were sensitive to host CPU scheduling and stdio
buffering, so the test was marked CI-skipped. Useful signal lost.

The replacement reframes the property: streaming means N inputs produce
N outputs in lockstep. The test now feeds one ndjson line, awaits its
output line, then feeds the next, etc. A buffered implementation never
produces the first output before EOF, so the per-line wait times out
and the test fails with a clear streaming-broken signal. No wall-clock
comparison; no CI skip.

The helper avoids `package:async` by hand-rolling a completer queue
front of the stdout LineSplitter.
The runtime [Pipe] evaluator short-circuits on null: a null left-hand
side returns null without invoking the right-hand op (except for ops
listed in [nullSafePipeOpNames] like `type`). This is the documented
"navigation on null returns null" contract and is pinned by tests in
null_propagation_test.dart.

The static analyser did not model this, with two visible consequences:

1. `inferPipeOpShape` saw a null input and returned [SAny] (since
   most ops' `accepts` predicate rejects null), so `null | length`
   inferred to 'any' instead of 'null'.

2. `_analyzeRejection` then saw the not-accepted predicate and
   emitted "length rejects null; this will throw at runtime" — a
   warning that contradicts the documented contract and the actual
   runtime behaviour.

Same shape of bug in `infer.dart`: `as(target)` bypasses
`inferPipeOpShape` to call the synthesis table directly, so even
after fixing the spec-table path, `null | as(toml)` still inferred
to `map<value: null>` and falsely listed TOML/HCL as writable.

Fix: thread the null short-circuit through both paths.
- `inferPipeOpShape`: when input is [SNull] and the op is not in
  [nullSafePipeOpNames], return [SNull].
- `_analyzeRejection` in explain.dart: suppress the "will throw at
  runtime" warning under the same condition.
- `inferShape` in infer.dart: gate the `As` branch the same way
  before reaching `_asShape`.

Nine pinned tests in shape_explain_test.dart cover length / has /
keys / sum / type / type|length / as(toml) / as(csv) and a negative
test (non-null shape that the op rejects must still warn).
`lib/src/_version.dart` (from pubspec.yaml), `lib/src/_skill.dart`
(from .agents/skills/lambe/SKILL.md), and `doc/lam.1` (from
doc/lam.1.md) are checked-in artifacts produced by tools in tool/.
Without a CI guard, a SKILL.md edit can land without regenerating
the embedded copy and `lam --skill` ships stale content until the
next manual release-prep run. The runtime byte-parity test in
cli_integration_test.dart only catches drift after a regression
has already merged.

New `generated-files-in-sync` job re-runs the three generators
and fails the build if any of the three files differ from the
committed copies. Output includes the diff and the exact commands
to run locally.
`test/to_number_type_test.dart` (added in commit a70b8d1) and
`tool/gen_skill.dart` (added in commit cc20b68) were created without
running `dart format` so the project-wide format CI job fails on
this branch. Same content, just reflowed to dart-format conventions.

Catches up the format CI before this branch is reviewed; the new
`generated-files-in-sync` job verified earlier in the branch is
unrelated and was already passing.
* refactor(shape): hand-build curated remediation templates, drop parser import

`check.dart` previously imported the parser to validate four curated
remediation source strings into [LamExpr] templates at construction
time. Those four templates are constants in practice — `{items: .}`,
`{value: .}`, `to_entries`, and `{value: .} | to_entries` — and
hand-building them as `const LamExpr` values gives the same result
without dragging the parser into the import graph.

Why it matters: `pipe_ops.dart` is consumed by the parser, so the
chain `pipe_ops → synthesize → check → parser → pipe_ops` was
cyclic. That cycle prevented `pipe_ops.dart` from importing
`synthesize.dart`, which forced the spec table's `_asSpec` entry
to live as a stub while the real `as` inference was special-cased
in `infer.dart`. Breaking the cycle here is the precondition for
unifying `As` through the spec table.

Side effect: the public `Remediation()` and `Remediation.withDisplay()`
factories that took a `source` string are removed. Internal callers
exclusively used `Remediation._` (the private factory). The public
type and its read-only fields stay exported, so destructuring callers
in `bin/lam.dart` are unaffected. Pre-1.0 breaking change; if a
downstream consumer needs string-to-`Remediation` they can compose
`parseQuery` and a public hand-built constructor in their own code.

No behavior change. All 1673 tests pass.

* refactor: unify As through the spec table; eval takes the AST node

The spec table in pipe_ops.dart was already declared as the single
source of truth for pipe-op metadata, runtime, and parsing — but `as`
violated this on three axes:

1. `infer.dart` had an `if (expr is As)` branch that bypassed
   `inferPipeOpShape` to call a private `_asShape` helper. Any
   invariant added to `inferPipeOpShape` (e.g. the null short-circuit
   added in PR #10 commit 5) had to be re-implemented at the As branch
   or it would not apply.

2. `evaluator.dart` had a similar `As(:final target) => _as(...)`
   branch with its own private `_as` runtime that duplicated the
   bridge-resolution logic.

3. The spec's `_asSpec` entry was a stub: `infer` returned SAny,
   `eval` threw, both with comments saying "the real logic lives in
   <other file>".

This commit makes `_asSpec` a real spec entry. Both `infer` and
`eval` now consult the synthesis table directly via the imports
unblocked by the previous commit's cycle break. The dispatch site
becomes uniform: every pipe op (BuiltinPipeOp and As) flows through
`evalPipeOp` and `inferPipeOpShape`, and per-op invariants apply
to every op without per-AST branches.

To make `_asSpec.eval` able to read its typed `As.target`
argument, the `PipeOpInfo.eval` signature changes from
`(ctx, args, eval)` to `(ctx, op, eval)`. Specs that need
arguments destructure the AST themselves: `(op as BuiltinPipeOp).args[0]`
for the generic case, `(op as As).target` for the typed case. Nine
existing specs are updated mechanically; the 19 zero-arg specs use
`(_, _, _)` and need no body change.

`evalBuiltinPipeOp` is renamed to `evalPipeOp` and now accepts
any LamExpr, dispatching via `pipeOpInfoFor`. The evaluator's
switch case becomes `BuiltinPipeOp() || As() => evalPipeOp(...)`.

Removed:
- `_as` helper and its imports in evaluator.dart
- `_asShape` helper and its imports in infer.dart
- The `if (expr is As)` branch in inferShape

No behavior change. All 1673 tests pass, including the nine
null-short-circuit explain tests added in PR #10 commit 5 — those
now pass via the unified spec path instead of the special-case As
branch in infer.dart.

* docs: update ast.dart and pipe_ops.dart comments to match unified dispatch

The As class doc described the runtime as living in evaluator.dart's
`_as` (now removed). The PipeOpParseKind.custom enum doc said the
spec provides shape metadata only. Both reflected the pre-refactor
world. Updated to describe the actual dispatch model: As keeps a
dedicated AST class for its typed argument, but inference and runtime
flow through the spec table.

* test(shape): pin As through unified evalPipeOp dispatch

The BuiltinPipeOp round-trip test skipped `custom` parseKind ops, so
the As node's resolution through the unified spec-table dispatch was
unpinned — the central invariant of this PR. Add three cases:

- As(fmt) resolves through pipeOpInfoFor to the `as` spec for every
  OutputFormat (the missing half of the round-trip test).
- evalPipeOp runs an As node end-to-end via the spec's eval field.
- evalPipeOp rejects a non-pipe-op AST with the programmer-error
  QueryError, covering the reworded guard message.
@hakimjonas hakimjonas merged commit 5d785bd into main Jun 1, 2026
5 checks passed
@hakimjonas hakimjonas deleted the fix/null-safe-pipe-ops-and-skill-flag branch June 1, 2026 07:26
@hakimjonas hakimjonas mentioned this pull request Jun 1, 2026
7 tasks
hakimjonas added a commit that referenced this pull request Jun 1, 2026
…#12)

PR #10 fixed six bugs against the 0.11.0 binary surfaced by an audit
against the published skill. PR #11 (stacked) addressed the underlying
smell: `as` was the lone pipe op outside the spec table, so per-op
invariants like the null short-circuit had to be reimplemented at its
callsite. Both PRs landed before the release tag; this commit bumps
versions, regenerates embedded artefacts, and corrects the diacritic
spelling of "Lambë" (the umlaut form is the Quenya original; both an
acute-form typo and a bare-ASCII typo had spread across docs and
prose).

Test count: 1676. Breaking: PipeOpInfo.eval signature and the public
Remediation source-string factories.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant