Feature/0.8.0#6
Merged
Merged
Conversation
Opens the 0.8.0 branch with a full handover document covering: - The concrete CSV round-trip bug driving the element-level shape fix (with reproduction, stage-by-stage trace, and the three responsible layers identified). - Option A vs. Option B design decision for element-level shape constraints (recommends A: richer ShapeRequirement subclasses). - Work items in dependency order: driving test, stricter MustBeList, defensive _toCsv assertion, consistency matrix extension, inner-expression shape validation, parser error recovery / multi-line diagnostics. - Uncommitted work in sibling repos (rem validator rewrite, arda-web lambe 0.7.1 cutover) that needs to be decided on separately. - Starting state and launch checklist for the next session. No code changes yet. Branch base: main at 49b2ad4 (0.7.1 merged).
MustBeList.accepts checks only that the root is a list; it does not
walk element shapes. A list of maps whose cells hold nested lists or
maps slips past the check, and _toCsv falls back to Dart's default
toString() for the cell — producing output like
items,"[{key: rumil, value: ^0.6.0}, ...]".
Pins: crafted list-of-maps with a list-valued cell (and map-valued
cell, and list-of-lists with a nested-list element) must raise
OutputShapeError under both csv and tsv; the end-to-end
.dependencies | as(csv) | as(toml) | as(csv) pipeline must either
succeed with scalar-only cells or raise OutputShapeError — not
silently emit Dart toString() garbage. Legal scalar-cell shapes
continue to round-trip.
7 failing, 4 passing against the current tree.
CSV and TSV project rows onto a flat grid of scalar cells. The old
MustBeList requirement only checked that the root was a list and
never walked the element shape, so a list of maps holding nested
lists or maps slipped past the check. The writer then fell back to
Dart's default toString() and silently emitted garbage like
items,"[{key: rumil, value: ^0.6.0}, ...]" for the real-world chain
.dependencies | as(csv) | as(toml) | as(csv).
MustBeFlatList walks the element shape. It accepts SList<scalar>,
SList<SList<scalar>>, and SList<SMap<k:scalar, ...>>, and defers to
SAny wherever the shape is unknown (the checker cannot prove
incompatibility). requirementFor(csv|tsv) now returns the stricter
class; MustBeList stays in the hierarchy for any future consumer
that wants the permissive form.
The writer also grows a defensive _scalarCell guard. The shape
check is the primary line of defense, but element shapes can
collapse to SAny (empty or heterogeneous lists) and bypass it; in
that case the guard throws QueryError rather than stringify a
non-scalar cell. The handover file calls for QueryError and not
OutputShapeError here because by the time we reach the writer the
shape check has already reported Writable — reaching here means
the shape language was too lossy to prove the mismatch.
MustBeFlatList is exported from the lambe barrel. One shape_check
test that asserted `required is MustBeList` now asserts the
stricter class. Full suite green, 973 tests, zero analyzer issues,
formatter clean.
Adds shape_output_consistency_test.dart, a structural complement to pipe_ops_consistency_test.dart. The pipe-op matrix pins evaluator vs. spec; this one pins writer vs. shape check. Together they prevent a whole class of drift. For every representative value and every OutputFormat, the test runs canWriteAs and formatOutput and asserts they agree: Writable must not produce OutputShapeError; NotWritable must. The writer is allowed to raise plain QueryError from the defensive _scalarCell guard (for cases where the shape language is too lossy to prove the mismatch), but silent stringification — what the original 0.7.1 bug produced — is strictly forbidden. A second group pins that invariant directly on the two offender shapes. 88 new cases. 1061 tests green, analyzer clean, formatter clean.
filter evaluates `evaluate(predicate, item) == true`. Strict equality against `true` means that any predicate whose value is not a boolean is silently rejected — the filter produces an empty list regardless of input. Without static feedback, a typo like filter(.missing) or a predicate that forgot its comparison (filter(.name) instead of filter(.name == "Alice")) looks fine at parse time and silently produces an empty result at runtime. explain() now walks each stage before inferring it and flags two patterns that can only produce an empty filter: - Missing-field paths. filter(.missing) over an element shape that is a known SMap without `missing` — at runtime the field access returns null, which never equals true. The check walks Field / Access chains segment by segment, so .users | filter(.addr.nope) resolves the chain against the element's `addr` subshape. - Non-boolean predicate shapes. If the inferred predicate shape is any concrete non-bool (SNum, SString, SList, SMap, SNull), the filter is provably empty. SBool and SAny are left alone: either might be true. The same rules apply to filter_values (predicate sees element values of the input map) and filter_keys (predicate sees keys, always strings). sort_by / map / unique_by are intentionally out of scope — they do not apply a `== true` test, so their predicates have different correctness rules. New ExplainWarning type is exported through lib/lambe.dart and rendered by renderExplain between the stage table and the writability summary. 12 new tests pin the warning on every affected op and on both warning modes, and pin the absence of warnings on valid predicates, SAny input, and non-filter ops. 1073 tests green, analyzer clean, dart doc clean.
Parse errors now report `line:column` instead of just `column`, and the source excerpt is a proper context block: the offending line prefixed with its line number, the caret under the offending column, and (for multi-line queries) one line of context on either side. The gutter width adapts to the query's line count. A 1-line query still produces the existing single-line excerpt with a gutter prefix. Previously the renderer built one big block by concatenating the caret line after the full expression, ignoring newlines. For any multi-line query that placed the caret at column N of the final visible line regardless of which line the error was actually on — unusable for heredoc-authored queries or anything coming out of an editor. `Location.line` was always available on Rumil's ParseError; the old code just didn't read it. The CLI's --explain path used to bail out with "Error: failed to parse query", skipping the rich diagnostic that --to had access to. It now goes through parseAst so both paths surface the same message. Also drops the `QueryError: ` prefix-doubling in bin/lam.dart: three callsites used `'Error: $e'` which invokes QueryError.toString and added a second `QueryError:` on top of the CLI's `Error:` prefix. Those now use `e.message`, matching the rest of the file. Test matrix pins the renderer: single-line + multi-line, error-at- line-1 / line-mid / line-last, gutter width scaling with line count, and the did-you-mean hint for mistyped pipe ops. Also pins the QueryError.message / toString contract so downstream callers know which to use. 1081 tests green, analyzer clean, dart doc clean.
First review pass on the 0.8.0 work surfaced seven issues. None of them affect behavior the tests already pin, but they affect what a reader sees when they open the public API or hit an error. - formatOutput's docstring in lib/src/output.dart still said CSV/TSV accepts "a list of maps or list of lists" with no mention of the scalar-cell constraint. Now describes the three accepted element shapes and names both error paths (OutputShapeError from the shape check, QueryError from the defensive writer guard). - _scalarCell's error message used cell.runtimeType, so a user saw "got _GrowableList" instead of "got list". Now maps List/Map to "list"/"map" up front and only falls back to runtimeType for the unreachable else branch. - _predicateWarning hard-coded "element shape" for all three ops. For filter_values the predicate sees a value, for filter_keys it sees a key. Now takes the domain name as an argument so each op reads naturally. - Two test files had stale docstrings: one referenced a runQueryAsString API that never existed (draft rot), one claimed "remediation" is offered for the non-scalar-cell case (the handover deliberately chose not to emit one). Both corrected. - Leftover inline comment in csv_element_shape_test.dart removed. - Attribution fix in shape_output_consistency_test.dart: the silent- stringification bug predates 0.7.1 by two releases, not "shipped in 0.7.1". Two new tests pin the polish: the new descriptive cell-kind message, and the value-domain wording for filter_values warnings. 1083 tests green, analyzer clean, dart doc clean, pana 160/160.
Two gaps surfaced on the second review pass. Neither indicates a real bug, but both described behavior more loosely than the tests in the rest of the file. - The renderExplain ordering test asserted that Warning and Writable lines both appear in the output but said nothing about their position. "Between stages and writability" was only the test name, not what was pinned. Now compares indexOf values so the stage row, the warning block, and the Writable line are pinned in that order. - The parse-error renderer handles \r\n line endings (regex split on \r?\n), but no test exercised that path. Added a direct Windows-line-ending case that also pins the CR character does not leak into the excerpt.
Third review pass turned up one real user-facing bug and one doc inaccuracy. Both surface-level, both pinned by existing tests now that the code is correct. bin/mcp_server.dart had three `Error: $e` callsites that invoked QueryError.toString(), producing `Error: QueryError: parse error at line ...` — the same prefix-doubling the CLI avoided. MCP callers (typically LLM agents) would see the `QueryError:` token as part of the human-readable error, which is noise for agents that already know the error came from the query tool. Now uses `e.message` consistently with bin/lam.dart. MustBeList's docstring still claimed it was "used where downstream code handles arbitrary element shapes", but after 0.8.0 there is no such downstream code — csv/tsv use the stricter MustBeFlatList, and no other consumer exists. Rewritten to describe the class honestly: the generic list-root requirement, retained for future format additions whose serialization tolerates any element shape.
The writer took headers from the first row only, so
[{a: 1}, {b: 2}] produced "a\n1\n" — the b column was silently
dropped and row 2 rendered as an empty line. Pre-existed 0.8.0 but
belongs to the same class as the non-scalar-cell bug we just
fixed: the writer producing something other than what the input
structurally carries. Leaving it unfixed would have forced a 0.8.1
within weeks of the release.
_toCsv now unions keys across all rows in first-seen insertion
order. Each row renders a cell per header, with empty string when
the row doesn't have that key. Matches pandas.DataFrame.to_csv and
Python's csv.DictWriter(extrasaction='ignore') semantics:
presentation is lossless for scalar-celled data regardless of
whether every row carries every key.
Null values and absent keys both render as empty cells. This
intentional — the CSV format has no distinct "null" token, so the
distinction can't be preserved round-trip anyway. Callers that
need the distinction should use JSON.
Two rows added to shape_output_consistency_test.dart
(disjoint-keys, overlapping-keys) so the matrix covers the new
behavior. Seven new cases in csv_element_shape_test.dart pin the
behavior directly, including a regression guard for the
homogeneous case. 1102 tests green, analyzer clean, pana 160/160.
Running \`lam '' file.json\` surfaced the parser's full list of expected tokens — every pipe-op name plus 14 punctuation entries, roughly 30 items — and buried the actual problem (the query is empty) under the noise. Pre-existed 0.8.0 but became more visible once the parse-error renderer got the rest of its jq-style treatment in this release. _formatParseErrors now short-circuits on expression.trim().isEmpty with a single line: \`parse error: expression is empty\`. No expected-list, no line/column (neither meaningful for an empty input), no caret. Three tests pin the shape for genuinely-empty, whitespace-only, and newline-only input. 1105 tests green.
- pubspec: 0.7.1 → 0.8.0 - _version.dart regenerated via tool/gen_version.dart - doc/lam.1.md front matter: "Lambë 0.7.1" → "0.8.0", "April 2026" → "May 2026"; lam.1 regenerated via tool/manpage.dart - README / doc/getting-started REPL banner bumped to v0.8.0 - manpage_test.dart: TH date check updated to "May 2026" - CHANGELOG: new 0.8.0 entry with Added / Changed / Breaking sections - ROADMAP: 0.8.0 marked shipped; 0.9.0 candidates refreshed with the new classes surfaced during the 0.8.0 review passes (runtime-failure warnings, CSV cell-flattening policy) .pubignore added to keep the pub archive focused. Previously the archive shipped a 7MB stale x86_64 AOT of lam-mcp from an April developer build, plus any locally-generated doc/api/ HTML and any benchmark artifacts lying around. None of that reaches users: `dart pub global activate lambe` compiles the MCP server from source via Dart VM snapshots, and per-platform lam-mcp binaries are built in CI on tag push (see .github/workflows/release.yml) and published as GitHub release assets referenced by SHA256 in the MCP registry. Dropping the stale binary takes the archive from 3 MB to 133 KB. 1105 tests green, analyzer clean, dart doc clean, pana 160/160, dart pub publish --dry-run clean (bar the uncommitted-changes warning, which this commit resolves).
…e helpers Removes em dashes from prose added in this release (they stay in existing section-heading style like `## 0.8.0 — shipped` to match the 0.7.x entries). Flattens editorial phrasing in CHANGELOG and ROADMAP: "This is the release's core behavior change", "every behavior change swapped a silent failure for a loud one", "is the most identity-defining candidate in the pool" — all cut. The entries describe what changed and what users need to do about it. Two private helpers picked up docstrings: `_describeCellKind` in lib/src/output.dart and `_isScalar` in lib/src/shape/check.dart. The analyzer only enforces public_member_api_docs on public members, so these weren't required, but the helpers are non-obvious (why is SAny "scalar"?) and the one-liner makes the intent explicit. No behavior change. 1105 tests, analyzer clean, dart doc clean, pana 160/160, dart pub publish --dry-run 0 warnings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lambë 0.8.0. Element-level shape checking for CSV/TSV output, union headers across heterogeneous-keyed rows, static warnings for provably-empty filters in
--explain, and line-aware parser diagnostics.MustBeFlatListrejects list-of-maps and list-of-lists whose cells aren't scalar. Previously these passed the shape check and the writer fell back to Dart'stoString(), emitting cells like"[{key: rumil, value: ^0.6.0}, ...]". Defensive_scalarCellguard catches theSAnycases the shape language can't prove.[{a:1}, {b:2}]used to lose thebcolumn silently; now rendersa,b\n1,\n,2\n.canWriteAsandformatOutputpinned in a 100-case consistency matrix, structural complement topipe_ops_consistency_test.dart.--explainwarns onfilter,filter_values, andfilter_keyspredicates that are provably empty: missing field on a known-closed shape, or a non-boolean predicate shape.line:columnwith a gutter-prefixed excerpt and caret; multi-line queries get one line of context on either side. Empty-expression message replaced the 30-token expected-list withparse error: expression is empty.--explainpath goes throughparseAstso both CLI modes get the same diagnostic.Three MCP server error sites fixed to use
e.messageinstead of$e(was doublingQueryError:over the CLI'sError:prefix)..pubignoreadded to keep the published archive at 133 KB (was 3 MB from shipping a stale locallam-mcpAOT; per-platform binaries are built in CI on tag push).Breaking
OutputShapeError.requirementFor(OutputFormat.csv)andrequirementFor(OutputFormat.tsv)returnMustBeFlatListinstead ofMustBeList.Full details in CHANGELOG 0.8.0 section.
Verification
dart analyze --fatal-infoscleandart format --set-exit-if-changedcleandart doc --dry-run0 warningsdart pub publish --dry-run0 warningsTest plan
lam --to csv '.dependencies | as(csv) | as(toml) | as(csv)' /tmp/data.yamlraisesOutputShapeError(previously emitted garbage)lam --explain '.users | filter(.missing)' users.jsonshows the predicate warning.github/workflows/release.ymlbuilds per-platformlam-mcpand publishes to MCP registry