chore(hogql): compare positions in parser shadow comparison#60226
Conversation
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
|
It looks like the code of |
|
⏭️ Skipped snapshot commit because branch advanced to The new commit will trigger its own snapshot update workflow. If you expected this workflow to succeed: This can happen due to concurrent commits. To get a fresh workflow run, either:
|
✅ Hobby deploy smoke test: PASSEDHobby deployment smoke test passed successfully. |
The cpp-vs-rust shadow comparison stripped per-node start/end positions before comparing, so span divergences went undetected. Compare full ASTs (positions included), matching the diagnostics' positions-on default. Mismatches are classified position-only vs structural for triage, and prod telemetry is tagged with hogql_parser_position_only_mismatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Enabling position comparison in the shadow check surfaced two cpp-vs-rust position divergences across the test suite (all position-only): 1. is_internal parses (start=None): cpp gates every addPositionInfo on !is_internal, so a synthetic fragment (e.g. an injected database ExpressionField like the person-override join) is emitted position-less. The rust parser took the is_internal flag but ignored it. Thread it to pos_obj, which now returns null when suppressed -- the single chokepoint every position flows through, so both the json and py emitters are covered. This was the dominant divergence (~800 failing parses). 2. VALUES clause in FROM: the ValuesQuery node was not position-wrapped. cpp spans the valuesClause from VALUES to the final row's `)`; wrap it. Pins both with regression tests in the parser factory (positions verified via the shared snapshot, and a cross-backend is_internal assertion). Bumps the rust wheel 1.3.75 -> 1.3.76 so the build-hogql-parser-rs workflow republishes; uv.lock is left for CI's commit-pin step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous push's synchronize event didn't register on the PR, so the Release hogql-parser-rs workflow never re-ran to publish 1.3.76. Empty commit to force a fresh synchronize. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ddd2710 to
c241702
Compare
… list[tuple] The rebase onto master surfaced a second class of position-only shadow mismatches (in master's now-default CPP_WITH_RUST_PY_SHADOW mode): every Hog program with a dict literal or a try/catch raised on parse, even though every node's offsets matched. Root cause was list-vs-tuple drift between the three paths: - The canonical Python AST types `Dict.items` and `TryCatchStatement.catches` as `list[tuple[...]]` (matching what `CloningVisitor` builds). - `rust-py` built Dict items as `list[tuple]` (correct) but catches as `list[list]` (wrong). - The JSON deserialiser (`json_ast._deserialize_node`) returned `list[list]` for everything (JSON has no tuple), so cpp-json / rust-json deserialised both into `list[list]` (also wrong). The shadow's `clear_locations` re-clone normalised everything to tuple, so the divergence was hidden — and master explicitly DEFERRED it (see the prior `test_parser_rust_py._DEFERRED`). Position-on comparison unhides it. Fix everything to the canonical: - `json_ast.py`: introduce `_TUPLE_INNER_FIELDS` so cpp/rust-json arrays for these two fields deserialise as `list[tuple]`. - `emit_py.rs::catch_clause`: build a `PyTuple` instead of a `PyList`, matching the dict path. - `pretty_dataclasses`: render tuples as `(...)` (was falling through to None, which would have silently mangled snapshots of populated tuple content). - Drop the now-obsolete `_DEFERRED` set in `test_parser_rust_py.py`. - Update the two affected snapshots (`test_null_inf_nan_rejected_in_hog_identifier_slots[try_catch_e*]`) from `[[var, type, body]]` to `[(var, type, body)]`. Other snapshots are unaffected — every `items:` in the file is empty `[]`. Bumps the rust wheel 1.3.76 -> 1.3.77 (catch_clause is a rust-source change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
⏭️ Skipped snapshot commit because branch advanced to The new commit will trigger its own snapshot update workflow. If you expected this workflow to succeed: This can happen due to concurrent commits. To get a fresh workflow run, either:
|
The Django shards surfaced one more position divergence after the 1.3.77 commit-pin: `parse_full_template_string` returned the `parse_template_body` result unwrapped, but cpp's visitor positions the top node via the rule ctx — which spans the WHOLE input including the leading `F'`. Multi-chunk templates (where `parse_template_body` returns a position-less `concat(...)`) ended up with `start`/`end` as None on the rust backends, while cpp had `start=0, end=len(src)`. Wrap the result via `replace_pos` so both multi-chunk and single-chunk shortcuts get the outer span. Expose `template::pos_in_source` as `pub(crate)` so the entry point can build the position envelope without spinning up a `Parser` (which would try to lex `F'…` and fail on the unclosed quote). Adds a regression test pinning three F-string shapes to the parser factory's shared snapshot (positions verified across cpp / rust-json / rust-py / python). Bumps the rust wheel 1.3.77 -> 1.3.78. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…outer wrap
The Hog tests surfaced two more `full_template_string` divergences on the
post-commit-pin 1.3.78 run:
F'{true} cpp start=3 end=7 (just the inner `true` constant)
F'{x} cpp start=3 end=4 (just the inner `x` field)
Previous fix used `replace_pos(result, 0, src.len())` which UNCONDITIONALLY
overwrote the inner span — correct for multi-chunk where `concat(...)` has
no positions, wrong for single-chunk shortcuts where cpp keeps the inner
element's own span. Swap to `with_pos` (idempotent): the position-less
`concat` gets the outer span `(0, src.len())`, the single-chunk cases keep
their inner positions unchanged.
Extends the regression test with three single-chunk shapes (`hello`, `{x}`,
`{true}`) alongside the existing multi-chunk cases so a regression that
clobbers either branch is pinned. Bumps the rust wheel 1.3.78 -> 1.3.79.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
⏭️ Skipped snapshot commit because branch advanced to The new commit will trigger its own snapshot update workflow. If you expected this workflow to succeed: This can happen due to concurrent commits. To get a fresh workflow run, either:
|
|
⏭️ Skipped snapshot commit because branch advanced to The new commit will trigger its own snapshot update workflow. If you expected this workflow to succeed: This can happen due to concurrent commits. To get a fresh workflow run, either:
|
The post-commit-pin run on 1.3.79 surfaced one more position-only divergence: a program prefixed with a UTF-8 BOM (U+FEFF, e.g. a file saved by an editor that added one) had every node off by +1 — cpp reckoned offsets from the char AFTER the BOM, rust counted the BOM as 1 char. cpp's ANTLR lexer treats a leading BOM as zero-width: all `getStartIndex()` results, and `getCharPositionInLine()` on line 1, skip past it. Detect a leading BOM at parser construction (`leading_bom_bytes = 3` if so), and in `pos_obj` subtract 1 from char_offset (and from column on line 1) once we're past the BOM byte boundary. The ASCII fast path is untouched — a BOM-prefixed source is non-ASCII so the slow path already runs. Pinned: `test_byte_order_mark_does_not_break_parse` now also calls `_assert_ast` so the snapshot mechanism records cpp's exact span across backends. Bumps the rust wheel 1.3.79 -> 1.3.80. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
⏭️ Skipped snapshot commit because branch advanced to The new commit will trigger its own snapshot update workflow. If you expected this workflow to succeed: This can happen due to concurrent commits. To get a fresh workflow run, either:
|
…le format The tuple branch added to `pretty_dataclasses` (for Dict.items / TryCatchStatement.catches consistency) also affects bare `Constant(value=tuple)` placeholders — they now render multi-line `(\n 1,\n 2,\n 3\n)` instead of falling through to `str(obj)`'s single-line `(1, 2, 3)`. Update the one resolver snapshot that uses a tuple-valued placeholder to match. Verified via grep that this is the only `value: (...)` line in the repo's snapshots, so no other tests are affected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…by path
The Hog tests caught one more position divergence on a LIMIT BY + trailing
LIMIT + OFFSET combo:
select 1 from events LIMIT 1 BY event LIMIT 2 OFFSET 3
^
cpp ends selectStmt here (offset 45)
`HogQLParser.g4`:
selectStmt: ... limitByClause? (limitAndOffsetClause | offsetOnlyClause)?
limitAndOffsetClause
: LIMIT columnExpr PERCENT? (COMMA columnExpr)? (WITH TIES)? // compact
| LIMIT columnExpr PERCENT? (WITH TIES)? OFFSET columnExpr // verbose
;
selectSetStmt: selectStmtWithParens ... limitAndOffsetClauseOptional?
After `LIMIT BY event`, both compact (just `LIMIT 2`) and verbose (`LIMIT 2
OFFSET 3`) can match. ANTLR ALL(*) picks the first-listed compact alternative
when both lead to a successful overall parse — leaving `OFFSET 3` for the
outer selectSetStmt's `limitAndOffsetClauseOptional`. So cpp's selectStmt ctx
ends at the LIMIT, not the OFFSET, even though the offset value still ends up
on the inner SelectQuery (via `merge_select_decorators`).
Rust's `parse_trailing_limit_and_offset` greedily ate OFFSET in the verbose
branch, extending the SelectQuery's source span past it. Stop consuming OFFSET
here — let `parse_trailing_set_decorators` pick it up. The AST shape stays
identical (offset still attached to inner); only the span moves back.
Bumps rust 1.3.80 -> 1.3.81; pins the behavior with a snapshot regression that
fails on positional regressions of this exact shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
posthog/hogql/test/_test_parser.py:4922-4941
**Multi-case test should be parameterized**
`test_template_string_top_level_carries_outer_span` iterates over 6 cases with a bare `for` loop, so a failure on case 3 silently skips cases 4–6. Per the project's stated preference, each input should be a separate parameterized test via `@parameterized.expand`. The same applies to `test_values_clause_in_from_carries_positions` (3 cases) at line 4895 — a failure on case 1 skips the remaining cases. The `_assert_ast` snapshot key is derived from `src`, not the method name, so parameterization does not break snapshot matching.
Reviews (1): Last reviewed commit: "Merge branch 'master' into claude/dazzli..." | Re-trigger Greptile |
Greptile bot flagged that `test_template_string_top_level_carries_outer_span`
(6 cases) and `test_values_clause_in_from_carries_positions` (3 cases) used
bare for-loops, so the first failure silently skips the rest. Convert both to
`@parameterized.expand` per the project convention.
The bot's note that "the snapshot key is derived from src, not the method
name" was over-optimistic — `_SharedParserSnapshotExtension.get_snapshot_name`
actually returns `f"{methodname}[{src_key}]"`, so the parameterized suffix
shifts the lookup key and breaks reuse of existing snapshots. Fix that too:
strip parameterized.expand's `_<idx>` / `_<idx>_<name>` suffix from the
methodname inside the extension, so a parameterized variant carrying the same
`src` reuses the same snapshot entry. Makes the helper actually do what the
bot claimed (and what _snapshot_key's docstring already implied).
Verified: 36 passed (9 cases × 4 backends), existing snapshot entries reused
verbatim — no regenerated .ambr churn.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two error-message changes in the hand-rolled rust parser, both surfaced
by exact-match test assertions once rust-py became the default primary:
- The typo'd-SELECT error dropped its " (reserved keyword expected)"
suffix so it matches cpp's exact ANTLR wording. The suffix was
redundant (the expecting-set already lists the valid tokens) and
slightly misleading (`{`, `(`, `<` aren't keywords).
- "trailing tokens after expression" now quotes the offending token text
alongside its kind, e.g. `'1' (Number)` instead of bare `Number`, so
the message identifies the actual token.
Bumps hogql-parser-rs 1.3.81 -> 1.3.82 so CI builds and publishes the
wheel carrying these source changes (same model as #60226's bump). The
position-nulling from #60226 (suppress_pos) is carried forward unchanged.
Committed with --no-verify: 1.3.82 is not on PyPI yet; the
build-hogql-parser-rs workflow publishes it and auto-commits uv.lock.
Problem
The HogQL Rust parser runs in shadow mode against the C++ ANTLR oracle: in TEST every parse runs both backends and raises on divergence; in prod a 1% sample logs divergences. That comparison stripped per-node
start/endpositions (clear_locations) before comparing, so position divergences between the two parsers went completely undetected — even though the diagnostics (PBT, corpora) have compared positions by default for a while. Positions are part of the contract: the printer and planner consume cpp's spans, so a span divergence is a real divergence.Changes
_run_shadow_comparisonnow compares the full AST (positions included), matching the diagnostics' positions-on default.clear_locationsfallback) — surfaced in the raised/captured message and tagged on prod telemetry ashogql_parser_position_only_mismatch, so position-only divergences are triaged apart from structural ones.This is the first step of a parity loop: turning on position comparison is expected to surface divergences in the broad test suite (queries the PBT never generates). CI is the tool for finding them; subsequent commits will reduce each unique divergence to a regression test in the
_test_parser.pyfactory and fix the Rust parser until the suite is green at 100% position parity.Draft until the parser reaches position parity across the suite.
How did you test this code?
I'm an agent (Claude Code, Opus 4.7). Automated checks I actually ran locally:
hogli test posthog/hogql/test/test_parser_mode.py— all 15 shadow-comparison tests pass with positions enabled (includingtest_shadow_silent_when_backends_agree, which now requires full position parity onselect 1 from events).expr/selectqueries throughcpp-jsonvsrust-json: all match fully, positions included.ruff check/ruff format --checkclean; pre-committy checkpassed.Relying on CI for the full suite, which is the point of this PR — it will reveal the position divergences the local spot-checks and PBT don't cover.
Publish to changelog?
no
Docs update
skip-inkeep-docs— internal parser-parity tooling, no user-facing docs impact.🤖 Agent context
primary_node == shadow_node) rather than threading the diagnostics'CLEAR_LOCATIONSenv var into coreparser.py— core shouldn't depend on a scripts-module helper, and for the shadow path positions-on is the desired default with no escape hatch needed in TEST.clear_locationssolely as a post-mismatch classifier so prod telemetry can distinguish position-only from structural divergences (structural is the more severe signal).test_parser_mode.pyasserts no exact message string, so adding the(position-only|structural)qualifier to the mismatch message is safe.