tp: replace the handrolled preprocessor with syntaqlite#5472
Draft
LalitMaganti wants to merge 18 commits intomainfrom
Draft
tp: replace the handrolled preprocessor with syntaqlite#5472LalitMaganti wants to merge 18 commits intomainfrom
LalitMaganti wants to merge 18 commits intomainfrom
Conversation
3dfc93b to
8c6b320
Compare
44bcdca to
5be3e1e
Compare
8c6b320 to
d85eb80
Compare
Bump the pinned syntaqlite SHA to pull in several fixes that obsolete local workarounds: - #109: SyntaqliteTextSpan.length widened to uint32_t — drop (uint16_t) casts on select_span initializers. - #111: extract_terminals_from_y excludes base grammar terminals — removes the need for the local LP/RP filter. - #112: generated amalgamation self-suppresses compiler warnings — drop the BUILD.gn cflags_c list and the #pragma wrappers around the header include in the tokenizer. - #113: WINDOW/OVER/FILTER treated as context-sensitive keywords — drop the local FILTER fallback in perfetto_module_name.
d85eb80 to
2ddeedb
Compare
5be3e1e to
47e96e3
Compare
🎨 Perfetto UI Builds
|
Pull platform-specific prebuilt syntaqlite CLI archives from the v0.4.1
GitHub release rather than cloning the source tree and invoking
`cargo build`. This avoids needing a Rust toolchain just to regenerate
the PerfettoSQL parser.
The new `--syntaqlite` flag on tools/install-build-deps gates fetching
the archive (only contributors editing the grammar need it); normal
builds continue to consume the committed generated output.
`tools/gen_syntaqlite_parser` now directs users to run
`tools/install-build-deps --syntaqlite` when the binary is missing, and
the regenerated `syntaqlite_perfetto.{c,h}` reflect v0.4.1.
Rebuilds syntaqlite_perfetto.{c,h} from a newer syntaqlite snapshot
(adds straddle tracking and cross-layer poisoning, among other fixes)
and teaches the gen_syntaqlite_parser wrapper to pass
--macro-style rust.
Now that the parser is backed by syntaqlite, macro expansion and
statement splitting can live alongside the rest of parsing instead of
behind a separate preprocessor pass. This CL:
* Deletes src/trace_processor/perfetto_sql/preprocessor/ and its
generated grammar entirely.
* Moves macro bookkeeping (the Macro struct + registry) onto
PerfettoSqlParser and introduces intrinsic_macro_expansion.{cc,h}
for the built-in macros previously implemented by the preprocessor.
* Teaches the parser unittest to drive the new parser-resident
expansion directly.
* Updates SqlSource with a FromMacroExpansion constructor so SQLite
errors traceback back to the macro call site.
* Drops the preprocessor dependency from the engine and rewires the
parser unittests target against syntaqlite.
The base::TrimWhitespace(std::string_view) overload used by
intrinsic_macro_expansion.cc will land in a separate CL; this branch
will be unbuildable until it does.
47e96e3 to
c412a68
Compare
Member
Author
|
The So if (c.body_call_length == 0) continue;never trips for the arg-internal case. |
LalitMaganti
added a commit
to LalitMaganti/syntaqlite
that referenced
this pull request
Apr 16, 2026
The public header, internal header, and session.rs test comment all claimed the arg-internal case was signalled by `body_call_length == 0`, but the implementation sets both `body_call_offset` and `body_call_length` to `SYNTAQLITE_MACRO_BODY_CALL_ARG_INTERNAL` (UINT32_MAX) — and that is what the public sentinel macro documents at parser.h:236. Downstream consumers (e.g. google/perfetto#5472) that trusted the stale comment ended up pushing UINT32_MAX offsets into their rewriter and crashing on overflow. Pure doc fix — no behavior change.
LalitMaganti
added a commit
to LalitMaganti/syntaqlite
that referenced
this pull request
Apr 16, 2026
## Motivation The public header, internal header, and a session.rs test comment all said the arg-internal case for a nested macro call was signalled by `body_call_length == 0`. That contradicts the implementation: `parser_macros.c:530-535` sets *both* `body_call_offset` and `body_call_length` to `SYNTAQLITE_MACRO_BODY_CALL_ARG_INTERNAL` (= `UINT32_MAX`), matching the sentinel defined at `parser.h:236` and the Rust docs at `types.rs:274-277`. Downstream consumers that trusted the stale comment can crash. Concrete example: google/perfetto#5472 filtered with `if (c.body_call_length == 0) continue;`, which never trips for arg-internal calls, so a `RewriteItem{start = UINT32_MAX, end = UINT32_MAX + UINT32_MAX}` ended up in `SqlSource::Rewriter` and tripped its `start <= end` check. ## Changes Pure documentation fix — no behavior change: - `syntaqlite-syntax/include/syntaqlite/parser.h` — refer to `SYNTAQLITE_MACRO_BODY_CALL_ARG_INTERNAL` (UINT32_MAX) as the arg-internal sentinel on both fields. - `syntaqlite-syntax/csrc/parser_internal.h` — same fix on the `SynqExpansionLayer` fields. - `syntaqlite-syntax/src/parser/session.rs` — update the test comment that still said `body_call_length == 0`.
BuildForUserMacro relied on syntaqlite's body_call_offset / body_call_length to locate nested calls inside a user macro's authored body. Two problems: * The header comment claims body_call_length == 0 signals an arg-internal call, but the implementation uses SYNTAQLITE_MACRO_BODY_CALL_ARG_INTERNAL (= UINT32_MAX). The old `== 0` check never fired, so arg-internal children were pushed as rewrites with UINT32_MAX ranges, crashing ExpandsMacroInsideArg. * The per-segment shift upstream is computed as `max(0, sub_length - body_length)`, which silently drops the signed delta whenever a substitution is *shorter* than its authored placeholder (e.g. `$y` (2) → `5` (1)). body_call_offset / body_call_length are off-by-one whenever that happens, producing garbled expansions like `SELECT 5 +(10 + 1))`. Work around both in the consumer: compare against the real sentinel and recompute the body range ourselves by walking the parent's arg segments with signed deltas. A TODO points at the upstream fix so we can drop the workaround once the vendored syntaqlite is re-rolled.
LalitMaganti
added a commit
that referenced
this pull request
Apr 20, 2026
Replaces the handrolled Lemon grammar under `perfetto_sql/grammar/`
and the vendored-SQLite-based tokenizer in
`perfetto_sql/tokenizer/tokenize_internal.c` with the standalone
syntaqlite parser (pulled in via `buildtools/syntaqlite`).
`PerfettoSqlParser` now walks the syntaqlite AST for PerfettoSQL
statements (`CREATE PERFETTO {TABLE,VIEW,FUNCTION,MACRO,INDEX}`,
`INCLUDE PERFETTO MODULE`, etc.) instead of its own recursive-descent
code, and defers to SQLite for everything else.
The legacy preprocessor is retained as-is on this PR; killing it is
stacked on top in #5472.
Codegen tooling moves from `tools/update_sql_parsers.py` to
`tools/gen_syntaqlite_parser`.
v0.5.1 restructures the `dialect` subcommand into `dialect generate`,
so update tools/gen_syntaqlite_parser accordingly and regenerate the
amalgamated PerfettoSQL parser (syntaqlite_perfetto.{c,h}).
syntaqlite v0.5.1 changed every layer-0 offset it emits (node extents, spans, macro rewrite call offsets, error offsets) from document-relative to statement-relative, measured from the start of the slice returned by the new syntaqlite_parser_text API. Thread the statement's document offset through MacroRewriteBuilder, ParseCreateMacro, and the error path, adding it when slicing impl_->source so each stmt.Substr call lands at the right document position. Avoids materializing an extra SqlSource slice per Next().
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Now that the parser is backed by syntaqlite, macro expansion and
statement splitting can live alongside the rest of parsing instead of
behind a separate preprocessor pass. This CL:
src/trace_processor/perfetto_sql/preprocessor/and itsgenerated grammar entirely.
Macrostruct + registry) ontoPerfettoSqlParserand introducesintrinsic_macro_expansion.{cc,h}for the built-in macros previously implemented by the preprocessor.
expansion directly.
SqlSourcewith aFromMacroExpansionconstructor soSQLite errors traceback back to the macro call site.
parser unittests target against syntaqlite.
Also bundles the bump of vendored syntaqlite (hash +
--macro-style rust) since the new macro-lookup/rewrite API surfaceonly has a consumer once the preprocessor is gone.