Add name_spans and body_span to Node for graph↔text navigation#30
Merged
Add name_spans and body_span to Node for graph↔text navigation#30
Conversation
Extends Node with: - name_spans: every occurrence of a node's name in the statement source, enabling the UI to cycle through references during graph<->text navigation. - body_span: the parenthesized subquery body of a CTE, for separate highlighting of the definition. Populated for table, view, and CTE nodes. Columns intentionally retain empty name_spans in this release — accurate per-occurrence column spans require alias/scope resolution, tracked separately as a follow-up to the semantic resolver epic. Text-search strategy skips SQL string literals and comments, and for CTEs excludes matches that fall inside the CTE's own body so internal column references don't inflate the occurrence count. Refs #20, #17.
Refactor span.rs helpers (skip_string_or_comment, find_matching_paren, first_skip_between, find_cte_body_span) to operate on &[u8] so that non-ASCII content in SQL identifiers, comments, or string literals no longer triggers panics on byte-offset string slicing. Emit feature-gated tracing warnings when the scan range violates UTF-8 char boundaries instead of silently returning empty results. Add regression tests covering multi-byte characters in block/line comments and string literals. Also populate name_spans incrementally through add_name_span on the statement context and introduce Default for Node so that analyzer call sites can use struct-update syntax instead of enumerating every field.
Replace the two-pass locate_relation_name_span logic with a single find_relation_occurrence_spans scan that returns the full and tail spans together, so quoted identifiers with embedded dots (e.g. "my.schema"."my.table") resolve to the correct name span. Teach find_cte_body_span to skip optional column lists and [NOT] MATERIALIZED modifiers, and make find_identifier_span skip string literals and comments. Add node_index to StatementContext for O(1) add_name_span lookups, a Node::all_name_spans helper, and document the left-to-right traversal contract on locate_statement_span.
Fix relation and CTE span detection for hash comments and Postgres dollar-quoted strings. Also tighten StatementContext node/index invariants, switch relation occurrence tracking to per-name cursors, and update snapshots/tests for the new span metadata.
Capture formatting changes from the workspace checks and refresh generated schema and WASM binding artifacts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #20. First step from the #17 breakdown.
What
Extends
Nodewith two source-location fields that future navigation features need:name_spans: Span[]— every occurrence of the node's name in the statement source (declaration + references). Enables the UI to cycle through references with◀ n/total ▶controls.body_span?: Span— for CTE nodes, the parenthesized subquery body afterAS. Enables the UI to highlight the definition body separately from the name.Mirrored in Rust (
crates/flowscope-core/src/types/response.rs) and TypeScript (packages/core/src/types.ts,docs/api-types.md,docs/api_schema.json).Scope
Populated for table, view, and CTE nodes.
Column nodes deliberately get empty
name_spansin this PR. Accurate per-occurrence column spans require the alias/scope resolver tracked in #27 — a naive text match would pick up all occurrences ofidregardless of which relation's column is being referenced. Emptyname_spanson columns is forward-compatible: consumers that need a source location for a column node fall back to the existing singlespan.Column coverage will ship in a follow-up PR once #27 lands.
How
The text-search helpers in
analyzer/helpers/span.rs:''escape handling) and line/block comments, so-- usersor'users'do not produce false positives;users_archivedoes not matchusers;body_span(aWHERE activecolumn reference insideWITH active AS (...)is not a reference to the CTE).Population happens in a single pass at the end of
Analyzer::analyze_statement, so every table-like node gets its spans computed exactly once against the statement's source text.Validation
cargo test --workspace: 2705 passed, 0 failedcargo clippy --workspace -- -D warnings: cleancargo fmt --all -- --check: cleanyarn workspaces run typecheck+test: cleanjust check-schema: clean (Rust ↔ TS schema compat verified)analyzer::helpers::spantests/lineage_engine.rscovering single refs, multiple refs, CTE body, comment/literal skipping, and the empty-column-spans contractTest plan
Follow-ups