-
-
Notifications
You must be signed in to change notification settings - Fork 0
Querying
Synaptic reads a built graph.json and answers four kinds of questions about it:
query (relevant subgraph), path (shortest route between two nodes), explain
(one node and its neighbours), and affected (reverse-impact: what depends on a
node). All four are read-only and operate on the graph produced by synaptic extract (see [Commands] and [Output-Formats]).
By default each command loads synaptic-out/graph.json. Pass --graph <path>
to point at a different file.
For structural queries (match on kind, visibility, lines of code, fan-in/out,
relationships, variable-length paths, and aggregation), use the search command
and its SYNQL query language instead. SYNQL is documented in full under
Commands; search matches on structure rather than on the
free-text relevance that query scores.
The SQL layer is queryable the same way: MATCH (t:table) WHERE t.rls_enabled = "false" RETURN t finds tables without row-level security, and (c:column) /
(i:index) / (p:policy) match the SQL objects extraction now models (tables
also expose dialect). See SQL Auditing.
synaptic query "user authentication" --max-nodes 30
query retrieves a subgraph relevant to free text. It scores every node by how
well its label tokens overlap the query, picks the best-scoring nodes as seeds,
then expands outward from those seeds — best-first, by relevance — until it has
collected --max-nodes nodes. Results come back ranked, each with a relevance
score.
How scoring works:
- Labels and the query are tokenized into lowercased word tokens, splitting on
both
snake_caseandcamelCaseboundaries and dropping tokens shorter than two characters.run_analysis()becomesrun,analysis;AuthServicebecomesauth,service. - A node's seed score is the sum of IDF weights of the query tokens it contains
— IDF is
ln((N + 1) / (1 + df)) + 1, withNthe node count anddfthe number of nodes whose label contains that token, so rarer tokens count for more — divided by the square root of the node's token count, so a long label can't out-score a tight match just by accumulating tokens. - Nodes scoring above zero are ranked highest-first (ties broken by node id for determinism). The top 8 become the seeds.
Expansion uses the undirected adjacency of the graph (edge direction and
self-loops are ignored), but it is best-first, not a plain breadth-first
wave: the frontier is a priority queue keyed by relevance, so the --max-nodes
budget is spent on the most relevant neighbourhood rather than on whatever a
breadth-first sweep happened to reach first. Two refinements keep the result
clean:
-
Hub penalty. A high-fan-out node (a registry, a
Builder, a documentation index) is down-weighted in proportion to how far its degree exceeds the graph average, so it is expanded last and its many incidental neighbours rarely reach the budget. This stops one hub from flooding the result with noise. - Decay. A neighbour inherits a fraction of the relevance of the node that reached it, so far-flung nodes fade while a genuinely relevant chain survives.
Every returned node keeps a final relevance score; nodes and edges are returned sorted by it (edges by the relevance of their weaker endpoint), so you can read the top of the list and ignore the low-scored tail.
Both modes expand best-first by relevance; the traversal mode only breaks score ties:
- Default (breadth-first): among equally-relevant frontier nodes, the earlier-discovered (shallower) one is taken first, giving a broad neighbourhood around the matches.
-
--dfs: among equal scores, the later-discovered (deeper) one is taken first, favoring deep call chains over wide neighbourhoods.
synaptic query "request handler" --dfs --max-nodes 50
--max-nodes (default 30) bounds the number of nodes in the returned subgraph.
It is a node count, not a token budget. Expansion stops as soon as the limit is
reached; edges are then included only when both their endpoints are in the
collected set.
--since <baseline> boosts nodes whose file changed on the current branch, so
in-progress code surfaces first. The baseline is a git ref (main, HEAD~10), a
date ("2 weeks ago"), or auto to detect the default branch. The changed set is
scoped to merge-base(<baseline>, HEAD)..working-tree, so it includes uncommitted
edits — what you are working on right now — and the boost is weighted by each
file's churn (lines changed).
synaptic query "collider mesh" --since main
Changed nodes are marked (changed) in the ranked list and float toward the top,
while a strong query match still holds its rank — recency re-ranks within the
relevant set rather than replacing it. Add --seed-changed to also inject the
changed-file nodes as seeds, so the branch's changed surface appears even when the
query matches little ("what did this branch change"):
synaptic query "anything" --since main --seed-changed
Resolution runs git; if the directory is not a git repo, the ref does not
resolve, or nothing changed, the command prints a short note and falls back to a
plain query. The MCP query_graph tool exposes the same via its since and
recency_mode arguments — see [MCP-Server].
The command prints the matched seeds, the ranked nodes with their scores, then the
subgraph as a list of edges (a Recency: header and (changed) markers appear
when --since is used):
Seeds:
- AuthService
- login_user
Ranked nodes (12):
[6.10] AuthService
[4.80] login_user
...
Subgraph (12 nodes, 9 edges):
AuthService --calls--> login_user
AuthService --uses--> Database
...
If no node scores above zero (and no changed nodes are seeded), it prints
No matches for "...".
In a federated graph, --repo <tag> scopes the query to a single member before
running. Scoping drops nodes tagged with other repos and the cross-repo edges
that span them, so seeds and the subgraph come only from that member. See
[Workspaces-and-Federation].
synaptic query "payment" --repo billing-service
synaptic path AuthService Database
path finds the shortest undirected path between two nodes and prints it as a
chain of labels:
AuthService → SessionStore → Database
Both endpoints are resolved from your arguments: an exact node id is used
directly, otherwise the first node whose label equals the argument exactly. If
either endpoint cannot be resolved it prints Could not resolve one or both endpoints. If both resolve but no route connects them it prints No path between <from> and <to>.
The search is a breadth-first walk over undirected adjacency (edge direction is ignored), so the path returned has the fewest hops. A node has a one-element path to itself.
path also accepts --graph and --repo.
synaptic explain AuthService
explain shows one node plus every node it is directly connected to. It prints
the label and source file, the community id (if the node has one), and each
neighbour grouped by direction:
AuthService [src/auth/service.py]
community: 3
neighbours (5):
--> login_user (calls)
--> Database (uses)
<-- LoginController (calls)
...
--> is an outgoing edge (this node is the source); <-- is incoming (this node
is the target). Neighbours are sorted by direction, then relation, then id. The
node argument is resolved the same way as path (exact id, else exact label). If
nothing resolves it prints Node not found: <node>.
explain also accepts --graph and --repo.
synaptic affected login_user --depth 2
affected is reverse-impact analysis: it reports the nodes that (transitively)
depend on a node, so you can see the blast radius of changing it. It walks edges
backward (from target to source) so that "X calls Y" means changing Y affects
X.
affected resolves its node argument through a conservative cascade, stopping at
the first match and never guessing on a tie:
- Exact node id.
- Unique case-insensitive exact label.
- Unique bare name: the label with a trailing
()removed, matched case-insensitively (sotransformmatches a node labeledtransform()). - Unique case-insensitive source file path.
- Unique case-insensitive substring of a label.
If any step would match more than one node, or nothing matches at all, it prints
No unique node match for <node> and stops. (The query/path/explain
commands use a simpler resolver: exact id, then exact label.)
-
--depth <n>(default 2) bounds how many hops backward the walk follows. Each reported node records the relation it was first reached through and the hop count. -
--relation <name>restricts which edge relations propagate impact. It is repeatable. When omitted, a default set of structural relations is used:calls,references,imports,imports_from,re_exports,inherits,extends,implements,uses,mixes_in,embeds,depends_on,reads_from, and the cross-language relationsinvokes,binds_native,calls_service, andhandled_by.The four cross-language relations mean reverse-impact crosses language boundaries: changing an HTTP/gRPC handler reaches the clients that call it, a Rust function exported through PyO3 reaches the Python that imports it, and a binary reaches the scripts that invoke it. See Cross-Language-Edges.
Containment relations such as
containsandmethodare intentionally not in the default set: containing something is not the same as depending on it, so they do not propagate impact.
synaptic affected Database --relation reads_from --relation depends_on --depth 3
Affected nodes for login_user
Relations: calls, references, imports, ...
Depth: 2
- LoginController [calls] src/web/login.py:L42
- AuthRouter [imports] src/web/router.py:L10
Each line is the affected node, the relation it was reached through, and its
source location. If nothing depends on the seed within the depth bound it prints
No affected nodes found.
affected accepts --graph. It does not take a --repo flag.
- [Commands] for the full command reference.
- [Output-Formats] for the JSON shape these queries operate on.
- [Analysis-and-Reports] for whole-graph structural analysis.
- [Workspaces-and-Federation] for
--reposcoping.
Getting started
Concepts
Using Synaptic
- Commands
- Extraction
- Querying
- Cross-Language Edges
- SQL Auditing
- Analysis and Reports
- Output Formats
- Visualizations
Integrations
Scaling
Reference