-
-
Notifications
You must be signed in to change notification settings - Fork 0
Querying
CodeGraph reads a built graph.json and answers four kinds of questions about it:
query (relevant subgraph), path (shortest route between two nodes), explain
(one node and its neighbours), and affected (reverse-impact: what depends on a
node). All four are read-only and operate on the graph produced by codegraph extract (see [Commands] and [Output-Formats]).
By default each command loads codegraph-out/graph.json. Pass --graph <path>
to point at a different file.
codegraph query "user authentication" --max-nodes 30
query retrieves a subgraph relevant to free text. It scores every node by how
well its label tokens overlap the query, picks the best-scoring nodes as seeds,
then expands outward from those seeds until it has collected --max-nodes nodes.
How scoring works:
- Labels and the query are tokenized into lowercased word tokens, splitting on
both
snake_caseandcamelCaseboundaries and dropping tokens shorter than two characters.run_analysis()becomesrun,analysis;AuthServicebecomesauth,service. - Each node's score is the sum of IDF weights of the query tokens it contains,
where IDF is
ln((N + 1) / (1 + df)) + 1withNthe node count anddfthe number of nodes whose label contains that token. Rarer tokens count for more. - Nodes with a score above zero are ranked highest-first (ties broken by node id for determinism). The top 8 become the seeds.
Expansion from the seeds uses the undirected adjacency of the graph (edge
direction and self-loops are ignored). New neighbours are pushed onto a queue;
nodes are collected until the count reaches --max-nodes.
The expansion order is the only difference between the two traversal modes:
- Default (breadth-first): collects nodes in order of distance from the seeds, so the result is a broad neighbourhood around the matches.
-
--dfs: follows a chain as far as it goes before fanning out, favouring deep call chains over wide neighbourhoods.
codegraph query "request handler" --dfs --max-nodes 50
--max-nodes (default 30) bounds the number of nodes in the returned subgraph.
It is a node count, not a token budget. Expansion stops as soon as the limit is
reached; edges are then included only when both their endpoints are in the
collected set.
The command prints the matched seeds, then the subgraph as a list of edges:
Seeds:
- AuthService
- login_user
Subgraph (12 nodes, 9 edges):
AuthService --calls--> login_user
AuthService --uses--> Database
...
If no node scores above zero, it prints No matches for "...".
In a federated graph, --repo <tag> scopes the query to a single member before
running. Scoping drops nodes tagged with other repos and the cross-repo edges
that span them, so seeds and the subgraph come only from that member. See
[Workspaces-and-Federation].
codegraph query "payment" --repo billing-service
codegraph path AuthService Database
path finds the shortest undirected path between two nodes and prints it as a
chain of labels:
AuthService → SessionStore → Database
Both endpoints are resolved from your arguments: an exact node id is used
directly, otherwise the first node whose label equals the argument exactly. If
either endpoint cannot be resolved it prints Could not resolve one or both endpoints. If both resolve but no route connects them it prints No path between <from> and <to>.
The search is a breadth-first walk over undirected adjacency (edge direction is ignored), so the path returned has the fewest hops. A node has a one-element path to itself.
path also accepts --graph and --repo.
codegraph explain AuthService
explain shows one node plus every node it is directly connected to. It prints
the label and source file, the community id (if the node has one), and each
neighbour grouped by direction:
AuthService [src/auth/service.py]
community: 3
neighbours (5):
--> login_user (calls)
--> Database (uses)
<-- LoginController (calls)
...
--> is an outgoing edge (this node is the source); <-- is incoming (this node
is the target). Neighbours are sorted by direction, then relation, then id. The
node argument is resolved the same way as path (exact id, else exact label). If
nothing resolves it prints Node not found: <node>.
explain also accepts --graph and --repo.
codegraph affected login_user --depth 2
affected is reverse-impact analysis: it reports the nodes that (transitively)
depend on a node, so you can see the blast radius of changing it. It walks edges
backward (from target to source) so that "X calls Y" means changing Y affects
X.
affected resolves its node argument through a conservative cascade, stopping at
the first match and never guessing on a tie:
- Exact node id.
- Unique case-insensitive exact label.
- Unique bare name: the label with a trailing
()removed, matched case-insensitively (sotransformmatches a node labelledtransform()). - Unique case-insensitive source file path.
- Unique case-insensitive substring of a label.
If any step would match more than one node, or nothing matches at all, it prints
No unique node match for <node> and stops. (The query/path/explain
commands use a simpler resolver: exact id, then exact label.)
-
--depth <n>(default 2) bounds how many hops backward the walk follows. Each reported node records the relation it was first reached through and the hop count. -
--relation <name>restricts which edge relations propagate impact. It is repeatable. When omitted, a default set of structural relations is used:calls,references,imports,imports_from,re_exports,inherits,extends,implements,uses,mixes_in,embeds,depends_on,reads_from.Containment relations such as
containsandmethodare intentionally not in the default set: containing something is not the same as depending on it, so they do not propagate impact.
codegraph affected Database --relation reads_from --relation depends_on --depth 3
Affected nodes for login_user
Relations: calls, references, imports, ...
Depth: 2
- LoginController [calls] src/web/login.py:L42
- AuthRouter [imports] src/web/router.py:L10
Each line is the affected node, the relation it was reached through, and its
source location. If nothing depends on the seed within the depth bound it prints
No affected nodes found.
affected accepts --graph. It does not take a --repo flag.
- [Commands] for the full command reference.
- [Output-Formats] for the JSON shape these queries operate on.
- [Analysis-and-Reports] for whole-graph structural analysis.
- [Workspaces-and-Federation] for
--reposcoping.
Getting started
Concepts
Using CodeGraph
Integrations
Scaling
Reference