feat(extraction): instantiates + decorates graph edges#134
Conversation
Two new structural edges that fill gaps in the call graph for
modern JS/TS / Java / C# / Python / Kotlin codebases.
1) `instantiates` edges from `new Foo(...)`:
The bulk-extraction and visitFunctionBody dispatchers only
recognised `call_expression`; `new_expression` (and the equivalent
`object_creation_expression` / `instance_creation_expression` in
other grammars) was silently ignored. Adds INSTANTIATION_KINDS,
extractInstantiation(), and dispatch from BOTH the top-level
visitNode and the per-function-body walker. Children are still
descended so nested calls inside constructor args (`new Foo(bar())`)
get their own `calls` refs.
Output: a `bootstrap` function that does `new UserService(); new
UserController(svc)` now produces two `instantiates` edges to those
class nodes — previously zero edges.
2) `decorates` edges from `@Decorator` annotations:
Tree-sitter places decorator nodes BEFORE the symbol they apply to
in the AST, so the original walk-time dispatch saw the wrong
nodeStack head (file/class instead of class/method). Replaced with
extractDecoratorsFor(declNode, decoratedId) that runs from inside
extractClass / extractFunction / extractMethod after the symbol's
node id is known.
Looks for decorator nodes in two places:
- Direct named children of the declaration (method/property style)
- Preceding siblings in the parent (TypeScript class style:
@foo class X {} parses as parent { decorator, class_decl })
Sibling check uses startIndex comparison rather than reference
identity — tree-sitter web bindings return fresh JS wrappers from
parent/namedChild navigation, so `===` is unreliable. Took a debug
session to spot this; flagging in the comment so the next reader
doesn't re-introduce the bug.
Output: a `@Controller` class decorator + `@Get` method decorator
on a NestJS-style controller now produce two `decorates` edges
(class→Controller, method→Get) with the correct source nodes.
Verified live on a synthetic NestJS-shape fixture; all 380
existing tests pass.
…ric constructors, property/field decorators, marker_annotation, tests
Five fixes from independent semantic review:
- extractDecoratorsFor sibling walk now iterates BACKWARD from the
declaration and stops at the first non-decorator/annotation
separator. Previous version walked forward up to declStart and
consumed every decorator-typed sibling — so two adjacent
decorated classes (`@A class Foo {} @b class Bar {}`) had `@A`
spuriously attributed to `Bar`.
- extractInstantiation strips the type-argument suffix from the
constructor field text. `new Map<K, V>()` was producing
referenceName 'Map<K, V>' (the constructor field is a generic_type
node) and resolution always failed.
- extractProperty and extractField now call extractDecoratorsFor
after their createNode calls. NestJS-style `@Inject() private
svc: Foo` and Java field annotations were being silently dropped.
- consider() in extractDecoratorsFor recognises 'marker_annotation'
in addition to 'decorator'/'annotation'. Java's tree-sitter grammar
emits marker_annotation for arg-less annotations like @OverRide
and @deprecated; without this every Java marker annotation was
silently skipped.
- 6 new extraction tests covering: instantiates ref for new Foo(),
generic-type stripping (`new Container<string>()` -> 'Container'),
qualified-new keeps trailing identifier (`new ns.Foo()` -> 'Foo'),
decorates ref for @foo class X {}, regression for adjacent
decorated classes (each gets its OWN decorator), decorates ref
for @foo method().
Full test suite: 386 passed (was 380, +6 new extraction tests).
…edges # Conflicts: # __tests__/extraction.test.ts
Adds Steps K-O to walk the new PRs in dependency order: K: bug-fix wave (clean): colbymchenry#128, colbymchenry#129 L: resolution + search: colbymchenry#130 (resolve), colbymchenry#131 (resolve) M: extraction edges: colbymchenry#134 (resolve) N: biomarker stack: colbymchenry#132, colbymchenry#133 (both resolve, on top of colbymchenry#125) O: search advanced: colbymchenry#135 (resolve, on top of colbymchenry#131) Also flips colbymchenry#125 from merge_clean to merge_resolve - it now hits a queries.ts conflict after the Phase-4 stack lands (colbymchenry#111/colbymchenry#112/colbymchenry#123/colbymchenry#124 all extend the same QueryBuilder surface, so colbymchenry#125's biomarker columns no longer apply cleanly without a resolution). Validated end-to-end against colbymchenry/main HEAD: script ran clean through all 43 PRs, npm run build succeeded, full test suite reports 877/877 passing (was 829 before this wave: +48 from new tests added by the new PRs plus the reviewer-driven follow-ups).
Two follow-ups to the new instantiates/decorates ref kinds, surfaced
during review:
1) name-matcher previously only had a kind bonus for `calls`
(preferring function/method). When a class and a function share a
name across modules, an `instantiates` ref would tie or pick the
wrong candidate. Adds:
- `instantiates` → +25 for class/struct/interface
- `decorates` → +25 for function/method, +15 for class
(Python class decorators, Java annotation interfaces)
2) Python (and Ruby) have no `new` keyword — `Foo()` is the standard
instantiation syntax, indistinguishable from a function call at
extraction time. Resolution can tell the difference once the
target is known: when a `calls` ref resolves to a class/struct,
promote it to `instantiates`. Mirrors the existing extends→
implements promotion in createEdges.
Verified: 386 → 389 passing (+3 tests covering the kind biases and
the Python promotion).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Reviewed and approved. Pushed two follow-ups before merging (commit 1. Kind-aware scoring in The existing scorer only had a kind bonus for
2. Python instantiation promotion in The PR description listed Python, but Python (and Ruby) have no Known behavior change (left as-is, intentionally): Dart's Verified: 386 → 389 passing (+3 tests covering the kind biases and the Python promotion). Merging. |
Summary
Two new structural edges that fill significant gaps in the call graph for modern JS/TS / Java / C# / Python / Kotlin codebases.
instantiatesedges fromnew Foo(...)Previously the extractor only recognised
call_expression;new_expression(and equivalents in other grammars) was silently ignored, so constructor invocations produced zero graph edges.Adds an
INSTANTIATION_KINDSset coveringnew_expression,object_creation_expression, andinstance_creation_expression. Wired into both the top-levelvisitNodedispatcher AND the per-function-bodyvisitForCallsAndStructurewalker —newcalls inside function bodies were being missed by the body walker even after the top-level dispatch was hooked up.Handles three subtleties:
new Map<K, V>()strips the angle-bracket suffix to produce'Map', so resolution can match the class node.new ns.Foo()keeps the trailing identifier ('Foo') — same shape as the existingextractCalldoes for member-call.new Foo(bar())produces both aninstantiatesref and acallsref.decoratesedges from@DecoratorannotationsTree-sitter places decorator nodes BEFORE the symbol they apply to, so a naive walk-time dispatch saw the wrong nodeStack head (file/class instead of class/method). The fix runs decorator extraction from inside
extractClass/extractFunction/extractMethod/extractProperty/extractField, after the symbol's node id is known.Looks for decorator nodes in two places:
@Foo class X {}parses asparent { decorator, class_decl }).Handles two subtleties:
@A class Foo {} @B class Bar {}would attribute@AtoBar. Caught by reviewer; covered by a regression test.parent/namedChildreturns fresh JS wrapper objects, sosibling === declNodeis unreliable — usesstartIndexcomparison instead.Also recognises Java's
marker_annotation(no-args annotations like@Override/@Deprecated) alongsidedecorator/annotation.Test plan
Verified live on a synthetic NestJS-shape fixture:
new Foo()/ generic-stripping / qualified-new /@Foo class X/ two-adjacent-decorated-classes regression /@Foo method()npx vitest run— 386 passed (was 380, +6 new)npx tsc --noEmitcleannpm run buildsucceeds🤖 Generated with Claude Code