Skip to content

feat(extraction): instantiates + decorates graph edges#134

Merged
colbymchenry merged 3 commits into
colbymchenry:mainfrom
andreinknv:feat/decorator-and-constructor-edges
May 8, 2026
Merged

feat(extraction): instantiates + decorates graph edges#134
colbymchenry merged 3 commits into
colbymchenry:mainfrom
andreinknv:feat/decorator-and-constructor-edges

Conversation

@andreinknv
Copy link
Copy Markdown
Contributor

Summary

Two new structural edges that fill significant gaps in the call graph for modern JS/TS / Java / C# / Python / Kotlin codebases.

instantiates edges from new Foo(...)

Previously the extractor only recognised call_expression; new_expression (and equivalents in other grammars) was silently ignored, so constructor invocations produced zero graph edges.

Adds an INSTANTIATION_KINDS set covering new_expression, object_creation_expression, and instance_creation_expression. Wired into both the top-level visitNode dispatcher AND the per-function-body visitForCallsAndStructure walker — new calls inside function bodies were being missed by the body walker even after the top-level dispatch was hooked up.

Handles three subtleties:

  • Generic types: new Map<K, V>() strips the angle-bracket suffix to produce 'Map', so resolution can match the class node.
  • Qualified constructors: new ns.Foo() keeps the trailing identifier ('Foo') — same shape as the existing extractCall does for member-call.
  • Nested calls: children are still walked, so new Foo(bar()) produces both an instantiates ref and a calls ref.

decorates edges from @Decorator annotations

Tree-sitter places decorator nodes BEFORE the symbol they apply to, so a naive walk-time dispatch saw the wrong nodeStack head (file/class instead of class/method). The fix runs decorator extraction from inside extractClass / extractFunction / extractMethod / extractProperty / extractField, after the symbol's node id is known.

Looks for decorator nodes in two places:

  • Direct named children of the declaration (method/property style).
  • Preceding siblings in the parent (TypeScript class style: @Foo class X {} parses as parent { decorator, class_decl }).

Handles two subtleties:

  • Sibling boundary: walks BACKWARD from the declaration and stops at the first non-decorator separator. Without that stop, @A class Foo {} @B class Bar {} would attribute @A to Bar. Caught by reviewer; covered by a regression test.
  • Tree-sitter wrapper identity: parent/namedChild returns fresh JS wrapper objects, so sibling === declNode is unreliable — uses startIndex comparison instead.

Also recognises Java's marker_annotation (no-args annotations like @Override/@Deprecated) alongside decorator/annotation.

Test plan

Verified live on a synthetic NestJS-shape fixture:

src              kind         tgt
UserController   decorates    Controller   (class decorator)
list             decorates    Get          (method decorator)
bootstrap        instantiates UserService  (new ...())
bootstrap        instantiates UserController
  • 6 new extraction tests covering new Foo() / generic-stripping / qualified-new / @Foo class X / two-adjacent-decorated-classes regression / @Foo method()
  • npx vitest run386 passed (was 380, +6 new)
  • npx tsc --noEmit clean
  • npm run build succeeds

🤖 Generated with Claude Code

Two new structural edges that fill gaps in the call graph for
modern JS/TS / Java / C# / Python / Kotlin codebases.

1) `instantiates` edges from `new Foo(...)`:

The bulk-extraction and visitFunctionBody dispatchers only
recognised `call_expression`; `new_expression` (and the equivalent
`object_creation_expression` / `instance_creation_expression` in
other grammars) was silently ignored. Adds INSTANTIATION_KINDS,
extractInstantiation(), and dispatch from BOTH the top-level
visitNode and the per-function-body walker. Children are still
descended so nested calls inside constructor args (`new Foo(bar())`)
get their own `calls` refs.

Output: a `bootstrap` function that does `new UserService(); new
UserController(svc)` now produces two `instantiates` edges to those
class nodes — previously zero edges.

2) `decorates` edges from `@Decorator` annotations:

Tree-sitter places decorator nodes BEFORE the symbol they apply to
in the AST, so the original walk-time dispatch saw the wrong
nodeStack head (file/class instead of class/method). Replaced with
extractDecoratorsFor(declNode, decoratedId) that runs from inside
extractClass / extractFunction / extractMethod after the symbol's
node id is known.

Looks for decorator nodes in two places:
  - Direct named children of the declaration (method/property style)
  - Preceding siblings in the parent (TypeScript class style:
    @foo class X {} parses as parent { decorator, class_decl })

Sibling check uses startIndex comparison rather than reference
identity — tree-sitter web bindings return fresh JS wrappers from
parent/namedChild navigation, so `===` is unreliable. Took a debug
session to spot this; flagging in the comment so the next reader
doesn't re-introduce the bug.

Output: a `@Controller` class decorator + `@Get` method decorator
on a NestJS-style controller now produce two `decorates` edges
(class→Controller, method→Get) with the correct source nodes.

Verified live on a synthetic NestJS-shape fixture; all 380
existing tests pass.
…ric constructors, property/field decorators, marker_annotation, tests

Five fixes from independent semantic review:

- extractDecoratorsFor sibling walk now iterates BACKWARD from the
  declaration and stops at the first non-decorator/annotation
  separator. Previous version walked forward up to declStart and
  consumed every decorator-typed sibling — so two adjacent
  decorated classes (`@A class Foo {} @b class Bar {}`) had `@A`
  spuriously attributed to `Bar`.

- extractInstantiation strips the type-argument suffix from the
  constructor field text. `new Map<K, V>()` was producing
  referenceName 'Map<K, V>' (the constructor field is a generic_type
  node) and resolution always failed.

- extractProperty and extractField now call extractDecoratorsFor
  after their createNode calls. NestJS-style `@Inject() private
  svc: Foo` and Java field annotations were being silently dropped.

- consider() in extractDecoratorsFor recognises 'marker_annotation'
  in addition to 'decorator'/'annotation'. Java's tree-sitter grammar
  emits marker_annotation for arg-less annotations like @OverRide
  and @deprecated; without this every Java marker annotation was
  silently skipped.

- 6 new extraction tests covering: instantiates ref for new Foo(),
  generic-type stripping (`new Container<string>()` -> 'Container'),
  qualified-new keeps trailing identifier (`new ns.Foo()` -> 'Foo'),
  decorates ref for @foo class X {}, regression for adjacent
  decorated classes (each gets its OWN decorator), decorates ref
  for @foo method().

Full test suite: 386 passed (was 380, +6 new extraction tests).
andreinknv added a commit to andreinknv/codegraph that referenced this pull request Apr 28, 2026
…edges

# Conflicts:
#	__tests__/extraction.test.ts
andreinknv added a commit to andreinknv/codegraph that referenced this pull request Apr 29, 2026
Adds Steps K-O to walk the new PRs in dependency order:
  K: bug-fix wave (clean):    colbymchenry#128, colbymchenry#129
  L: resolution + search:     colbymchenry#130 (resolve), colbymchenry#131 (resolve)
  M: extraction edges:        colbymchenry#134 (resolve)
  N: biomarker stack:         colbymchenry#132, colbymchenry#133 (both resolve, on top of colbymchenry#125)
  O: search advanced:         colbymchenry#135 (resolve, on top of colbymchenry#131)

Also flips colbymchenry#125 from merge_clean to merge_resolve - it now hits a
queries.ts conflict after the Phase-4 stack lands (colbymchenry#111/colbymchenry#112/colbymchenry#123/colbymchenry#124
all extend the same QueryBuilder surface, so colbymchenry#125's biomarker columns
no longer apply cleanly without a resolution).

Validated end-to-end against colbymchenry/main HEAD: script ran
clean through all 43 PRs, npm run build succeeded, full test
suite reports 877/877 passing (was 829 before this wave: +48 from
new tests added by the new PRs plus the reviewer-driven follow-ups).
Two follow-ups to the new instantiates/decorates ref kinds, surfaced
during review:

1) name-matcher previously only had a kind bonus for `calls`
   (preferring function/method). When a class and a function share a
   name across modules, an `instantiates` ref would tie or pick the
   wrong candidate. Adds:
     - `instantiates` → +25 for class/struct/interface
     - `decorates`    → +25 for function/method, +15 for class
       (Python class decorators, Java annotation interfaces)

2) Python (and Ruby) have no `new` keyword — `Foo()` is the standard
   instantiation syntax, indistinguishable from a function call at
   extraction time. Resolution can tell the difference once the
   target is known: when a `calls` ref resolves to a class/struct,
   promote it to `instantiates`. Mirrors the existing extends→
   implements promotion in createEdges.

Verified: 386 → 389 passing (+3 tests covering the kind biases and
the Python promotion).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@colbymchenry
Copy link
Copy Markdown
Owner

Reviewed and approved. Pushed two follow-ups before merging (commit 63a4bbe):

1. Kind-aware scoring in name-matcher.ts

The existing scorer only had a kind bonus for calls (preferring function/method). With the two new ref kinds, an instantiates ref like new Logger() could tie or pick a same-named function over the actual class when both exist across modules. Added:

  • instantiates → +25 for class/struct/interface
  • decorates → +25 for function/method, +15 for class/interface (Python class decorators, Java annotation interfaces)

2. Python instantiation promotion in createEdges

The PR description listed Python, but Python (and Ruby) have no new keyword — Foo() is the standard instantiation syntax and looks identical to a function call at extraction time. Resolution can tell the difference once the target is known, so I mirrored the existing extendsimplements promotion: when a calls ref resolves to a class/struct, kind is promoted to instantiates. This covers Python without any extractor changes.

Known behavior change (left as-is, intentionally):

Dart's new Widget() previously emitted a calls edge via extractBareCall. The new INSTANTIATION_KINDS branch wins ahead of that path in visitForCallsAndStructure, so it now emits instantiates instead. Semantically more accurate, but a breaking change for any consumer querying "what calls Widget" on Dart code. Worth a release-note line.

Verified: 386 → 389 passing (+3 tests covering the kind biases and the Python promotion). tsc --noEmit clean.

Merging.

@colbymchenry colbymchenry merged commit 8eed243 into colbymchenry:main May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants