Skip to content

feat: Improve Java/Python resolution, context relevance, and multi-symbol aggregation#75

Merged
colbymchenry merged 2 commits into
mainfrom
fix/python-resolution-and-context-relevance
Apr 3, 2026
Merged

feat: Improve Java/Python resolution, context relevance, and multi-symbol aggregation#75
colbymchenry merged 2 commits into
mainfrom
fix/python-resolution-and-context-relevance

Conversation

@colbymchenry
Copy link
Copy Markdown
Owner

@colbymchenry colbymchenry commented Apr 3, 2026

Summary

  • Eliminate cross-language false positives in name resolution for multi-language projects (Python+Rust, Python+Java, etc.)
  • Deprioritize test files in context building so codegraph_context returns production code first
  • Improve Java extraction — proper method invocation parsing, interface inheritance, import resolution
  • Aggregate results across all matching symbols in callers/callees/impact tools
  • Impact analysis traverses into container children so class impact includes method callers

Benchmark

Tested on a 96-file Java project and a 115-file Python+Rust project.

Java: CodeGraph delivered production-quality results across all benchmarks — symbol search (10x richer than grep), call graph (17 callees in 1 call vs impossible with grep), impact analysis (transitive chains), and context building (entry points + code in 1 call).

Python (before fix): 37% of all edges were false positives from Python built-in methods resolving to Rust functions. codegraph_context returned test functions instead of production code.

Python (after fix):

Metric Before After
Total edges 4,717 4,640
False cross-language edges ~1,752 28 (mostly subprocess.run → Rust run)
submit_message callees 6 (included Rust extend ❌) 4 (all correct Python ✅)
QueryEnginePort impact Mixed Python + Rust noise Clean Python-only results

What changed

Python resolution fixes (index-time) — src/resolution/

name-matcher.ts:

  • Added language boundary checks to all 3 matchMethodCall strategies — cross-language class/method matches are skipped
  • findBestMatch: cross-language penalty of -80 points (was 0, only had +50 same-language bonus)
  • matchByExactName: single cross-language matches get 0.5 confidence (was 0.9)
  • matchFuzzy: prefers same-language candidates; cross-language get 0.3 confidence (was 0.5)

index.ts (isBuiltInOrExternal):

  • Filters Python built-in type method calls (list.extend, dict.update, str.split, etc.)
  • Filters bare Python built-in method names (append, extend, pop, keys, values, join, etc.)

Context relevance fixes (query-time) — src/context/, src/search/

  • New isTestFile() utility — detects test files across Python, JS/TS, Go, Rust, Java conventions
  • scorePathRelevance() applies -15 penalty to test files (unless query is about tests)
  • Context builder reduces test file scores to 30% after result merging
  • Re-sorts results so production code surfaces as entry points first

Java extraction improvements — src/extraction/tree-sitter.ts

  • Handle Java method_invocation AST node (receiver.method() pattern via object+name fields)
  • Support extends_interfaces and super_interfaces with type_list wrapper for Java
  • Create unresolved references for Java imports enabling cross-file resolution
  • Extract interface inheritance via extractInheritance

Multi-symbol aggregation — src/mcp/tools.ts

  • codegraph_callers, codegraph_callees, codegraph_impact now aggregate results across ALL matching symbols (e.g., multiple overloads, same-named methods in different classes)
  • New findAllSymbols() helper with note when results span multiple symbols

Impact traversal — src/graph/traversal.ts

  • Impact analysis traverses into container children (class → methods) so callers of contained methods appear in the impact radius of their parent class/interface

Other

  • deleteSpecificResolvedReferences() in src/db/queries.ts for precise cleanup after resolution
  • Added 'instance-method' to resolvedBy union type in src/resolution/types.ts
  • Version bump to 0.6.8

Test plan

  • All 395 previously passing tests still pass
  • Resolution tests: 16/16 pass
  • Context tests: 17/17 pass
  • Verified via direct DB query: no false cross-language extend/append edges after re-index
  • Java benchmark: all 5 categories (search, call graph, impact, context, files) validated
  • Re-run Python benchmark after MCP server restart to verify context relevance improvement end-to-end

🤖 Generated with Claude Code

colbymchenry and others added 2 commits April 3, 2026 12:29
Eliminate cross-language false positives in name resolution and deprioritize
test files in context building. Benchmarked on a Python+Rust codebase where
37% of edges were false positives from Python built-in methods resolving to
Rust functions (e.g., list.extend → Rust extend).

Resolution fixes (index-time):
- Filter Python built-in type method calls (list.extend, dict.update, etc.)
- Filter bare Python built-in method names (append, extend, pop, keys, etc.)
- Add language boundary checks to matchMethodCall strategies 1, 2, and 3
- Penalize cross-language matches: -80 points in findBestMatch (was 0)
- Reduce confidence for single cross-language exact matches (0.5 vs 0.9)
- Prefer same-language candidates in matchFuzzy

Context relevance fixes (query-time):
- Add isTestFile() utility detecting test files across Python/JS/TS/Go/Rust/Java
- Deprioritize test files in scorePathRelevance (-15 penalty)
- Reduce test file scores to 30% in context builder result merging
- Both skip deprioritization when query mentions "test" or "spec"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…raversal

Java extraction:
- Handle Java method_invocation AST (receiver.method pattern)
- Support Java extends_interfaces and super_interfaces with type_list
- Create unresolved references for Java imports for cross-file resolution
- Extract interface inheritance via extractInheritance

MCP tools:
- Aggregate callers/callees/impact across ALL matching symbols (e.g. multiple
  overloads or same-named methods in different classes)
- New findAllSymbols() helper for multi-symbol lookup

Graph traversal:
- Impact analysis now traverses into container children (class → methods)
  so that callers of methods appear in the impact radius of their class

Other:
- Add deleteSpecificResolvedReferences() for precise cleanup after resolution
- Add 'instance-method' to resolvedBy union type
- Version bump to 0.6.8

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@colbymchenry colbymchenry changed the title fix: Improve Python resolution accuracy and context relevance feat: Improve Java/Python resolution, context relevance, and multi-symbol aggregation Apr 3, 2026
@colbymchenry colbymchenry merged commit 47ed47b into main Apr 3, 2026
@colbymchenry colbymchenry deleted the fix/python-resolution-and-context-relevance branch April 3, 2026 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant