Skip to content

feat(analyzers): tree-sitter Python symbol resolver (T18 #689)#691

Draft
DvirDukhan wants to merge 1 commit into
dvirdukhan/mcp-t15-base-classfrom
dvirdukhan/mcp-t18-ts-resolver
Draft

feat(analyzers): tree-sitter Python symbol resolver (T18 #689)#691
DvirDukhan wants to merge 1 commit into
dvirdukhan/mcp-t15-base-classfrom
dvirdukhan/mcp-t18-ts-resolver

Conversation

@DvirDukhan
Copy link
Copy Markdown

Summary

Replaces jedi-based symbol resolution in second_pass with a pure tree-sitter static resolver, opt-in via CODE_GRAPH_PY_RESOLVER=tree_sitter. Default remains jedi for backwards compatibility.

Stacked on #690 (T15 base class).

Benchmark — pytest-dev/pytest-6202 (204 files)

Resolver Wall DEFINES CALLS EXTENDS
jedi 247.1s 4509 1976 71
tree-sitter 6.9s 4509 4833 83

~36× speedup. DEFINES exact match. CALLS +145% (jedi returned None ~80% of the time; TS prefers recall). EXTENDS +17%.

Where the win lives

The headline isn't the resolver throughput — it's that AbstractAnalyzer.needs_lsp() lets source_analyzer skip LSP startup and venv setup entirely when the static resolver is active. Jedi's warm-up was ~240s of the 247s baseline.

Architecture

  • TreeSitterPythonResolver (api/analyzers/python/ts_resolver.py) builds a project-wide symbol table keyed by id(files).
  • Resolution: head lookup (local module → import map → cross-project bare-name fallback) + tail walk through attributes and class methods.
  • Handles relative imports, aliased imports, package imports, Optional[T]/generic_type subscript unwrapping.
  • AbstractAnalyzer.needs_lsp() hook + PythonAnalyzer override gates LSP startup and add_dependencies venv install.

Tests

16 unit tests in tests/analyzers/test_ts_python_resolver.py covering dotted-name parsing, path-to-module mapping, relative imports, aliased imports, import-of-package, class methods, env-var gating.

Known follow-ups

Closes #689.

Replace jedi-based resolution with a pure tree-sitter static resolver
behind CODE_GRAPH_PY_RESOLVER=tree_sitter. Default remains jedi for
backwards compatibility.

Benchmark on pytest-dev/pytest-6202 (204 files):
  - jedi:        247.1s wall, CALLS=1976, EXTENDS=71
  - tree-sitter:   6.9s wall, CALLS=4833, EXTENDS=83
  ~36x speedup, broader call recall (jedi returns None ~80% of the time).

Mechanism:
  - TreeSitterPythonResolver builds a project-wide symbol table
    (top-level funcs/classes/assigns, class methods, import maps)
    keyed by id(files) for lazy construction.
  - Resolution: head lookup (local module -> import map ->
    cross-project bare-name fallback) + tail walk through attributes
    and class methods.
  - Handles relative imports, aliased imports, import-of-package,
    Optional[T]/generic_type subscript unwrapping.
  - AbstractAnalyzer.needs_lsp() hook + PythonAnalyzer override let
    source_analyzer skip LSP startup and venv setup entirely when
    the static resolver is active. This is where the wall-time win
    actually lives (jedi warm-up was ~240s of the 247s baseline).

Closes #689.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f09d13c1-da08-4e4e-aaa1-174ea7bfb1ef

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dvirdukhan/mcp-t18-ts-resolver

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

DvirDukhan added a commit that referenced this pull request May 28, 2026
After T18 (#691) + query-cache (#692), code_graph indexing on
pytest-6202 drops from 247s to 3.7s — but only if the API server is
launched with CODE_GRAPH_PY_RESOLVER=tree_sitter. This helper bakes
in that env plus the public/permissive flags the bench harness
expects, so calibration runs hit the fast path without manual setup.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant