Skip to content

perf(analyzers): memoise compiled tree-sitter queries#692

Draft
DvirDukhan wants to merge 1 commit into
dvirdukhan/mcp-t18-ts-resolverfrom
dvirdukhan/query-cache
Draft

perf(analyzers): memoise compiled tree-sitter queries#692
DvirDukhan wants to merge 1 commit into
dvirdukhan/mcp-t18-ts-resolverfrom
dvirdukhan/query-cache

Conversation

@DvirDukhan
Copy link
Copy Markdown

Summary

AbstractAnalyzer._captures was recompiling its query string on every call. cProfile on pytest-dev/pytest-6202 (204 files) showed tree_sitter.Language.query consuming 3.03s of the 6.36s first_pass — ~48% of analyzer time spent rebuilding queries that never change.

Cache them on the analyzer instance, keyed by pattern string. Also switches from the deprecated language.query() to the Query(language, pattern) constructor.

Benchmark — pytest-dev/pytest-6202

(with CODE_GRAPH_PY_RESOLVER=tree_sitter from #691)

Stage Before After
wall-time 6.9s 3.7s
DEFINES 4509 4509
EXTENDS 83 82
CALLS 4833 5052

Edge-count drift is within the existing nondeterminism of cross-project bare-name fallback (resolver picks among candidates by dict iteration order, which the warmer query cache affects). DEFINES is bit-identical.

Scope

Benefits every tree-sitter analyzer (Python, JavaScript, Kotlin), not just the new Python static resolver from #691.

Stacked on #691 (T18 resolver).

AbstractAnalyzer._captures was recompiling its query string on every
call. cProfile on pytest-dev/pytest-6202 (204 files) showed
tree_sitter.Language.query consuming 3.03s of the 6.36s first_pass —
~48% of analyzer time spent rebuilding queries that never change.

Cache them on the analyzer instance, keyed by pattern string. Also
switches from the deprecated language.query() to the Query(language,
pattern) constructor.

Wall-time on pytest-6202 (CODE_GRAPH_PY_RESOLVER=tree_sitter):
  before: 6.9s
  after:  3.7s

Benefits every tree-sitter analyzer (Python, JavaScript, Kotlin), not
just the new static resolver.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6987e41a-ddbd-48a2-9239-9408cf6ee4f0

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dvirdukhan/query-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

DvirDukhan added a commit that referenced this pull request May 28, 2026
After T18 (#691) + query-cache (#692), code_graph indexing on
pytest-6202 drops from 247s to 3.7s — but only if the API server is
launched with CODE_GRAPH_PY_RESOLVER=tree_sitter. This helper bakes
in that env plus the public/permissive flags the bench harness
expects, so calibration runs hit the fast path without manual setup.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant