Skip to content

refactor(analyzers): extract TreeSitterAnalyzer base class (T15 #663)#690

Open
DvirDukhan wants to merge 1 commit into
stagingfrom
dvirdukhan/mcp-t15-base-class
Open

refactor(analyzers): extract TreeSitterAnalyzer base class (T15 #663)#690
DvirDukhan wants to merge 1 commit into
stagingfrom
dvirdukhan/mcp-t15-base-class

Conversation

@DvirDukhan
Copy link
Copy Markdown

@DvirDukhan DvirDukhan commented May 28, 2026

Summary

  • Non-functional refactor: extracted shared tree-sitter scaffolding into TreeSitterAnalyzer without changing analyzer graph semantics.
  • Migrated Python, JavaScript, and Kotlin analyzers to declare node/type/call resolution contracts.
  • Added regression coverage for subclass entity-node declarations and multi-language fixture graph counts.

Before/after line counts

  • api/analyzers/tree_sitter_base.py: 0 -> 107
  • api/analyzers/python/analyzer.py: 124 -> 100
  • api/analyzers/javascript/analyzer.py: 172 -> 128
  • api/analyzers/kotlin/analyzer.py: 157 -> 155

Base-class hooks

  • entity_node_types drives default entity types and labels.
  • type_definition_node_types / callable_definition_node_types drive shared parent resolution.
  • type_resolution_keys / method_resolution_keys drive resolve_symbol dispatch.
  • _extract_type_target and _extract_call_target preserve Python/JavaScript AST target normalization.
  • Kotlin keeps its custom resolve_method override for child-identifier iteration.

Closes #663

Summary by CodeRabbit

Release Notes

  • Refactor

    • Unified analyzer architecture for improved code analysis consistency across Python, JavaScript, and Kotlin.
  • Tests

    • Added comprehensive multilanguage test coverage to validate analyzer behavior across supported languages.

Review Change Stack

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

📝 Walkthrough

Walkthrough

This PR extracts shared tree-sitter boilerplate into a new TreeSitterAnalyzer base class and migrates three language analyzers to inherit from it. PythonAnalyzer, JavaScriptAnalyzer, and KotlinAnalyzer now declare language-specific configuration and inherit resolution logic, reducing duplication. A new multilanguage test suite validates that outputs remain unchanged across languages.

Changes

TreeSitterAnalyzer Base Class and Migration

Layer / File(s) Summary
TreeSitterAnalyzer base class and shared infrastructure
api/analyzers/tree_sitter_base.py
New base class defines configurable entity node-type mappings, resolution key sets, and shared implementations for entity labeling, symbol dispatch, and type/method resolution via LSP-backed AST traversal. Provides overridable normalization hooks for subclasses.
PythonAnalyzer migration
api/analyzers/python/analyzer.py
PythonAnalyzer extends TreeSitterAnalyzer, preserves entity metadata helpers, and adds _extract_type_target and _extract_call_target hooks to normalize AST nodes for inherited resolution methods.
JavaScriptAnalyzer migration
api/analyzers/javascript/analyzer.py
JavaScriptAnalyzer extends TreeSitterAnalyzer, declares entity/callable node-type and resolution-key mappings, and adds _extract_call_target to normalize call expressions by extracting member properties.
KotlinAnalyzer migration
api/analyzers/kotlin/analyzer.py
KotlinAnalyzer extends TreeSitterAnalyzer, declares entity and resolution-key mappings, introduces _get_delegation_types helper for base/interface extraction, and refactors add_symbols to populate structured type/call/parameter symbols.
Multilanguage test suite and fixtures
tests/analyzers/test_tree_sitter_base.py, tests/analyzers/fixtures/multilang/sample.*
New test validates that all three analyzers expose correct entity node types and that multilanguage fixture files parse with expected entity and edge counts. Fixtures (Python, JavaScript, Kotlin) each define a simple class and helper function pair.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Three languages, once three scattered tales,
Now dance as one through tree-sitter trails.
Base class born, redundancy frays,
Python, JavaScript, Kotlin in sync—test always! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 39.02% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: extracting a TreeSitterAnalyzer base class from existing analyzer implementations, with reference to the issue #663.
Linked Issues check ✅ Passed The PR fully meets all linked issue #663 objectives: new TreeSitterAnalyzer base class added, Python/JavaScript/Kotlin analyzers migrated, regression tests included, and no graph schema changes.
Out of Scope Changes check ✅ Passed All changes directly support the refactoring objective. Test fixtures (sample.py/js/kt) and test file are needed for regression testing, and no unrelated modifications are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dvirdukhan/mcp-t15-base-class

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
api/analyzers/tree_sitter_base.py (1)

45-107: ⚡ Quick win

Add a resolver-focused regression.

This refactor moves type/call resolution into shared code, but the new fixture coverage only asserts entity counts and DEFINES edges. A bad type_resolution_keys / _extract_* / dispatch change here would still pass, so please add at least one inheritance edge and one call edge assertion that exercise resolve_symbol.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/analyzers/tree_sitter_base.py` around lines 45 - 107, The tests only
assert entity counts and DEFINES edges but miss exercising resolve_symbol; add
assertions that exercise type and call resolution by creating at least one
inheritance edge and one call edge: update the test fixture or test case to
trigger TreeSitterBase.resolve_symbol via the existing dispatch keys (ensure
type_resolution_keys and method_resolution_keys include the test keys) and/or by
customizing _extract_type_target / _extract_call_target to return the captured
nodes; then assert an INHERITS (or similar) edge exists for the resolved type
(using resolve_type/resolve_symbol) and a CALLS (or similar) edge exists for the
resolved callable (using resolve_method/resolve_symbol) so failures in
resolve_symbol/type/callor extraction are detected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/analyzers/kotlin/analyzer.py`:
- Around line 112-115: The current object_declaration branch discards the kind
info from _get_delegation_types and always calls
entity.add_symbol("implement_interface"), turning delegated base-class relations
into interfaces; change the loop to unpack the second value returned by
_get_delegation_types (e.g., for node, kind in types) and branch on that kind:
when kind indicates a constructor_invocation (the superclass case) call
entity.add_symbol using the superclass relation symbol (e.g., "extend_class" or
the project’s existing superclass symbol), otherwise call
entity.add_symbol("implement_interface") to preserve interface relations.

---

Nitpick comments:
In `@api/analyzers/tree_sitter_base.py`:
- Around line 45-107: The tests only assert entity counts and DEFINES edges but
miss exercising resolve_symbol; add assertions that exercise type and call
resolution by creating at least one inheritance edge and one call edge: update
the test fixture or test case to trigger TreeSitterBase.resolve_symbol via the
existing dispatch keys (ensure type_resolution_keys and method_resolution_keys
include the test keys) and/or by customizing _extract_type_target /
_extract_call_target to return the captured nodes; then assert an INHERITS (or
similar) edge exists for the resolved type (using resolve_type/resolve_symbol)
and a CALLS (or similar) edge exists for the resolved callable (using
resolve_method/resolve_symbol) so failures in resolve_symbol/type/callor
extraction are detected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9e0be41b-01fa-4693-a9fe-0b21241282e7

📥 Commits

Reviewing files that changed from the base of the PR and between b066064 and 4841701.

📒 Files selected for processing (9)
  • api/analyzers/javascript/analyzer.py
  • api/analyzers/kotlin/analyzer.py
  • api/analyzers/python/analyzer.py
  • api/analyzers/tree_sitter_base.py
  • tests/analyzers/__init__.py
  • tests/analyzers/fixtures/multilang/sample.js
  • tests/analyzers/fixtures/multilang/sample.kt
  • tests/analyzers/fixtures/multilang/sample.py
  • tests/analyzers/test_tree_sitter_base.py

Comment on lines 112 to 115
elif entity.node.type == 'object_declaration':
types = self._get_delegation_types(entity)
for node, _ in types:
entity.add_symbol("implement_interface", node)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve superclass vs interface symbols for object declarations.

_get_delegation_types() already tells you whether a delegated type came from a constructor_invocation, but this branch drops that bit and records every entry as implement_interface. That turns object Foo : Base() into an interface relation instead of a base-class relation, which changes graph semantics in this “non-functional” refactor.

Suggested fix
         elif entity.node.type == 'object_declaration':
             types = self._get_delegation_types(entity)
-            for node, _ in types:
-                entity.add_symbol("implement_interface", node)
+            for node, is_class in types:
+                if is_class:
+                    entity.add_symbol("base_class", node)
+                else:
+                    entity.add_symbol("implement_interface", node)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/analyzers/kotlin/analyzer.py` around lines 112 - 115, The current
object_declaration branch discards the kind info from _get_delegation_types and
always calls entity.add_symbol("implement_interface"), turning delegated
base-class relations into interfaces; change the loop to unpack the second value
returned by _get_delegation_types (e.g., for node, kind in types) and branch on
that kind: when kind indicates a constructor_invocation (the superclass case)
call entity.add_symbol using the superclass relation symbol (e.g.,
"extend_class" or the project’s existing superclass symbol), otherwise call
entity.add_symbol("implement_interface") to preserve interface relations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP T15 — Tree-sitter analyzer base class refactor

1 participant