feat: Add AST parsing with tree-sitter for code symbol extraction#14
feat: Add AST parsing with tree-sitter for code symbol extraction#14
Conversation
Implements a new --output ast mode that extracts and displays high-level code entities (functions, classes, interfaces, etc.) using tree-sitter. Features: - New output format: --output ast displays parsed code symbols - Multi-language support: TypeScript, JavaScript, Python (fully implemented) - Extensible architecture: Ready for 7+ additional languages (Go, Rust, Java, C++, C#, Ruby, PHP) - Symbol extraction: Functions, classes, interfaces, types, enums, variables, imports, exports - Hierarchical display: Shows nested symbols (classes → methods → properties) - Location tracking: Displays line numbers and ranges for each symbol - Signature formatting: Full type signatures with parameters and return types Technical implementation: - AST parser with lazy-loaded grammars for performance - Language configuration system for easy extensibility - Integrated with existing cache and scanner infrastructure - Type-safe implementation with comprehensive TypeScript types - Zero additional user installation required (dependencies bundled) Symbol types extracted: - Functions/Methods (ƒ) - with parameters, return types, async/generator flags - Classes (C) - with extends, implements, abstract modifiers - Interfaces (I) - with extends and members - Types (T) - type aliases and definitions - Enums (E) - with member values - Variables/Constants (v/c) - with type annotations - Imports (←) - showing source and imported items - Exports (→) - showing exported items - Namespaces (N) - for languages that support them Dependencies added: - tree-sitter: Core parsing library - tree-sitter-typescript: TypeScript/TSX grammar - tree-sitter-javascript: JavaScript/JSX grammar - tree-sitter-python: Python grammar - tree-sitter-go: Go grammar - tree-sitter-rust: Rust grammar - tree-sitter-java: Java grammar - tree-sitter-cpp: C++ grammar - tree-sitter-c-sharp: C# grammar - tree-sitter-ruby: Ruby grammar - tree-sitter-php: PHP grammar
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the WalkthroughThis PR introduces AST parsing and symbol extraction capabilities using tree-sitter. It adds language-specific parsers for TypeScript, JavaScript, and Python; a new ASTParser core module; comprehensive symbol type definitions; integration with the existing scanning pipeline; and AST-formatted output rendering. Cache versioning is incremented to reflect the schema changes. Changes
Sequence DiagramsequenceDiagram
participant User as User/CLI
participant CLI as cli.ts
participant Scanner as DirectoryScanner
participant ASTParser as ASTParser
participant LangReg as Language Registry
participant Formatter as astFormatter
User->>CLI: Request AST output
CLI->>CLI: Parse args, set enableAST=true
CLI->>Scanner: Instantiate with enableAST
Scanner->>ASTParser: Create instance
Scanner->>Scanner: Start scanning
loop For each file
Scanner->>ASTParser: parseFile(filePath)
ASTParser->>LangReg: getLanguageByExtension()
LangReg-->>ASTParser: LanguageConfig
ASTParser->>ASTParser: loadGrammar (cached)
ASTParser->>ASTParser: parse with tree-sitter
ASTParser->>ASTParser: extractSymbols()
ASTParser-->>Scanner: ASTSymbol[]
Scanner->>Scanner: Attach entities to FileNode
end
Scanner-->>CLI: ScanResult with entities
CLI->>Formatter: formatAsAST(result, options)
Formatter->>Formatter: Traverse files/symbols
Formatter->>Formatter: Build ASCII tree
Formatter-->>CLI: Formatted string
CLI-->>User: Output AST representation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Rationale: This PR introduces substantial new infrastructure (ASTParser class, language registry system, symbol type definitions) alongside deep integration into existing scanning and CLI pipelines. While individual language implementations follow consistent patterns, the diverse mix of core logic, type definitions, integration points, and grammar-handling mechanics requires careful reasoning across multiple domains. Comprehensive test coverage and incremental design patterns reduce friction, but the interconnected nature and novel dependencies (tree-sitter) demand thorough review. Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (12)
src/core/scanner.ts (1)
162-170: Avoid triple I/O (hash + tokenize + AST).Current flow likely reads file up to 3x (hashFile, tokenizer.countTokens, astParser.parseFile). Consider a small perf refactor:
- Read once (e.g., via Bun.file(path).text()) and feed:
- tokenizer.countTokensFromText(text) [new helper]
- astParser.parseText(text, ext)
- hashFromText(text) [optional helper]
Even switching only AST to parseText saves one read. Also consider always initializing entities: [] to simplify consumers.Also applies to: 172-186
src/core/languages/rust.ts (1)
1-18: Fix lints: alias Symbol type and underscore unused args.Prevents global Symbol shadowing and unused-arg errors while keeping the stub.
Apply:
-import type Parser from 'tree-sitter'; -import type { Symbol } from '../../types/index.js'; +import type Parser from 'tree-sitter'; +import type { Symbol as CodeSymbol } from '../../types/index.js'; export const RustConfig: LanguageConfig = { @@ - extractSymbols: (tree: Parser.Tree, sourceCode: string): Symbol[] => { + extractSymbols: (_tree: Parser.Tree, _sourceCode: string): CodeSymbol[] => { // TODO: Implement Rust-specific symbol extraction return []; } };src/core/languages/typescript.ts (4)
1-5: Resolve lints: alias Symbol type; drop unused SymbolType import.Prevents global Symbol shadowing and removes an unused type import.
-import type Parser from 'tree-sitter'; -import type { Symbol, SymbolType, FunctionSymbol, ClassSymbol, InterfaceSymbol, TypeSymbol, EnumSymbol, VariableSymbol, ImportSymbol, ExportSymbol, SourceLocation, Parameter } from '../../types/index.js'; +import type Parser from 'tree-sitter'; +import type { Symbol as CodeSymbol, FunctionSymbol, ClassSymbol, InterfaceSymbol, TypeSymbol, EnumSymbol, VariableSymbol, ImportSymbol, ExportSymbol, SourceLocation, Parameter } from '../../types/index.js'; import type { LanguageConfig } from './index.js'; import { SymbolType as ST } from '../../types/index.js';Also update return/locals to CodeSymbol:
- extractSymbols: (tree: Parser.Tree, sourceCode: string): Symbol[] => { - const symbols: Symbol[] = []; + extractSymbols: (tree: Parser.Tree, sourceCode: string): CodeSymbol[] => { + const symbols: CodeSymbol[] = []; @@ - const members: Symbol[] = []; + const members: CodeSymbol[] = []; @@ - const members: Symbol[] = []; + const members: CodeSymbol[] = [];
67-68: Generator detection likely brittle.Checking for a child with type '*' may miss generator markers; prefer grammar token check (e.g., c.type === 'asterisk') or drop generator until verified.
Would you like me to cross‑check the tree-sitter TypeScript grammar nodes for generator functions and adjust?
321-323: Robust const detection.Using startsWith('const') on node text can be wrong with leading modifiers (e.g., export). Prefer token check.
- type: node.type === 'lexical_declaration' && getNodeText(node).startsWith('const') ? ST.CONSTANT : ST.VARIABLE, + type: node.type === 'lexical_declaration' && node.children.some(c => c.type === 'const') + ? ST.CONSTANT + : ST.VARIABLE,
87-90: Refine class heritage parsing (extends vs implements).Reading whole class_heritage may mix extends/implements. Consider extracting specific child nodes (extends_clause, implements_clause) for cleaner strings and arrays.
Also applies to: 114-123
src/formatters/astFormatter.ts (3)
1-4: Clean imports: alias Symbol type and drop unused FolderNode.Prevents global Symbol shadowing and removes an unused type.
-import chalk from 'chalk'; -import type { ScanResult, Node, FileNode, FolderNode, Symbol, TreeOptions, FunctionSymbol, ClassSymbol, InterfaceSymbol, EnumSymbol, ImportSymbol, ExportSymbol, NamespaceSymbol } from '../types/index.js'; +import chalk from 'chalk'; +import type { ScanResult, Node, FileNode, Symbol as CodeSymbol, TreeOptions, FunctionSymbol, ClassSymbol, InterfaceSymbol, EnumSymbol, ImportSymbol, ExportSymbol, NamespaceSymbol } from '../types/index.js';Update function signatures/uses to CodeSymbol (examples):
-function formatSymbolSignature(symbol: Symbol): string { +function formatSymbolSignature(symbol: CodeSymbol): string { @@ -function formatSymbol(symbol: Symbol, indent: string, isLast: boolean, showLocation: boolean = true): string[] { +function formatSymbol(symbol: CodeSymbol, indent: string, isLast: boolean, showLocation: boolean = true): string[] { @@ -function getNestedSymbols(symbol: Symbol): Symbol[] { +function getNestedSymbols(symbol: CodeSymbol): CodeSymbol[] { @@ -function countNestedSymbols(symbol: Symbol): number { +function countNestedSymbols(symbol: CodeSymbol): number {
47-109: Wrap case blocks to avoid lexical leakage in switch.Adds braces to satisfy no-case-declarations and Biome’s noSwitchDeclarations.
- function formatSymbolSignature(symbol: Symbol): string { + function formatSymbolSignature(symbol: CodeSymbol): string { switch (symbol.type) { case SymbolType.FUNCTION: case SymbolType.METHOD: - const funcSymbol = symbol as FunctionSymbol; - if (funcSymbol.signature) { - return funcSymbol.signature; - } - const params = funcSymbol.parameters.map(p => { - let param = p.name; - if (p.type) param += `: ${p.type}`; - if (p.optional) param += '?'; - if (p.defaultValue) param += ` = ${p.defaultValue}`; - return param; - }).join(', '); - let sig = `${funcSymbol.name}(${params})`; - if (funcSymbol.returnType) sig += `: ${funcSymbol.returnType}`; - if (funcSymbol.async) sig = `async ${sig}`; - return sig; + { + const funcSymbol = symbol as FunctionSymbol; + if (funcSymbol.signature) { + return funcSymbol.signature; + } + const params = funcSymbol.parameters.map(p => { + let param = p.name; + if (p.type) param += `: ${p.type}`; + if (p.optional) param += '?'; + if (p.defaultValue) param += ` = ${p.defaultValue}`; + return param; + }).join(', '); + let sig = `${funcSymbol.name}(${params})`; + if (funcSymbol.returnType) sig += `: ${funcSymbol.returnType}`; + if (funcSymbol.async) sig = `async ${sig}`; + return sig; + } case SymbolType.CLASS: - const classSymbol = symbol as ClassSymbol; - let classSig = classSymbol.name; - if (classSymbol.abstract) classSig = `abstract ${classSig}`; - if (classSymbol.extends) classSig += ` extends ${classSymbol.extends}`; - if (classSymbol.implements && classSymbol.implements.length > 0) { - classSig += ` implements ${classSymbol.implements.join(', ')}`; - } - return classSig; + { + const classSymbol = symbol as ClassSymbol; + let classSig = classSymbol.name; + if (classSymbol.abstract) classSig = `abstract ${classSig}`; + if (classSymbol.extends) classSig += ` extends ${classSymbol.extends}`; + if (classSymbol.implements && classSymbol.implements.length > 0) { + classSig += ` implements ${classSymbol.implements.join(', ')}`; + } + return classSig; + } case SymbolType.INTERFACE: - const ifaceSymbol = symbol as InterfaceSymbol; - let ifaceSig = ifaceSymbol.name; - if (ifaceSymbol.extends && ifaceSymbol.extends.length > 0) { - ifaceSig += ` extends ${ifaceSymbol.extends.join(', ')}`; - } - return ifaceSig; + { + const ifaceSymbol = symbol as InterfaceSymbol; + let ifaceSig = ifaceSymbol.name; + if (ifaceSymbol.extends && ifaceSymbol.extends.length > 0) { + ifaceSig += ` extends ${ifaceSymbol.extends.join(', ')}`; + } + return ifaceSig; + } case SymbolType.ENUM: - const enumSymbol = symbol as EnumSymbol; - return `${enumSymbol.name} { ${enumSymbol.members.length} members }`; + { + const enumSymbol = symbol as EnumSymbol; + return `${enumSymbol.name} { ${enumSymbol.members.length} members }`; + } case SymbolType.IMPORT: - const importSymbol = symbol as ImportSymbol; - let importSig = `from "${importSymbol.from}"`; - if (importSymbol.default) importSig = `${importSymbol.default} ${importSig}`; - if (importSymbol.imports.length > 0) importSig += ` { ${importSymbol.imports.join(', ')} }`; - if (importSymbol.namespace) importSig += ` as ${importSymbol.namespace}`; - return importSig; + { + const importSymbol = symbol as ImportSymbol; + let importSig = `from "${importSymbol.from}"`; + if (importSymbol.default) importSig = `${importSymbol.default} ${importSig}`; + if (importSymbol.imports.length > 0) importSig += ` { ${importSymbol.imports.join(', ')} }`; + if (importSymbol.namespace) importSig += ` as ${importSymbol.namespace}`; + return importSig; + } case SymbolType.EXPORT: - const exportSymbol = symbol as ExportSymbol; - if (exportSymbol.default) return `default ${exportSymbol.default}`; - return `{ ${exportSymbol.exports.join(', ')} }`; + { + const exportSymbol = symbol as ExportSymbol; + if (exportSymbol.default) return `default ${exportSymbol.default}`; + return `{ ${exportSymbol.exports.join(', ')} }`; + } case SymbolType.NAMESPACE: - const nsSymbol = symbol as NamespaceSymbol; - return `${nsSymbol.name} { ${nsSymbol.members.length} members }`; + { + const nsSymbol = symbol as NamespaceSymbol; + return `${nsSymbol.name} { ${nsSymbol.members.length} members }`; + }
6-9: Option not used.ASTFormatterOptions.showTokensPerSymbol is unused. Either implement it or remove to avoid confusion.
src/core/languages/python.ts (1)
1-5: Alias Symbol type to avoid shadowing.Prevents global Symbol shadowing; no behavior change.
-import type Parser from 'tree-sitter'; -import type { Symbol, FunctionSymbol, ClassSymbol, ImportSymbol, SourceLocation, Parameter } from '../../types/index.js'; +import type Parser from 'tree-sitter'; +import type { Symbol as CodeSymbol, FunctionSymbol, ClassSymbol, ImportSymbol, SourceLocation, Parameter } from '../../types/index.js'; @@ - extractSymbols: (tree: Parser.Tree, sourceCode: string): Symbol[] => { - const symbols: Symbol[] = []; + extractSymbols: (tree: Parser.Tree, sourceCode: string): CodeSymbol[] => { + const symbols: CodeSymbol[] = [];Also applies to: 15-17
src/core/languages/index.ts (1)
4-9: Consider typingloadGrammarreturn value more specifically.The
loadGrammarfunction returnsPromise<any>, which loses type safety. If tree-sitter provides aLanguagetype, use it:+import type Parser from 'tree-sitter'; export interface LanguageConfig { name: string; extensions: string[]; - loadGrammar: () => Promise<any>; + loadGrammar: () => Promise<Parser.Language>; extractSymbols: (tree: Parser.Tree, sourceCode: string) => Symbol[]; }If
Parser.Languageis not exported or the grammars have incompatible types, document whyanyis necessary with a comment.Based on static analysis (GitHub Check: Test and Build).
src/core/astParser.ts (1)
7-11: TypegrammarCachemore specifically.The
Map<string, any>loses type safety. If tree-sitter provides aLanguagetype:+import type Parser from 'tree-sitter'; export class ASTParser { private parser: Parser | null = null; private initialized = false; - private grammarCache: Map<string, any> = new Map(); + private grammarCache: Map<string, Parser.Language> = new Map();Based on static analysis (GitHub Check: Test and Build).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
package.json(2 hunks)src/cli.ts(4 hunks)src/core/astParser.ts(1 hunks)src/core/languages/cpp.ts(1 hunks)src/core/languages/csharp.ts(1 hunks)src/core/languages/go.ts(1 hunks)src/core/languages/index.ts(1 hunks)src/core/languages/java.ts(1 hunks)src/core/languages/javascript.ts(1 hunks)src/core/languages/php.ts(1 hunks)src/core/languages/python.ts(1 hunks)src/core/languages/ruby.ts(1 hunks)src/core/languages/rust.ts(1 hunks)src/core/languages/typescript.ts(1 hunks)src/core/scanner.ts(4 hunks)src/formatters/astFormatter.ts(1 hunks)src/types/index.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Do not use dotenv; Bun loads .env automatically
UseBun.serve()for HTTP/WebSocket/HTTPS routes; do not use Express
Usebun:sqlitefor SQLite; do not usebetter-sqlite3
UseBun.redisfor Redis; do not useioredis
UseBun.sqlfor Postgres; do not usepgorpostgres.js
Use built-inWebSocket; do not usews
PreferBun.fileovernode:fsreadFile/writeFile
UseBun.$for shelling out instead ofexeca
Files:
src/core/languages/csharp.tssrc/core/languages/rust.tssrc/core/languages/php.tssrc/core/languages/javascript.tssrc/core/languages/cpp.tssrc/core/languages/typescript.tssrc/core/languages/python.tssrc/core/languages/go.tssrc/core/languages/ruby.tssrc/core/scanner.tssrc/core/astParser.tssrc/formatters/astFormatter.tssrc/core/languages/java.tssrc/core/languages/index.tssrc/cli.tssrc/types/index.ts
src/core/scanner.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Implement file scanning logic in src/core/scanner.ts
Files:
src/core/scanner.ts
src/formatters/**
📄 CodeRabbit inference engine (CLAUDE.md)
Place output formatters (tree, flat, json, csv) under src/formatters/
Files:
src/formatters/astFormatter.ts
src/cli.ts
📄 CodeRabbit inference engine (CLAUDE.md)
ContextCalc CLI entry point is src/cli.ts
Files:
src/cli.ts
src/types/index.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Keep core type definitions in src/types/index.ts
Files:
src/types/index.ts
🧠 Learnings (1)
📚 Learning: 2025-09-12T14:25:55.847Z
Learnt from: CR
PR: agentinit/contextcalc#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T14:25:55.847Z
Learning: Applies to src/core/scanner.ts : Implement file scanning logic in src/core/scanner.ts
Applied to files:
src/core/scanner.tssrc/cli.ts
🧬 Code graph analysis (15)
src/core/languages/csharp.ts (1)
src/core/languages/index.ts (1)
LanguageConfig(4-9)
src/core/languages/rust.ts (1)
src/core/languages/index.ts (1)
LanguageConfig(4-9)
src/core/languages/php.ts (1)
src/core/languages/index.ts (1)
LanguageConfig(4-9)
src/core/languages/javascript.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/core/languages/typescript.ts (1)
TypeScriptConfig(6-341)
src/core/languages/cpp.ts (1)
src/core/languages/index.ts (1)
LanguageConfig(4-9)
src/core/languages/typescript.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (10)
SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)VariableSymbol(186-190)InterfaceSymbol(170-174)TypeSymbol(176-179)EnumSymbol(181-184)ImportSymbol(192-198)ExportSymbol(200-204)
src/core/languages/python.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (5)
SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)ImportSymbol(192-198)
src/core/languages/go.ts (1)
src/core/languages/index.ts (1)
LanguageConfig(4-9)
src/core/languages/ruby.ts (1)
src/core/languages/index.ts (1)
LanguageConfig(4-9)
src/core/scanner.ts (2)
src/core/astParser.ts (1)
ASTParser(7-147)src/types/index.ts (1)
FileNode(1-11)
src/core/astParser.ts (2)
src/core/languages/index.ts (2)
initializeLanguages(32-56)getLanguageByExtension(24-26)src/types/index.ts (1)
ASTOptions(234-239)
src/formatters/astFormatter.ts (2)
src/types/index.ts (11)
TreeOptions(79-92)ScanResult(104-110)FunctionSymbol(153-160)ClassSymbol(162-168)InterfaceSymbol(170-174)EnumSymbol(181-184)ImportSymbol(192-198)ExportSymbol(200-204)NamespaceSymbol(206-209)FileNode(1-11)Node(23-23)src/utils/formatUtils.ts (1)
formatFileSize(1-13)
src/core/languages/java.ts (1)
src/core/languages/index.ts (1)
LanguageConfig(4-9)
src/core/languages/index.ts (10)
src/core/languages/typescript.ts (1)
TypeScriptConfig(6-341)src/core/languages/javascript.ts (1)
JavaScriptConfig(5-20)src/core/languages/python.ts (1)
PythonConfig(6-163)src/core/languages/go.ts (1)
GoConfig(5-18)src/core/languages/rust.ts (1)
RustConfig(5-18)src/core/languages/java.ts (1)
JavaConfig(5-18)src/core/languages/cpp.ts (1)
CppConfig(5-18)src/core/languages/csharp.ts (1)
CSharpConfig(5-18)src/core/languages/ruby.ts (1)
RubyConfig(5-18)src/core/languages/php.ts (1)
PhpConfig(5-18)
src/cli.ts (3)
src/core/scanner.ts (1)
DirectoryScanner(11-258)src/types/index.ts (1)
TreeOptions(79-92)src/formatters/astFormatter.ts (1)
formatAsAST(11-222)
🪛 Biome (2.1.2)
src/core/languages/csharp.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/rust.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/php.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/javascript.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/cpp.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/typescript.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/python.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/go.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/ruby.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/astParser.ts
[error] 4-4: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/formatters/astFormatter.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
[error] 51-52: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 55-62: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 62-63: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 68-69: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 69-70: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 78-79: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 79-80: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 86-87: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 90-91: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 91-92: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 98-99: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
[error] 103-104: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Safe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
src/core/languages/java.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/index.ts
[error] 2-2: Do not shadow the global "Symbol" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
🪛 ESLint
src/core/languages/csharp.ts
[error] 14-14: 'tree' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 14-14: 'sourceCode' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/core/languages/rust.ts
[error] 14-14: 'tree' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 14-14: 'sourceCode' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/core/languages/php.ts
[error] 14-14: 'tree' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 14-14: 'sourceCode' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/core/languages/javascript.ts
[error] 17-17: A require() style import is forbidden.
(@typescript-eslint/no-require-imports)
src/core/languages/cpp.ts
[error] 14-14: 'tree' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 14-14: 'sourceCode' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/core/languages/typescript.ts
[error] 2-2: 'SymbolType' is defined but never used. Allowed unused vars must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/core/languages/go.ts
[error] 14-14: 'tree' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 14-14: 'sourceCode' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/core/languages/ruby.ts
[error] 14-14: 'tree' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 14-14: 'sourceCode' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/core/astParser.ts
[error] 20-20: 'options' is assigned a value but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 85-85: 'options' is assigned a value but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
src/formatters/astFormatter.ts
[error] 2-2: 'FolderNode' is defined but never used. Allowed unused vars must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 51-51: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 55-61: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 62-62: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 68-68: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 69-69: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 78-78: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 79-79: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 86-86: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 90-90: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 91-91: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 98-98: Unexpected lexical declaration in case block.
(no-case-declarations)
[error] 103-103: Unexpected lexical declaration in case block.
(no-case-declarations)
src/core/languages/java.ts
[error] 14-14: 'tree' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
[error] 14-14: 'sourceCode' is defined but never used. Allowed unused args must match /^_/u.
(@typescript-eslint/no-unused-vars)
🪛 GitHub Actions: CI
src/core/astParser.ts
[warning] 10-10: ESLint: Unexpected any. Specify a different type. @typescript-eslint/no-explicit-any
[error] 20-20: ESLint: 'options' is assigned a value but never used. Allowed unused args must match /^_/ (no-unused-vars)
🪛 GitHub Check: Test and Build (18)
src/core/languages/csharp.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/languages/cpp.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/languages/go.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/astParser.ts
[failure] 85-85:
'options' is assigned a value but never used. Allowed unused args must match /^_/u
[failure] 20-20:
'options' is assigned a value but never used. Allowed unused args must match /^_/u
[warning] 10-10:
Unexpected any. Specify a different type
src/core/languages/java.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/languages/index.ts
[warning] 7-7:
Unexpected any. Specify a different type
🪛 GitHub Check: Test and Build (22)
src/core/languages/csharp.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/languages/cpp.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/languages/go.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/astParser.ts
[failure] 85-85:
'options' is assigned a value but never used. Allowed unused args must match /^_/u
[failure] 20-20:
'options' is assigned a value but never used. Allowed unused args must match /^_/u
[warning] 10-10:
Unexpected any. Specify a different type
src/core/languages/java.ts
[failure] 14-14:
'sourceCode' is defined but never used. Allowed unused args must match /^_/u
[failure] 14-14:
'tree' is defined but never used. Allowed unused args must match /^_/u
src/core/languages/index.ts
[warning] 7-7:
Unexpected any. Specify a different type
🔇 Additional comments (7)
package.json (1)
21-22: LGTM! Tree-sitter integration properly configured.The externals configuration correctly prevents bundling of tree-sitter native modules, and the dependency versions are consistently pinned across all language grammars.
Also applies to: 55-66
src/core/scanner.ts (1)
15-31: All constructor call sites verified and correct.The single DirectoryScanner instantiation at
src/cli.ts:279properly includes the newenableASTparameter, correctly derived fromoutputFormat === OutputFormat.ASTat line 278. The constructor signature matches the call site, and enableAST lifecycle gating is consistent throughout initialization and disposal.src/cli.ts (1)
10-10: LGTM! Clean integration of AST output support.The AST output format integration follows the existing patterns for other formatters (CSV, JSON, etc.) and correctly enables AST parsing in the scanner when needed. The CLI help text and completion message are appropriately updated.
Also applies to: 109-109, 277-279, 287-304, 371-371
src/core/languages/index.ts (1)
32-56: LGTM! Language initialization is well-structured.The lazy-loading approach with dynamic imports is efficient and the registration logic is straightforward.
src/core/astParser.ts (1)
20-83: LGTM! Robust parsing implementation with graceful error handling.The parser implementation properly handles initialization, grammar loading with caching, and parsing errors by returning empty arrays. The error handling strategy is appropriate for this use case.
Also applies to: 85-139, 141-146
src/types/index.ts (2)
112-239: LGTM! Comprehensive and well-structured AST type system.The AST symbol type definitions are thorough and properly use discriminated unions. The type hierarchy (BaseSymbol → specialized interfaces → union type) is clean and extensible.
43-49: LGTM! Clean additions to support AST functionality.The
OutputFormatenum extension andASTOptionsinterface are straightforward and integrate well with the existing type system.Also applies to: 234-239
Fixes all 31 linting errors and 2 warnings identified by ESLint: Symbol Naming Conflict: - Renamed 'Symbol' type to 'ASTSymbol' throughout codebase to avoid shadowing JavaScript's global Symbol - Updated all imports, type annotations, and usages across 14 files Unused Parameters: - Prefixed unused parameters with underscore in stub language implementations - Fixed _options parameter in astParser.ts parseFile and parseText methods ESLint Violations: - Replaced require() with ESM import in javascript.ts - Added braces to all case blocks in astFormatter.ts formatSymbolSignature - Changed 'any' types to 'unknown' in grammarCache and loadGrammar Removed Unused Imports: - Removed FolderNode from astFormatter.ts imports Test Results: ✅ All 83 tests passing ✅ TypeScript compilation successful ✅ ESLint: 0 errors, 0 warnings ✅ Functionality verified: AST parsing working correctly
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/core/languages/go.ts (1)
14-17: Unused params prefix fix — LGTM.The underscore-prefixed parameters resolve the lint violations noted earlier.
🧹 Nitpick comments (5)
src/types/index.ts (1)
10-10: Solid AST type surface and OutputFormat extension.Types are cohesive and align with the formatter and language extractors; FileNode.entities and OutputFormat.AST look correct.
Optional enhancement: in ImportSymbol, model aliasing (e.g., { original: string; alias?: string }[]) to preserve “as” names for named imports.
Also applies to: 47-49, 112-239
src/core/languages/go.ts (1)
9-12: Make grammar loading resilient to export shape differences.tree-sitter grammars vary (default export vs named). Guard for both to avoid runtime failures.
Apply:
- loadGrammar: async () => { - const GoLanguage = await import('tree-sitter-go'); - return GoLanguage.default; - }, + loadGrammar: async () => { + const mod = await import('tree-sitter-go'); + // Support both ESM default and named exports + return (mod as any).default ?? (mod as any).Go ?? (mod as any).language ?? mod; + },Please run a quick smoke parse on a tiny Go file to confirm Parser.setLanguage(await loadGrammar()) succeeds.
src/core/languages/typescript.ts (3)
85-123: Parse extends/implements into discrete names.Currently extends/implements are captured as full clauses (e.g., “implements A, B”). Normalize to arrays of identifiers.
- const extendsNode = node.children.find(c => c.type === 'class_heritage'); - const implementsNode = node.children.find(c => c.type === 'implements_clause'); + const extendsClause = node.children.find(c => c.type === 'extends_clause') + ?? node.children.find(c => c.type === 'class_heritage' && c.text.startsWith('extends')); + const implementsClause = node.children.find(c => c.type === 'implements_clause') + ?? node.children.find(c => c.type === 'class_heritage' && c.text.includes('implements')); @@ - extends: extendsNode ? getNodeText(extendsNode) : undefined, - implements: implementsNode ? [getNodeText(implementsNode)] : undefined, + extends: extendsClause + ? getNodeText(extendsClause).replace(/^extends\s+/,'').split(',').map(s => s.trim()) + : undefined, + implements: implementsClause + ? getNodeText(implementsClause).replace(/^implements\s+/,'').split(',').map(s => s.trim()) + : undefined,
60-79: Derive function signature via body field (avoid split('{') heuristics).Split on “{” breaks with generics or object types. Slice to the start of the body instead.
- return { + const body = node.childForFieldName('body'); + const header = body ? sourceCode.slice(node.startIndex, body.startIndex).trim() : getNodeText(node); + return { name: getNodeText(nameNode), type: ST.FUNCTION, location: getLocation(node), parameters, returnType: returnTypeNode ? getNodeText(returnTypeNode) : undefined, async: isAsync, generator: isGenerator, - signature: getNodeText(node).split('{')[0]?.trim() || getNodeText(node) + signature: header };
34-58: Handle rest parameters and destructuring in extractParameters.Support rest_parameter and keep names readable for patterns.
- if (formalParams) { + if (formalParams) { for (const child of formalParams.namedChildren) { - if (child.type === 'required_parameter' || child.type === 'optional_parameter') { + if (child.type === 'required_parameter' || child.type === 'optional_parameter' || child.type === 'rest_parameter') { const nameNode = child.childForFieldName('pattern') || child.children.find(c => c.type === 'identifier'); const typeNode = child.childForFieldName('type'); const defaultValue = child.children.find(c => c.type === 'initializer'); + const isRest = child.type === 'rest_parameter'; if (nameNode) { params.push({ - name: getNodeText(nameNode), + name: (isRest ? '...' : '') + getNodeText(nameNode), type: typeNode ? getNodeText(typeNode) : undefined, optional: child.type === 'optional_parameter', defaultValue: defaultValue ? getNodeText(defaultValue) : undefined }); } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
src/core/astParser.ts(1 hunks)src/core/languages/cpp.ts(1 hunks)src/core/languages/csharp.ts(1 hunks)src/core/languages/go.ts(1 hunks)src/core/languages/index.ts(1 hunks)src/core/languages/java.ts(1 hunks)src/core/languages/javascript.ts(1 hunks)src/core/languages/php.ts(1 hunks)src/core/languages/python.ts(1 hunks)src/core/languages/ruby.ts(1 hunks)src/core/languages/rust.ts(1 hunks)src/core/languages/typescript.ts(1 hunks)src/formatters/astFormatter.ts(1 hunks)src/types/index.ts(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (10)
- src/core/languages/ruby.ts
- src/core/astParser.ts
- src/core/languages/csharp.ts
- src/core/languages/php.ts
- src/core/languages/java.ts
- src/core/languages/cpp.ts
- src/core/languages/index.ts
- src/core/languages/javascript.ts
- src/core/languages/rust.ts
- src/core/languages/python.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Do not use dotenv; Bun loads .env automatically
UseBun.serve()for HTTP/WebSocket/HTTPS routes; do not use Express
Usebun:sqlitefor SQLite; do not usebetter-sqlite3
UseBun.redisfor Redis; do not useioredis
UseBun.sqlfor Postgres; do not usepgorpostgres.js
Use built-inWebSocket; do not usews
PreferBun.fileovernode:fsreadFile/writeFile
UseBun.$for shelling out instead ofexeca
Files:
src/core/languages/go.tssrc/core/languages/typescript.tssrc/types/index.tssrc/formatters/astFormatter.ts
src/types/index.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Keep core type definitions in src/types/index.ts
Files:
src/types/index.ts
src/formatters/**
📄 CodeRabbit inference engine (CLAUDE.md)
Place output formatters (tree, flat, json, csv) under src/formatters/
Files:
src/formatters/astFormatter.ts
🧬 Code graph analysis (3)
src/core/languages/go.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (1)
ASTSymbol(221-232)
src/core/languages/typescript.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (11)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)VariableSymbol(186-190)InterfaceSymbol(170-174)TypeSymbol(176-179)EnumSymbol(181-184)ImportSymbol(192-198)ExportSymbol(200-204)
src/formatters/astFormatter.ts (2)
src/types/index.ts (12)
TreeOptions(79-92)ScanResult(104-110)ASTSymbol(221-232)FunctionSymbol(153-160)ClassSymbol(162-168)InterfaceSymbol(170-174)EnumSymbol(181-184)ImportSymbol(192-198)ExportSymbol(200-204)NamespaceSymbol(206-209)FileNode(1-11)Node(23-23)src/utils/formatUtils.ts (1)
formatFileSize(1-13)
- Add top-level guard to prevent nested variables, functions, classes, etc. from being extracted as top-level symbols - Implement showTokensPerSymbol option in AST formatter to display per-symbol token counts when enabled
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/formatters/astFormatter.ts (1)
130-156: LGTM! Token-per-symbol display has been implemented.The
showTokensPerSymboloption is now properly utilized (lines 135-138) and included in the symbol output (line 143). This addresses the previous review comment about the unused option.Based on past review comment.
src/core/languages/typescript.ts (1)
289-344: Make the top-level check more robust to handle different root node types.The
isTopLevelcheck (line 291) only tests for'program', but tree-sitter grammars can use different root node types (e.g.,'source_file'). The previous review suggested checking both to ensure compatibility.Apply this diff:
function traverse(node: Parser.SyntaxNode) { - // Check if this node is at the top level (parent is program) - const isTopLevel = node.parent?.type === 'program'; + // Check if this node is at the top level (parent is program or source_file, or is root itself) + const isTopLevel = !node.parent || node.parent === rootNode || + node.parent.type === 'program' || node.parent.type === 'source_file';Based on past review comment.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/core/languages/typescript.ts(1 hunks)src/formatters/astFormatter.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Do not use dotenv; Bun loads .env automatically
UseBun.serve()for HTTP/WebSocket/HTTPS routes; do not use Express
Usebun:sqlitefor SQLite; do not usebetter-sqlite3
UseBun.redisfor Redis; do not useioredis
UseBun.sqlfor Postgres; do not usepgorpostgres.js
Use built-inWebSocket; do not usews
PreferBun.fileovernode:fsreadFile/writeFile
UseBun.$for shelling out instead ofexeca
Files:
src/formatters/astFormatter.tssrc/core/languages/typescript.ts
src/formatters/**
📄 CodeRabbit inference engine (CLAUDE.md)
Place output formatters (tree, flat, json, csv) under src/formatters/
Files:
src/formatters/astFormatter.ts
🧬 Code graph analysis (2)
src/formatters/astFormatter.ts (2)
src/types/index.ts (12)
TreeOptions(79-92)ScanResult(104-110)ASTSymbol(221-232)FunctionSymbol(153-160)ClassSymbol(162-168)InterfaceSymbol(170-174)EnumSymbol(181-184)ImportSymbol(192-198)ExportSymbol(200-204)NamespaceSymbol(206-209)FileNode(1-11)Node(23-23)src/utils/formatUtils.ts (1)
formatFileSize(1-13)
src/core/languages/typescript.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (11)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)VariableSymbol(186-190)InterfaceSymbol(170-174)TypeSymbol(176-179)EnumSymbol(181-184)ImportSymbol(192-198)ExportSymbol(200-204)
🔇 Additional comments (1)
src/formatters/astFormatter.ts (1)
235-270: LGTM! Symbol counting correctly handles nested structures.The counting logic properly distinguishes between enum members (lines 256-258) and nested ASTSymbol members (lines 260-263), which aligns with the type definitions.
| function extractClass(node: Parser.SyntaxNode): ClassSymbol | null { | ||
| const nameNode = node.childForFieldName('name'); | ||
| if (!nameNode) return null; | ||
|
|
||
| const members: ASTSymbol[] = []; | ||
| const bodyNode = node.childForFieldName('body'); | ||
| const extendsNode = node.children.find(c => c.type === 'class_heritage'); | ||
| const implementsNode = node.children.find(c => c.type === 'implements_clause'); | ||
| const isAbstract = node.children.some(c => c.type === 'abstract'); | ||
|
|
||
| if (bodyNode) { | ||
| for (const child of bodyNode.namedChildren) { | ||
| if (child.type === 'method_definition') { | ||
| const method = extractFunction(child); | ||
| if (method) { | ||
| method.type = ST.METHOD; | ||
| members.push(method); | ||
| } | ||
| } else if (child.type === 'public_field_definition' || child.type === 'field_definition') { | ||
| const propName = child.childForFieldName('name'); | ||
| const propType = child.childForFieldName('type'); | ||
| if (propName) { | ||
| members.push({ | ||
| name: getNodeText(propName), | ||
| type: ST.VARIABLE, | ||
| location: getLocation(child), | ||
| variableType: propType ? getNodeText(propType) : undefined | ||
| } as VariableSymbol); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return { | ||
| name: getNodeText(nameNode), | ||
| type: ST.CLASS, | ||
| location: getLocation(node), | ||
| extends: extendsNode ? getNodeText(extendsNode) : undefined, | ||
| implements: implementsNode ? [getNodeText(implementsNode)] : undefined, | ||
| members, | ||
| abstract: isAbstract | ||
| }; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Consider using field-based lookups for class metadata.
Lines 87-89 search through node.children to find heritage clauses and abstract modifiers. Tree-sitter's field-based API (childForFieldName()) is typically more reliable than searching by type. Additionally, 'abstract' is unlikely to be a node type (similar to the async/generator issue).
Verify the correct field names and node types in tree-sitter-typescript documentation:
If field names are available (e.g., heritage, decorator, modifiers), refactor to use childForFieldName() instead of children.find().
🌐 Web query:
tree-sitter-typescript class_declaration node structure and field names
💡 Result:
Summary of the tree-sitter‑typescript class_declaration node (common field names)
Typical named fields you can expect on a class_declaration node in the tree-sitter TypeScript grammar:
- name — the class identifier (identifier node). [1]
- type_parameters — generic type parameter list (type_parameters node). [1]
- superclass / superclass: (superclass ...) — the extends clause / superclass name(s). [2]
- interfaces / super_interfaces / implements — implemented interfaces (super_interfaces / type_list style). [2]
- body — the class body (class_body node). [1][2]
- decorators / modifiers (when present) — decorator or modifier tokens attached to the class (grammar-dependent). [1][2]
Notes:
- Exact field keys may vary slightly between versions (e.g., superclass vs. superclass_name or super_interfaces vs. interfaces), but the canonical node-types.json for the tree-sitter-typescript grammar exposes the name, type_parameters, superclass/super_interfaces (extends/implements), and body fields used by queries. See the grammar/node-types data for the precise names in the version you use. [1][2]
Sources:
[1] tree-sitter static node-types / tree-sitter-typescript node-types (node-types.json) documentation.
[2] Example tagging / stack-graph queries that match class_declaration fields (shows name, body, superclass/superclass_name, type_parameters, interfaces/super_interfaces).
🏁 Script executed:
#!/bin/bash
# Find tree-sitter-typescript version and extract field name details from node-types
if [ -f "package.json" ]; then
cat package.json | grep -A2 -B2 "tree-sitter"
fi
if [ -f "package-lock.json" ]; then
echo "=== tree-sitter packages in lock file ==="
grep -A5 '"tree-sitter' package-lock.json | head -30
fi
# Look for node_modules or grammar definitions
find . -name "node-types.json" -o -name "grammar.js" 2>/dev/null | head -10Length of output: 1032
🏁 Script executed:
#!/bin/bash
# Examine the actual extractClass implementation at lines 81-123
cat src/core/languages/typescript.ts | sed -n '81,123p'Length of output: 1678
🏁 Script executed:
#!/bin/bash
# Search for other uses of class heritage/implements in the file to see pattern
rg -n 'class_heritage|implements_clause|superclass|super_interfaces' src/core/languages/typescript.ts -C2Length of output: 351
🏁 Script executed:
#!/bin/bash
# Search for modifier/abstract handling patterns elsewhere in the file
rg -n 'modifiers|async|generator' src/core/languages/typescript.ts -C2 | head -40Length of output: 705
🏁 Script executed:
#!/bin/bash
# Check if there are test files or examples showing tree-sitter usage
find . -type f \( -name "*.test.ts" -o -name "*.spec.ts" -o -name "*example*" \) -exec grep -l "class_declaration\|extractClass" {} \; 2>/dev/null | head -5Length of output: 0
Fix class metadata extraction using field-based lookups.
Lines 87-89 search for class metadata using incorrect node types (class_heritage, implements_clause, abstract). These type names don't exist in tree-sitter-typescript v0.23.0; the search will fail silently, preventing extends/implements/abstract from being detected.
Replace with field-based lookups:
- Use
childForFieldName('superclass')instead ofchildren.find(c => c.type === 'class_heritage') - Use
childForFieldName('super_interfaces')instead ofchildren.find(c => c.type === 'implements_clause') - Check the
modifiersfield (viachildForFieldName('modifiers')) for the abstract modifier, not a standaloneabstractnode type
🤖 Prompt for AI Agents
In src/core/languages/typescript.ts around lines 81 to 123, the class extractor
currently looks for extends/implements/abstract via children.find(...) using
non-existent node types; replace those lookups with field-based calls: use
node.childForFieldName('superclass') for extends,
node.childForFieldName('super_interfaces') for implements (if present extract
the individual interfaces from that node, e.g., map its namedChildren to
getNodeText), and use node.childForFieldName('modifiers') to detect abstract
(check modifiers.namedChildren for a child with type 'abstract' or whose text
equals 'abstract'); update the returned extends/implements/abstract values
accordingly so they are populated when present.
… Ruby, PHP, Swift) - Implement full symbol extraction for Go: functions, methods, structs, interfaces, imports, constants, variables - Implement full symbol extraction for Rust: functions, structs, enums, traits, impl blocks, imports - Implement full symbol extraction for Java: classes, interfaces, methods, enums, imports, fields - Implement full symbol extraction for C++: functions, classes, structs, namespaces, enums, includes - Implement full symbol extraction for C#: classes, interfaces, structs, methods, properties, namespaces - Implement full symbol extraction for Ruby: classes, modules, methods, constants, requires - Implement full symbol extraction for PHP: classes, interfaces, traits, functions, namespaces - Add new Swift language support with full AST parsing (functions, classes, structs, enums, protocols) - Add tree-sitter-swift@0.6.0 dependency - Register Swift in language registry All implementations follow the comprehensive pattern established by Python/TypeScript parsers, extracting detailed metadata about functions, classes, types, and their members for accurate codebase analysis.
…mprehensive tests - Add file size limit to AST parser (default 10MB) to prevent OOM issues - Add comprehensive JSDoc documentation to ASTParser class methods - Improve error handling and logging throughout AST parsing pipeline - Update README with AST output documentation and supported languages list - Add symbol icons reference (ƒ functions, C classes, I interfaces, T types, E enums, v variables) - Add CSV output format documentation to README Tests: - Add comprehensive AST parser test suite (astParser.test.ts) - Add AST formatter test suite with 25+ test cases (astFormatter.test.ts) - Add language-specific tests for Python and TypeScript parsers - All tests passing (159 tests, 424 expectations) Core improvements: - Enhance cache version handling for AST symbols - Improve scanner integration with AST extraction - Add STRUCT and TRAIT symbol types to type definitions
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (23)
src/core/languages/java.ts (1)
201-225: Consider enhancing import extraction for Java-specific patterns.The current implementation splits the import path and extracts the last component (e.g., "List" from "java.util.List"), which works for simple cases. However, it doesn't distinguish between:
- Regular imports:
import java.util.List;- Wildcard imports:
import java.util.*;- Static imports:
import static java.lang.Math.PI;Consider parsing these patterns separately to provide more detailed import information (e.g., marking wildcard imports or static imports with metadata).
src/core/languages/cpp.ts (3)
44-54: Avoid callingchildForFieldNametwice.The parameter name extraction calls
childForFieldName('declarator')twice—once in the condition (line 46) and again on line 47. Store the result in a variable to improve efficiency and readability.Apply this diff:
if (child.type === 'parameter_declaration') { const declaratorNode = child.childForFieldName('declarator'); const typeNode = child.childForFieldName('type'); if (declaratorNode) { - const name = declaratorNode.type === 'identifier' ? getNodeText(declaratorNode) : - declaratorNode.childForFieldName('declarator') ? - getNodeText(declaratorNode.childForFieldName('declarator')!) : - getNodeText(declaratorNode); + let name: string; + if (declaratorNode.type === 'identifier') { + name = getNodeText(declaratorNode); + } else { + const nestedDeclarator = declaratorNode.childForFieldName('declarator'); + name = nestedDeclarator ? getNodeText(nestedDeclarator) : getNodeText(declaratorNode); + } params.push({ name,
221-239: Consider more robust include parsing.The regex-based parsing handles common cases but may miss edge cases like escaped quotes, macros in include paths, or unusual whitespace. For the initial implementation, this is acceptable.
If you encounter parsing issues in real-world code, consider using tree-sitter's AST structure to extract the include path more reliably:
function extractImport(node: Parser.SyntaxNode): ImportSymbol | null { // tree-sitter-cpp parses the path as a string or system_lib_string child node const pathNode = node.childForFieldName('path'); if (!pathNode) return null; const from = getNodeText(pathNode).replace(/^["<]|[">]$/g, ''); return { name: from, type: ST.IMPORT, location: getLocation(node), from, imports: [from] }; }
111-120: Consider iterating direct children instead ofdescendantsOfType.Using
descendantsOfTyperecursively searches all descendants, which can be less efficient than iterating through direct children or usingchildForFieldNamewhen the structure is predictable.For field declarations, you can use the tree-sitter field name directly:
// In extractClass (line 111-120) } else if (child.type === 'field_declaration') { const declarator = child.childForFieldName('declarator'); if (declarator) { const fieldName = declarator.type === 'field_identifier' ? getNodeText(declarator) : getNodeText(declarator.childForFieldName('declarator') || declarator); members.push({ name: fieldName, type: ST.VARIABLE, location: getLocation(child) } as VariableSymbol); } } // Similar pattern for extractStruct (line 140-151)Also applies to: 140-151
src/core/languages/csharp.ts (2)
236-261: Namespace extraction is incomplete.The
extractNamespacefunction only extracts classes and interfaces (lines 245-251), but C# namespaces can contain other top-level declarations such as structs, enums, delegates, and nested namespaces.Consider extending the namespace extraction to handle all declaration types:
function extractNamespace(node: Parser.SyntaxNode): NamespaceSymbol | null { const nameNode = node.childForFieldName('name'); if (!nameNode) return null; const members: ASTSymbol[] = []; const bodyNode = node.childForFieldName('body'); if (bodyNode) { for (const child of bodyNode.namedChildren) { if (child.type === 'class_declaration') { const cls = extractClass(child); if (cls) members.push(cls); } else if (child.type === 'interface_declaration') { const iface = extractInterface(child); if (iface) members.push(iface); + } else if (child.type === 'struct_declaration') { + const struct = extractStruct(child); + if (struct) members.push(struct); + } else if (child.type === 'enum_declaration') { + const enumDecl = extractEnum(child); + if (enumDecl) members.push(enumDecl); + } else if (child.type === 'namespace_declaration') { + const ns = extractNamespace(child); + if (ns) members.push(ns); } } } return { name: getNodeText(nameNode), type: ST.NAMESPACE, location: getLocation(node), members }; }
263-280: Using directive extraction is overly simplistic.C#
usingdirectives have different forms that aren't captured by the current implementation:
- Namespace import:
using System.Collections.Generic;- Static using:
using static System.Math;- Alias directive:
using Alias = Some.Long.Namespace;The current implementation only extracts the name field and treats it as both the
fromand a singleimportsentry, which doesn't accurately represent these different forms.Consider enhancing the extraction to handle different using directive types:
function extractImport(node: Parser.SyntaxNode): ImportSymbol | null { const nameNode = node.childForFieldName('name'); if (!nameNode) return null; const namespaceOrType = getNodeText(nameNode); const isStatic = node.children.some(c => c.type === 'static'); // Check for alias (using Alias = Target;) const aliasNode = node.childForFieldName('alias'); if (aliasNode) { return { name: getNodeText(aliasNode), type: ST.IMPORT, location: getLocation(node), from: namespaceOrType, imports: [getNodeText(aliasNode)] }; } return { name: namespaceOrType, type: ST.IMPORT, location: getLocation(node), from: namespaceOrType, imports: isStatic ? [`static ${namespaceOrType}`] : [namespaceOrType] }; }Note: Verify the tree-sitter-c-sharp grammar's field names for using directives to ensure accurate extraction.
src/core/languages/rust.ts (2)
19-32: Extract shared helper functions to reduce duplication.The
getLocationandgetNodeTextfunctions are identical across multiple language configs (rust.ts, go.ts, and likely others). Consider extracting these to a shared utility module to improve maintainability and reduce code duplication.For example, create
src/core/languages/utils.ts:import type Parser from 'tree-sitter'; import type { SourceLocation } from '../../types/index.js'; export function getLocation(node: Parser.SyntaxNode): SourceLocation { return { startLine: node.startPosition.row + 1, startColumn: node.startPosition.column, endLine: node.endPosition.row + 1, endColumn: node.endPosition.column, startByte: node.startIndex, endByte: node.endIndex }; } export function getNodeText(node: Parser.SyntaxNode, sourceCode: string): string { return sourceCode.slice(node.startIndex, node.endIndex); }Then import and use in each language config.
34-61: Consider simplifying the parameter name extraction.The nested ternary operator at lines 45-47 is complex and reduces readability. Consider refactoring for clarity.
Apply this diff:
- if (patternNode) { - const name = patternNode.type === 'identifier' ? getNodeText(patternNode) : - patternNode.childForFieldName('name') ? getNodeText(patternNode.childForFieldName('name')!) : - getNodeText(patternNode); + if (patternNode) { + let name: string; + if (patternNode.type === 'identifier') { + name = getNodeText(patternNode); + } else { + const nameChild = patternNode.childForFieldName('name'); + name = nameChild ? getNodeText(nameChild) : getNodeText(patternNode); + }src/core/languages/go.ts (1)
73-89: Method signature could include parameters and return type.The method signature at line 87 only includes the receiver and method name, but excludes parameters and return type. Consider including these for a more complete signature representation.
For example:
return { name: getNodeText(nameNode), type: ST.METHOD, location: getLocation(node), parameters, returnType: resultNode ? getNodeText(resultNode) : undefined, - signature: receiverNode ? `func ${getNodeText(receiverNode)} ${getNodeText(nameNode)}` : undefined + signature: receiverNode ? + `func ${getNodeText(receiverNode)} ${getNodeText(nameNode)}(${parameters.map(p => `${p.name} ${p.type || ''}`).join(', ')})${resultNode ? ` ${getNodeText(resultNode)}` : ''}` + : undefined };This would produce signatures like
func (r *Receiver) methodName(param1 string, param2 int) error.src/core/languages/ruby.ts (8)
106-124: Prefer NamespaceSymbol/ST.NAMESPACE for Ruby modulesModules are namespaces/mixins; typing them as CLASS confuses downstream formatters. Switch to NamespaceSymbol + ST.NAMESPACE (types appear supported in src/types/index.ts ASTSymbol union).
Apply:
-import type { ASTSymbol, FunctionSymbol, ClassSymbol, ImportSymbol, VariableSymbol, SourceLocation, Parameter } from '../../types/index.js'; +import type { ASTSymbol, FunctionSymbol, ClassSymbol, NamespaceSymbol, ImportSymbol, VariableSymbol, SourceLocation, Parameter } from '../../types/index.js'; @@ - function extractModule(node: Parser.SyntaxNode): ClassSymbol | null { + function extractModule(node: Parser.SyntaxNode): NamespaceSymbol | null { @@ - return { + return { name: getNodeText(nameNode), - type: ST.CLASS, // Use CLASS type for modules as they're similar + type: ST.NAMESPACE, location: getLocation(node), members }; }Confirm ST.NAMESPACE and NamespaceSymbol exist; if not, keep CLASS as a fallback and open a types follow-up.
34-56: Parameter extraction: handle method_parameters and splat/block/kw variantsRuby exposes parameters under method.parameters: method_parameters; also supports splat/hash_splat/block params. Make parsing resilient. (tree-sitter.github.io)
Apply:
- function extractParameters(node: Parser.SyntaxNode): Parameter[] { - const params: Parameter[] = []; - const paramsNode = node.childForFieldName('parameters'); - if (paramsNode) { - for (const child of paramsNode.namedChildren) { - if (child.type === 'identifier' || child.type === 'optional_parameter' || child.type === 'keyword_parameter') { - const name = child.type === 'identifier' ? getNodeText(child) : - child.childForFieldName('name') ? getNodeText(child.childForFieldName('name')!) : - getNodeText(child); - const defaultValue = child.childForFieldName('value'); - params.push({ - name, - defaultValue: defaultValue ? getNodeText(defaultValue) : undefined - }); - } - } - } - return params; - } + function extractParameters(node: Parser.SyntaxNode): Parameter[] { + const params: Parameter[] = []; + const paramsNode = + node.childForFieldName('parameters') ?? + node.descendantsOfType('method_parameters')[0] ?? + node.descendantsOfType('parameters')[0]; + if (!paramsNode) return params; + const supported = new Set([ + 'identifier','optional_parameter','keyword_parameter', + 'splat_parameter','hash_splat_parameter','block_parameter' + ]); + for (const p of paramsNode.namedChildren) { + if (!supported.has(p.type)) continue; + const nameNode = p.childForFieldName('name') ?? (p.type === 'identifier' ? p : p.child(0)); + const defaultNode = p.childForFieldName('value') ?? p.childForFieldName('default') ?? null; + params.push({ + name: nameNode ? getNodeText(nameNode) : '?', + optional: p.type.includes('optional') || p.type.includes('keyword'), + defaultValue: defaultNode ? getNodeText(defaultNode) : undefined, + }); + } + return params; + }
85-95: Deduplicate instance variables and add variableType metadataInstance variables can appear many times; emit once per name and tag as instance for clarity.
Apply:
- for (const child of node.descendantsOfType('instance_variable')) { + const seenIvars = new Set<string>(); + for (const child of node.descendantsOfType('instance_variable')) { const varName = getNodeText(child); if (varName) { - members.push({ + if (seenIvars.has(varName)) continue; + seenIvars.add(varName); + members.push({ name: varName, type: ST.VARIABLE, - location: getLocation(child) + location: getLocation(child), + variableType: 'instance' } as VariableSymbol); } }
126-153: Require/include parsing: support argument_list, constants, and multiple argsRuby call nodes expose method and arguments: argument_list; include often uses constants (not strings) and can take multiple modules. Parse all args and normalize text; set imports to all, from to first. (tree-sitter.github.io)
Apply:
- function extractRequire(node: Parser.SyntaxNode): ImportSymbol | null { + function extractRequire(node: Parser.SyntaxNode): ImportSymbol | null { // Look for method calls with 'require' or 'require_relative' const methodNode = node.childForFieldName('method'); if (!methodNode) return null; const methodName = getNodeText(methodNode); if (methodName !== 'require' && methodName !== 'require_relative' && methodName !== 'include') { return null; } - const args = node.childForFieldName('arguments'); - let from = ''; - - if (args) { - const stringNode = args.namedChildren[0]; - if (stringNode) { - from = getNodeText(stringNode).replace(/['"]/g, ''); - } - } - - return { - name: from, - type: ST.IMPORT, - location: getLocation(node), - from, - imports: [from] - }; + const args = node.childForFieldName('arguments'); + const imports: string[] = []; + if (args) { + for (const a of args.namedChildren) { + const raw = getNodeText(a); + imports.push(raw.replace(/['"]/g, '')); + } + } + const from = imports[0] ?? methodName; + return { + name: from, + type: ST.IMPORT, + location: getLocation(node), + from, + imports: imports.length ? imports : [from], + }; }To be safe, confirm tree-sitter-ruby emits fields method and arguments: argument_list on call, as shown in docs. If your local grammar differs, adjust field names accordingly.
10-13: Make loadGrammar robust to ESM/CJS export shapesSome language packages export default, others export language; guard both to avoid runtime errors in Bun/Node.
Apply:
- loadGrammar: async () => { - const RubyLanguage = await import('tree-sitter-ruby'); - return RubyLanguage.default; - }, + loadGrammar: async () => { + const mod: any = await import('tree-sitter-ruby'); + return (mod?.default ?? mod?.language ?? mod); + },If you’ve standardized on one export shape across languages, align with that instead.
169-201: Traversal: consider top-level-only loop to reduce workYou recurse entire tree but only handle top-level nodes. Iterate rootNode.namedChildren and drop recursion for a small speed win.
Apply:
- function traverse(node: Parser.SyntaxNode) { - const isTopLevel = node.parent?.type === 'program'; - if (isTopLevel) { - // existing cases... - } - for (const child of node.children) { - traverse(child); - } - } - traverse(rootNode); + for (const node of rootNode.namedChildren) { + // existing top-level cases... + }
34-70: Method metadata: consider capturing receiver for singleton methods and signatureOptional: if method has a receiver (def self.foo), record it (e.g., signature or name like self.foo) for clarity.
Apply:
- function extractMethod(node: Parser.SyntaxNode): FunctionSymbol | null { + function extractMethod(node: Parser.SyntaxNode): FunctionSymbol | null { const nameNode = node.childForFieldName('name'); if (!nameNode) return null; + const receiver = node.childForFieldName('receiver'); @@ - return { + return { name: getNodeText(nameNode), type: ST.METHOD, location: getLocation(node), - parameters + parameters, + signature: receiver ? `${getNodeText(receiver)}.${getNodeText(nameNode)}` : undefined, }; }
85-95: Minor: sort/dedupe imports and ivars for stable outputAfter building members, consider sorting and Set-based dedupe to keep formatter output deterministic.
Also applies to: 126-153
src/core/languages/swift.ts (6)
77-118: Split class inheritance into extends (superclass) and implements (protocols).Currently you store the entire inheritance clause in extends. Parse identifiers; first is superclass (if any), remainder are protocols. ClassSymbol supports implements?: string[] (see src/types/index.ts).
Apply this diff:
@@ - const inheritanceNode = node.childForFieldName('inheritance'); + const inheritanceNode = node.childForFieldName('inheritance'); + const heritage = inheritanceNode + ? inheritanceNode.descendantsOfType(['type_identifier', 'scoped_identifier']).map(getNodeText) + : []; @@ - extends: inheritanceNode ? getNodeText(inheritanceNode) : undefined, + extends: heritage.length ? heritage[0] : undefined, + implements: heritage.length > 1 ? heritage.slice(1) : undefined, members };
185-226: Protocol inheritance should be an array of protocols, not a single joined string.Emit InterfaceSymbol.extends as string[] of type identifiers.
Apply this diff:
@@ - const inheritanceNode = node.childForFieldName('inheritance'); + const inheritanceNode = node.childForFieldName('inheritance'); @@ - return { + const inherited = inheritanceNode + ? inheritanceNode.descendantsOfType(['type_identifier', 'scoped_identifier']).map(getNodeText) + : []; + return { name: getNodeText(nameNode), type: ST.INTERFACE, location: getLocation(node), - extends: inheritanceNode ? [getNodeText(inheritanceNode)] : undefined, + extends: inherited.length ? inherited : undefined, members };
297-299: Avoid traversing punctuation/anonymous nodes.Use namedChildren to cut noise and speed up traversal.
Apply this diff:
- for (const child of node.children) { + for (const child of node.namedChildren) { traverse(child); }
228-247: Import representation duplicates module name in both from and imports.Swift imports usually bring a module (optionally with kind: struct/class/enum). Consider:
- from: module name (e.g., Foundation)
- imports: [] (or specific symbol when using import kind syntax)
This avoids redundancy and matches ImportSymbol semantics.Please confirm how the formatter expects ImportSymbol for single-module imports to avoid regressions.
153-183: Enum members: raw values/associated values are dropped.If feasible, capture raw-value expressions (e.g., case a = 1) in members[].value; associated values could be reflected in value as a signature string.
I can extend this with a minimal extractor for raw values using descendantsOfType on enum_case patterns.
10-13: Grammar import interop/typing.Dynamic import shape can vary (CJS vs ESM). Add a safe fallback and clarify return typing.
Apply this diff:
loadGrammar: async () => { - const SwiftLanguage = await import('tree-sitter-swift'); - return SwiftLanguage.default; + const mod: any = await import('tree-sitter-swift'); + return (mod.default ?? mod) as unknown; // Parser.Language },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
package.json(2 hunks)src/core/languages/cpp.ts(1 hunks)src/core/languages/csharp.ts(1 hunks)src/core/languages/go.ts(1 hunks)src/core/languages/index.ts(1 hunks)src/core/languages/java.ts(1 hunks)src/core/languages/php.ts(1 hunks)src/core/languages/ruby.ts(1 hunks)src/core/languages/rust.ts(1 hunks)src/core/languages/swift.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- src/core/languages/index.ts
- package.json
- src/core/languages/php.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Do not use dotenv; Bun loads .env automatically
UseBun.serve()for HTTP/WebSocket/HTTPS routes; do not use Express
Usebun:sqlitefor SQLite; do not usebetter-sqlite3
UseBun.redisfor Redis; do not useioredis
UseBun.sqlfor Postgres; do not usepgorpostgres.js
Use built-inWebSocket; do not usews
PreferBun.fileovernode:fsreadFile/writeFile
UseBun.$for shelling out instead ofexeca
Files:
src/core/languages/swift.tssrc/core/languages/go.tssrc/core/languages/rust.tssrc/core/languages/ruby.tssrc/core/languages/java.tssrc/core/languages/cpp.tssrc/core/languages/csharp.ts
🧬 Code graph analysis (7)
src/core/languages/swift.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (10)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)VariableSymbol(186-190)StructSymbol(211-214)EnumSymbol(181-184)InterfaceSymbol(170-174)ImportSymbol(192-198)
src/core/languages/go.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (8)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)StructSymbol(211-214)InterfaceSymbol(170-174)ImportSymbol(192-198)VariableSymbol(186-190)
src/core/languages/rust.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (9)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)StructSymbol(211-214)EnumSymbol(181-184)TraitSymbol(216-219)ImportSymbol(192-198)VariableSymbol(186-190)
src/core/languages/ruby.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (7)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)VariableSymbol(186-190)ImportSymbol(192-198)
src/core/languages/java.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (9)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)VariableSymbol(186-190)InterfaceSymbol(170-174)EnumSymbol(181-184)ImportSymbol(192-198)
src/core/languages/cpp.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (10)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)ClassSymbol(162-168)VariableSymbol(186-190)StructSymbol(211-214)EnumSymbol(181-184)NamespaceSymbol(206-209)ImportSymbol(192-198)
src/core/languages/csharp.ts (2)
src/core/languages/index.ts (1)
LanguageConfig(4-9)src/types/index.ts (11)
ASTSymbol(221-232)SourceLocation(129-136)Parameter(138-143)FunctionSymbol(153-160)VariableSymbol(186-190)ClassSymbol(162-168)InterfaceSymbol(170-174)StructSymbol(211-214)EnumSymbol(181-184)NamespaceSymbol(206-209)ImportSymbol(192-198)
🪛 Biome (2.1.2)
src/core/languages/java.ts
[error] 99-99: Do not shadow the global "constructor" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
src/core/languages/csharp.ts
[error] 106-106: Do not shadow the global "constructor" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
🔇 Additional comments (21)
src/core/languages/java.ts (7)
1-13: LGTM!The imports and configuration setup follow the established pattern. Dynamic grammar loading is correct.
15-32: LGTM!The helper functions for location mapping and text extraction are correctly implemented.
34-65: LGTM!Parameter extraction correctly handles both regular parameters and Java varargs (spread_parameter).
67-81: LGTM!Method extraction is correctly implemented with proper null handling and type defaults.
173-199: LGTM!Enum extraction correctly handles enum constants and matches the EnumSymbol interface.
227-249: LGTM!The traversal logic correctly filters top-level declarations and recursively processes the AST.
251-253: LGTM!The extraction execution is straightforward and correct.
src/core/languages/cpp.ts (2)
6-13: LGTM! Configuration is well-structured.The language configuration correctly defines C++ support with comprehensive file extensions and lazy grammar loading.
15-273: Excellent implementation of C++ symbol extraction!The
extractSymbolsfunction is comprehensive and well-organized. It correctly:
- Extracts all major C++ constructs (functions, classes, structs, enums, namespaces, includes)
- Uses helper functions for code reusability
- Follows the LanguageConfig interface contract
- Handles AST traversal with proper top-level scope checking
The minor refactoring suggestions above will enhance efficiency and robustness, but the core implementation is solid and ready for use.
src/core/languages/csharp.ts (3)
1-13: LGTM!The imports and configuration structure are correct, following the established pattern for language configs with lazy grammar loading.
19-89: LGTM!The helper functions correctly handle location mapping, text extraction, and parameter/method/property extraction with appropriate type information.
282-314: LGTM!The traverse function correctly identifies top-level declarations and recursively processes the AST to extract all symbols.
src/core/languages/rust.ts (3)
1-13: LGTM!The imports and grammar loading follow the established pattern from other language configs. The async dynamic import ensures lazy loading of the tree-sitter grammar.
63-166: LGTM!The symbol extraction functions correctly handle Rust-specific constructs (functions, structs, enums, traits) and properly transform function types to methods within traits. The logic aligns well with Rust's AST structure.
212-261: LGTM!The traversal logic comprehensively handles Rust's top-level declarations and correctly extracts methods from
implblocks. The recursive traversal ensures all nested structures are visited.src/core/languages/go.ts (3)
1-13: LGTM!The imports and grammar loading are correct and consistent with the other language configs. The previous linting issue with unused parameters has been properly addressed.
91-159: LGTM!The struct and interface extraction logic correctly handles Go's type system and AST structure. The method spec extraction properly creates METHOD symbols with parameters and return types.
213-259: LGTM!The traversal logic comprehensively handles Go's declaration types and correctly processes grouped constant and variable declarations. The type distinction between structs and interfaces is handled properly.
src/core/languages/ruby.ts (1)
126-153: Field names are correct per tree-sitter-ruby documentation.The code correctly uses
methodandargumentsfields on call nodes, matching the confirmed tree-sitter-ruby grammar. No adjustments needed.src/core/languages/swift.ts (2)
19-28: ****The review comment assumes inconsistency across language configurations, but verification confirms all 10 language files (swift, typescript, ruby, rust, python, php, java, go, cpp, csharp) follow an identical, intentional convention: 1-based line indexing and 0-based column indexing. Swift.ts is already correctly aligned with the established global convention. No changes are needed.
Likely an incorrect or invalid review comment.
6-9: Registry integration verified successfully.All checks confirmed:
- SwiftConfig is properly exported and imported in src/core/languages/index.ts (line 44)
- SwiftConfig is registered via
registerLanguage(SwiftConfig)(line 57)- Grammar declaration
tree-sitter-swiftis present in package.json- SwiftConfig includes complete loadGrammar and extractSymbols implementations
src/core/languages/csharp.ts
Outdated
| function extractClass(node: Parser.SyntaxNode): ClassSymbol | null { | ||
| const nameNode = node.childForFieldName('name'); | ||
| if (!nameNode) return null; | ||
|
|
||
| const members: ASTSymbol[] = []; | ||
| const bodyNode = node.childForFieldName('body'); | ||
| const basesNode = node.childForFieldName('bases'); | ||
| const isAbstract = node.children.some(c => c.type === 'abstract'); | ||
|
|
||
| if (bodyNode) { | ||
| for (const child of bodyNode.namedChildren) { | ||
| if (child.type === 'method_declaration') { | ||
| const method = extractMethod(child); | ||
| if (method) members.push(method); | ||
| } else if (child.type === 'constructor_declaration') { | ||
| const constructor = extractMethod(child); | ||
| if (constructor) { | ||
| constructor.name = 'constructor'; | ||
| members.push(constructor); | ||
| } | ||
| } else if (child.type === 'property_declaration') { | ||
| const property = extractProperty(child); | ||
| if (property) members.push(property); | ||
| } else if (child.type === 'field_declaration') { | ||
| const declarator = child.descendantsOfType('variable_declarator')[0]; | ||
| if (declarator) { | ||
| const fieldName = declarator.childForFieldName('name'); | ||
| const typeNode = child.childForFieldName('type'); | ||
| if (fieldName) { | ||
| members.push({ | ||
| name: getNodeText(fieldName), | ||
| type: ST.VARIABLE, | ||
| location: getLocation(declarator), | ||
| variableType: typeNode ? getNodeText(typeNode) : undefined | ||
| } as VariableSymbol); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return { | ||
| name: getNodeText(nameNode), | ||
| type: ST.CLASS, | ||
| location: getLocation(node), | ||
| extends: basesNode ? getNodeText(basesNode) : undefined, | ||
| members, | ||
| abstract: isAbstract | ||
| }; | ||
| } |
There was a problem hiding this comment.
Incorrect handling of C# class inheritance and interface implementation.
In C#, a class declaration can have both a base class and implemented interfaces (e.g., class Foo : BaseClass, IInterface1, IInterface2). The current implementation at line 136 assigns the entire bases clause as a single string to the extends field, but:
- The
ClassSymboltype has separate fields:extends?: string(for the base class) andimplements?: string[](for interfaces) - The first item in the bases list is the base class (if it exists), and subsequent items are interfaces
The bases need to be parsed and split appropriately to populate both extends and implements fields.
Consider parsing the bases clause to distinguish between the base class and interfaces:
function extractClass(node: Parser.SyntaxNode): ClassSymbol | null {
const nameNode = node.childForFieldName('name');
if (!nameNode) return null;
const members: ASTSymbol[] = [];
const bodyNode = node.childForFieldName('body');
const basesNode = node.childForFieldName('bases');
const isAbstract = node.children.some(c => c.type === 'abstract');
+
+ let extendsClass: string | undefined;
+ let implementsInterfaces: string[] | undefined;
+
+ if (basesNode) {
+ const basesList = basesNode.namedChildren.map(c => getNodeText(c));
+ if (basesList.length > 0) {
+ // In C#, if there's a base class, it comes first
+ // Heuristic: if first item doesn't start with 'I' followed by uppercase, treat as base class
+ // More robust: check tree-sitter node types or use semantic analysis
+ extendsClass = basesList[0];
+ if (basesList.length > 1) {
+ implementsInterfaces = basesList.slice(1);
+ }
+ }
+ }
if (bodyNode) {
// ... member extraction ...
}
return {
name: getNodeText(nameNode),
type: ST.CLASS,
location: getLocation(node),
- extends: basesNode ? getNodeText(basesNode) : undefined,
+ extends: extendsClass,
+ implements: implementsInterfaces,
members,
abstract: isAbstract
};
}Note: Distinguishing between a base class and interfaces may require additional logic or tree-sitter node type analysis, as C# naming conventions (interfaces starting with 'I') are not always reliable.
🧰 Tools
🪛 Biome (2.1.2)
[error] 106-106: Do not shadow the global "constructor" property.
Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.
(lint/suspicious/noShadowRestrictedNames)
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
test/languages/python.test.ts (1)
11-18: Initialize once with a simple guard; acceptable as-is, but consider beforeAll for clarity.Current
initParserworks;beforeAll(async () => { ... })would be more idiomatic and avoids repeated awaits.test/astParser.test.ts (1)
236-240: Simplify redundant non-null assertion check.The condition
if (funcSymbol! && 'parameters' in funcSymbol!)is redundant because the non-null assertion (!) already asserts thatfuncSymbolis truthy. The first check always evaluates to true.Apply this diff to simplify:
- if (funcSymbol! && 'parameters' in funcSymbol!) { + if (funcSymbol && 'parameters' in funcSymbol) { expect(funcSymbol!.parameters.length).toBe(3); expect(funcSymbol.parameters[0].name).toBe('id'); // Note: default values and optional flags may vary by tree-sitter parser versionAlternatively, if you're confident
funcSymbolis defined (due to the previous assertion on line 235), you can keep the non-null assertion but remove the redundant check:- if (funcSymbol! && 'parameters' in funcSymbol!) { + if ('parameters' in funcSymbol!) {src/core/astParser.ts (1)
159-168: Avoid hardcoding the list of implemented languages.The hardcoded list of implemented languages (
['TypeScript', 'JavaScript', 'Python']) at lines 162-163 creates a maintenance burden. When new language extractors are added, developers must remember to update this list, or users will see misleading "not yet implemented" warnings for working implementations.Consider one of these approaches:
Option 1 (Preferred): Add an
isImplementedflag to theLanguageConfiginterface:In
src/core/languages/index.ts:export interface LanguageConfig { name: string; extensions: string[]; loadGrammar: () => Promise<unknown>; extractSymbols: (tree: Parser.Tree, sourceCode: string) => ASTSymbol[]; isImplemented?: boolean; // Add this flag }Then update this code to:
- const isStubImplementation = languageConfig.name && - !['TypeScript', 'JavaScript', 'Python'].includes(languageConfig.name); + const isStubImplementation = languageConfig.isImplemented === false;Option 2: Remove the warning entirely and let users discover unsupported languages through empty results, which is already the documented behavior ("returns empty array if language unsupported").
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
README.md(2 hunks)src/core/astParser.ts(1 hunks)src/core/cache.ts(1 hunks)src/core/scanner.ts(3 hunks)src/types/index.ts(4 hunks)test/astFormatter.test.ts(1 hunks)test/astParser.test.ts(1 hunks)test/languages/python.test.ts(1 hunks)test/languages/typescript.test.ts(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/core/cache.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Do not use dotenv; Bun loads .env automatically
UseBun.serve()for HTTP/WebSocket/HTTPS routes; do not use Express
Usebun:sqlitefor SQLite; do not usebetter-sqlite3
UseBun.redisfor Redis; do not useioredis
UseBun.sqlfor Postgres; do not usepgorpostgres.js
Use built-inWebSocket; do not usews
PreferBun.fileovernode:fsreadFile/writeFile
UseBun.$for shelling out instead ofexeca
Files:
src/core/astParser.tstest/languages/typescript.test.tstest/astFormatter.test.tstest/languages/python.test.tssrc/types/index.tstest/astParser.test.tssrc/core/scanner.ts
**/*.test.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Write tests using Bun’s test API: import from 'bun:test' and use test/expect
Files:
test/languages/typescript.test.tstest/astFormatter.test.tstest/languages/python.test.tstest/astParser.test.ts
src/types/index.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Keep core type definitions in src/types/index.ts
Files:
src/types/index.ts
src/core/scanner.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Implement file scanning logic in src/core/scanner.ts
Files:
src/core/scanner.ts
🧠 Learnings (2)
📚 Learning: 2025-09-12T14:25:55.847Z
Learnt from: CR
PR: agentinit/contextcalc#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T14:25:55.847Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Write tests using Bun’s test API: import from 'bun:test' and use test/expect
Applied to files:
test/astParser.test.ts
📚 Learning: 2025-09-12T14:25:55.847Z
Learnt from: CR
PR: agentinit/contextcalc#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T14:25:55.847Z
Learning: Applies to src/core/scanner.ts : Implement file scanning logic in src/core/scanner.ts
Applied to files:
src/core/scanner.ts
🧬 Code graph analysis (6)
src/core/astParser.ts (3)
src/utils/pathUtils.ts (1)
parseFileSize(28-51)src/core/languages/index.ts (2)
initializeLanguages(32-58)getLanguageByExtension(24-26)src/types/index.ts (2)
ASTOptions(235-240)ASTSymbol(222-233)
test/languages/typescript.test.ts (1)
src/core/languages/typescript.ts (1)
TypeScriptConfig(6-349)
test/astFormatter.test.ts (2)
src/types/index.ts (7)
FileNode(1-11)ScanResult(105-111)FunctionSymbol(154-161)ClassSymbol(163-169)InterfaceSymbol(171-175)ImportSymbol(193-199)VariableSymbol(187-191)src/formatters/astFormatter.ts (1)
formatAsAST(11-233)
test/languages/python.test.ts (1)
src/core/languages/python.ts (1)
PythonConfig(6-163)
test/astParser.test.ts (1)
src/core/astParser.ts (1)
ASTParser(12-187)
src/core/scanner.ts (7)
src/core/cache.ts (1)
CacheManager(9-111)src/core/tokenizer.ts (1)
Tokenizer(6-78)src/utils/ignoreParser.ts (1)
IgnoreManager(148-210)src/core/astParser.ts (1)
ASTParser(12-187)src/utils/pathUtils.ts (1)
parseFileSize(28-51)src/types/index.ts (1)
FileNode(1-11)src/utils/fileDetector.ts (1)
getFileTypeFromExtension(120-123)
🪛 ESLint
test/languages/typescript.test.ts
[error] 1-1: Do not use "@ts-nocheck" because it alters compilation errors.
(@typescript-eslint/ban-ts-comment)
test/astFormatter.test.ts
[error] 1-1: Do not use "@ts-nocheck" because it alters compilation errors.
(@typescript-eslint/ban-ts-comment)
test/languages/python.test.ts
[error] 1-1: Do not use "@ts-nocheck" because it alters compilation errors.
(@typescript-eslint/ban-ts-comment)
test/astParser.test.ts
[error] 1-1: Do not use "@ts-nocheck" because it alters compilation errors.
(@typescript-eslint/ban-ts-comment)
🪛 GitHub Actions: CI
test/astFormatter.test.ts
[error] 1-1: ESLint (ban-ts-comment): Do not use '@ts-nocheck' because it alters compilation errors. (error triggered by command 'eslint src/ test/')
🪛 GitHub Check: Test and Build (18)
test/languages/typescript.test.ts
[failure] 1-1:
Do not use "@ts-nocheck" because it alters compilation errors
test/astFormatter.test.ts
[failure] 1-1:
Do not use "@ts-nocheck" because it alters compilation errors
test/languages/python.test.ts
[failure] 1-1:
Do not use "@ts-nocheck" because it alters compilation errors
test/astParser.test.ts
[failure] 1-1:
Do not use "@ts-nocheck" because it alters compilation errors
🪛 markdownlint-cli2 (0.18.1)
README.md
186-186: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (9)
README.md (2)
181-201: AST output docs look clear and consistent with CLI and formatter.Section reads well and matches the new
--output astmode.
240-240: Options table correctly addsast.Accurate default and values. No further action.
test/languages/typescript.test.ts (2)
117-133: Notes on expectations.Comments acknowledge parser-specific nuances (implements/abstract). Tests are reasonable and won’t be flaky.
Also applies to: 135-149, 150-163
1-1: Replace@ts-nocheckwith proper types (fixes ESLint error).Align with Python test: type the grammar as
Language.-// @ts-nocheck -import Parser from 'tree-sitter'; +import Parser, { type Language } from 'tree-sitter'; ... - let parser: Parser; - let grammar: unknown; + let parser: Parser; + let grammar: Language;Also applies to: 4-4, 8-16
src/core/scanner.ts (4)
15-32: AST parser lifecycle is correctly gated and initialized.Lazy creation and initialize-on-demand are appropriate; disposal in
finallyis good hygiene. Based on learnings.Also applies to: 34-45
156-163: Good: AST parse is size-guarded and failure-tolerant.AST is parsed before caching, skipped on error, and attached to FileNode. This prevents rework and keeps scans resilient.
Also applies to: 175-187
238-258: Percentage calculation remains consistent; no AST coupling issues.Method continues to operate on tokens only; safe with optional
entities.
135-172: CacheEntry type is properly updated with entities field; schema versioning is in place.Verification confirms the implementation is correct:
CacheEntryinterface (src/types/index.ts) includesentities?: ASTSymbol[]CACHE_VERSION('1.1') is defined in src/core/cache.ts and validated on cache load- Schema versioning mechanism prevents stale reads
src/types/index.ts (1)
112-240: Excellent AST type system design.The comprehensive AST type definitions are well-structured with:
- Clean discriminated union via
SymbolTypeenum- Proper inheritance hierarchy with
BaseSymbol- Rich metadata support (locations, parameters, members, doc comments)
- Good coverage of language constructs across multiple programming languages
The type system provides a solid foundation for the tree-sitter AST parsing feature.
…type safety Rust imports: - Replace broad descendantsOfType scan with targeted childForFieldName queries - Add recursive processUseTree to handle simple, grouped, renamed, and wildcard imports - Extract only actual imported items, not path components Swift functions & properties: - Fix return type field: change 'result' to 'return_type' - Rewrite extractParameters to properly navigate function_value_parameters structure - Read default_value from parent function_value_parameter node - Add let/var classification for class and protocol properties (ST.CONSTANT vs ST.VARIABLE) - Fix multi-binding support by refactoring extractVariable to extractVariableFromBinding Tests: - Remove @ts-nocheck from test files to enable type checking - Add proper type guards to fix all TypeScript compilation errors - All 159 tests passing with clean linting
- Change LanguageConfig.loadGrammar return type from Promise<unknown> to Promise<any> to match tree-sitter API - Update grammar variable type from unknown to any in Python and TypeScript test files - Replace 'any' type with proper TreeOptions interface in astFormatter tests - Fix MetricSettings to use showPercentages instead of non-existent primary field - Add language tag 'text' to README code fence to satisfy markdownlint MD040 rule All tests passing, typecheck clean.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
README.md (1)
186-186: Add language identifier to fenced code block.The fenced code block at line 186 lacks a language identifier, causing markdownlint MD040 violations.
-``` +```text
🧹 Nitpick comments (2)
src/core/languages/index.ts (1)
18-22: Consider warning on extension conflicts.If two language configurations register the same extension, the second registration silently overwrites the first. This could lead to unexpected behavior.
Consider adding a check:
export function registerLanguage(config: LanguageConfig): void { for (const ext of config.extensions) { + if (languages[ext]) { + console.warn(`Extension ${ext} already registered for ${languages[ext].name}, overwriting with ${config.name}`); + } languages[ext] = config; } }test/languages/python.test.ts (1)
8-8: Consider typinggrammarasunknownfor better type safety.While the current
anytype matches the return type ofPythonConfig.loadGrammar(), usingunknownwould provide slightly better type safety without functional changes.- let grammar: any; + let grammar: unknown;This is consistent with the broader effort to reduce
anyusage, though it's acceptable as-is given the upstream typing.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
README.md(2 hunks)src/core/languages/index.ts(1 hunks)src/core/languages/rust.ts(1 hunks)src/core/languages/swift.ts(1 hunks)test/astFormatter.test.ts(1 hunks)test/astParser.test.ts(1 hunks)test/languages/python.test.ts(1 hunks)test/languages/typescript.test.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- src/core/languages/rust.ts
- src/core/languages/swift.ts
- test/languages/typescript.test.ts
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Do not use dotenv; Bun loads .env automatically
UseBun.serve()for HTTP/WebSocket/HTTPS routes; do not use Express
Usebun:sqlitefor SQLite; do not usebetter-sqlite3
UseBun.redisfor Redis; do not useioredis
UseBun.sqlfor Postgres; do not usepgorpostgres.js
Use built-inWebSocket; do not usews
PreferBun.fileovernode:fsreadFile/writeFile
UseBun.$for shelling out instead ofexeca
Files:
test/languages/python.test.tstest/astParser.test.tstest/astFormatter.test.tssrc/core/languages/index.ts
**/*.test.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Write tests using Bun’s test API: import from 'bun:test' and use test/expect
Files:
test/languages/python.test.tstest/astParser.test.tstest/astFormatter.test.ts
🧠 Learnings (1)
📚 Learning: 2025-09-12T14:25:55.847Z
Learnt from: CR
PR: agentinit/contextcalc#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T14:25:55.847Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Write tests using Bun’s test API: import from 'bun:test' and use test/expect
Applied to files:
test/astParser.test.ts
🧬 Code graph analysis (4)
test/languages/python.test.ts (1)
src/core/languages/python.ts (1)
PythonConfig(6-163)
test/astParser.test.ts (1)
src/core/astParser.ts (1)
ASTParser(12-187)
test/astFormatter.test.ts (2)
src/types/index.ts (8)
FileNode(1-11)ScanResult(105-111)TreeOptions(80-93)FunctionSymbol(154-161)ClassSymbol(163-169)InterfaceSymbol(171-175)ImportSymbol(193-199)VariableSymbol(187-191)src/formatters/astFormatter.ts (1)
formatAsAST(11-233)
src/core/languages/index.ts (3)
src/types/index.ts (1)
ASTSymbol(222-233)src/core/languages/typescript.ts (1)
TypeScriptConfig(6-349)src/core/languages/javascript.ts (1)
JavaScriptConfig(6-19)
🪛 GitHub Check: Test and Build (18)
src/core/languages/index.ts
[warning] 7-7:
Unexpected any. Specify a different type
🪛 GitHub Check: Test and Build (20)
src/core/languages/index.ts
[warning] 7-7:
Unexpected any. Specify a different type
🪛 GitHub Check: Test and Build (22)
src/core/languages/index.ts
[warning] 7-7:
Unexpected any. Specify a different type
🔇 Additional comments (11)
src/core/languages/index.ts (3)
1-2: LGTM!The imports are correct, and the ASTSymbol type naming properly avoids shadowing JavaScript's built-in
Symbol.
32-58: LGTM!The lazy initialization pattern with dynamic imports is well-implemented. The function loads and registers all language configurations efficiently, enabling code splitting and deferred grammar loading.
7-7: Remove suggestion to importLanguagetype; keep recommendation to usePromise<unknown>.The codebase uses the native node-tree-sitter binding (v0.21.1), which does not export a standalone
Languagetype. TheLanguagetype exists only in the web-tree-sitter WebAssembly binding, which is a different package.Using
Promise<unknown>is a valid improvement here since grammar objects are dynamically loaded from language-specific packages and return opaque types that cannot be unified. This provides better type safety thananywithout requiring unification of package-specific types.Likely an incorrect or invalid review comment.
test/astFormatter.test.ts (2)
1-31: LGTM!The test setup correctly uses Bun's test API per the coding guidelines. The mock helpers are well-structured and properly typed without requiring type suppression directives.
Based on coding guidelines.
476-633: Excellent test coverage for summary and location formatting.The test suite comprehensively validates symbol counting (including nested members), file counting, and location range formatting. The tests correctly verify both single-line and multi-line symbol location displays.
test/languages/python.test.ts (2)
19-127: Excellent Python-specific test coverage.The function extraction tests thoroughly validate Python-specific features, including type hints, async functions, default parameters, and the critical
self/clsparameter filtering for methods. This is exactly the kind of language-specific testing needed.
262-304: Critical test coverage for extraction scope.These tests correctly verify that only top-level declarations are extracted, while nested functions are appropriately excluded. This prevents symbol pollution and ensures the AST output remains clean and manageable. The distinction between nested functions (excluded) and class methods (included) is particularly well-tested.
test/astParser.test.ts (4)
1-26: LGTM!The test setup correctly uses Bun's test API and implements proper test isolation with temporary directories. The lifecycle hooks ensure clean setup and teardown for each test.
Based on coding guidelines.
48-130: Comprehensive parseFile test coverage.The tests thoroughly validate file parsing across multiple languages, symbol types, error conditions, and size limits. The size limit test at line 116-129 is particularly valuable for preventing OOM issues in production.
132-184: Excellent parseText API coverage.The tests validate text parsing with strong emphasis on error handling and edge cases. The language identifier normalization test (lines 156-163) is particularly important for API usability, ensuring users can specify languages with or without the leading dot.
223-280: Strong integration testing for symbol extraction.These tests validate the end-to-end symbol extraction pipeline, covering functions with complex parameters, class members, and imports. The comment at line 241 appropriately documents expected variations across tree-sitter parser versions, making the tests more maintainable.
…uages - TypeScript: Fix async/generator detection and class extends/implements/abstract extraction - C#: Rename constructor variable to avoid shadowing, extract individual interface names - Go: Handle empty imports to avoid creating symbols with empty names - Java: Extract individual interface names instead of wrapping entire text All fixes improve type safety and accuracy of AST symbol extraction.
- Replace any with unknown in loadGrammar type signature - Fix TypeScript parser to handle inline export declarations - Bump cache version from 1.1 to 1.2 to invalidate old AST data The TypeScript AST parser now properly extracts exported interfaces, classes, types, and enums instead of creating empty export symbols. This fixes the issue where AST output showed only 5 symbols instead of 500+ across the codebase.
- Fix single file AST output support (was ignoring --output ast flag) - Fix TypeScript class extraction to properly parse extends/implements/abstract - Remove 8 untested language parsers (Go, Rust, Java, C++, C#, Ruby, PHP, Swift) - Clean up error handling (remove noisy warnings, silent failure for expected cases) - Update README to reflect only supported languages (TS/JS/Python) - Remove unused tree-sitter dependencies from package.json All tests passing (159/159). TypeScript type checking passes.
…for AST output - Add comprehensive language support matrix to README showing feature coverage - Integrate AST parser with DEBUG=1 flag for detailed parsing diagnostics - Add file statistics showing processed/skipped files with categorized reasons - Track AST parsing stats (files processed, skipped, skip reasons) - Display grouped skip reasons (unsupported extensions, file size, errors) - Improve user feedback with clear summary of parsing results Example output: Found 28 symbols across 2 files Skipped 2 files - Unsupported extensions: .json (1), .md (1)
# [1.4.0](v1.3.6...v1.4.0) (2025-10-23) ### Features * Add AST parsing with tree-sitter for code symbol extraction ([#14](#14)) ([41053c9](41053c9))
|
🎉 This PR is included in version 1.4.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Fixed critical bug where AST parsing was failing due to incorrect grammar loading when using ES module imports on CommonJS tree-sitter modules. Changes: - Fix grammar loading in TypeScript, JavaScript, and Python language configs to handle both ESM and CJS module formats - Add cache invalidation when switching to AST output mode - Fix AST formatter to count files with symbols instead of only freshly parsed files (which excluded cache hits) - Add countFilesWithSymbols() helper to accurately report files in summary This fixes the issue where `contextcalc . --output ast` would only show 1 file instead of all parseable code files in the project. Fixes #14
Fixed critical bug where AST parsing was failing due to incorrect grammar loading when using ES module imports on CommonJS tree-sitter modules. Changes: - Fix grammar loading in TypeScript, JavaScript, and Python language configs to handle both ESM and CJS module formats - Add cache invalidation when switching to AST output mode - Fix AST formatter to count files with symbols instead of only freshly parsed files (which excluded cache hits) - Add countFilesWithSymbols() helper to accurately report files in summary This fixes the issue where `contextcalc . --output ast` would only show 1 file instead of all parseable code files in the project. Fixes #14
|
🎉 This PR is included in version 1.4.3 🎉 The release is available on: Your semantic-release bot 📦🚀 |
New Feature: AST Parsing with Tree-Sitter
Adds a new
--output astmode that extracts and displays high-level code entities (functions, classes, interfaces, etc.) using tree-sitter AST parsing.📊 What's New
New CLI Option
Example Output
🚀 Features
Extracted Symbol Types
Language Support
Technical Highlights
📦 Dependencies
Added tree-sitter and language grammars to dependencies. No user action required - these are automatically installed with
npm installand include prebuilt binaries for common platforms.🧪 Testing
📝 Implementation Details
New Files
src/core/astParser.ts- Main AST parsersrc/core/languages/- Language configurationstypescript.ts- Full TS/TSX symbol extractionjavascript.ts- Full JS/JSX symbol extractionpython.ts- Full Python symbol extractiongo.ts,rust.ts,java.ts, etc. - Stubs for expansionsrc/formatters/astFormatter.ts- AST output formatterModified Files
package.json- Added tree-sitter dependenciessrc/types/index.ts- Added AST symbol typessrc/cli.ts- Added --output ast supportsrc/core/scanner.ts- Integrated AST parsing🔄 Backward Compatibility
✅ Fully backward compatible - all existing functionality unchanged. AST parsing is opt-in via
--output ast.📚 Use Cases
🎯 Next Steps (Future)
Installation note: All dependencies are bundled with the package. Users don't need to install tree-sitter separately - it's handled automatically by npm/bun.
Summary by CodeRabbit
Release Notes
New Features
Documentation