feat(query): polyglot from_directory with auto-detection#13
Merged
tob-scott-a merged 1 commit intomainfrom Apr 23, 2026
Merged
Conversation
QueryEngine.from_directory previously accepted one language at a time, but real repositories mix languages (Python + Solidity contracts, TypeScript + Rust, etc.). Callers had to build two engines and figure out how to combine them, or give up on multi-language analysis. The `language` argument now accepts: - `"auto"` — walks the tree, detects every supported language with at least one matching file, parses each, and merges into a single graph. Skips common vendor directories (node_modules, .venv, target, etc.). - `"python,rust"` — explicit comma-separated list for when auto-detect would pull in too much or miss something. - `"python"` — single language (unchanged; the single-language path is preserved byte-for-byte when exactly one language is specified). Entrypoint detection runs on the merged graph, so a repo with a Python main() and a Solidity external function surfaces both in attack_surface() from a single analyze call. Also exposes `detect_languages(path)` as a public helper for callers that want the list without building a graph. 12 new tests: detection on single/multi/empty/vendor-heavy directories, auto-merge behavior, explicit list handling, error paths, and a regression guard that single-language behavior is unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
QueryEngine.from_directory previously accepted one language at a time, but real repositories mix languages (Python + Solidity contracts, TypeScript + Rust, etc.). Callers had to build two engines and figure out how to combine them, or give up on multi-language analysis.
The
languageargument now accepts:"auto"— walks the tree, detects every supported language with at least one matching file, parses each, and merges into a single graph. Skips common vendor directories (node_modules, .venv, target, etc.)."python,rust"— explicit comma-separated list for when auto-detect would pull in too much or miss something."python"— single language (unchanged; the single-language path is preserved byte-for-byte when exactly one language is specified).Entrypoint detection runs on the merged graph, so a repo with a Python main() and a Solidity external function surfaces both in attack_surface() from a single analyze call.
Also exposes
detect_languages(path)as a public helper for callers that want the list without building a graph.12 new tests: detection on single/multi/empty/vendor-heavy directories, auto-merge behavior, explicit list handling, error paths, and a regression guard that single-language behavior is unchanged.