Skip to content

feat: exhaustive language schemas — all 18 tree-sitter grammars#128

Merged
jamestexas merged 2 commits intomainfrom
feat/exhaustive-language-schemas
Mar 24, 2026
Merged

feat: exhaustive language schemas — all 18 tree-sitter grammars#128
jamestexas merged 2 commits intomainfrom
feat/exhaustive-language-schemas

Conversation

@jamestexas
Copy link
Copy Markdown
Contributor

Summary

Every compiled-in tree-sitter grammar now has a curated preset schema for language-aware auto-detection.

12 new schemas: JavaScript (imports/classes/exports), TypeScript (+interfaces/enums/type aliases), Java (classes/interfaces/enums/annotations), C (includes/functions/structs/enums/typedefs/macros), C++ (+classes/namespaces/templates), Ruby (require/classes/modules/methods), PHP (use/classes/interfaces/traits), Kotlin (classes/objects/interfaces), Swift (classes/structs/protocols/enums), Scala (classes/objects/traits/vals), Elixir (defmodule/def/defp/defmacro), YAML (top-level mappings)

Improved: Rust (added use imports), Terraform (added terraform{} and moved{} blocks)

Wiring: sourceCodePresets expanded from 5 → 18 languages; presetSchemas from 9 → 21 entries. Auto-detection now produces language-aware projections for all supported languages instead of falling back to generic FCA inference.

Also filed 2 ley-line beads for downstream work: tree-sitter query validation endpoint, language-aware embedding metadata.

Test plan

  • task build succeeds
  • task test — all tests pass
  • TestResolveSchema_AllPresets — all 21 presets load and parse
  • task test-go-schema — self-hosting smoke test passes
  • Manual: mache serve ~/some-ts-repo → verify TS constructs projected
  • Manual: mache serve ~/some-rust-repo → verify use imports appear

🤖 Generated with Claude Code

…reset schemas

Add curated preset schemas for every compiled-in tree-sitter grammar so
auto-detection produces language-aware projections instead of falling
back to generic FCA inference.

New schemas: javascript, typescript, java, c, cpp, ruby, php, kotlin,
swift, scala, elixir, yaml (12 new).

Improved: rust (added use imports), terraform (added terraform{} and
moved{} blocks).

Wired: sourceCodePresets now maps all 18 detected languages to their
preset schemas. presetSchemas registry updated (21 total presets).

Closes: mache-6bb5e7
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds curated preset schemas for every compiled-in tree-sitter grammar so auto-detected projects get language-aware projections (instead of falling back to generic FCA inference).

Changes:

  • Added 12 new preset schema JSON files (JS/TS/Java/C/C++/Ruby/PHP/Kotlin/Swift/Scala/Elixir/YAML).
  • Expanded presetSchemas and sourceCodePresets wiring so all supported languages are eligible for auto-detected preset schemas.
  • Enhanced existing presets (Rust use imports; Terraform terraform{} and moved{} blocks).

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
cmd/schemas/yaml.json New YAML preset schema (mapping projections).
cmd/schemas/typescript.json New TypeScript preset schema (imports/functions/classes/interfaces/type aliases/enums/exports/vars).
cmd/schemas/terraform.json Adds terraform{} and moved{} block projections to Terraform preset.
cmd/schemas/swift.json New Swift preset schema (imports/types/functions).
cmd/schemas/scala.json New Scala preset schema (imports/classes/objects/traits/vals/types).
cmd/schemas/rust.json Adds Rust use import projections to existing preset.
cmd/schemas/ruby.json New Ruby preset schema (require/classes/modules/methods).
cmd/schemas/php.json New PHP preset schema (use/classes/interfaces/traits/functions/enums).
cmd/schemas/kotlin.json New Kotlin preset schema (imports/classes/objects/interfaces/functions).
cmd/schemas/javascript.json New JavaScript preset schema (imports/functions/classes/exports/vars).
cmd/schemas/java.json New Java preset schema (imports/classes/interfaces/enums/annotations).
cmd/schemas/elixir.json New Elixir preset schema (defmodule/def/defp/defmacro).
cmd/schemas/cpp.json New C++ preset schema (includes/functions/structs/enums/typedefs/macros/classes/namespaces/templates).
cmd/schemas/c.json New C preset schema (includes/functions/structs/enums/typedefs/macros).
cmd/schemas.go Registers new preset schema files in presetSchemas for resolution/loading.
cmd/infer.go Expands sourceCodePresets so inference prefers presets for all supported languages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/infer.go
Comment thread cmd/schemas/terraform.json Outdated
Comment thread cmd/schemas/yaml.json
Comment thread cmd/infer.go
…YAML depth

- Align langForExt to return "terraform" (not "hcl") matching
  DetectLanguageFromExt, so multi-language namespace filtering works.
  Also updates GetLanguage, RegisterAddressRefQuery, and tests.
- Remove moved{} block from terraform schema (no unique name → collision).
  Keep terraform{} which is typically singleton per module.
- YAML: anchor query to document root via stream>document>block_node
  path so only top-level mapping pairs are captured.
- Filed mache-a21b69 for selector compilation test (follow-up).
@jamestexas jamestexas merged commit be71d90 into main Mar 24, 2026
14 checks passed
@jamestexas jamestexas deleted the feat/exhaustive-language-schemas branch March 24, 2026 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants