Open
Description
Problem
Current situation: Nvim ships a few parsers (C, Lua, vimdoc, vimscript) in its runtime. If user wants more parsers they must build the parser and put it on their 'runtimepath' , or use a project like https://github.com/nvim-treesitter which tries to automatically build parsers on the user's machine.
Nvim can't ship hundreds of parsers in its runtime because
- the total file size approaches gigabytes (GB)
- partially because of a known treesitter issue for some parsers: The generated parser.c got too large (about 83MB) tree-sitter/tree-sitter#1799
- partially because of the sheer number of parsers (hundreds)
- undue burden on package maintainers
- updating parsers should not require updating Nvim itself?
Ideal case
Ideally, tree-sitter upstream would solve some problems for all tree-sitter consumers by:
- provide makefiles
- introspectible parser version that is set through
tree-sitter generate
- parser authors maintaining their own queries and bumping said version every time a parser update requires changes to them
Potential Solutions
Do nothing
- Do nothing, except "Guidance": User should install per-language plugins. Plugin authors should build them.
- Continue to outsource the problem to https://github.com/nvim-treesitter
- Similar to https://github.com/neovim/nvim-lspconfig
- Problem: maintenance burden, doesn't scale?
Distribute queries
The main problem is lack of query and parser versioning.
- Ship queries, but not parsers. Queries are relatively tiny text files.
- Problem: parsers and queries are tightly coupled, so a new parser version could break an existing query.
- tree-sitter upstream does not provide tools that could help us (like parser introspection, query introspection (version or metadata))
- Problem: parsers and queries are tightly coupled, so a new parser version could break an existing query.
- Enforce versioned parser names
- Problem: how?
- Right now, we only have the commit hash => can't reason about version range.
Distribute parsers (.so/.dll)
- Develop CI that builds .so/.dll files for every OS. Then Nvim can fetch those on-demand.
- Benefit: useful for all text editors, not just Nvim.
- Problem: Where to put (200 * 3) build artifacts? Could use Github packages like homebrew?
- Problem: similar maintenance burden as nvim-treesitter.
- Mitigation: strictly refuse to support parsers that don't easily build.
- Users to nudge the parser maintainer to "fix" their build steps.
- Mitigation: strictly refuse to support parsers that don't easily build.
- Develop CI that builds "universal" libs via cosmopolitan c
- Problem: "fat" libraries are costly: TS
.so
files are 90%+ data and 10% actual code (just the scanner part). Converting that 10% to WASM is less invasive.
- Problem: "fat" libraries are costly: TS
- Integrate nvim-treesitter's logic for "build the parser locally and put it into rtp"
- Benefit: gives us a "happy path" answer for users to avoid needing nvim-treesitter.
- Problem: Nvim becomes a package manager, which is a slippery slope.
- Mitigation: strictly refuse to support anything but the happy path.
- Don't try to find compilers in weird places.
- Don't support configuration.
- Mitigation: strictly refuse to support anything but the happy path.
- Problem: maintenance burden: many parsers have quirky build steps! May require C++ compiler.
- Mitigation: strictly refuse to support parsers that don't easily build.
- Users to nudge the parser maintainer to "fix" their build steps.
- Alternative: distribute zig binary as a compiler and use that as the toolchain to build on the user's machine.
- Mitigation: strictly refuse to support parsers that don't easily build.
- Outsource the problem to installers like mason.
- ✅ Wait for upstream to support WASM parsers