New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
treesitter distribution strategy (tree-sitter) #22313
Comments
That's not entirely true. Shipping all parsers of nvim-treesitter nvim-treesitter/nvim-treesitter#3688 ~150 requires depending on OS from 9 to 15MB. Helix ships with all of its parsers in its release https://github.com/helix-editor/helix/releases The linux release is 10.6MB big. But parsers will inflate to >100MiB when decompressed (binaries have a very repetitive structure). I had the idea for a long time to have parsers compressed on disk and only decompress when needed. So that nvim could transparently load compressed parsers (Since
Making parsers distributable is since a long time on tree-sitters 1.0 list tree-sitter/tree-sitter#930. It would be great if most of the challenges you're mentioning could be solved by https://github.com/tree-sitter providing infrastructure to parser repos so that editors can consume them. Offering release workflows for parser repos was one of the ideas (could be parsers or parser+queries). Parser repos could offer dedicated editor specific queries. I discussed with @clason to move more maintenance of queries out of nvim-treesitter to parser repos nvim-treesitter/nvim-treesitter#4279 (comment). Of course, this only works for repositories whose maintainer care about Neovim support. I was thinking as a first step to at least to have the built-in parsers vim/lua/help
It seems that at the moment, GH packages only supports container images, Ruby gems, pip packages, cargo crates. I suspect homebrew might use Ruby gems. I didn't find a way to store versioned binary blobs without the need of a package manager (might also be missing knowledge by me). Since tree-sitter is associated with GH, they might extend this to support tree-sitter parsers or plan binary blobs with versions and meta-data. Installation via curl would be my favorite. If the tree-sitter organization could standardize parser packages somehow with a central registry, then Neovim could provide a API function that curls a parser given it's name, version tag and the current OS/arch combination. Contributors to Neovim, Helix, Emacs with good ideas will probably need to get active to contribute to solution to avoid having to much complexity in editor repos or end-users machines. On the long run the installer logic in nvim-treesitter should become obsolete. EDIT: https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:ae5d8e9148068e001b5ca7bbc2aa8663aa13b9995245f7655772725add67454c?se=2023-02-19T14%3A20%3A00Z&sig= these URLs looks like homebrew is using the container registry storage to store binary blobs. |
Just to make this obvious:
|
PlanNotes on treesitter plan ("migration from legacy vim syntax") from chat with @clason : Short-term (around 0.10):
Medium-term (around 0.11):
Long-term (1-2 years, not 3+ years...):
|
Problem
Current situation: Nvim ships a few parsers (C, Lua, vimdoc, vimscript) in its runtime. If user wants more parsers they must build the parser and put it on their 'runtimepath' , or use a project like https://github.com/nvim-treesitter which tries to automatically build parsers on the user's machine.
Nvim can't ship hundreds of parsers in its runtime because
Ideal case
Ideally, tree-sitter upstream would solve some problems for all tree-sitter consumers by:
tree-sitter generate
Potential Solutions
Do nothing
Distribute queries
The main problem is lack of query and parser versioning.
Distribute parsers (.so/.dll)
.so
files are 90%+ data and 10% actual code (just the scanner part). Converting that 10% to WASM is less invasive.The text was updated successfully, but these errors were encountered: