-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate tree-sitter parsers into the toplevel build system #29042
Comments
This is basically impossible without nasty hacks. An attempt of this was made in #22054 and the problem is that With that background out of the way: I don't quite understand why this is a problem. Distros are usually expected to package everything themselves and should able to just run the main build since all dependencies are already available. I don't quite get what blocks you from just using the main file (assuming you've already packaged all dependencies). |
Thank you for linking the pr, this is what I was looking for. Good to see this was already approached.
This can probably be worked around by predicting the exact paths of installed files by the external project and putting dependencies on the target created by
This is the case for MacPorts, except for the tree-sitter parsers, which are expected to be installed in a neovim-specific directory. In that sense:
|
Thank you for asking these questions. Yes, these parsers should be considered part of Neovim, just like the other runtime files, and you cannot rely on "semantic version" stability (the exact same version should work, but no guarantees -- and there is a virtual guarantee that no two consumers (Helix, Emacs, Neovim) will depend on the same version. |
We don't use a find module for these because they are runtime dependencies. It is not cmake that locates these but neovim itself. Neovim relies on these being in the correct location during runtime.
I don't know if this is a philosophical question or a practical one. They are dependencies because we don't maintain them. If you mean that we should treat them as they are part of neovim (like with runtime files) then that is a different question. It's probably not a popular suggestion as other distros will throw a hissy fit, although I don't remember exactly why. They don't like this though.
Exact |
I'll need to think about this. Normally this would be out of the question for dependencies that have other dependencies relying on them (such as luajit), but it might be possible for dependencies that don't have other dependencies relying on them (such as parsers IIRC). I will need to experiment first to say for sure. |
I still don't understand why packaging the parsers are not an option. Is it too much work to package the parsers or what's the problem? |
The latter. It seems more logical to me that in the specific case of the parsers, which need to be an exact version and installed in an nvim-specific location, they are part of the core nvim build itself.
They usually want a library to be installed in a single location by a single package, so it can be patched and updated for bug and security fixes without having to dig through all downstream packages that use it and where it might be bundled somehow with an outdated version.
It is an option, but it is not perfect. Basically there are two possibilities:
|
I think it's possible to move the parsers download/installation to main build. Only question is if it makes sense to do in general and if other distributions have a problem with this (there are others). @neovim/automation thoughts? |
Speaking with my Debian hat on, I'm packaging the parsers independently. This is in the (possibly misguided) hope that the tree-sitter story around parser distribution / versioning / interface stability improves in the long run, so there doesn't need to be such strict association between parsers and users of the parsers.
The parser packages in Debian aren't shipping pre-built parsers, partly for the reasons you mention here. The "build" of the parser packages re-generates the Then neovim (and eventually other packages) have build dependencies on, e.g., tree-sitter-c-src, and use the respective helper Makefile to build the version of the parser they need. They then install the resulting shared object where needed (in It's all structured so that, if needed, multiple versions of a parser can be available and each dependent package can build the one it needs, although neovim is currently the only consumer so I haven't actually started providing multiple versions. Yes, this was a lot of up front work, but it ensures Debian can regenerate all parsers, if needed, and provides a standard interface (which I can simplify going forward) that any package in Debian can use. Adding new parsers is pretty simple. Updating the tree-sitter tooling is much more involved. All that being said, I also understand that given the current state of the tree-sitter ecosystem, it's much more attractive to just make Neovim build everything itself. At a minimum, that would require vendoring the parsers, since many distributions don't allow network access during the build. |
We can't vendor parsers atm as they'd be system dependent. If vendoring parsers is the approach we wanna take then we need wasmtime up and running first. |
If I understand the initial request right, this would be about vendoring the parser sources, not the binaries. Only to avoid the download. |
Have you seen the size of some of these parsers? We do not want them in our source tree and blowing up the repo. |
Absolutely. This is why I suggested to put them into release tarballs only, not into the repo. If you have the tarball and want to build it, you would then need to download them sooner or later anyway so there it makes no difference anymore. |
Our tarballs are (intentionally!) identical to the source tree, though. (Insert xz reference here.) |
Then that is not vendoring in neovim(?) lingo, vendoring is including external source code into the neovim repo. Your suggestion is also not any different than just moving the parser installation from the deps cmake build to the main cmake build, which is basically what I proposed. This approach would help macports, but not tryhard distros like debian and fedora as they can't use internet connection. @jamessan's suggestion of vendoring parsers would overall help more distros (including macports), but is also more ambitious. That is probably a decent long term goal/workaround. |
There are two suggestions here. We should not mix them:
|
Completely out of question. Don't waste your time trying to argue about that. |
Alright, then I think this entire issue now boils down to just the question whether moving the tree-sitter parser handling from cmake.deps to the main cmake build is possible and sensible. |
As a PoC, macports now uses this patch to integrate the parsers into the toplevel cmake: https://github.com/macports/macports-ports/blob/77f8d99816f5e9ac3ef7869a772f79f1d26f8f24/editors/neovim/files/embed-parsers-build.diff |
Problem
During packaging for MacPorts (macports/macports-ports#24120 (comment)), the question came up why the cmake build system (under
cmake.deps/
) is strictly separate from the toplevel CMakeLists.txt, requiring either multiple manual steps or using the additional wrapper Makefile. This has become especially cumbersome since as of v0.10.0, neovim expects multiple builtin tree-sitter parsers to be available at<prefix>/lib/nvim/parser
or otherwise will throw runtime errors on common tasks like opening:help
.Even worse is the fact that the toplevel cmake silently does or does not use the parsers in
.deps
depending on whether they were built manually before, so forgetting to build them can easily go through unnoticed, which is what initially happened in MacPorts. Using the wrapper Makefile is not a good option there.CMake has extensive builtin functionality for embedding other cmake roots, from e.g. a simple
add_subdirectory
to ExternalProject/FetchContent. Havingcmake.deps/
as a standalone-buildable project makes sense because you can point multiple neovim builds to it. But I was not able to find reasons in the codebase or past issues why there had to be a wrapper-Makefile around the toplevel cmake for tasks that cmake itself should be able to handle just fine.So my question is: Was this a deliberate design decision and what was the reasoning behind it?
Expected behavior
I could imagine a build system like this:
cmake.deps/
stays mostly as-is: A standalone project for bundled dependencies.deps
folder, told to not use bundled dependencies at all, or it will embedcmake.deps/
itself using something like ExternalProject and build and manage a specified subset of dependencies in its own build dir. It should fail the build if building any of the dependencies fails.The text was updated successfully, but these errors were encountered: