Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treesitter: 'query_gc, tree_gc, treecursor_gc, etc' are never called (aka memory leaks) #14216

Open
chentoast opened this issue Mar 26, 2021 · 9 comments
Labels
bug issues reporting wrong behavior treesitter

Comments

@chentoast
Copy link
Contributor

  • nvim --version: latest master

Steps to reproduce using nvim -u NORC

In src/nvim/lua/treesitter.c, add logging messages to each of the various __gc methods for treesitter userdata: tree_gc, query_gc, parser_gc, treecursor_gc, querycursor_gc.
Then, run

NVIM_LOG_FILE=log build/bin/nvim src/nvim/eval.c
:bw
:lua collectgarbage('collect')
:e log

Actual behaviour

None of the logging messages show up, indicating that none of the treesitter machinery has been garbage collected.

I believe that garbage collection is not being called because there are some dangling references to these items from C code - mostly from nvim_buf_attach or set_decoration_provider. For example, tree_gc and parser_gc were not being called because we did not free the nvim_buf_attach functions entirely: making the following modification to buffer_updates.c fixes this issue for those two functions:

--- a/src/nvim/buffer_updates.c
+++ b/src/nvim/buffer_updates.c
@@ -410,4 +410,7 @@ static void free_update_callbacks(BufUpdateCallbacks cb)
 {
   api_free_luaref(cb.on_lines);
   api_free_luaref(cb.on_changedtick);
+  api_free_luaref(cb.on_bytes);
+  api_free_luaref(cb.on_detach);
+  api_free_luaref(cb.on_reload);
 }

However, for some reason, I can't seem to ever get query_gc, treeecursor_gc, querycursor_gc to be called at all, meaning that a ton of memory allocated for lua userdata is never freed. Reloading eval.c several times already brings up neovim memory usage to around 10% of my system's memory.

I'll need to investigate this issue further, but I would appreciate some help in this - I've been trying to fix this for a while now and I'm still stuck. Alternatively, it's possible that this isn't actually a problem at all and the leaks are coming from elsewhere.

@bfredl

@chentoast chentoast added the bug issues reporting wrong behavior label Mar 26, 2021
@bfredl
Copy link
Member

bfredl commented Mar 26, 2021

Hmm at least tree_gc should get called frequently. We have many tricky references to the parser itself, but the parser should only hold on to the very last tree. It is recommended that plugins only keep references to the parser, and use parser:tree() instead of keeping a non-local reference to the tree (or individual nodes, which is equivalent as you can "reach" the root from any node).

@bfredl
Copy link
Member

bfredl commented Mar 26, 2021

regarding lua references from c code, we can probably ad a simple counter for lua_ref / lua_unref and check it in free_all_mem in the ASAN/UBSAN build (and failure exit if it is non-zero).

@chentoast
Copy link
Contributor Author

chentoast commented Mar 27, 2021

Looks like the querycursors are not freed because of some functions set using nvim_set_decoration_provider; freeing all of the luarefs in decor_providers->items seems to garbage collect most (but not all) querycursors. However, ASAN still says that there are some that remain allocated.

This is slightly problematic though; the decoration providers are meant to be active for pretty much the lifetime of Neovim, meaning that there will be a ton of querycursors that end up never being freed unless we explicitly make a call to api_free_luaref on buffer unload. Maybe we should rewrite the the treesitter module to use weak rather than strong references?

@bfredl
Copy link
Member

bfredl commented Mar 27, 2021

the decoration_provider callbacks doesn't "own" anything by upvalues or otherwise, they are just global entry points that look up the actual state per bufnr. TSHighlighter:destroy() should be called when the specific buffer gets unloaded/deleted, but perhaps this doesn't happen?

@bfredl
Copy link
Member

bfredl commented Mar 27, 2021

in addition, a _on_end callback could be added to set self.iter to nil which will free the cursor for a hidden but not unloaded buffer.

@chentoast
Copy link
Contributor Author

Yeah I had also realized I was wrong re: the decor providers. I will have to think about this some more

@bfredl
Copy link
Member

bfredl commented Mar 27, 2021

BTW in #14226 I will try to add check for missing api_free_luaref to the ASAN build (with informative tracebacks)

@chentoast
Copy link
Contributor Author

It seems (ofc, I may be wrong again) that the source of the leaks is from the closure returned from Query:iter_captures. For some reason, this function doesn't ever seem to go out of scope, even when the highlighter instance is destroyed, although I can't for the life of me figure out why.

Can be verified by adding the following lines to Query:iter_captures:

@@ -383,9 +385,12 @@ function Query:iter_captures(node, source, start, stop)
   start, stop = value_or_node_range(start, stop, node)
 
   local raw_iter = node:_rawquery(self.query, true, start, stop)
+  local tmp = newproxy(true) -- create a blank userdata
+  getmetatable(tmp).__gc = function() print("HELP MEEEE") end
   local function iter()
     local capture, captured_node, match = raw_iter()
     local metadata = new_match_metadata()
+    local a = tmp
 
     if match ~= nil then

Opening the buffer then wiping it, and running the GC doesn't print anything, even though it should.

@bfredl
Copy link
Member

bfredl commented Apr 1, 2021

Is this with fixing the above known problems ( i e like in #14226)? I will try to look more into it but if we cannot root case it we can always add a few extra local_var = nil with a TODO to debug GC issues more after 0.5 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issues reporting wrong behavior treesitter
Projects
None yet
Development

No branches or pull requests

3 participants