-
-
Notifications
You must be signed in to change notification settings - Fork 6.1k
perf(treesitter): partial injection and incremental invalidation #26827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bffa2b6
to
53c9cc7
Compare
-- match each pattern separately. | ||
if not self._has_combined_injection and range ~= true then | ||
start_line = math.max(start_line, range_start_line) | ||
end_line = math.min(end_line, range_end_line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are range_start_line
, range_end_line
all non-nil? (range = true
) possibly need a nil check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this branch range
must have type Range
, so range_start_line
, range_end_line
must've been set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the case then I think range
should not be defined to allow a boolean
type at all? Rather it should be typed only as Range
, no?
It is confusing to see it span a boolean type as well, and especially since passing in true
leads to nil access in the math.max
if | ||
vim.tbl_contains(preds, function(pred) | ||
return vim.deep_equal(pred, { 'set!', 'injection.combined' }) | ||
end, { predicate = true }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Can injection.combined
be extracted from metadata
(see LanguageTree:_get_injection
) or do we have to search all the predicates/directives by a for loop? If so, can you elaborate more on why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's too late at that point. _get_injection
is invoked after iter_matches
, but we need to check the existence of combined injection before calling iter_matches
to compute the start_line
/end_line
arguments.
53c9cc7
to
9650455
Compare
e0ffa59
to
433257f
Compare
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but I'm not really convinced with the _batch_*
method here. It makes the languagetree significantly more complicated and is only optimized for the highlighter. Any plugins that need to parse specific buffers with a range aren't going to see any of the benefits.
What I suggest we do is to look at what Helix has done with incremental injections, and see if we can borrow anything from them. https://github.com/helix-editor/helix/blob/0c81ef73e17a3d45cd6240fd5933ad99b3a60d01/helix-core/src/syntax.rs#L1012
Once we've solved all our invalidation bugs, then maybe we can rethink the parse()
API to better handle multiple ranges due to multiple windows, if adding/removing the languages really is a significant cost. I suspect it isn't common enough for a buffer to be visible in multiple windows at once and thus this isn't something we should be optimizing for.
Another route (and potentially better) would be to make changes to the decoration provider API to better accommodate batch updates, i.e. by calling parse()
once after all the on_wins
have been invoked so we can collect the ranges.
I don't think it's that uncommon -- having the the definition of a (local) function and its calling site visible at the same time seems like something people do regularly (at least I do). I'd expect this to be more common the longer the file (and thus the more expensive the parsing), too. Whether that warrants specific optimization (you'll probably only editing one site) is another question, of course. |
---@param exclude_children boolean|nil whether to ignore the validity of children (default `false`). | ||
--- `is_valid(false)` is sound but not complete, i.e., | ||
--- * if it returns `true`, this LanguageTree is actually valid; but | ||
--- * even if this LanguageTree is actually valid, it may return `false`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documenting bugs and defining it has a part of the API.
is_valid(false)
should always return true if the languagetree is valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making it complete is possible but that would incur quite a bit of overhead (see note below), and i thought that kinda defeats the purpose of this function (quickly and soundly check necessity of parse(true)
). i will implement it when i return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed by running full injection in is_valid(false)
3d1560b#diff-70b0f3fa7d6baf9e605400a9f0e10431183df821ec4096e8097eaa5f4d5531baR310-R314
Sure, I do this all the time. However, at the very most this is in a split with the buffer in at most 2 windows. Each additional window is exponentially less likely. |
iiuc helix doesn't do paritial injection: https://github.com/helix-editor/helix/blob/0c81ef73e17a3d45cd6240fd5933ad99b3a60d01/helix-core/src/syntax.rs#L1097 the batching is primarily for correcteness of highlights. without it the highlights will ficker when the buffer is shown in multiple windows as the regions created by on_win of a window are discarded by on_win of another window. i agree that better decoration api for batching is a more natural approach |
That shouldn't be the root cause of the flicker. Once all the lines for a window have been drawn during a redraw cycle, subsequent windows shouldn't cause flicker in previous windows unless something else is invoking another redraw. But also, the regions being discarded is just a bug in our highlighter we need to fix. We shouldn't be layering on API methods to work around this.
Ah true, it's only the parsing that is incremental, but we already do that. However I think their invalidation logic is less buggy than ours, so that's probably what we should fix first. |
hmm in fact, I can't no longer reproduce this flicker. So the only valid role of batching is optimization: prevention of discarding regions and adding/removing children.
In this PR,
So there should be some mechanism to discard stale regions. This PR eagerly removes stale region, which necessitates batching. But it can be avoided if we use less eager method such as assigning some lifespan to each region.
IIUC Helix is not incremental in the sense that it doesn't care about what's visible on the screen. Also I think their invalidation logic isn't much different from this PR's. Here's a summary of their approach. Their
A notable difference from nvim is that |
eb3e69c
to
dd129fe
Compare
21b6df5
to
3af56c9
Compare
3af56c9
to
9831d29
Compare
9831d29
to
1cc6f4d
Compare
Yeah I'm not arguing the performance aspect. This was my understanding:
Let me know if any of that is incorrect. But if not I am just saying that we should not worry about that special case because we are doing more work to support a current aspect of injection functionality which is a bug (by treesitter's definition) |
Just to give an example to clarify the differences: Neovim's injection.combined (for comment content(!) only, not comment markers): -- print([[ some
-- text
-- here]]) in Neovim the comment markers before in (If it was not one singular tree, e.g. |
I forgot neovim still has that bug, but I think that is beside the point? Even if we correctly excluded the comment markers from the combined region, we would still want to disable incremental injections for combined injections because we need to scan the whole document to get the full combined region. Whether or not we've collected the ranges correctly is a different matter. |
Gotcha, that makes sense. I had thought the "partial" aspect of this PR only applied to the query iteration + highlight applications. But yeah you're right if it also includes the actual ranges of what to parse then it would have to stay |
dcf7fdb
to
717d647
Compare
return vim.deep_equal(directive, { 'set!', 'injection.combined' }) | ||
end, { predicate = true }) | ||
then | ||
self.has_combined_injection = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic should be removed, and all special cases regarding combined injections here should be removed also. Per my previous comment. As Lewis pointed out, things are different when it comes to parsing, but querying should only run on the injection ranges, even for combined injections. Combined injections only apply to the tree document content, not to the ranges for which highlights should be applied
EDIT: Sorry, I see now that this function is actually used to set the ranges for parsing... that is unfortunate. I wonder if it would be acceptable to only parse injected ranges as combined if they are all currently in the window. That is, only combine visible regions which did fit within the original range parameter. That way we still get partial querying gains on queries with at least one combined injection
How does this currently work with non-combined injections that partially clip the window? E.g.
// multi
// line <-- window first line here
// injection
// here
because if it includes the whole range, to me seems like we could just do the same with partially visible combined regions (include the full range), combine only those which we have deemed visible, and call it a day
I'd even say this is preferred because it will largely eliminate the confusing behavior of injection parsing being negatively affected by combined injections far away in the buffer (not really meant to be combined in other parts), hence the desire in some cases for "scoped injections".
-- If the query doesn't have combined injection, run the query on the given range. Combined | ||
-- injection must be run on the full range. Currently there is no simply way to selectively | ||
-- match each pattern separately. | ||
if range ~= true and not self._injection_query.has_combined_injection then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, I think this check should be removed. And the other one earlier in this file
-- match each pattern separately. | ||
if not self._has_combined_injection and range ~= true then | ||
start_line = math.max(start_line, range_start_line) | ||
end_line = math.min(end_line, range_end_line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the case then I think range
should not be defined to allow a boolean
type at all? Rather it should be typed only as Range
, no?
It is confusing to see it span a boolean type as well, and especially since passing in true
leads to nil access in the math.max
@@ -439,11 +448,6 @@ end | |||
--- only the root tree without injections). | |||
--- @return table<integer, TSTree> | |||
function LanguageTree:parse(range) | |||
if self:is_valid() then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this removal cause unnecessary reparsing? Is it possible to maybe pass in a range to is_valid()
, which will check if the languagetree is valid for that range? And maybe then we could also remove the
self:_add_injections(true)
self._injections_processed = true
side effect in is_valid()
(not sure if that would be possible after this change though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my reply to your comment on is_valid()
documentation.
@@ -1551,7 +1551,8 @@ LanguageTree:invalidate({reload}) *LanguageTree:invalidate()* | |||
LanguageTree:is_valid({exclude_children}) *LanguageTree:is_valid()* | |||
Returns whether this LanguageTree is valid, i.e., |LanguageTree:trees()| | |||
reflects the latest state of the source. If invalid, user should call | |||
|LanguageTree:parse()|. | |||
|LanguageTree:parse()|. `is_valid(false)` can be slow because it runs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be really good to avoid this side effect in is_valid()
, if possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR's region management strategy only retains the regions that intersect the requested parsing range. But computing is_valid(false)
requires the knowledge of all possible regions, including the ones that are not currently tracked. This necessitates running injection.
Actually, I would say is_valid(false)
is kind of unnecessary. The thing you would mostly likely to do after seeing is_valid(false) == false
is parse(true)
. The is_valid(false) == false
guard for parse(true)
would be useful if it avoids the cost of parse(true)
in the is_valid(false) == true
case. But if you run parse(true)
in the is_valid(false) == true
case, parse(true)
will be no-op. So the is_valid(false) == false
guard isn't really meaningful. This is why I removed is_valid()
check in _parse()
.
local buf_ranges = {} ---@type table<integer, (Range)[]> | ||
for _, win in ipairs(api.nvim_tabpage_list_wins(0)) do | ||
local buf = api.nvim_win_get_buf(win) | ||
if TSHighlighter.active[buf] then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is false, will there be a nil
access a few lines below, where the parse is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop below loops over buf_ranges
which contains buffers for which TSHighlighter.active[buf]
is non-nill.
717d647
to
9774bd3
Compare
Problem: After an edit that changes the number of injection regions, the LanguageTree drops all the existing trees. This inefficient because the injections should be parsed from scratch. Solution: When setting included regions, match them with the existing regions so that they can be reparsed incrementally. This uses a table that maps region values to their indices. Regions are matched by "similarity", because some changes of regions cannot be precisely tracked by `_edit()`. Breaking change: The indices of `parser:trees()` behave now differently because existing regions are reused. So `parser:parse(true)` does not ensure that the tree table is list-like. Also, when new regions are added manually, they are first added and then the stale regions are discarded. So the existing uses of `trees[1]` may break. Use `next(trees())` instead.
Problem: Executing injection query on the full source is slow. Solution: Execute injection query only on the given range. Notes * This is not applicable to languages with combined injection. * `is_valid(false)` should run full injection to determine if the current set of children parsers and their regions are complete. Since this can be slow, `parse()` no longer checks this at the beginning. * Children parsers and regions outside the given range are discarded.
Problem: Partial injection invalidates regions and children parsers outside the visible range (passed to `parse`). Invalidating non-matching regions for each `parse()` is not efficient if multiple windows display different ranges of the same buffer. Solution: Let `parse()` take set of ranges, and invoke `parse()` for all visible ranges at `on_start`.
|LanguageTree:trees()| no longer guarantees that the returned table is list-like even after a full parse. NOTE: Actually, they should work fine without modification in these specific cases. * dev.lua: For the root tree, the returned table is always `{ tree }` unless the private method `LanguageTree:set_included_regions` is used. * gen_help_html.lua: `trees()` is initially list-like in the very first parse. So `ipairs` should be fine if source is not modified. However, it seems better to "fix" them for consistency.
9774bd3
to
3c7cfe5
Compare
I believe I have a commit that implements the major perf gain of this patch with a small fraction of the diffstat- would you mind taking a look? 6454f81 Note that some of the implementation might be a bit hacky, like with |
I think this is a reasonable fragment. But for the record, it has the problems mentioned in the following two commits of the current PR, though the impact won't be too bad.
I think it's clearer to use |
That makes sense. Maybe these can be saved for a follow up PR if the impact isn't too bad? I have also noticed like you say that the incremental validation for injection trees isn't too bad since there usually is a very small amount of code parsed by them
That sounds much better 👍 |
Yes I agree. |
Would you prefer to edit this PR, or would you like me to make one? Feel free to use the patch I posted if you like; I also don't mind making a separate change |
Please make a new PR |
Am I correct in thinking that this PR is now superseded, at least in its current form, and that a fresh PR for the additional changes on top of master would be preferable? |
Yes. |
The two main bottlenecks in editing big files with many injections are (1) parsing and (2) processing the injection queries. This PR resolves (2) by applying injection query to only visible regions.
Benchmark:
cpp
(.h
file is recognized ascpp
filetype by default):let g:__ts_debug = 1
:e dcn_3_2_0_sh_mask.h
,:lua vim.treesitter.start()
G?//<CR>
), then append a space (A<Space><Esc>
)j
) and append 0 (A0<Esc>
)~/.local/state/nvim/treesitter.log
before
after
Evaluation