Treesitter performance (tracking issue) #22426

lewis6991 · 2023-02-27T09:54:16Z

Problem

Our treesitter implementation could be more efficient to overcome many known issues with performance. Specifically around large files.

Injection languages are not incremental
See: Treesitter: LanguageTree is not incremental #18107
Resolved with: perf(treesitter): smarter languagetree invalidation #22309
Rendering long lines is inefficient
See: treesitter incredible slow with single line big json due to hotspot in nvim_buf_set_extmark #14756
Resolved with: perf(treesitter): use columns to filter matches in TSHighlighter #15405
Parsing blocks user input
This is especially bad for the initial parse since this takes the longest.

Either:
(a): We run parsing on a separate thread (difficult).
(b): Use treesitters parsing timeout feature to segment the parse over several event-loop iterations (less-difficult).

See: Treesitter hangs neovim when trying to open a big log file with JSON lines #20765
Resolved with: (b) feat(treesitter): async parsing #22420
Injection queries are run too often
Currently we run the full injection query on every edit to a buffer.
- Incrementally run the query on tree changes ( perf(treesitter): only run injection query on tree changes #23070).
  - ... on tree changes that intersect the injection regions.
- Partially run the query for rendered lines.
Injections are not highlighted incrementally
Currently every injection is parsed, even if it never gets displayed.
Resolved with: feat(treesitter)!: incremental injection parsing #24647

Other

Add better benchmarking to the testsuite ( perf(treesitter): only run injection query on tree changes #23070)

Files for benchmarking:

clason · 2023-02-27T10:35:27Z

Another issue is incremental querying, which may help with expensive injection predicates as in nvim-treesitter/nvim-treesitter#2996?

lewis6991 · 2023-02-27T10:46:50Z

This will be partially resolved by just running the queries less in #22394.

clason · 2023-03-08T14:29:08Z

"Bad" files for performance testing: https://raw.githubusercontent.com/torvalds/linux/master/drivers/gpu/drm/amd/include/asic_reg/bif/bif_5_1_sh_mask.h
https://raw.githubusercontent.com/torvalds/linux/master/drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_2_0_sh_mask.h

(cpp header files consisting of, respectively, 33k and 220k injections of cpp into preprocessor macros)

lewis6991 · 2023-08-12T15:24:10Z

Idea from chat: Highlighter should not parse off-screen injections (should help with those files).

#24647 is now merged which should improve the performance of large files with lots of injections.

tomtomjhj · 2023-12-25T07:07:03Z

Partially run the query for rendered lines.

This would require separate queries for non-combined injection and combined injection (or some other significant API changes). Combined injection query should be run on the entire source because the invisble part of the region affects the visible part, just like the root language. So the partial query execution (using start and stop params of iter_matches) should be applied only to non-combined injections. However, currently there is no way to iterate only the matches that have some specific metadata. The simpliest solution I can think of is to have two separate query files for combined and non-combined ones.

lewis6991 · 2023-12-25T15:02:57Z

Non-combined is the norm, so if combined is detected such optimisations should just be disabled.

I think adding additional states could over complicate things for not a lot of return and running the query twice would just be worse over all.

tomtomjhj · 2023-12-25T16:33:52Z

Partially run the query for rendered lines.

Another issue with this is that it's difficult to make injected region parsing incremental. This requires some sort of persistent ID for each region and the corresponding tree so that the old tree can be used when parsing the corresponding region. The current implementation of LanguageTree simply uses the index of the region in the whole list of regions captured by the last execution of the injection query, which is frequently invalidated. Unfortunately, treesitter doesn't seem to provide such a functionality. Using the equality of the region value could be an option, but that looks inefficient.

Even if there were some ID mechanism that allows incremental parsing of injected regions, I think it would be still inefficient because all those regions should be adjusted after each edit. So the parser would need to discard trees (e.g., with LRU cache) to balance the cost of maintaining the existing trees and parsing the new regions from scratch.

Update: see #26827

lewis6991 · 2023-12-25T17:37:06Z

Helix apparently already does this so I suggest you look at their implementation.

lewis6991 added bug issues reporting wrong behavior performance issues reporting performance problems treesitter and removed bug issues reporting wrong behavior labels Feb 27, 2023

This comment was marked as resolved.

Sign in to view

lewis6991 removed the bug issues reporting wrong behavior label Mar 17, 2023

stsewd mentioned this issue Mar 28, 2023

Heavy performance regression for block-comments stsewd/tree-sitter-comment#17

Closed

aMOPel mentioned this issue May 28, 2023

[Feature Request] language injection alaviss/tree-sitter-nim#24

Closed

lewis6991 mentioned this issue May 31, 2023

Editing of large javascript file is very slow in current nvim-treesitter version nvim-treesitter/nvim-treesitter#2996

Closed

This comment has been minimized.

Sign in to view

neovim deleted a comment from MKrbm Sep 18, 2023

justinmk mentioned this issue Sep 18, 2023

Tree-sitter based highlight may be inefficient #18108

Closed

wookayin mentioned this issue Sep 25, 2023

Treesitter performance: parsing query files are much slow #25356

Closed

MKrbm mentioned this issue Oct 9, 2023

Slow highlight for large cpp files nvim-treesitter/nvim-treesitter#5503

Closed

This was referenced Oct 18, 2023

better format string injection aMOPel/nvim-treesitter-nim#1

Open

feat(nim): added nim parser and queries nvim-treesitter/nvim-treesitter#5556

Merged

clason mentioned this issue Jan 24, 2024

LanguageTree:register_cbs { on_changedtree } reports a lot of false changes for injected languages #27155

Closed

wookayin mentioned this issue Jan 24, 2024

Treesitter performance in cpp: editing with treesitter highlightning is noticably slow #27184

Closed

vanaigr mentioned this issue Apr 12, 2024

Disable injections for large files nvim-treesitter/nvim-treesitter#6435

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Treesitter performance (tracking issue) #22426

Treesitter performance (tracking issue) #22426

lewis6991 commented Feb 27, 2023 •

edited

Loading

clason commented Feb 27, 2023

lewis6991 commented Feb 27, 2023 •

edited

Loading

clason commented Mar 8, 2023 •

edited

Loading

This comment was marked as resolved.

lewis6991 commented Aug 12, 2023 •

edited

Loading

This comment has been minimized.

tomtomjhj commented Dec 25, 2023

lewis6991 commented Dec 25, 2023 •

edited

Loading

tomtomjhj commented Dec 25, 2023 •

edited

Loading

lewis6991 commented Dec 25, 2023 •

edited

Loading

Treesitter performance (tracking issue) #22426

Treesitter performance (tracking issue) #22426

Comments

lewis6991 commented Feb 27, 2023 • edited Loading

Problem

Other

Files for benchmarking:

clason commented Feb 27, 2023

lewis6991 commented Feb 27, 2023 • edited Loading

clason commented Mar 8, 2023 • edited Loading

This comment was marked as resolved.

lewis6991 commented Aug 12, 2023 • edited Loading

This comment has been minimized.

tomtomjhj commented Dec 25, 2023

lewis6991 commented Dec 25, 2023 • edited Loading

tomtomjhj commented Dec 25, 2023 • edited Loading

lewis6991 commented Dec 25, 2023 • edited Loading

lewis6991 commented Feb 27, 2023 •

edited

Loading

lewis6991 commented Feb 27, 2023 •

edited

Loading

clason commented Mar 8, 2023 •

edited

Loading

lewis6991 commented Aug 12, 2023 •

edited

Loading

lewis6991 commented Dec 25, 2023 •

edited

Loading

tomtomjhj commented Dec 25, 2023 •

edited

Loading

lewis6991 commented Dec 25, 2023 •

edited

Loading