Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background tree-sitter incremental parsing to improve TextArea responsiveness. #5645

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

paul-ollis
Copy link
Contributor

@paul-ollis paul-ollis commented Mar 13, 2025

Tree-sitter's incremental parsing is not always fast enough to be
executed for every editing keystroke without providing a very poor user
experience. For example:

"""Docstring with closing quotes not yet added <cursor here>

import textual
...

Typing into the above docstring can become very slow. For a 25,000 line
Python file on my laptop, each change causes a reparse time of about
0.2--0.3 seconds: editing is painful.

This change decouples incremental parsing from TextArea edits, using
tree-sitter's timeout mechanism and a task to effectivley run parsing in
the background, on a snapshot of the TextAreas's contents. While parsing
is in progress, editing of the TextArea text continues in a responsive
manner.

Edits to the tree-siiter parse tree are buffered until the background
parser is able to process them, while edits to the displayed text are
applied as they occur.

Please review the following checklist.

  • [x ] Docstrings on all new or modified functions / classes
  • [ n/a] Updated documentation
  • [ x] Updated CHANGELOG.md (where appropriate)

The TextArea widget code queries the entires syntax tree for each edit,
using the tree-sitter Query.captures method. This has potential scaling
issues, but such issues are exacerbated by the fact the Query.captures
method scales very poorly with the number of line it is asked to
generate captures for. It appears to be quadratic or something similar I
think - I strongly suspect a bug in tree-sitter or its python bindings.

On my laptop, this makes editing a 25,000 line Python file painfully
unresponsive. A 25,000 lines Python file is large, but not entirely
unreasonable. I actually became aware of this behaviour developing a
LaTeX editor, which showed symptoms after about 1,000 lines.

This change replaces the plain TextArea._highlights dictionary with a
dictonary-like class that lazily performs item access to build
hilghlight information for small blocks of (50) lines at a time. As well
as keeping the TextArea very much more responsive, it will reduce the
average memory requirements for larger documents.

During regression testing, I discovered that the per-line highlights are
not necessarily in the correct (best) order for syntax highlighting. For
example, highlighting within string expressions can lost. So I added
suitable sorting. This required that the snapshots for some tests to be
updated.
Tree-sitter's incremental parsing is not always fast enough to be
executed for every editing keystroke without providing a very poor user
experience. For example:

```python

"""Docstring with closing quotes not yet added <cursor here>

import textual
...
```

Typing into the above docstring can become very slow. For a 25,000 line
Python file on my laptop, each change causes a reparse time of about 0.2
- 0.3 seconds: editing is painful.

This change decouples incremental parsing from TextArea edits, using
tree-sitter's timeout mechanism and a task to effectivley run parsing in
the background, on a snapshot of the TextAreas's contents. While parsing
is in progress, editing of the TextArea text continues in a responsive
manner.

Edits to the tree-siiter parse tree are buffered until the background
parser is able to process them, while edits to the displayed text are
applied as they occur.
@paul-ollis
Copy link
Contributor Author

This incorporates changes from #5642 and will need rebasing if and when that PR
is merged or rejected.

I can imaging that this might inspire ideas for possible different approaches
and am more than happy to discuss and help update this PR with any you might have.

Things I am not entirely happy about.

  • In TextArea._handle_syntax_tree_update() forces a refresh of the entire
    visible area, which feels heavy handed.

    I think issue Investigate reducing TextArea updates #4086 might be relevant here.

  • A small number of the BackgroundSyntaxParser.trigger_syntax_tree_update tasks
    do not get cleaned up when the tests are executed, causing a variable number
    of 'Task was destroyed but it is pending!' messages to be printed.

    I have not yet convinced myself that this will not occur for normal
    applications.

@paul-ollis paul-ollis changed the title Bacground tree-sitter increment parsing to improve TextArea responsiveness. Bacground tree-sitter incremental parsing to improve TextArea responsiveness. Mar 13, 2025
@paul-ollis paul-ollis changed the title Bacground tree-sitter incremental parsing to improve TextArea responsiveness. Background tree-sitter incremental parsing to improve TextArea responsiveness. Mar 13, 2025
A first attempt at using changed_ranges to reduce the amount of
redrawing triggered by completion of a tree-sitter incremental parse.
@paul-ollis
Copy link
Contributor Author

I have updated this to make some use of Tree.updated_ranges().

This only reduces the additional redrawing that can (perhaps always) occur as the result of parsing completing in the background.

So as it stands, I think this PR exacerbates the problem that #4086 is trying to solve. I will perform some measurements and report them here - probably tomorrow.

@paul-ollis
Copy link
Contributor Author

I ran some tests (with instrumented code) and this PRs TextArea does not seem to obviously call render_line more often than the 'main' version.

BUT Then I realised that the code is almost certainly broken for a TextArea using soft wrapping. I think more automatic tests are needed. So, as it stands, this PR is not in a fit state to merge.

I will spend some more time over the next week looking at this. My LaTeX editor really needs and I would prefer to be able to use an unmodified version of textual.

@darrenburns
Copy link
Member

The "wrapped_document" attribute in TextArea lets you translate from a position in the unwrapped document to the corresponding position in the wrapped view of the document - hopefully that can be useful in your investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants