Parsing seems slow

It looks like the parser is one of the slower parts of rust-analyzer:

```
6.01%  <rowan::green::GreenNode as core::hash::Hash>::hash
4.49%  _int_malloc
4.34%  ra_parser::parser::Parser::nth
3.66%  ra_syntax::syntax_node::SyntaxTreeBuilder::finish_node
3.64%  <ra_syntax::parsing::text_token_source::TextTokenSource as ra_parser::TokenSource>::lookahead_nth
3.51%  _int_free
3.04%  malloc
2.82%  ra_parser::event::process
2.71%  alloc::sync::Arc<T>::drop_slow
2.67%  ra_mbe::subtree_source::SubtreeTokenSource::get
2.43%  <smol_str::SmolStr as core::hash::Hash>::hash
1.98%  __memmove_avx_unaligned_erms
1.72%  <rowan::cursor::SyntaxNode as core::ops::drop::Drop>::drop
1.68%  unlink_chunk.isra.0
1.54%  <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter
1.45%  malloc_consolidate
1.43%  ra_syntax::parsing::text_tree_sink::TextTreeSink::do_token
1.42%  ra_rustc_lexer::<impl ra_rustc_lexer::cursor::Cursor>::advance_token
1.25%  rowan::cursor::FreeList::try_push
1.21%  <rowan::cursor::SyntaxNodeChildren as core::iter::traits::iterator::Iterator>::next
1.19%  realloc
0.99%  <core::iter::adapters::Map<I,F> as core::iter::traits::iterator::Iterator>::next
0.93%  <ra_mbe::subtree_source::SubtreeTokenSource as ra_parser::TokenSource>::lookahead_nth
```

Besides the green node hashing for which there is a TODO in `rowan` and the memory allocation, `Parser::nth` takes the most time while running `ra_cli analysis-stats`, especially once you factor its children from `TextTokenSource` and `SubtreeTokenSource`.

Note that `nth` is called very often, via `current{,2,3}` and `at`. The `n` ranges from 0 to 2, with an exception of 3 during parameter parsing. Looking at `nth`, it seems that:

 - it calls `token_source.lookahead_nth(i)` to get the current token kind
 - it passes the kind to `is_composite`
 - `is_composite` calls `token_source.lookahead_nth` for `i`, `i + 1`, `i + 2`
 - the things above happen in a loop from 0 to `n`
 - `nth` itself is often called by the parser in a row with 0..3
 - there's also `current3`, `current2`, `current`, which call `nth` three, two and one time
 - these functions are themselves called in a row:

```rs
    if let Some(t) = p.current3() {
        // ...
    }
    if let Some(t) = p.current2() {
        // ...
    }
    match p.current() {
        // ...
    }
```

These cause `token_source.lookahead_nth` to be called a large number of times, and that function in itself is not necessarily inexpensive.

While there is some low-hanging fruit like avoiding the `first` lookahead in `is_composite` (the caller already knows it) and only computing `second` and `third` if they're needed, it might be better to keep track of the current and next three tokens in `bump` or somewhere similar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parsing seems slow #1643

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parsing seems slow #1643

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions