-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
It looks like the parser is one of the slower parts of rust-analyzer:
6.01% <rowan::green::GreenNode as core::hash::Hash>::hash
4.49% _int_malloc
4.34% ra_parser::parser::Parser::nth
3.66% ra_syntax::syntax_node::SyntaxTreeBuilder::finish_node
3.64% <ra_syntax::parsing::text_token_source::TextTokenSource as ra_parser::TokenSource>::lookahead_nth
3.51% _int_free
3.04% malloc
2.82% ra_parser::event::process
2.71% alloc::sync::Arc<T>::drop_slow
2.67% ra_mbe::subtree_source::SubtreeTokenSource::get
2.43% <smol_str::SmolStr as core::hash::Hash>::hash
1.98% __memmove_avx_unaligned_erms
1.72% <rowan::cursor::SyntaxNode as core::ops::drop::Drop>::drop
1.68% unlink_chunk.isra.0
1.54% <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter
1.45% malloc_consolidate
1.43% ra_syntax::parsing::text_tree_sink::TextTreeSink::do_token
1.42% ra_rustc_lexer::<impl ra_rustc_lexer::cursor::Cursor>::advance_token
1.25% rowan::cursor::FreeList::try_push
1.21% <rowan::cursor::SyntaxNodeChildren as core::iter::traits::iterator::Iterator>::next
1.19% realloc
0.99% <core::iter::adapters::Map<I,F> as core::iter::traits::iterator::Iterator>::next
0.93% <ra_mbe::subtree_source::SubtreeTokenSource as ra_parser::TokenSource>::lookahead_nth
Besides the green node hashing for which there is a TODO in rowan and the memory allocation, Parser::nth takes the most time while running ra_cli analysis-stats, especially once you factor its children from TextTokenSource and SubtreeTokenSource.
Note that nth is called very often, via current{,2,3} and at. The n ranges from 0 to 2, with an exception of 3 during parameter parsing. Looking at nth, it seems that:
- it calls
token_source.lookahead_nth(i)to get the current token kind - it passes the kind to
is_composite is_compositecallstoken_source.lookahead_nthfori,i + 1,i + 2- the things above happen in a loop from 0 to
n nthitself is often called by the parser in a row with 0..3- there's also
current3,current2,current, which callnththree, two and one time - these functions are themselves called in a row:
if let Some(t) = p.current3() {
// ...
}
if let Some(t) = p.current2() {
// ...
}
match p.current() {
// ...
}These cause token_source.lookahead_nth to be called a large number of times, and that function in itself is not necessarily inexpensive.
While there is some low-hanging fruit like avoiding the first lookahead in is_composite (the caller already knows it) and only computing second and third if they're needed, it might be better to keep track of the current and next three tokens in bump or somewhere similar.