Skip to content

Repo to profile pathological compile times with knuffel / chumsky

Notifications You must be signed in to change notification settings

zesterer/netherquote

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

netherquote

Showcase for pathological compile times when using knuffel / chumsky / VERY LARGE types.

How to reproduce

The rust toolchain version is pinned to 1.59.0 stable already, .cargo/config.toml defaults to lld but it probably doesn't make a big difference here.

cargo run finishes in reasonable time, but cargo run --release spends a long time in "netherquote (bin)".

cargo timings don't show much, rustc self-profile shows a bunch of time spent in thin-LTO, I don't know how to go much deeper.

$ summarize summarize netherquote-867123.mm_profdata | less
+-------------------------------------------------+-----------+-----------------+----------+------------+
| Item                                            | Self time | % of total time | Time     | Item count |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_passes                                     | 12.66s    | 24.018          | 12.69s   | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| finish_ongoing_codegen                          | 11.80s    | 22.376          | 11.80s   | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_module_optimize                            | 8.90s     | 16.879          | 8.90s    | 17         |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_thin_lto_import                            | 4.89s     | 9.274           | 4.89s    | 16         |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_module_codegen_emit_obj                    | 4.71s     | 8.942           | 4.71s    | 17         |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_lto_optimize                               | 4.55s     | 8.629           | 4.55s    | 16         |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| codegen_module_perform_lto                      | 1.40s     | 2.658           | 15.67s   | 16         |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| codegen_module                                  | 1.12s     | 2.125           | 1.38s    | 16         |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| codegen_module_optimize                         | 1.01s     | 1.921           | 9.91s    | 17         |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| run_linker                                      | 232.54ms  | 0.441           | 232.54ms | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| codegen_fulfill_obligation                      | 209.15ms  | 0.397           | 345.49ms | 2308       |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| normalize_projection_ty                         | 201.78ms  | 0.383           | 206.45ms | 823        |
+-------------------------------------------------+-----------+-----------------+----------+------------+

cargo llvm-lines show some multiple-pages-long types: these are generated by one of knuffel's derive macro, which uses chumsky under the hood.

These generate a frightening amount of LLVM IR lines, considering config.rs is barely over a hundred lines:

$ cargo llvm-lines | less
  Lines          Copies       Function name
  -----          ------       -------------
  322815 (100%)  7117 (100%)  (TOTAL)
   38985 (12.1%)  107 (1.5%)  <chumsky::combinator::Then<A,B> as chumsky::Parser<I,(O,U)>>::parse_inner
   21584 (6.7%)   331 (4.7%)  core::result::Result<T,E>::map
   21132 (6.5%)    36 (0.5%)  <chumsky::combinator::Or<A,B> as chumsky::Parser<I,O>>::parse_inner
   14220 (4.4%)   158 (2.2%)  <chumsky::combinator::Map<A,F,O> as chumsky::Parser<I,U>>::parse_inner
   14148 (4.4%)   108 (1.5%)  <chumsky::combinator::Or<A,B> as chumsky::Parser<I,O>>::parse_inner::zip_with
   12336 (3.8%)    48 (0.7%)  <chumsky::combinator::Repeated<A> as chumsky::Parser<I,alloc::vec::Vec<O>>>::parse_inner::{{closure}}
   10154 (3.1%)   158 (2.2%)  <chumsky::combinator::Map<A,F,O> as chumsky::Parser<I,U>>::parse_inner::{{closure}}
    6944 (2.2%)    14 (0.2%)  <chumsky::primitive::Choice<(X_,Y_,Z_),E> as chumsky::Parser<I,O>>::parse_inner
    6664 (2.1%)   180 (2.5%)  <chumsky::combinator::Or<A,B> as chumsky::Parser<I,O>>::parse_inner::{{closure}}
    5634 (1.7%)   112 (1.6%)  chumsky::stream::Stream<I,S>::attempt
    4680 (1.4%)    78 (1.1%)  chumsky::stream::Stream<I,S>::try_parse::{{closure}}
    4440 (1.4%)    24 (0.3%)  <chumsky::combinator::Repeated<A> as chumsky::Parser<I,alloc::vec::Vec<O>>>::parse_inner
    3807 (1.2%)   253 (3.6%)  <chumsky::debug::Silent as chumsky::debug::Debugger>::invoke
    3777 (1.2%)   251 (3.5%)  <chumsky::debug::Verbose as chumsky::debug::Debugger>::invoke
    3272 (1.0%)    14 (0.2%)  <chumsky::primitive::Filter<F,E> as chumsky::Parser<I,I>>::parse_inner

This might have more to do with codegen + LLVM and less with "making rustc data structures / algorithms" more efficient, you be the judge!

About

Repo to profile pathological compile times with knuffel / chumsky

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 100.0%