Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve parser performance a bit #5812

Merged
merged 4 commits into from
Jan 17, 2022
Merged

improve parser performance a bit #5812

merged 4 commits into from
Jan 17, 2022

Conversation

pennae
Copy link
Contributor

@pennae pennae commented Dec 21, 2021

the current parser is not as efficient as it could be. this is an attempt at optimizing it at least a little, and we get around 4% improvement on nixos system builds out of it. pure parsing (such as loading the massive hackage registry) benefits much more, around 13%.

src/libexpr/parser.y Outdated Show resolved Hide resolved
src/libexpr/parser.y Outdated Show resolved Hide resolved
@pennae pennae requested a review from edolstra January 3, 2022 17:00
@pennae
Copy link
Contributor Author

pennae commented Jan 12, 2022

@edolstra @thufschmitt may we ask this PR be checked? :) pretty much everything else we've not opened a PR for yet depends on this in one way or other.

Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small comment, but LGTM otherwise

(note that I’m not really familiar with the parser code, so I might have missed something)

src/libexpr/parser.y Show resolved Hide resolved
every stringy token the lexer returns is turned into a Symbol and not
used further, so we don't have to strdup. using a string_view is
sufficient, but due to limitations of the current parser we have to use
a POD type that holds the same information.

gives ~2% on system build, 6% on search, 8% on parsing alone

 # before

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     610.6 ms ±   2.4 ms    [User: 602.5 ms, System: 7.8 ms]
  Range (min … max):   606.6 ms … 617.3 ms    50 runs

Benchmark 2: nix eval -f hackage-packages.nix
  Time (mean ± σ):     430.1 ms ±   1.4 ms    [User: 393.1 ms, System: 36.7 ms]
  Range (min … max):   428.2 ms … 434.2 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      3.032 s ±  0.005 s    [User: 2.808 s, System: 0.223 s]
  Range (min … max):    3.023 s …  3.041 s    50 runs

 # after

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     574.7 ms ±   2.8 ms    [User: 566.3 ms, System: 8.0 ms]
  Range (min … max):   569.2 ms … 580.7 ms    50 runs

Benchmark 2: nix eval -f hackage-packages.nix
  Time (mean ± σ):     394.4 ms ±   0.8 ms    [User: 361.8 ms, System: 32.3 ms]
  Range (min … max):   392.7 ms … 395.7 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.976 s ±  0.005 s    [User: 2.757 s, System: 0.218 s]
  Range (min … max):    2.966 s …  2.990 s    50 runs
speeds up parsing by ~3%, system builds by a bit more than 1%

 # before

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     574.7 ms ±   2.8 ms    [User: 566.3 ms, System: 8.0 ms]
  Range (min … max):   569.2 ms … 580.7 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     394.4 ms ±   0.8 ms    [User: 361.8 ms, System: 32.3 ms]
  Range (min … max):   392.7 ms … 395.7 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.976 s ±  0.005 s    [User: 2.757 s, System: 0.218 s]
  Range (min … max):    2.966 s …  2.990 s    50 runs

 # after

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     572.4 ms ±   2.3 ms    [User: 563.4 ms, System: 8.6 ms]
  Range (min … max):   566.9 ms … 579.1 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     381.7 ms ±   1.0 ms    [User: 348.3 ms, System: 33.1 ms]
  Range (min … max):   380.2 ms … 387.7 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.936 s ±  0.005 s    [User: 2.715 s, System: 0.221 s]
  Range (min … max):    2.923 s …  2.946 s    50 runs
when given a string yacc will copy the entire input to a newly allocated
location so that it can add a second terminating NUL byte. since the
parser is a very internal thing to EvalState we can ensure that having
two terminating NUL bytes is always possible without copying, and have
the parser itself merely check that the expected NULs are present.

 # before

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     572.4 ms ±   2.3 ms    [User: 563.4 ms, System: 8.6 ms]
  Range (min … max):   566.9 ms … 579.1 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     381.7 ms ±   1.0 ms    [User: 348.3 ms, System: 33.1 ms]
  Range (min … max):   380.2 ms … 387.7 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.936 s ±  0.005 s    [User: 2.715 s, System: 0.221 s]
  Range (min … max):    2.923 s …  2.946 s    50 runs

 # after

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     571.7 ms ±   2.4 ms    [User: 563.3 ms, System: 8.0 ms]
  Range (min … max):   566.7 ms … 579.7 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     376.6 ms ±   1.0 ms    [User: 345.8 ms, System: 30.5 ms]
  Range (min … max):   374.5 ms … 379.1 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.922 s ±  0.006 s    [User: 2.707 s, System: 0.215 s]
  Range (min … max):    2.906 s …  2.934 s    50 runs
mainly to avoid an allocation and a copy of a string that can be
modified in place (ever since EvalState holds on to the buffer, not the
generated parser itself).

 # before

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     571.7 ms ±   2.4 ms    [User: 563.3 ms, System: 8.0 ms]
  Range (min … max):   566.7 ms … 579.7 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     376.6 ms ±   1.0 ms    [User: 345.8 ms, System: 30.5 ms]
  Range (min … max):   374.5 ms … 379.1 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.922 s ±  0.006 s    [User: 2.707 s, System: 0.215 s]
  Range (min … max):    2.906 s …  2.934 s    50 runs

 # after

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     570.4 ms ±   2.8 ms    [User: 561.3 ms, System: 8.6 ms]
  Range (min … max):   564.6 ms … 578.1 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     375.4 ms ±   1.3 ms    [User: 343.2 ms, System: 31.7 ms]
  Range (min … max):   373.4 ms … 378.2 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.925 s ±  0.006 s    [User: 2.704 s, System: 0.219 s]
  Range (min … max):    2.910 s …  2.942 s    50 runs
@edolstra
Copy link
Member

I still don't understand why the null terminators are necessary. That seems very hacky and infects other parts of the code (like drainFD).

@pennae
Copy link
Contributor Author

pennae commented Jan 17, 2022

the generated parser demands them, no idea why (though we'd guess it's to do with lookahead in the absence of a separate tokenizer). we can either add an extra terminator ourselves or have the generated parser do it at the cost of an extra alloc+copy. and while we do agree that it's kinda terrible and it does infect anything it touches, adding that extra byte feels well worth it for avoiding potentially hundreds of megabytes of string copies that only happen to add that one extra byte 🙁

@edolstra edolstra merged commit fc2443a into NixOS:master Jan 17, 2022
@edolstra
Copy link
Member

Thanks for the explanation!

@pennae pennae deleted the small-perf-improvements branch January 19, 2022 16:13
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-24/17230/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants