Finish migrating gitoxide from nom 7 to winnow 0.5 #956

epage · 2023-08-01T22:03:52Z

Follow up to #948 and #951

gix-actor, gix-object, gix-ref, and gix-testtools are interrelated, so this does it all at once

Things left out

Removing winnow from the API of the above packages
Switching from_bytes functions to using parse instead of (the implied) parse_next

BREAKING CHANGE: nom, and now winnow, is exposed in the public API of some of these packages

epage · 2023-08-01T22:04:39Z

Since I'm seeing local test failures, I'm rely on CI for the cases I'm not easily hitting due to other errors

epage · 2023-08-02T01:17:28Z

Huh, mac-fast is failing in CI differently than local (in gix-worktree) but all other test jobs passed.

Byron · 2023-08-02T07:30:07Z

So good to see this happening, and it's basically done already and working in the first stage :)!

Since I'm seeing local test failures, I'm rely on CI for the cases I'm not easily hitting due to other errors

I'd love to treat this as a bug, knowing that there were problems with git-lfs initially, but maybe the root cause is another one. One could try to checkout a fresh worktree and try just test there, just to be sure it's not some strange state that accumulated locally. If the issue is reproducible, I invite you to open an issue with the details so this can be fixed.

Huh, mac-fast is failing in CI differently than local (in gix-worktree) but all other test jobs passed.

It looks like everything passed now except for lint, whose failure is expected I presume due to the early stage of the PR.

gix-object/src/commit/decode.rs

gix-ref/src/store/file/log/line.rs

gix-object/src/lib.rs

tests/tools/src/lib.rs

Byron · 2023-08-02T19:21:32Z

Just a quick note before I forget or it becomes a surprise later: d3f65d8 will be conflicting, and I recommend rebasing onto main and keep only the test. It should be easy from there to properly implement a fix for it. My TODO's are hopefully worth a chuckle, given how winnow should solve all of them.

epage · 2023-08-02T19:30:12Z

Already rebased and working on cleaning it up right now

gix-actor/src/signature/decode.rs

gix-object/src/commit/decode.rs

gix-object/src/commit/message/decode.rs

Byron · 2023-08-03T07:19:39Z

I am cheering you on from the side-lines, the goal is already in sight :)!

gix-ref/src/store/file/log/line.rs

This is intended to make the winnow 0.5 transition easier

This is prep for switching from `VerboseError` to `ContextError`

epage · 2023-08-17T20:59:41Z

With winnow-rs/winnow#316 and winnow-rs/winnow#317 merged into 0.5 and backported to 0.4 and 0.3

TagRefIter(sig)
510192e (main): 165.88 ns
ee75de1: 159.62 ns
86d7fd1 (w1): 157.66 ns
ac0e81c (w2): 140.30 ns
b37a909 (w3): 147.65 ns
fee441d (w4): 148.87 ns
b3f0418 (w5): 157.37 ns
86ea47f (winnow04): 157.88 ns
9ed7df0 (w6): 157.88 ns
f0cbf81 (w7): 158.08 ns
2fb4a54: 158.66 ns
12f03db (w8): 175.76 ns
3f8c91f (w9): 155.50 ns
a07590c: 155.83 ns
266864f: 155.77 ns
df226dd: 159.27 ns
93fc441: 158.16 ns
62f7bc6: 160.28 ns
be1f4b5 (winnow05): 155.82 ns
80d7991: 155.48 ns
b7ee9a1 (winnow): 162.52 ns

And for across-the-board

$ cargo bench -p gix-object@0.34.0 --bench decode-objects
     Running benches/decode_objects.rs (target/release/deps/decode_objects-2030aacefe52a773)
Gnuplot not found, using plotters backend
CommitRef(sig)          time:   [536.77 ns 537.97 ns 539.26 ns]
                        change: [-34.148% -33.943% -33.752%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild

CommitRefIter(sig)      time:   [648.26 ns 662.31 ns 684.63 ns]
                        change: [-30.898% -29.723% -28.166%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) high mild
  8 (8.00%) high severe

TagRef(sig)             time:   [170.18 ns 170.50 ns 170.86 ns]
                        change: [+4.3542% +4.7672% +5.1841%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

TagRefIter(sig)         time:   [174.34 ns 174.55 ns 174.77 ns]
                        change: [+1.5350% +2.1648% +2.7832%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

TreeRef(sig)            time:   [101.54 ns 101.70 ns 101.88 ns]
                        change: [+7.2389% +7.5092% +7.7863%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

TreeRefIter(sig)        time:   [46.948 ns 46.997 ns 47.054 ns]
                        change: [-6.4018% -6.2109% -5.9962%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  8 (8.00%) high mild
  3 (3.00%) high severe

Seeing some of those lows, I feel like we could do better

Byron · 2023-08-18T06:33:39Z

It's strange that these benchmark improvements seem to have no bearing on the real-world test.

❯ /Users/byron/dev/github.com/Byron/gitoxide/target/release/ein t h
 08:25:29 traverse commit graph done 1.2M commits in 10.35s (113.0k commits/s)
 08:25:29        estimate-hours Extracted and organized data from 1169974 commits in 15.651417ms (74751952 commits/s)
total hours: 1119736.25
total 8h days: 139967.03
total commits = 1169974
total authors: 31645
total unique authors: 24103 (23.83% duplication)

linux ( master) +369 -819 [!] took 10s
❯ ein t h
 08:25:54 traverse commit graph done 1.2M commits in 8.50s (137.6k commits/s)
 08:25:54        estimate-hours Extracted and organized data from 1169974 commits in 15.961291ms (73300712 commits/s)
total hours: 1119736.25
total 8h days: 139967.03
total commits = 1169974
total authors: 31645
total unique authors: 24103 (23.83% duplication)

(The cache was hot for both runs and I made sure the the …/target/release/ein binary is truly the one from b7ee9a1)

I am absolutely puzzled by that. Is this something you can reproduce?

epage · 2023-08-18T13:50:41Z

$ git switch main
$ cd ..
$ cp -r gitoxide gitoxide1
$ cd gitoxide1
$ git switch winnow
$ cd ../linux
$
$ cargo run --release --manifest-path ../gitoxide/Cargo.toml --bin ein -- t h
$ cargo run --release --manifest-path ../gitoxide/Cargo.toml --bin ein -- t h
$ cargo run --release --manifest-path ../gitoxide/Cargo.toml --bin ein -- t h
$
$ cargo run --release --manifest-path ../gitoxide1/Cargo.toml --bin ein -- t h
$ cargo run --release --manifest-path ../gitoxide1/Cargo.toml --bin ein -- t h
$ cargo run --release --manifest-path ../gitoxide1/Cargo.toml --bin ein -- t h

I found that both main and winnow had the same range of completion times, from ~13.5ms to 15ms.

Byron · 2023-08-18T14:20:43Z

Thank you, this probably means that it's an issue with the way it is compiled. On x86 there seem to be optimizations that don't kick in anymore on ARM64.

What's more surprising is that after a cargo clean, the main/nom version at 510192e delivers just 112.6k commits/s, whereas the winnow version at b7ee9a1 gets up to 115k commits/s on a fresh build as well.

But here it comes. The version I produced with cargo install that is from 9ef69e1 delivers 139.1k commits/s. After cargo installing the nom version at 510192e I am getting the same ~139 commit/s.

Now, after cargo installing the winnow version at b7ee9a1 I also see numbers around 139k commits/s. Running ./target/release/ein after installing it directly also yields the good values. However, rebuilding it with cargo build --release --bin ein shows the bad values once again.

The binary sizes are similar, so there must be something about cargo install that makes all the difference - maybe it builds something like target-cpu=native?

epage · 2023-08-18T14:33:00Z

The binary sizes are similar, so there must be something about cargo install that makes all the difference - maybe it builds something like target-cpu=native?

Looking at the code, I'm not seeing --target-cpu being used anywhere. The only difference I see with cargo build is that it hard codes --release. Might be interesting to look at verbose output in case I missed anything.

Byron · 2023-08-18T15:39:40Z

I didn't see it that with -v either, but tried to build with -C target-cpu=native - the resulting binary was just as fast as the bad case. In desperation, I wanted to see if ~/.cargo/bin is a magic place, and fortunately after copying the 'bad' binary onto the 'good' one it was still slow when executing it through the PATH.

But then it struck me: cargo install runs on the latest available dependencies, whereas cargo build runs on Cargo.lock. After running cargo update it actually got as fast as it should be even after using cargo build --release.

I think this case is solved now 🎉.

Byron · 2023-08-18T15:43:14Z

Performance-wise, it seems like the benchmarks were a great indicator after all and by the looks of it (comparing winnow to nom), all values are better or have regressed by less than 5% (TagRef).

Gnuplot not found, using plotters backend
CommitRef(sig)          time:   [1.1752 µs 1.1888 µs 1.2056 µs]
                        change: [+24.581% +25.593% +26.850%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  6 (6.00%) high severe

CommitRefIter(sig)      time:   [1.2681 µs 1.2690 µs 1.2699 µs]
                        change: [+20.260% +20.443% +20.635%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe

TagRef(sig)             time:   [245.54 ns 245.73 ns 245.95 ns]
                        change: [-5.2706% -4.9124% -4.6490%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low mild
  7 (7.00%) high mild
  1 (1.00%) high severe

TagRefIter(sig)         time:   [243.32 ns 243.75 ns 244.42 ns]
                        change: [-6.7406% -6.4258% -6.0412%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  3 (3.00%) high severe

TreeRef(sig)            time:   [138.67 ns 139.40 ns 140.11 ns]
                        change: [+4.5696% +5.1061% +5.6137%] (p = 0.00 < 0.05)
                        Performance has regressed.

TreeRefIter(sig)        time:   [61.047 ns 61.104 ns 61.186 ns]
                        change: [-0.9602% -0.8084% -0.6524%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  1 (1.00%) high severe

Thus I think performance won't be preventing a merge anymore, quite the opposite :).

epage · 2023-08-18T16:50:34Z

I leave it to you. For me, there are more things I want to look at but that can be done independent of this being merged.

epage · 2023-08-18T19:04:51Z

I dropped b7ee9a1 because it just seemed to slow things down (oddly enough)

Byron · 2023-08-18T19:28:32Z

I leave it to you. For me, there are more things I want to look at but that can be done independent of this being merged.

Great, then I will work on merging this PR now as it seems ready in every way, while looking forward to seeing those ideas of yours to come to fruition.

Before I go: I have a feeling your CPU might be twice as fast as mine. With that said, what's the commits/s you get when running ein t h on the linux kernel? The best number I have seen is ~139k commits/s.

epage · 2023-08-18T19:46:25Z

$ cargo run --release --manifest-path ../gitoxide/Cargo.toml --bin ein -- t h
    Finished release [optimized] target(s) in 0.09s
     Running `/home/epage/src/personal/gitoxide/target/release/ein t h`
 14:45:32 traverse commit graph done 1.1M commits in 8.30s (131.6k commits/s)
 14:45:32        estimate-hours Extracted and organized data from 1092424 commits in 13.815333ms (79073304 commits/s)
total hours: 1050880.75
total 8h days: 131360.09
total commits = 1092424
total authors: 29848
total unique authors: 22742 (23.81% duplication)

Byron · 2023-08-18T20:05:13Z

Aw, snap, I hoped to see something in the 200k range. Of course, the workload is bound by zlib performance and parsing isn't holding anything back here, so improvements are probably out of range unless object decompression gets faster over time. Oh well, I keep an eye on that number with a new baseline of ~140k commits/s expecting it to go up over time.

I guess the real benchmark is against git shortlog:

✦ ❯ /usr/bin/time -lp git shortlog >/dev/null
real 11.27
user 10.95
sys 0.31
          2335965184  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              143036  page reclaims
                   7  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   0  voluntary context switches
                 229  involuntary context switches
         75705099109  instructions retired
         36068289312  cycles elapsed
          1516510016  peak memory footprint

linux ( master) +369 -819 [!] took 11s
✦ ❯ /usr/bin/time -lp ein t h
 22:00:38 traverse commit graph done 1.2M commits in 8.49s (137.8k commits/s)
 22:00:38        estimate-hours Extracted and organized data from 1169974 commits in 14.844667ms (78814432 commits/s)
total hours: 1119736.25
total 8h days: 139967.03
total commits = 1169974
total authors: 31645
total unique authors: 24103 (23.83% duplication)
real 8.84
user 10.03
sys 0.56
          1024720896  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
               66234  page reclaims
                 240  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   5  voluntary context switches
              228344  involuntary context switches
         67037989003  instructions retired
         32757758823  cycles elapsed
           209817216  peak memory footprint

Interestingly, besides memory usage, ein is just a little bit faster if it was single-threaded, but can truly get ahead thanks to trivial multi-threading (trivial because Rust makes it so).

Byron

Thanks a million for working on this topic! It's a great outcome and I couldn't have wished for more.

epage marked this pull request as draft August 1, 2023 22:03

epage changed the title ~~Finish migrating gitoxide from nom 7 to winnow 0.5~~ WIP: Finish migrating gitoxide from nom 7 to winnow 0.5 Aug 1, 2023

epage mentioned this pull request Aug 2, 2023

Tests fail to run locally #960

Closed

1 task

epage force-pushed the winnow branch from eb1c2ba to 5aa61db Compare August 2, 2023 15:51