Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use serde instead of hand rolled serialization #18

Closed
wants to merge 1 commit into from

Conversation

piscisaureus
Copy link
Contributor

No description provided.

@evmar
Copy link
Owner

evmar commented Mar 30, 2022

This is interesting! What does the ultimate serialization end up as? (I read it in a hex editor to debug it sometimes.) What is the performance impact? (I don't have a good way of measuring it, hmmmm.)

@piscisaureus
Copy link
Contributor Author

piscisaureus commented Mar 30, 2022

This is interesting! What does the ultimate serialization end up as?

cbor.

The format was fairly arbitrarily chosen, the only constraint being that it needs to store its own serialized length so we can serialize multiple entries back-to-back . You can pick any other listed on https://serde.rs/ if you wish.

(I read it in a hex editor to debug it sometimes.)

If that's a concern, you could pick yaml or json.

What is the performance impact? (I don't have a good way of measuring it, hmmmm.)

I haven't measured it but I'm certain it's negligible (esp. when using a binary format like cbor).

Disassembled it for fun: https://gist.github.com/piscisaureus/d09b68b083d680295fa5dfa00ab95dbd
(Your hand-rolled impl: https://gist.github.com/piscisaureus/0f5edd534d1d1e6d8fb379a38c53812f)

@evmar
Copy link
Owner

evmar commented Mar 30, 2022

I'm mildly positive on this. I had considered serde initially but I was reluctant to take on dependencies I don't well understand, and in particular here I don't understand what the actual serialization ends up being, but I can read about cbor I guess.

I think JSON/YAML aren't acceptable for perf reasons. Reading this is in the startup critical path and in my recollection the comparable file in Ninja ends up being megabytes.

I will look into a way to evaluating the perf of this and get back to you.

@evmar
Copy link
Owner

evmar commented Mar 30, 2022

I tried this:

$ (cd llvm-project/build; cmake -GNinja -DLLVM_ENABLE_PROJECTS=clang ../llvm)
$ rm llvm-project/build/.n2_db
$ hyperfine -w1 './target/release/n2 -C ~/llvm-project/build clang-format'

This measures the no-op time to check that clang-format is up to date, which involves reading this db.

With the in-tree impl I got
Time (mean ± σ): 125.6 ms ± 2.1 ms [User: 108.2 ms, System: 15.0 ms]

With this patch (and main merged, to make the comparison fair) I got
Time (mean ± σ): 292.5 ms ± 1.9 ms [User: 158.1 ms, System: 132.2 ms]

Some other notes: the binary size grew by like 200kb which is like 15% which is no problem.
The database size went from 950kb to 835kb, I think because cbor handles variable-length integers perhaps.

Based on my understanding of the serde docs, it could be used to implement a serialization more similar to the current serialization (which was chosen for performance reasons). In other words I think we could still use serde with some more custom logic instead of cbor and maybe get most of the benefit? Not sure, I'm not too familiar with serde.

@piscisaureus
Copy link
Contributor Author

Based on my understanding of the serde docs, it could be used to implement a serialization more similar to the current serialization (which was chosen for performance reasons). In other words I think we could still use serde with some more custom logic instead of cbor and maybe get most of the benefit? Not sure, I'm not too familiar with serde.

That's possible, albeit of somewhat debatable benefit. We could also try bincode.

With the in-tree impl I got Time (mean ± σ): 125.6 ms ± 2.1 ms [User: 108.2 ms, System: 15.0 ms]

With this patch (and main merged, to make the comparison fair) I got Time (mean ± σ): 292.5 ms ± 1.9 ms [User: 158.1 ms, System: 132.2 ms]

I suspect that it is not the serialization itself that causes the slowdown, but the fact that it "flushes to disk" on every write. That's expected because:

serde_cbor::ser::to_writer(&mut self.w, &entry)?;   // <-- self.w is a `File`.

I think if we put a buffered writer there it might get a little better.

@piscisaureus
Copy link
Contributor Author

I've updated the PR to use a BufReader/BufWriter, hopefully that will reduce time spent in the kernel.

@piscisaureus
Copy link
Contributor Author

piscisaureus commented Mar 30, 2022

OK I'm convinced now that (lack of) buffering was indeed the issue.
(serde is still ~10% slower but I would consider that ok.)

Hand rolled:

$ hyperfine -w1 './target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8'
Benchmark 1: ./target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8
  Time (mean ± σ):      49.2 ms ±   0.4 ms    [User: 36.8 ms, System: 11.3 ms]
  Range (min … max):    48.7 ms …  51.3 ms    58 runs

Serde without buffering:

$ hyperfine -w1 './target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8'
Benchmark 1: ./target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8
  Time (mean ± σ):     432.2 ms ±   2.2 ms    [User: 149.8 ms, System: 281.4 ms]
  Range (min … max):   428.9 ms … 435.1 ms    10 runs

Serde with BufReader/BufWriter:

$ hyperfine -w1 './target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8'
Benchmark 1: ./target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8
  Time (mean ± σ):      53.9 ms ±   0.3 ms    [User: 42.0 ms, System: 11.0 ms]
  Range (min … max):    53.6 ms …  54.5 ms    53 runs

@evmar
Copy link
Owner

evmar commented Apr 2, 2022

I think to resummarize this one, it adds a dependency, grows the binary by a bit, makes the code a bit slower, and saves ~140 lines of serialization code. I'm not sure it's really worth it. I wonder if using bincode directly would be better? Its encoding looks pretty similar to what I wrote.

@evmar evmar force-pushed the main branch 2 times, most recently from 03de3da to afe500f Compare December 21, 2022 19:27
GrigorenkoPV added a commit to GrigorenkoPV/n2 that referenced this pull request Jun 13, 2023
GrigorenkoPV added a commit to GrigorenkoPV/n2 that referenced this pull request Jun 13, 2023
@GrigorenkoPV
Copy link
Contributor

GrigorenkoPV commented Jun 13, 2023

I tried porting this on top of #58: https://github.com/GrigorenkoPV/n2/tree/serde

Comparison between 05d11b9 (baseline) & f827571 (serde):

Benchmark 1: ~/n2/baseline clang-format
  Time (mean ± σ):     241.1 ms ±  10.1 ms    [User: 208.9 ms, System: 31.1 ms]
  Range (min … max):   226.9 ms … 265.5 ms    50 runs
  
Benchmark 2: ~/n2/serde clang-format
  Time (mean ± σ):     230.6 ms ±   9.2 ms    [User: 201.1 ms, System: 28.9 ms]
  Range (min … max):   214.4 ms … 257.3 ms    50 runs

Surprisingly, it is ~4% faster.

The binary did, however, get bigger. Stripped versions, size in bytes:

1241272 baseline
1379896 serde

Another thing worth mentioning is that this allows to get rid of some dubious unsafe code like

unsafe { MaybeUninit::uninit().assume_init() },

from here.

Should I create a PR? I feel like I should mention @piscisaureus's authorship in my comment, because it is mostly his code, but I don't actually know how to do it. Probably there's some git command for that.

@evmar
Copy link
Owner

evmar commented Jun 14, 2023

Closing in favor of #59

@evmar evmar closed this Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants