use serde instead of hand rolled serialization #18

piscisaureus · 2022-03-29T22:14:55Z

No description provided.

evmar · 2022-03-30T04:47:46Z

This is interesting! What does the ultimate serialization end up as? (I read it in a hex editor to debug it sometimes.) What is the performance impact? (I don't have a good way of measuring it, hmmmm.)

piscisaureus · 2022-03-30T05:57:09Z

This is interesting! What does the ultimate serialization end up as?

cbor.

The format was fairly arbitrarily chosen, the only constraint being that it needs to store its own serialized length so we can serialize multiple entries back-to-back . You can pick any other listed on https://serde.rs/ if you wish.

(I read it in a hex editor to debug it sometimes.)

If that's a concern, you could pick yaml or json.

What is the performance impact? (I don't have a good way of measuring it, hmmmm.)

I haven't measured it but I'm certain it's negligible (esp. when using a binary format like cbor).

Disassembled it for fun: https://gist.github.com/piscisaureus/d09b68b083d680295fa5dfa00ab95dbd
(Your hand-rolled impl: https://gist.github.com/piscisaureus/0f5edd534d1d1e6d8fb379a38c53812f)

evmar · 2022-03-30T18:27:21Z

I'm mildly positive on this. I had considered serde initially but I was reluctant to take on dependencies I don't well understand, and in particular here I don't understand what the actual serialization ends up being, but I can read about cbor I guess.

I think JSON/YAML aren't acceptable for perf reasons. Reading this is in the startup critical path and in my recollection the comparable file in Ninja ends up being megabytes.

I will look into a way to evaluating the perf of this and get back to you.

evmar · 2022-03-30T19:12:53Z

I tried this:

$ (cd llvm-project/build; cmake -GNinja -DLLVM_ENABLE_PROJECTS=clang ../llvm)
$ rm llvm-project/build/.n2_db
$ hyperfine -w1 './target/release/n2 -C ~/llvm-project/build clang-format'

This measures the no-op time to check that clang-format is up to date, which involves reading this db.

With the in-tree impl I got
Time (mean ± σ): 125.6 ms ± 2.1 ms [User: 108.2 ms, System: 15.0 ms]

With this patch (and main merged, to make the comparison fair) I got
Time (mean ± σ): 292.5 ms ± 1.9 ms [User: 158.1 ms, System: 132.2 ms]

Some other notes: the binary size grew by like 200kb which is like 15% which is no problem.
The database size went from 950kb to 835kb, I think because cbor handles variable-length integers perhaps.

Based on my understanding of the serde docs, it could be used to implement a serialization more similar to the current serialization (which was chosen for performance reasons). In other words I think we could still use serde with some more custom logic instead of cbor and maybe get most of the benefit? Not sure, I'm not too familiar with serde.

piscisaureus · 2022-03-30T20:39:58Z

Based on my understanding of the serde docs, it could be used to implement a serialization more similar to the current serialization (which was chosen for performance reasons). In other words I think we could still use serde with some more custom logic instead of cbor and maybe get most of the benefit? Not sure, I'm not too familiar with serde.

That's possible, albeit of somewhat debatable benefit. We could also try bincode.

With the in-tree impl I got Time (mean ± σ): 125.6 ms ± 2.1 ms [User: 108.2 ms, System: 15.0 ms]

With this patch (and main merged, to make the comparison fair) I got Time (mean ± σ): 292.5 ms ± 1.9 ms [User: 158.1 ms, System: 132.2 ms]

I suspect that it is not the serialization itself that causes the slowdown, but the fact that it "flushes to disk" on every write. That's expected because:

serde_cbor::ser::to_writer(&mut self.w, &entry)?;   // <-- self.w is a `File`.

I think if we put a buffered writer there it might get a little better.

piscisaureus · 2022-03-30T21:18:31Z

I've updated the PR to use a BufReader/BufWriter, hopefully that will reduce time spent in the kernel.

piscisaureus · 2022-03-30T21:52:17Z

OK I'm convinced now that (lack of) buffering was indeed the issue.
(serde is still ~10% slower but I would consider that ok.)

Hand rolled:

$ hyperfine -w1 './target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8'
Benchmark 1: ./target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8
  Time (mean ± σ):      49.2 ms ±   0.4 ms    [User: 36.8 ms, System: 11.3 ms]
  Range (min … max):    48.7 ms …  51.3 ms    58 runs

Serde without buffering:

$ hyperfine -w1 './target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8'
Benchmark 1: ./target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8
  Time (mean ± σ):     432.2 ms ±   2.2 ms    [User: 149.8 ms, System: 281.4 ms]
  Range (min … max):   428.9 ms … 435.1 ms    10 runs

Serde with BufReader/BufWriter:

$ hyperfine -w1 './target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8'
Benchmark 1: ./target/release/n2 -C ~/d/rusty_v8/target/release/gn_out rusty_v8
  Time (mean ± σ):      53.9 ms ±   0.3 ms    [User: 42.0 ms, System: 11.0 ms]
  Range (min … max):    53.6 ms …  54.5 ms    53 runs

evmar · 2022-04-02T21:31:45Z

I think to resummarize this one, it adds a dependency, grows the binary by a bit, makes the code a bit slower, and saves ~140 lines of serialization code. I'm not sure it's really worth it. I wonder if using bincode directly would be better? Its encoding looks pretty similar to what I wrote.

Port of evmar#18

GrigorenkoPV · 2023-06-13T19:24:28Z

I tried porting this on top of #58: https://github.com/GrigorenkoPV/n2/tree/serde

Comparison between 05d11b9 (baseline) & f827571 (serde):

Benchmark 1: ~/n2/baseline clang-format
  Time (mean ± σ):     241.1 ms ±  10.1 ms    [User: 208.9 ms, System: 31.1 ms]
  Range (min … max):   226.9 ms … 265.5 ms    50 runs
  
Benchmark 2: ~/n2/serde clang-format
  Time (mean ± σ):     230.6 ms ±   9.2 ms    [User: 201.1 ms, System: 28.9 ms]
  Range (min … max):   214.4 ms … 257.3 ms    50 runs

Surprisingly, it is ~4% faster.

The binary did, however, get bigger. Stripped versions, size in bytes:

1241272 baseline
1379896 serde

Another thing worth mentioning is that this allows to get rid of some dubious unsafe code like

unsafe { MaybeUninit::uninit().assume_init() },

from here.

Should I create a PR? I feel like I should mention @piscisaureus's authorship in my comment, because it is mostly his code, but I don't actually know how to do it. Probably there's some git command for that.

evmar · 2023-06-14T16:42:18Z

Closing in favor of #59

piscisaureus mentioned this pull request Mar 29, 2022

Hardcoded buffer size is too small for some builds #17

Closed

piscisaureus force-pushed the nih branch from 6562c20 to e15efe6 Compare March 29, 2022 22:18

emartinfigma force-pushed the main branch from f098b77 to 7a7d10f Compare March 30, 2022 19:02

piscisaureus force-pushed the nih branch from e15efe6 to 20eeb70 Compare March 30, 2022 21:16

piscisaureus force-pushed the nih branch 2 times, most recently from 804ac03 to d3756e0 Compare March 31, 2022 05:23

use serde instead of hand rolled serialization

eaa8c60

piscisaureus force-pushed the nih branch from c2e9197 to eaa8c60 Compare April 2, 2022 18:33

evmar force-pushed the main branch 2 times, most recently from 03de3da to afe500f Compare December 21, 2022 19:27

GrigorenkoPV added a commit to GrigorenkoPV/n2 that referenced this pull request Jun 13, 2023

Use serde instead of hand-rolled serialization

7a3e271

Port of evmar#18

GrigorenkoPV added a commit to GrigorenkoPV/n2 that referenced this pull request Jun 13, 2023

Use serde instead of hand-rolled serialization

f827571

Port of evmar#18

GrigorenkoPV mentioned this pull request Jun 13, 2023

Use serde instead of hand-rolled serialization, take 2 #59

Open

evmar closed this Jun 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use serde instead of hand rolled serialization #18

use serde instead of hand rolled serialization #18

piscisaureus commented Mar 29, 2022

evmar commented Mar 30, 2022

piscisaureus commented Mar 30, 2022 •

edited

Loading

evmar commented Mar 30, 2022

evmar commented Mar 30, 2022

piscisaureus commented Mar 30, 2022

piscisaureus commented Mar 30, 2022

piscisaureus commented Mar 30, 2022 •

edited

Loading

evmar commented Apr 2, 2022

GrigorenkoPV commented Jun 13, 2023 •

edited

Loading

evmar commented Jun 14, 2023

use serde instead of hand rolled serialization #18

use serde instead of hand rolled serialization #18

Conversation

piscisaureus commented Mar 29, 2022

evmar commented Mar 30, 2022

piscisaureus commented Mar 30, 2022 • edited Loading

evmar commented Mar 30, 2022

evmar commented Mar 30, 2022

piscisaureus commented Mar 30, 2022

piscisaureus commented Mar 30, 2022

piscisaureus commented Mar 30, 2022 • edited Loading

Hand rolled:

Serde without buffering:

Serde with BufReader/BufWriter:

evmar commented Apr 2, 2022

GrigorenkoPV commented Jun 13, 2023 • edited Loading

evmar commented Jun 14, 2023

piscisaureus commented Mar 30, 2022 •

edited

Loading

piscisaureus commented Mar 30, 2022 •

edited

Loading

GrigorenkoPV commented Jun 13, 2023 •

edited

Loading