Skip to content

A benchmark of Rust/serde deserializers on configuration files

Notifications You must be signed in to change notification settings

Canop/bench-config-deserializers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Purpose and methodology

This program compares the time some serde deserializers take to deserialize some string into a configuration-like struct deriving Deserialize.

The benchmarker also checks the correct round-trip by checking equality of the deserialized config with the source struct (this involves enabling the float_roundtrip feature for serde_json).

A configuration file needs comments, and needs to be convenient enough to be written by humans. For those reasons, JSON isn't suitable, so this benchmark is really dedicated to Hjson, JSON5, YAML, and TOML. For a deeper discussion regarding the choice of a configuration format, read this blog post about configuration formats.

The struct used in this bench is bigger than usual configuration files but otherwise should be quite alike usual configurations. It is generated 10 times with different random seeds, and stored in memory to avoid disk IO perturbing the measurement.

JSON

The serde-json, deser_hjson, sonic-rs, and json5 deserializers are measured with the same JSON file built by serde_json with to_string_pretty.

JSON is a subset of both Hjson and JSON5, that's why a JSON file can be used to benchmark their parsers.

serde_json and sonic_rs are advantaged here, because they don't need to test for meany things you'd normally find in configurations: comments, multi-line texts, alternate ways to write data. They're still interesting as reference points for other deserializers as long as you remember they're not exactly doing the same work.

In this benchmark, the JSON5 deserializer appears slower than other ones. It's very probable it doesn't matter for you: deserializing a standard configuration is still done in less than 10 ms.

TOML

The toml and basic-toml deserializers are tested with the same struct, but encoded in a TOML string.

YAML

The serde_yaml deserializer is tested with the same struct, but encoded in a YAML string.

Results

Here are the results I get on my computer:

Fastest deserializer: serde_json
┌───────────┬─────────────┬─────────────────┬──────────┐
│   crate   │sum durations│diff with fastest│throughput│
├───────────┼─────────────┼─────────────────┼──────────┤
│serde_json │   40.79965ms│              +0%│  506 Mb/s│
│ sonic-rs  │  43.144377ms│              +6%│  478 Mb/s│
│deser-hjson│  91.190384ms│            +124%│  226 Mb/s│
│serde_yaml │ 341.829977ms│            +738%│   46 Mb/s│
│basic-toml │ 361.738851ms│            +787%│   39 Mb/s│
│   toml    │ 466.519594ms│           +1043%│   31 Mb/s│
│   json5   │ 854.551969ms│           +1995%│   24 Mb/s│
└───────────┴─────────────┴─────────────────┴──────────┘

A smaller "diff with fastest" is better, it's based on the sum of the durations of 10 random strings, with a size varying between 1 and 2 MB.

The througput is a little less relevant as some formats are more compact. In the specific serialization we do here, the TOML string is smaller than the JSON string (but depending on how you write it yourself, you may get different results).

To test the benchmark yourself with your hardware, use

cargo run +nightly --release

The +nightly is required by sonic_rs.

If you think some common or tricky patterns aren't well tested, that a config deserializer is missing, that I made an error, etc. please create an issue or contact me on Miaou.

About

A benchmark of Rust/serde deserializers on configuration files

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published