Avro Deserialize

Code for Deserializing Millions Of Messages Per Second Per Core article.

The code shows the evolution of deserialization pipeline for Avro messages and is roughly separated into 4 parts:

Baseline implementation using Apache Avro deserializer (tsdb.benchmark_01);
A manually written parser which parses message by a fixed schema using byte operations on source byte array (tsdb.benchmark_02);
An improvement of manually written parses which does string interning - instead of parsing string fields it matches string payload to unique id such that same payloads has same id and different strings has different ids. It has two sub versions - one stores strings in a HashMap, while another uses array (tsdb.benchmark_03);
A concurrent version of the parser, which uses concurrent string interner to have a synchronized view on string ids (tsdb.benchmark_04).

The parser (with slight changes) is powering deserialization pipeline in Agoda's internal distributed high-load time-series database. tsdb.deserialize.InterningAvroDeserializerArrayTags is the final version of the deserializer with tsdb.interner.ConcurrentStringInterner used for string interning.

How to run benchmark

JDK11+ is required. Install SBT to run benchmarks without IDE, or install Scala plugin and import as SBT project.

Following command will run all benchmarks for 10 forks, 4 threads, 10 warmup iteration, 30 measure iterations, "Fail on error" enabled and 1 second per warmup/measure iteration:

sbt 'jmh:run -f 10 -t 4 -wi 10 -i 30 -w 1 -r 1 -foe true'

With these settings and given 5 benchmarks the whole run will take ~35 minutes. It is advised to not run anything else besides benchmark on the machine and provide sufficient cooling, otherwise performance numbers might drift or be too noisy.

Name	Name	Last commit message	Last commit date
Latest commit svshb More info for readme, added `.size` for StringInterner and a bit of t… Nov 8, 2021 ab858a0 · Nov 8, 2021 History 5 Commits
project	project	init	Jun 23, 2020
src	src	More info for readme, added `.size` for StringInterner and a bit of t…	Nov 8, 2021
.gitignore	.gitignore	init	Jun 23, 2020
.scalafmt.conf	.scalafmt.conf	init	Jun 23, 2020
LICENCE	LICENCE	init	Jun 23, 2020
README.MD	README.MD	More info for readme, added `.size` for StringInterner and a bit of t…	Nov 8, 2021
build.sbt	build.sbt	init	Jun 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Avro Deserialize

How to run benchmark

About

Releases

Packages

Languages

License

svshb/AvroDeserialize

Folders and files

Latest commit

History

Repository files navigation

Avro Deserialize

How to run benchmark

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages