Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upCut parse time almost in half #40
Conversation
dtolnay
added some commits
Sep 4, 2018
This comment has been minimized.
This comment has been minimized.
|
Serde and serde_json author here by the way. |
This comment has been minimized.
This comment has been minimized.
|
This is awesome -- thank you! For whatever it's worth, here is the exact input file I used in my demo (~30MB compressed). As I mentioned there, I really haven't done any analysis on this, so it's not a surprise that there is some low-hanging fruit. (And I definitely missed serde_json's ability to handle concatenated JSON!) That said, I would love to know the methodology here: if you don't mind me asking, what tools do you/did you use for profiling? Thanks again for both serde and serde_json (truly awesome work) -- and for your interest in my little goober! |
This comment has been minimized.
This comment has been minimized.
|
Just callgrind in this case. $ valgrind --tool=callgrind ./target/release/statemap sample.out >rust.svg
$ callgrind_annotate --auto=yes callgrind.out.*Here is the top of the callgrind output before this PR. (Again isolated to parsing only.) 3,532,171,764 PROGRAM TOTALS
1,093,993,901 statemap::statemap::Statemap::json_end
486,996,711 <serde_json::read::StrRead<'a> as serde_json::Read<'a>>::parse_str
458,993,115 <&'a mut serde_json::de::Deserializer<R> as serde::de::Deserializer<'de>>::deserialize_struct
138,999,100 statemap::statemap::Statemap::ingest
111,998,799 <&'a mut serde_json::de::Deserializer<R> as serde::de::Deserializer<'de>>::deserialize_string
90,998,635 core::num::<impl core::str::FromStr for u64>::from_str
90,002,430 jemalloc.c:mallocx
72,001,944 jemalloc.c:sdallocx
65,896,974 jemalloc.c:isfree
60,158,832 tcache.h:isfree
60,071,035 tcache.h:mallocx
56,999,145 serde_json::de::from_str
51,999,220 <core::marker::PhantomData<T> as serde::de::DeserializeSeed<'de>>::deserialize
49,999,328 statemap::statemap::Statemap::json_startFrom this I know:
Then I confirm that what callgrind suggests likely to help does actually help (it doesn't always). The old-fashioned way:
|
masklinn
referenced this pull request
Sep 9, 2018
Open
serde(untagged) has significantly worse performances than hand-rolling the match #1381
This comment has been minimized.
This comment has been minimized.
|
Just wanted to follow up over here to let you know that I haven't dropped this! I've been looking at the performance of your suggestions (both of which improve performance, but I'm trying to quantify them separately and in a lab environment), and taking apart why Rust is performing so much better than C. ;) Anyway, more to come -- and thank you again for your time and help! |
This comment has been minimized.
This comment has been minimized.
gnzlbg
commented
Sep 19, 2018
•
You are probably already aware of this but @bcantrill might not be. There is a neat tool for benchmarking cli applications called |
dtolnay commentedSep 4, 2018
I used the following program to generate a 1.6 gigabyte sample input that resembles what you show in your presentation. Hopefully this is sufficiently representative to perform useful optimizations as far as parsing. In the presentation it looked like these
StatemapInputDatumvalues make up the bulk of the data.First commit
Remove homegrown JSON lexing code
Much of the ingest time is being spent in your
json_startandjson_endfunctions, not in serde_json. I replaced those with usingserde_json::StreamDeserializerto read one JSON element at a time without computing length in advance.On my sample input this commit drops parse time from 10.1 seconds to 6.8 seconds. I isolated parse time by inserting an early return just below the match statement that contains the
"illegal time value"error.Second commit
Avoid allocating String for datum time
A huge amount of the parse time is spent allocating Strings. The string associated with the time of each datum is never used as a string but just parsed to a u64. This commit avoids allocating those as strings in the first place.
On my sample input this commit drops parse time from 6.8 seconds to 5.7 seconds. I isolated parse time by inserting an early return just below the line that contains
let nstates: u32.