-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docs and benchmark for JSON flattening parser #1921
Conversation
ff62504
to
ef9b6d3
Compare
<version>0.1.0</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.jayway.jsonpath</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't have to add this, it should get pulled in by java-util now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gianm removed JsonPath dependency from benchmarks
The benchmarks were unnecessarily creating a StringInputRowParser, I've revised them to use only the JSONParser/JSONPathParser. New benchmark numbers: Benchmark Mode Cnt Score Error Units baseline is old JSONParser After discussing some profiling results with @gianm today, I added a 4th benchmark that extracts root-level fields with JsonPath (i.e., specify a "nested" field but with a root expression). This was done to get an idea of the overhead added by JsonPath, normally the new JSONPathParser would read directly from the Jackson-provided Map for root fields. With the same flattened input as baseline and preflatten: It looks like:
|
Given the results, I am +1 for deprecating old JsonParser and just using the new one. |
👍 |
1e27b04
to
380b830
Compare
Before merging, needs a new druid-api version that includes druid-io/druid-api@e8c3533 |
Updated pom.xml to use druid-api 3.14, is this good to merge? |
@gianm @himanshug Do you have any more feedback for this? |
👍 |
1 similar comment
👍 |
Add docs and benchmark for JSON flattening parser
Part of a set of 3 related pull requests, addressing Druid issue:
#1839
metamx/java-util#34 -- new JSON parser
druid-io/druid-api#65 -- ingestion spec modifications
#1921 -- docs and benchmark
Adds documentation and a JMH benchmark for comparing the new JsonPath-based parser with the existing parser.
Example benchmark result:
java -jar target/benchmarks.jar FlattenJSONBenchmark -wi 1 -i 250 -f 1 -v EXTRA
Result "baseline":
19.366 ±(99.9%) 0.268 us/op [Average](min, avg, max) = (17.979, 19.366, 28.522), stdev = 1.274
CI (99.9%): [19.098, 19.634](assumes normal distribution)
Result "flatten":
25.113 ±(99.9%) 0.434 us/op [Average](min, avg, max) = (22.311, 25.113, 37.351), stdev = 2.061
CI (99.9%): [24.679, 25.547](assumes normal distribution)
Run complete. Total time: 00:10:46
Benchmark Mode Cnt Score Error Units
FlattenJSONBenchmark.baseline avgt 250 19.366 ± 0.268 us/op
FlattenJSONBenchmark.flatten avgt 250 25.113 ± 0.434 us/op