-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-14951: [C++][Parquet] Add benchmarks for DELTA_BINARY_PACKED encoding #15140
Conversation
|
state, [](size_t number) { return std::vector<int32_t>(number, 64); }); | ||
} | ||
|
||
static void BM_DeltaBitPackingEncode_Int64_Equals(benchmark::State& state) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect different performance on this than on narrow data? Otherwise we might as well not bother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact I think no different when encoding currently. Maybe it will be a impact when compress is introduced. I will paste my testing result later.
On my PC, the CPU is:
with RelWithDebInfo, benchmark result:
For comparing, the Plain is:
and dict is:
|
(DeltaBitPackingDecode_Int32 runs much faster on my Mac with |
Much faster than what? |
Than running on my PC :) Hold on minutes, I'll take a bath and go back to upload some flamegraphs I run with
Output is:
And the PLAIN is:
|
That's indeed quite impressive... |
After enable
|
Well, you should never run benchmarks in Debug mode. |
Seems our tests are really unstable:
|
Yes, the S3 tests are unfortunately a bit flaky, especially under Windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, will merge if CI passes
Thanks, I'm sleepy now, seems the performance differences between |
Benchmark runs are scheduled for baseline = ceec795 and contender = 8ed4513. 8ed4513 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
… encoding (apache#15140) This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed * Closes: apache#14951 Lead-authored-by: mwish <anmmscs_maple@qq.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
… encoding (apache#15140) This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed * Closes: apache#14951 Lead-authored-by: mwish <anmmscs_maple@qq.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
… encoding (apache#15140) This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed * Closes: apache#14951 Lead-authored-by: mwish <anmmscs_maple@qq.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed