-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Parquet] Add benchmarks for DELTA_BINARY_PACKED #14951
Comments
Should it just follow |
That seems like the right place @mapleFU. I think you can go ahead and open a PR! :) |
By the way, should we generate the input using random? Seems that if we use a incremental number or fixed number, the data of |
Use random but with two different magnitudes (narrow or wide) such as to exercise both the easily compressible and poorly compressible cases. |
By the way, I want to ask that what's the differences between |
I'm not convinced it's useful to add compression here. We're not writing the compression routines ourselves. |
Yes, you're right. But I think if compression is not tested, maybe we will only have benchmark for partial of reading parquet. For example, some benchmark would be "the decoder is how much slower than PlainDecoder" when CPU prefetch works well. |
…ing (#15140) This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed * Closes: #14951 Lead-authored-by: mwish <anmmscs_maple@qq.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
… encoding (apache#15140) This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed * Closes: apache#14951 Lead-authored-by: mwish <anmmscs_maple@qq.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
… encoding (apache#15140) This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed * Closes: apache#14951 Lead-authored-by: mwish <anmmscs_maple@qq.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
… encoding (apache#15140) This patch support benchmark for DELTA_BINARY_PACKED. Different from PLAIN, it should considering the cases that data can or cannot be well compressed * Closes: apache#14951 Lead-authored-by: mwish <anmmscs_maple@qq.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
Describe the enhancement requested
Now that we support the DELTA_BINARY_PACKED encoding for both reading and writing, we can add some benchmarks for it.
Component(s)
C++, Parquet
The text was updated successfully, but these errors were encountered: