Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet] Thrift: generate template method to accelerate reading thrift #41702

Closed
mapleFU opened this issue May 17, 2024 · 2 comments
Closed

Comments

@mapleFU
Copy link
Member

mapleFU commented May 17, 2024

Describe the enhancement requested

thrift cpp idl enable generate template for thrift proto.

Pro: This avoid lots of virtual function calls during deserializing.
Cons: more generated methods

Component(s)

C++, Parquet

@mapleFU
Copy link
Member Author

mapleFU commented May 17, 2024

After:

BM_ReadOffsetIndex/num_pages:8                          669 ns          657 ns      1065579 bytes_per_second=135.025M/s items_per_second=12.1793M/s
BM_ReadOffsetIndex/num_pages:64                        2898 ns         2821 ns       248112 bytes_per_second=258.966M/s items_per_second=22.6879M/s
BM_ReadOffsetIndex/num_pages:512                      19916 ns        19726 ns        35852 bytes_per_second=316.668M/s items_per_second=25.9557M/s
BM_ReadOffsetIndex/num_pages:1024                     38858 ns        38746 ns        17122 bytes_per_second=325.143M/s items_per_second=26.4285M/s
BM_ReadColumnIndex<Int64Type>/num_pages:8              1053 ns         1035 ns       682301 bytes_per_second=157.502M/s items_per_second=7.72643M/s
BM_ReadColumnIndex<Int64Type>/num_pages:64             3746 ns         3729 ns       185681 bytes_per_second=331.2M/s items_per_second=17.1633M/s
BM_ReadColumnIndex<Int64Type>/num_pages:512           25504 ns        23972 ns        29741 bytes_per_second=408.125M/s items_per_second=21.3579M/s
BM_ReadColumnIndex<Int64Type>/num_pages:1024          46253 ns        46073 ns        15157 bytes_per_second=424.313M/s items_per_second=22.2256M/s
BM_ReadColumnIndex<DoubleType>/num_pages:8             1048 ns         1037 ns       676113 bytes_per_second=157.265M/s items_per_second=7.71483M/s
BM_ReadColumnIndex<DoubleType>/num_pages:64            4462 ns         3900 ns       177688 bytes_per_second=316.707M/s items_per_second=16.4122M/s
BM_ReadColumnIndex<DoubleType>/num_pages:512          24274 ns        23614 ns        29629 bytes_per_second=414.318M/s items_per_second=21.682M/s
BM_ReadColumnIndex<DoubleType>/num_pages:1024         46635 ns        46293 ns        15047 bytes_per_second=422.297M/s items_per_second=22.12M/s
BM_ReadColumnIndex<FLBAType>/num_pages:8               1060 ns         1055 ns       658359 bytes_per_second=154.538M/s items_per_second=7.58105M/s
BM_ReadColumnIndex<FLBAType>/num_pages:64              4200 ns         3860 ns       182470 bytes_per_second=319.921M/s items_per_second=16.5788M/s
BM_ReadColumnIndex<FLBAType>/num_pages:512            23811 ns        23545 ns        28779 bytes_per_second=415.526M/s items_per_second=21.7452M/s
BM_ReadColumnIndex<FLBAType>/num_pages:1024           45753 ns        45554 ns        15271 bytes_per_second=429.147M/s items_per_second=22.4788M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:8           983 ns          976 ns       708366 bytes_per_second=167.154M/s items_per_second=8.19994M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:64         3389 ns         3299 ns       214331 bytes_per_second=374.309M/s items_per_second=19.3972M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:512       25683 ns        21360 ns        33753 bytes_per_second=458.051M/s items_per_second=23.9706M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:1024      40311 ns        39316 ns        17610 bytes_per_second=497.237M/s items_per_second=26.0454M/s

Before

BM_ReadOffsetIndex/num_pages:8                          980 ns          836 ns       820749 bytes_per_second=106.102M/s items_per_second=9.57045M/s
BM_ReadOffsetIndex/num_pages:64                        3735 ns         3546 ns       198467 bytes_per_second=206.025M/s items_per_second=18.0497M/s
BM_ReadOffsetIndex/num_pages:512                      31427 ns        26145 ns        28486 bytes_per_second=238.919M/s items_per_second=19.583M/s
BM_ReadOffsetIndex/num_pages:1024                     48456 ns        47966 ns        14038 bytes_per_second=262.643M/s items_per_second=21.3483M/s
BM_ReadColumnIndex<Int64Type>/num_pages:8              1224 ns         1173 ns       625894 bytes_per_second=139.003M/s items_per_second=6.81895M/s
BM_ReadColumnIndex<Int64Type>/num_pages:64             3920 ns         3892 ns       176412 bytes_per_second=317.285M/s items_per_second=16.4422M/s
BM_ReadColumnIndex<Int64Type>/num_pages:512           25308 ns        24824 ns        28486 bytes_per_second=394.119M/s items_per_second=20.6249M/s
BM_ReadColumnIndex<Int64Type>/num_pages:1024          49556 ns        47995 ns        14693 bytes_per_second=407.321M/s items_per_second=21.3356M/s
BM_ReadColumnIndex<DoubleType>/num_pages:8             1234 ns         1160 ns       650715 bytes_per_second=140.642M/s items_per_second=6.89934M/s
BM_ReadColumnIndex<DoubleType>/num_pages:64            4716 ns         4138 ns       178391 bytes_per_second=298.467M/s items_per_second=15.467M/s
BM_ReadColumnIndex<DoubleType>/num_pages:512          28596 ns        25773 ns        25998 bytes_per_second=379.605M/s items_per_second=19.8654M/s
BM_ReadColumnIndex<DoubleType>/num_pages:1024         48784 ns        47841 ns        14408 bytes_per_second=408.636M/s items_per_second=21.4045M/s
BM_ReadColumnIndex<FLBAType>/num_pages:8               1169 ns         1156 ns       598352 bytes_per_second=141.084M/s items_per_second=6.92104M/s
BM_ReadColumnIndex<FLBAType>/num_pages:64              4052 ns         3924 ns       176630 bytes_per_second=314.697M/s items_per_second=16.3081M/s
BM_ReadColumnIndex<FLBAType>/num_pages:512            27219 ns        25592 ns        29032 bytes_per_second=382.301M/s items_per_second=20.0064M/s
BM_ReadColumnIndex<FLBAType>/num_pages:1024           46800 ns        46665 ns        14965 bytes_per_second=418.933M/s items_per_second=21.9438M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:8          1084 ns         1074 ns       642573 bytes_per_second=151.872M/s items_per_second=7.45026M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:64         3464 ns         3420 ns       201861 bytes_per_second=361.132M/s items_per_second=18.7144M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:512       21252 ns        20953 ns        33105 bytes_per_second=466.948M/s items_per_second=24.4362M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:1024      41540 ns        41089 ns        17490 bytes_per_second=475.776M/s items_per_second=24.9213M/s

It's about 20% faster here.

pitrou pushed a commit that referenced this issue May 22, 2024
…te reading thrift (#41703)

### Rationale for this change

By default, the Thrift serializer and deserializer call many virtual functions. However, the Thrift C++ compiler has an option to generate template methods that does away with the cost of calling virtual functions. It seems to make the metadata read/write benchmarks around 10% faster.

### What changes are included in this PR?

1. `cpp/build-support/update-thrift.sh`: enable `templates` option to Thirft C++ compilerargument
2. `cpp/src/parquet/thrift_internal.h`: use generated code
3. `cpp/src/generated`: update generated files.

### Are these changes tested?

Covered by existing tests.

### Are there any user-facing changes?

No.

* GitHub Issue: #41702

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou
Copy link
Member

pitrou commented May 22, 2024

Issue resolved by pull request 41703
#41703

@pitrou pitrou added this to the 17.0.0 milestone May 22, 2024
@pitrou pitrou closed this as completed May 22, 2024
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…celerate reading thrift (apache#41703)

### Rationale for this change

By default, the Thrift serializer and deserializer call many virtual functions. However, the Thrift C++ compiler has an option to generate template methods that does away with the cost of calling virtual functions. It seems to make the metadata read/write benchmarks around 10% faster.

### What changes are included in this PR?

1. `cpp/build-support/update-thrift.sh`: enable `templates` option to Thirft C++ compilerargument
2. `cpp/src/parquet/thrift_internal.h`: use generated code
3. `cpp/src/generated`: update generated files.

### Are these changes tested?

Covered by existing tests.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#41702

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants