Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet] Prebuffer: Avoid calculating prebuffer column bitmap multiple times #36773

Closed
mapleFU opened this issue Jul 19, 2023 · 0 comments · Fixed by #36774
Closed

[C++][Parquet] Prebuffer: Avoid calculating prebuffer column bitmap multiple times #36773

mapleFU opened this issue Jul 19, 2023 · 0 comments · Fixed by #36774

Comments

@mapleFU
Copy link
Member

mapleFU commented Jul 19, 2023

Describe the enhancement requested

According to #36192 and #36649 . RowGroupReader using a bitmap to control a column-level prebuffer.

However, if all columns are selected, this will be a heavy overhead for building a bitmap multiple times.

Component(s)

C++, Parquet

pitrou pushed a commit that referenced this issue Jul 24, 2023
…ltiple times (#36774)

### Rationale for this change

According to #36192 and #36649 . RowGroupReader using a bitmap to control a column-level prebuffer.

However, if all columns are selected, this will be a heavy overhead for building a bitmap multiple times.

### What changes are included in this PR?

Build `Prebuffer` Bitmap once, and reuse that vector.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: #36773

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou added this to the 14.0.0 milestone Jul 24, 2023
R-JunmingChen pushed a commit to R-JunmingChen/arrow that referenced this issue Aug 20, 2023
…map multiple times (apache#36774)

### Rationale for this change

According to apache#36192 and apache#36649 . RowGroupReader using a bitmap to control a column-level prebuffer.

However, if all columns are selected, this will be a heavy overhead for building a bitmap multiple times.

### What changes are included in this PR?

Build `Prebuffer` Bitmap once, and reuse that vector.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: apache#36773

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…map multiple times (apache#36774)

### Rationale for this change

According to apache#36192 and apache#36649 . RowGroupReader using a bitmap to control a column-level prebuffer.

However, if all columns are selected, this will be a heavy overhead for building a bitmap multiple times.

### What changes are included in this PR?

Build `Prebuffer` Bitmap once, and reuse that vector.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: apache#36773

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants