Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ruby][Parquet] Add Parquet::ArrowFileReader#each_row_group #36008

Closed
kou opened this issue Jun 9, 2023 · 3 comments · Fixed by #36022
Closed

[Ruby][Parquet] Add Parquet::ArrowFileReader#each_row_group #36008

kou opened this issue Jun 9, 2023 · 3 comments · Fixed by #36022

Comments

@kou
Copy link
Member

kou commented Jun 9, 2023

Describe the enhancement requested

It's a convenient method:

class Parquet::ArrowFileReader
  def each_row_group
    return to_enum(__method__) {n_row_groups} unless block_given?
    
    n_row_groups.times do |i|
      yield(read_row_group(i))
    end
  end
end

Component(s)

Parquet, Ruby

@kou
Copy link
Member Author

kou commented Jun 9, 2023

@heronshoes @otegami Do you want to work on this?

@otegami
Copy link
Contributor

otegami commented Jun 9, 2023

@kou
Yeah, I really want to try it if you don't mind.

@kou
Copy link
Member Author

kou commented Jun 9, 2023

Go ahead!

kou pushed a commit that referenced this issue Jun 11, 2023
…#36022)

### Rationale for this change
This change allows you to read a large Parquet file per row group.
- ref: #36001

### What changes are included in this PR?
- Add Parquet::ArrowFileReader#each_row_group
- Add the related test about it

### Are these changes tested?
Yes
- I don't have confidence about the test. Could you give me a comment?

### Are there any user-facing changes?
No

Close: #36008
* Closes: #36008

Authored-by: otegami <a.s.takuya1026@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 13.0.0 milestone Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants