Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ruby] Add Arrow::RecordBatch#each_raw_record #33749

Closed
otegami opened this issue Jan 18, 2023 · 2 comments · Fixed by #37137
Closed

[Ruby] Add Arrow::RecordBatch#each_raw_record #33749

otegami opened this issue Jan 18, 2023 · 2 comments · Fixed by #37137

Comments

@otegami
Copy link
Contributor

otegami commented Jan 18, 2023

Target method

Arrow::RecordBatch#raw_records

Proposed feature

Add Arrow::RecordBatch#each_raw_record method which is an iterator of Arrow::RecordBatch#raw_records.
Add Arrow::RecordBatch#each_raw_record and Arrow::Table#each_raw_record to make Arrow::RecordBatch#raw_records be iterable.

Impact of this request

It can iterate over huge datasets, such as those using the Apache Parquet format.

Component(s)

Ruby

@kou
Copy link
Member

kou commented Jan 18, 2023

Arrow::Table#each_raw_record is also needed.

@otegami
Copy link
Contributor Author

otegami commented Jan 18, 2023

Thank you for reviewing it. I've just fixed it.

@kou kou closed this as completed in #37137 Sep 5, 2023
kou added a commit that referenced this issue Sep 5, 2023
### Rationale for this change

This change allows for efficient iteration over large datasets, particularly those utilizing the Apache Parquet format.

### What changes are included in this PR?

- Add the following methods to make the raw_records method iterable.
  - Arrow::RecordBatch#each_raw_record
  - Arrow::Table#each_raw_record
- Add related test

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

This PR is related to  #33749
* Closes: #33749

Lead-authored-by: otegami <a.s.takuya1026@gmail.com>
Co-authored-by: takuya kodama <a.s.takuya1026@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 14.0.0 milestone Sep 5, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…#37137)

### Rationale for this change

This change allows for efficient iteration over large datasets, particularly those utilizing the Apache Parquet format.

### What changes are included in this PR?

- Add the following methods to make the raw_records method iterable.
  - Arrow::RecordBatch#each_raw_record
  - Arrow::Table#each_raw_record
- Add related test

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

This PR is related to  apache#33749
* Closes: apache#33749

Lead-authored-by: otegami <a.s.takuya1026@gmail.com>
Co-authored-by: takuya kodama <a.s.takuya1026@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…#37137)

### Rationale for this change

This change allows for efficient iteration over large datasets, particularly those utilizing the Apache Parquet format.

### What changes are included in this PR?

- Add the following methods to make the raw_records method iterable.
  - Arrow::RecordBatch#each_raw_record
  - Arrow::Table#each_raw_record
- Add related test

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

This PR is related to  apache#33749
* Closes: apache#33749

Lead-authored-by: otegami <a.s.takuya1026@gmail.com>
Co-authored-by: takuya kodama <a.s.takuya1026@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants