Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][Docs] PyArrow Documentation bug dataset.to_batches() #37560

Closed
aru-trackunit opened this issue Sep 5, 2023 · 2 comments · Fixed by #37605
Closed

[Python][Docs] PyArrow Documentation bug dataset.to_batches() #37560

aru-trackunit opened this issue Sep 5, 2023 · 2 comments · Fixed by #37605

Comments

@aru-trackunit
Copy link
Contributor

aru-trackunit commented Sep 5, 2023

Describe the bug, including details regarding any error messages, version, and platform.

https://github.com/apache/arrow/blob/main/python/pyarrow/_dataset.pyx#L3439

Default value is misleading and it suggests that user should define chunk size (128 Ki) rather than number of rows

I would use 2**17 or 128_000 rather than 128 Ki

Component(s)

Documentation, Python

@AlenkaF
Copy link
Member

AlenkaF commented Sep 5, 2023

I have to agree, it is very technical and can add some confusion. I would suggest to simply use 128,000 rows.
Would you be willing to contribute the change?

@kou kou changed the title PyArrow Documentation bug dataset.to_batches() [Python][Docs] PyArrow Documentation bug dataset.to_batches() Sep 7, 2023
@aru-trackunit
Copy link
Contributor Author

@AlenkaF thanks for sharing your thoughts PR is ready for a review

@AlenkaF AlenkaF added this to the 14.0.0 milestone Sep 12, 2023
AlenkaF pushed a commit that referenced this issue Sep 12, 2023
…128Ki to 128_000 (#37605)

### Rationale for this change

#37560

### Are these changes tested? -> No

### Are there any user-facing changes? -> Documentation
* Closes: #37560

Authored-by: Arkadiusz Rudny <aru@trackunit.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
… from 128Ki to 128_000 (apache#37605)

### Rationale for this change

apache#37560

### Are these changes tested? -> No

### Are there any user-facing changes? -> Documentation
* Closes: apache#37560

Authored-by: Arkadiusz Rudny <aru@trackunit.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
… from 128Ki to 128_000 (apache#37605)

### Rationale for this change

apache#37560

### Are these changes tested? -> No

### Are there any user-facing changes? -> Documentation
* Closes: apache#37560

Authored-by: Arkadiusz Rudny <aru@trackunit.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment