-
Notifications
You must be signed in to change notification settings - Fork 4k
Open
Description
Describe the enhancement requested
Currently, we has the issue below:
- [Python][Dataset][Parquet] Enable Pre-Buffering by default for Parquet s3 datasets #36765
- [Python] Read table stuck and hangs forever #37139
When reading parquet from s3, it's important to prefetch some file or row-groups. However, larger prefetch depth might cause more memory usage. And what make it even worse is that, interfaces like to_table and other reading might has different behaviors for prefetch. So we'd better and more document for it.
Component(s)
Documentation, Parquet, Python