You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Instead of creating a chunk per RowGroup, we should read at least for primitive type into a single, pre-allocated Array. This needs some new functionality in the Record reader classes and thus should be done after apache/parquet-cpp#462 is merged.
Wes McKinney / @wesm:
The main use case would be for pandas (where things need to be contiguous), but there memory will have to be copied in general when calling pyarrow.Table.to_pandas, so the benefits of this optimization would be minimal, if any. Producing large contiguous arrays could even be more expensive than the current behavior of creating chunked arrays
Instead of creating a chunk per RowGroup, we should read at least for primitive type into a single, pre-allocated Array. This needs some new functionality in the Record reader classes and thus should be done after apache/parquet-cpp#462 is merged.
Reporter: Uwe Korn / @xhochy
Assignee: Wes McKinney / @wesm
Related issues:
Note: This issue was originally created as ARROW-3774. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: