from_arrow
is not zero-cost
#17409
Labels
bug
Something isn't working
needs triage
Awaiting prioritization by a maintainer
python
Related to Python Polars
Checks
Reproducible example
Log output
No response
Issue description
I have an arrow table fetched from a cloud service. The table is 4 GB. I measure max resident memory usage after the download and it's 6 GB (the process needs some more extra memory). Then I convert it with
from_arrow
and the max resident memory usage spikes to 8 GB, implying that the DF was copied.The data types in the df are 64-bit floats and utf8 strings. It has 80M entries and 7 columns.
Expected behavior
I expect zero cost copies.
Installed versions
The text was updated successfully, but these errors were encountered: