Excessive Memory Consumption During Rolling Operations on Large DataFrames #16052
Labels
bug
Something isn't working
needs triage
Awaiting prioritization by a maintainer
python
Related to Python Polars
Checks
Reproducible example
Log output
Issue description
Issue Description:
During the execution of rolling operations on a large window, memory consumption increases significantly, ranging from 3 to 10 times the initial memory footprint. The operation involves a DataFrame comprising approximately 2000 categories spanning a 22-year period, with an initial memory footprint of around 1 to 2 gigabytes on the user's machine. However, upon executing rolling and aggregation functions, memory utilization escalates dramatically, potentially exhausting all available system memory resources. Notably, on Windows platforms, the memory remains at an elevated level even after the operation completes.
Steps to Reproduce:
Instantiate a DataFrame with approximately 2000 categories representing data spanning a 22-year period.
Execute rolling and aggregation operations on the DataFrame.
Monitor memory consumption during the operation.
Expected behavior
Memory consumption should remain within reasonable bounds relative to the initial size of the DataFrame, even during intensive rolling and aggregation operations.
Installed versions
The text was updated successfully, but these errors were encountered: