You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There appears to be a memory leak in the DataFrame.dropna(inplace=True) function. Please see the ipython session:
[1]:
import ipython_memory_usage.ipython_memory_usage as imu
imu.start_watching_memory()
import pandas as pd
import numpy as np
pd.__version__
Out[1]: '0.16.2'
In [1] used 23.6875 MiB RAM in 0.70s, peaked 0.00 MiB above current, total RAM usage 59.76 MiB
In [2]:
df = pd.DataFrame(np.ones(1e7))
df.info(memory_usage=True)
<class 'pandas.core.frame.DataFrame'>
Int64Index: 10000000 entries, 0 to 9999999
Data columns (total 1 columns):
0 float64
dtypes: float64(1)
memory usage: 152.6 MB
In [2] used 153.3477 MiB RAM in 0.14s, peaked 0.00 MiB above current, total RAM usage 213.11 MiB
In [3]:
df.dropna(inplace=True)
df.head(2)
Out[3]:
0
0 1
1 1
In [3] used 0.1758 MiB RAM in 0.60s, peaked 385.92 MiB above current, total RAM usage 213.29 MiB
In [4]:
df.dropna(inplace=True)
df.head(2)
Out[4]:
0
0 1
1 1
In [4] used 152.9375 MiB RAM in 0.61s, peaked 182.86 MiB above current, total RAM usage 366.22 MiB
In [5]:
df.dropna(inplace=True)
df.head(2)
Out[5]:
0
0 1
1 1
In [5] used 152.9297 MiB RAM in 0.58s, peaked 272.79 MiB above current, total RAM usage 519.15 MiB
Continuing to run cells 3,4 and 5 will add around 150MB to the memory usage each time. The behaviour is only materialised when the df.head() command is run after the dropna inplace.
The behaviour can also be seen in a cell which is not dropping na in place:
df = df.dropna()
df.head(2)
but this only happens if the df.dropna(inplace=True) has been run within the same cell already.
The text was updated successfully, but these errors were encountered:
There appears to be a memory leak in the DataFrame.dropna(inplace=True) function. Please see the ipython session:
Continuing to run cells 3,4 and 5 will add around 150MB to the memory usage each time. The behaviour is only materialised when the df.head() command is run after the dropna inplace.
The behaviour can also be seen in a cell which is not dropping na in place:
but this only happens if the df.dropna(inplace=True) has been run within the same cell already.
The text was updated successfully, but these errors were encountered: