rerun cell with datafame make memory leak

# env
use vscode remote-ssh connect to debian server and use the jupyter

debian12
vscode=Version: 1.99.0
python=3.10
ipykernal=6.29.5
pandas=2.2.3

# replay
**create a big dataframe**
```
import pandas as pd
import numpy as np

np.random.seed(0)
num_rows = 10000000  
num_cols = 10  
# create the random number
data = np.random.randint(0, 100, size=(num_rows, num_cols))
df = pd.DataFrame(data, columns=[f'col_{i}' for i in range(num_cols)])
```

**monitor the memory use**
```
%load_ext ipython_memory_usage
%imu_start
```

when **rerun** the cell below ,momery will used increasing
```
df_temp = df.copy(deep=True)
df_temp.head()
```

in my computer,the result like
```
[Out] In [4] used 763.6 MiB RAM in 0.55s (system mean cpu 40%, single max cpu 100%), peaked 0.0 MiB above final usage, current RAM usage now 1632.0 MiB

[Out] In [5] used 763.0 MiB RAM in 0.55s (system mean cpu 19%, single max cpu 100%), peaked 0.0 MiB above final usage, current RAM usage now 2394.9 MiB

[Out] In [6] used 763.1 MiB RAM in 0.55s (system mean cpu 25%, single max cpu 100%), peaked 0.0 MiB above final usage, current RAM usage now 3158.0 MiB
```

# try to solve

**After many tries , I find the main reason is `df_temp.head()`** ;
If rerun `df_temp = df.copy(deep=True)` ,the used memory doesn't increase;

If change the code to 
```
df_snapshot_test = df_snapshot.copy(deep=True)
df_snapshot_test.head().copy(deep=True)
```
rerun will not increase memory too;

I have try 
```
import gc
gc.collect()  
```
or 
```
from IPython.display 
import clear_output
```
both can't free the memory

[another similar issuse](https://github.com/pandas-dev/pandas/issues/11050#issue-105793890)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

rerun cell with datafame make memory leak #1391

env

replay

try to solve

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

rerun cell with datafame make memory leak #1391

Description

env

replay

try to solve

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions