[BUG] Memory leak in TimeSeriesDataSet

- PyTorch-Forecasting version: 0.8.5
- PyTorch version: 1.9.0
- PyTorch Lightning version: 1.4.0
- Python version: 3.8
- Operating System: MacOS 11.4

### Expected behavior

When a `TimeSeriesDataSet` instance is no longer being used, I'd expect the memory it uses to be released. 

### Actual behavior

Instead, memory seems to accumulate when creating multiple instances of a `TimeSeriesDataSet`, which is what happens under the hood when calling e.g. the `predict` method on the `TemporalFusionTransformer` class with a pandas `DataFrame`. This causes my deployment that serves predictions using a `TemporalFusionTransformer` to get `OOMKilled` after some time.

### Code to reproduce the problem

A minimal example can be found below. Simply run this script locally, and monitor your machine's memory usage. Memory accumulates over time. Uncommenting the last lines, where some of the attributes are explicitly set to `None`, seems to alleviate the problem a bit, but not completely solve it.

```python
import numpy as np
import pandas as pd
from pytorch_forecasting import TimeSeriesDataSet

test_data = pd.DataFrame(
    {
        "value": np.random.rand(3000000) - 0.5,
        "group": np.repeat(np.arange(3), 1000000),
        "time_idx": np.tile(np.arange(1000000), 3),
    }
)

# Memory accumulates when creating `TimeSeriesDataSet`s. Seems like not everything is
# being garbage collected after a `TimeSeriesDataSet` instance is no longer used.
for i in range(100):
    print("Creating dataset ", i)
    dataset = TimeSeriesDataSet(
        test_data,
        group_ids=["group"],
        target="value",
        time_idx="time_idx",
        min_encoder_length=5,
        max_encoder_length=5,
        min_prediction_length=2,
        max_prediction_length=2,
        time_varying_unknown_reals=["value"],
        predict_mode=False
    )
    
    # Uncommenting the following lines help to reduce the memory leak, but does not
    # completely solve it. Some memory is still not released.
    # dataset.index = None
    # dataset.data = None
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Memory leak in TimeSeriesDataSet #648

Expected behavior

Actual behavior

Code to reproduce the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Memory leak in TimeSeriesDataSet #648

Description

Expected behavior

Actual behavior

Code to reproduce the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions