GPVAREstimator - AssertionError: #3066

ArianKhorasani · 2023-11-28T18:13:10Z

Dear @lostella or maybe @jaheba et al - I'd require your help!

I'm training my multivariate time series dataset which is converted to ListDataset on GPVAREstimator, but getting the following AssertionError in Training process:

```
AssertionError                            Traceback (most recent call last)
Cell In[38], line 1
----> 1 predictor = estimator.train(
      2     training_data = train_ds_residuals,
      3     shuffle_buffer_length = 100,
      4     cache_data = True,
      5 )

File ~/Project/pytorch-transformer-ts/myenv/lib/python3.8/site-packages/gluonts/mx/model/estimator.py:237, in GluonEstimator.train(self, training_data, validation_data, shuffle_buffer_length, cache_data, **kwargs)
    229 def train(
    230     self,
    231     training_data: Dataset,
   (...)
    235     **kwargs,
    236 ) -> Predictor:
--> 237     return self.train_model(
    238         training_data=training_data,
    239         validation_data=validation_data,
    240         shuffle_buffer_length=shuffle_buffer_length,
    241         cache_data=cache_data,
    242     ).predictor

File ~/Project/pytorch-transformer-ts/myenv/lib/python3.8/site-packages/gluonts/mx/model/estimator.py:205, in GluonEstimator.train_model(self, training_data, validation_data, from_predictor, shuffle_buffer_length, cache_data)
    197             transformed_validation_data = Cached(
...
     35         input_dim=self.target_dim,
     36         output_dim=4 * self.distr_output.rank,
     37     )

AssertionError:
```

Please note that the target_dim = 7, prediction_length = 1, and context_length = 5. Here is the whole code of GPVAREstimator that I'm using too:

```
estimator = GPVAREstimator(
    prediction_length = 1,
    target_dim = 7,
    freq = '1H',
    context_length = 5,
    num_layers = 4,
    num_cells = 32,
    distr_output = MultivariateGaussianOutput(dim=7),
    trainer = Trainer(ctx = "cpu", epochs = 50, weight_decay = 1e-8, num_batches_per_epoch = 100)
)
predictor = estimator.train(
    training_data = train_ds_residuals,
    shuffle_buffer_length = 100,
    cache_data = True,
)
```

I'd appreciated if you could help me with this! Thank you!

The text was updated successfully, but these errors were encountered:

lostella · 2023-11-28T20:51:37Z

@ArianKhorasani could you provide the entire error trace? It’s not clear which assertion is failing

ArianKhorasani · 2023-11-28T21:02:58Z

@lostella - the error trace that I provided is the entire error that I get. Please check the screenshot below too:

ArianKhorasani · 2023-11-28T21:05:25Z

Dear @lostella - I have already checked the dimension of my multivariate time series too. Putting my whole dataset code below:

```
variables = ['DBP', 'SBP', 'Resp', 'Temp', 'HR', 'O2Sat', 'MAP']
df_actual = pd.read_csv('merged_test.csv')
static_features = df_actual[['patient_id', 'Age', 'Gender', 'HospAdmTime']].drop_duplicates().reset_index(drop=True)
df_residuals = pd.DataFrame()

for variable in variables:
    # First, let's load forecasted values 
    df_forecast = pd.read_csv(f'forecasts_{variable}.csv')

    # Ensure that the data are ordered in the same way
    df_actual = df_actual.sort_values(by=['patient_id', 'ICULOS'])
    df_forecast = df_forecast.sort_values(by=['patient_id', 'ICULOS'])

    # Calculate residuals 
    residuals = df_actual[variable] - df_forecast[f'{variable}_forecast']

    # Add residual to df_residuals
    df_residuals[variable] = residuals
    df_residuals['patient_id'] = df_actual['patient_id']
    df_residuals['ICULOS'] = df_actual['ICULOS']

# Convert df_residuals to ListDataset
data_residuals = []
for patient_id, group in df_residuals.groupby('patient_id'):
    target = group[variables].values  # Use the residuals as target
    start = pd.Timestamp("1970-01-01 00:00") + pd.Timedelta(hours=group['ICULOS'].iloc[0])
    entry = {
        FieldName.TARGET: target,
        FieldName.START: start,  # Use the index as the start date
        FieldName.FEAT_STATIC_CAT: static_features[static_features['patient_id'] == patient_id][['Age', 'Gender']].values[0]
    }
    data_residuals.append(entry)

dataset_residuals = ListDataset(data_residuals, freq='1H', one_dim_target=False)
```

ArianKhorasani added the bug Something isn't working label Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPVAREstimator - AssertionError: #3066

GPVAREstimator - AssertionError: #3066

ArianKhorasani commented Nov 28, 2023 •

edited

Loading

lostella commented Nov 28, 2023

ArianKhorasani commented Nov 28, 2023

ArianKhorasani commented Nov 28, 2023

GPVAREstimator - AssertionError: #3066

GPVAREstimator - AssertionError: #3066

Comments

ArianKhorasani commented Nov 28, 2023 • edited Loading

lostella commented Nov 28, 2023

ArianKhorasani commented Nov 28, 2023

ArianKhorasani commented Nov 28, 2023

ArianKhorasani commented Nov 28, 2023 •

edited

Loading