# Quick Start: Running Feed Forward model on gift-eval benchmark

This notebook shows how to run the Feed Forward model on the gift-eval benchmark.

Make sure you download the gift-eval benchmark and set the `GIFT-EVAL` environment variable correctly before running this notebook.

We will use the `Dataset` class to load the data and run the model. If you have not already please check out the [dataset.ipynb](./dataset.ipynb) notebook to learn more about the `Dataset` class. We are going to just run the model on two datasets for brevity. But feel free to run on any dataset by changing the `short_datasets` and `med_long_datasets` variables below.

We will use the `SimpleFeedForwardEstimator` class from gluonts to run the Feed Forward model. It demonstrates how to use the `gluonts` estimator interface to train and evaluate a deep learning model.

In [1]:
import json
# short_datasets = "m4_yearly m4_quarterly m4_monthly m4_weekly m4_daily m4_hourly electricity/15T electricity/H electricity/D electricity/W solar/10T solar/H solar/D solar/W hospital covid_deaths us_births/D us_births/M us_births/W saugeenday/D saugeenday/M saugeenday/W temperature_rain_with_missing kdd_cup_2018_with_missing/H kdd_cup_2018_with_missing/D car_parts_with_missing restaurant hierarchical_sales/D hierarchical_sales/W LOOP_SEATTLE/5T LOOP_SEATTLE/H LOOP_SEATTLE/D SZ_TAXI/15T SZ_TAXI/H M_DENSE/H M_DENSE/D ett1/15T ett1/H ett1/D ett1/W ett2/15T ett2/H ett2/D ett2/W jena_weather/10T jena_weather/H jena_weather/D bitbrains_fast_storage/5T bitbrains_fast_storage/H bitbrains_rnd/5T bitbrains_rnd/H bizitobs_application bizitobs_service bizitobs_l2c/5T bizitobs_l2c/H"
short_datasets = "m4_weekly"

# med_long_datasets = "electricity/15T electricity/H solar/10T solar/H kdd_cup_2018_with_missing/H LOOP_SEATTLE/5T LOOP_SEATTLE/H SZ_TAXI/15T M_DENSE/H ett1/15T ett1/H ett2/15T ett2/H jena_weather/10T jena_weather/H bitbrains_fast_storage/5T bitbrains_rnd/5T bizitobs_application bizitobs_service bizitobs_l2c/5T bizitobs_l2c/H"
med_long_datasets = "bizitobs_l2c/H"

all_datasets = short_datasets.split() + med_long_datasets.split()

dataset_properties_map = json.load(open('dataset_properties.json'))


In [2]:
# import pandas as pd
# import re
# import json

# # read model properties from dataset_properties.csv\n,
# dataset_properties = pd.read_csv('dataset_properties.csv')
# # Reforemat the the first element of each row after the header following these rules:\n,
# # 1. make all characters lowercase\n,
# dataset_properties['dataset'] = dataset_properties['dataset'].apply(lambda x: x.lower())
# # 2. replace all spaces with underscores\n,
# dataset_properties['dataset'] = dataset_properties['dataset'].apply(lambda x: x.replace(' ', '_'))
# # 3. Replace all dashes with underscores\n,
# dataset_properties['dataset'] = dataset_properties['dataset'].apply(lambda x: x.replace('-', '_'))
# # 4. Replace consecutive underscores with a single underscore. There maybe more than 2 consecutive underscores
# dataset_properties['dataset'] = dataset_properties['dataset'].apply(lambda x: re.sub('_+', '_', x))
# # 5. Remove all leading and trailing underscores
# dataset_properties['dataset'] = dataset_properties['dataset'].apply(lambda x: x.strip('_'))

# # replace consecutive underscores with a single underscore

# # dataset_properties['dataset'] = dataset_properties['dataset'].apply(lambda x: x.lower().replace(' ', '_'))
# # convert it to a dictionary, with dataset as the key, and the value as another dictionary. The inner dictionary has the column names as the key, and the value as the value.
# dataset_properties_dict = dataset_properties.set_index('dataset').T.to_dict('dict')
# dataset_properties_dict.keys()
# dataset_properties_dict["m4_weekly"]

# # Write the dataset_properties_dict to a json file
# with open('dataset_properties.json', 'w') as f:
#     json.dump(dataset_properties_dict, f)


In [3]:
from gluonts.ev.metrics import (
    MSE,
    MAE,
    MASE,
    MAPE,
    SMAPE,
    MSIS,
    RMSE,
    NRMSE,
    ND,
    MeanWeightedSumQuantileLoss,
)

# Instantiate the metrics
metrics = [
    MSE(forecast_type="mean"),
    MSE(forecast_type=0.5),
    MAE(),
    MASE(),
    MAPE(),
    SMAPE(),
    MSIS(),
    RMSE(),
    NRMSE(),
    ND(),
    MeanWeightedSumQuantileLoss(quantile_levels=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]),
]

## Training and Evaluation

We will use the `evaluate_model` function to evaluate the model. This function is a helper function to evaluate the model on the test data and return the results in a dictionary. We are going to follow the naming conventions explained in the [README](../README.md) file to store the results in a csv file called `all_results.csv` under the `results/feedforward` folder.

The first column in the csv file is the dataset config name which is a combination of the dataset name, frequency and the term:

```python
f"{dataset_name}/{freq}/{term}"
```



In [4]:
from gluonts.model import evaluate_model
from gluonts.torch.model.simple_feedforward import SimpleFeedForwardEstimator
import csv
import os
import time
from gluonts.time_feature import get_seasonality
from gift_eval.data import Dataset
# Iterate over all available datasets

output_dir = "../results/feedforward"
# Ensure the output directory exists
os.makedirs(output_dir, exist_ok=True)

# Define the path for the CSV file
csv_file_path = os.path.join(output_dir, 'all_results.csv')

with open(csv_file_path, 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    
    # Write the header
    writer.writerow(['dataset', 'model', 'eval_metrics/MSE[mean]', 'eval_metrics/MSE[0.5]', 'eval_metrics/MAE[0.5]', 'eval_metrics/MASE[0.5]', 'eval_metrics/MAPE[0.5]', 'eval_metrics/sMAPE[0.5]', 'eval_metrics/MSIS', 'eval_metrics/RMSE[mean]', 'eval_metrics/NRMSE[mean]', 'eval_metrics/ND[0.5]', 'eval_metrics/mean_weighted_sum_quantile_loss', 'domain', 'num_variates'])
    
for ds_name in all_datasets:
    ds_key = ds_name.split("/")[0]
    print(f"Processing dataset: {ds_name}")
    terms = ["short", "medium", "long"]
    for term in terms:
        if (term == "medium" or term == "long") and ds_name not in med_long_datasets.split():
            continue

        # Initialize the dataset
        to_univariate = False if Dataset(name=ds_name, term=term,to_univariate=False).target_dim == 1 else True
        dataset = Dataset(name=ds_name, term=term, to_univariate=to_univariate)
        season_length = get_seasonality(dataset.freq)
        # Create mapping from dataset properties CSV
       
                
        # Use mapping to get frequency for dataset
        ds_config = f'{ds_name}/{term}' if '/' in ds_name else f'{ds_name}/{dataset_properties_map[ds_key]["frequency"]}/{term}'

        estimator = SimpleFeedForwardEstimator(
            prediction_length=dataset.prediction_length,
            context_length=dataset.prediction_length,
            trainer_kwargs=dict(
                max_epochs=1,
            )
        )
        predictor = estimator.train(dataset.validation_dataset)

        # Measure the time taken for evaluation
        res = evaluate_model(
                predictor,
                test_data=dataset.test_data,
                metrics=metrics,
                batch_size=512,
                axis=None,
                mask_invalid_label=True,
                allow_nan_forecast=False,
                seasonality=season_length,
            )

        # Append the results to the CSV file
        with open(csv_file_path, 'a', newline='') as csvfile:
            writer = csv.writer(csvfile)
            writer.writerow([
                ds_config, 'feedforward',
                res['MSE[mean]'][0], res['MSE[0.5]'][0], res['MAE[0.5]'][0],
                res['MASE[0.5]'][0], res['MAPE[0.5]'][0], res['sMAPE[0.5]'][0],
                res['MSIS'][0], res['RMSE[mean]'][0], res['NRMSE[mean]'][0],
                res['ND[0.5]'][0], res['mean_weighted_sum_quantile_loss'][0],dataset_properties_map[ds_key]["domain"],dataset_properties_map[ds_key]["num_variates"]
            ])

        print(f"Results for {ds_name} have been written to {csv_file_path}")


  from .autonotebook import tqdm as notebook_tqdm
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/export/exp-series/miniconda3/envs/bench_oss/lib/python3.10/site-packages/lightning/pytorch/trainer/configuration_validator.py:70: You defined a `validation_step` but have no `val_dataloader`. Skipping val loop.
You are using a CUDA device ('NVIDIA A100-SXM4-40GB') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision


Processing dataset: m4_weekly


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type                   | Params | Mode 
---------------------------------------------------------
0 | model | SimpleFeedForwardModel | 5.8 K  | train
---------------------------------------------------------
5.8 K     Trainable params
0         Non-trainable params
5.8 K     Total params
0.023     Total estimated model params size (MB)
11        Modules in train mode
0         Modules in eval mode


Epoch 0: |          | 50/? [00:01<00:00, 30.79it/s, v_num=7, train_loss=9.370]

Epoch 0, global step 50: 'train_loss' reached 9.36584 (best 9.36584), saving model to '/export/exp-series/Projects/oss/gift-eval/notebooks/lightning_logs/version_7/checkpoints/epoch=0-step=50.ckpt' as top 1
`Trainer.fit` stopped: `max_epochs=1` reached.


Epoch 0: |          | 50/? [00:01<00:00, 30.20it/s, v_num=7, train_loss=9.370]


359it [00:16, 21.45it/s]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type                   | Params | Mode 
---------------------------------------------------------
0 | model | SimpleFeedForwardModel | 21.2 K | train
---------------------------------------------------------
21.2 K    Trainable params
0         Non-trainable params
21.2 K    Total params
0.085     Total estimated model params size (MB)
11        Modules in train mode
0         Modules in eval mode


Results for m4_weekly have been written to ../results/feedforward/all_results.csv
Processing dataset: bizitobs_l2c/H
Epoch 0: |          | 50/? [00:00<00:00, 60.90it/s, v_num=8, train_loss=5.180]

Epoch 0, global step 50: 'train_loss' reached 5.18463 (best 5.18463), saving model to '/export/exp-series/Projects/oss/gift-eval/notebooks/lightning_logs/version_8/checkpoints/epoch=0-step=50.ckpt' as top 1
`Trainer.fit` stopped: `max_epochs=1` reached.


Epoch 0: |          | 50/? [00:00<00:00, 58.16it/s, v_num=8, train_loss=5.180]


42it [00:02, 19.63it/s]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type                   | Params | Mode 
---------------------------------------------------------
0 | model | SimpleFeedForwardModel | 211 K  | train
---------------------------------------------------------
211 K     Trainable params
0         Non-trainable params
211 K     Total params
0.845     Total estimated model params size (MB)
11        Modules in train mode
0         Modules in eval mode


Results for bizitobs_l2c/H have been written to ../results/feedforward/all_results.csv
Epoch 0: |          | 50/? [00:00<00:00, 56.54it/s, v_num=9, train_loss=4.310]

Epoch 0, global step 50: 'train_loss' reached 4.30921 (best 4.30921), saving model to '/export/exp-series/Projects/oss/gift-eval/notebooks/lightning_logs/version_9/checkpoints/epoch=0-step=50.ckpt' as top 1
`Trainer.fit` stopped: `max_epochs=1` reached.


Epoch 0: |          | 50/? [00:00<00:00, 51.47it/s, v_num=9, train_loss=4.310]


7it [00:00, 11.78it/s]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type                   | Params | Mode 
---------------------------------------------------------
0 | model | SimpleFeedForwardModel | 316 K  | train
---------------------------------------------------------
316 K     Trainable params
0         Non-trainable params
316 K     Total params
1.268     Total estimated model params size (MB)
11        Modules in train mode
0         Modules in eval mode


Results for bizitobs_l2c/H have been written to ../results/feedforward/all_results.csv
Epoch 0: |          | 50/? [00:00<00:00, 53.89it/s, v_num=10, train_loss=4.580]

Epoch 0, global step 50: 'train_loss' reached 4.58319 (best 4.58319), saving model to '/export/exp-series/Projects/oss/gift-eval/notebooks/lightning_logs/version_10/checkpoints/epoch=0-step=50.ckpt' as top 1
`Trainer.fit` stopped: `max_epochs=1` reached.


Epoch 0: |          | 50/? [00:01<00:00, 48.86it/s, v_num=10, train_loss=4.580]


7it [00:00,  9.80it/s]

Results for bizitobs_l2c/H have been written to ../results/feedforward/all_results.csv





## Results

Running the above cell will generate a csv file called `all_results.csv` under the `results/feedforward` folder containing the results for the Feed Forward model on the gift-eval benchmark. The csv file will look like this:

```csv
dataset,model,eval_metrics/MSE[mean],eval_metrics/MSE[0.5],eval_metrics/MAE[0.5],eval_metrics/MASE[0.5],eval_metrics/MAPE[0.5],eval_metrics/sMAPE[0.5],eval_metrics/MSIS,eval_metrics/RMSE[mean],eval_metrics/NRMSE[mean],eval_metrics/ND[0.5],eval_metrics/mean_weighted_sum_quantile_loss,domain,num_variates
m4_weekly/W/short,feedforward,1448910.3029783587,1448910.3029783587,764.3635633169059,12.345394600259056,0.1595788812238858,0.15331997247127438,314.35590445280945,1203.706900777078,0.21929674306756633,0.13925519563500452,0.15997800028234835,Econ/Fin,1
bizitobs_l2c/H/short,feedforward,241.18829055059524,241.18829055059524,12.045664953807044,1.1593927223085088,1.0907100979372752,1.0100947788783483,14.501148138388263,15.5302379424977,0.8371195987273939,0.6492921904913098,0.5449973232364873,Web/CloudOps,7
bizitobs_l2c/H/medium,feedforward,81.86692708333334,81.86692708333334,6.16808093843006,0.6313417553056844,0.7273494340050158,0.8551737467447916,4.898049781903129,9.048034432037344,0.5478722280349564,0.37348666959888,0.3012130717159007,Web/CloudOps,7
bizitobs_l2c/H/long,feedforward,87.75125248015873,87.75125248015873,6.178149026537699,0.6622176358901023,0.7759000447908542,0.7934493776351687,5.6998924604382974,9.367563849804213,0.5721978364002469,0.3773791737770806,0.31008587824774975,Web/CloudOps,7
```
 