<a href="https://colab.research.google.com/github/florisrc/ESRNN-GPU/blob/master/es_rnn_colab_nb_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ES-RNN Colab NB Example

A GPU-enabled version of the hybrid ES-RNN model by Slawek et al that won the M4 time-series forecasting competition by a large margin, here implemented in a Google Colab environment. The details of our implementation and the results are discussed in detail on this [paper](https://arxiv.org/abs/1907.03329).



## Get data and code:


In [0]:
# get data
%cd /content
!mkdir /content/m4_data 
%cd /content/m4_data
!wget https://www.m4.unic.ac.cy/wp-content/uploads/2017/12/M4DataSet.zip
!wget https://www.m4.unic.ac.cy/wp-content/uploads/2018/07/M-test-set.zip
!wget https://github.com/M4Competition/M4-methods/raw/master/Dataset/M4-info.csv
!mkdir ./Train && cd ./Train && unzip ../M4DataSet.zip && cd ..
!mkdir ./Test && cd ./Test && unzip ../M-test-set.zip && cd ..

# clone git repo
%cd /content
!git clone https://github.com/damitkwr/ESRNN-GPU.git

# copy data to repo
%cd /content/ESRNN-GPU/
!mkdir ./data
%cd data/
!mkdir ./Train && cp /content/m4_data/Train/* ./Train/
!mkdir ./Test && cp /content/m4_data/Test/* ./Test/
!cp /content/m4_data/M4-info.csv ./info.csv
!cd ../..

/content
/content/m4_data
--2020-04-20 10:14:51--  https://www.m4.unic.ac.cy/wp-content/uploads/2017/12/M4DataSet.zip
Resolving www.m4.unic.ac.cy (www.m4.unic.ac.cy)... 35.177.142.35, 35.176.90.68
Connecting to www.m4.unic.ac.cy (www.m4.unic.ac.cy)|35.177.142.35|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 66613994 (64M) [application/zip]
Saving to: ‘M4DataSet.zip’


2020-04-20 10:14:55 (17.9 MB/s) - ‘M4DataSet.zip’ saved [66613994/66613994]

--2020-04-20 10:14:56--  https://www.m4.unic.ac.cy/wp-content/uploads/2018/07/M-test-set.zip
Resolving www.m4.unic.ac.cy (www.m4.unic.ac.cy)... 35.177.142.35, 35.176.90.68
Connecting to www.m4.unic.ac.cy (www.m4.unic.ac.cy)|35.177.142.35|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3723045 (3.5M) [application/zip]
Saving to: ‘M-test-set.zip’


2020-04-20 10:14:57 (4.20 MB/s) - ‘M-test-set.zip’ saved [3723045/3723045]

--2020-04-20 10:14:58--  https://github.com/M4Competition/M4-methods/raw/

## Create colab environment with correct library versions

In [0]:
# uninstall torch  
!pip uninstall torch
!pip uninstall torch # run twice (recommendation pytorch forums)

# and re-install as 0.4.1
from os import path
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())

accelerator = 'cu80' if path.exists('/opt/bin/nvidia-smi') else 'cpu'

!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.4.1-{platform}-linux_x86_64.whl torchvision

# tensorflow version 1 
%tensorflow_version 1.x

import torch
import tensorflow as tf 
print(f'Torch version: {torch.__version__}')
print(f'Tensorflow version: {tf.__version__}')
print(f'Torch.cuda.is_available: {torch.cuda.is_available()}')

Uninstalling torch-0.4.1:
  Would remove:
    /usr/local/lib/python3.6/dist-packages/torch-0.4.1.dist-info/*
    /usr/local/lib/python3.6/dist-packages/torch/*
Proceed (y/n)? y
  Successfully uninstalled torch-0.4.1
[K     |████████████████████████████████| 483.0MB 1.2MB/s 
[31mERROR: torchvision 0.5.0 has requirement torch==1.4.0, but you'll have torch 0.4.1 which is incompatible.[0m
[31mERROR: fastai 1.0.60 has requirement torch>=1.0.0, but you'll have torch 0.4.1 which is incompatible.[0m
[?25hTensorFlow 1.x selected.
Torch version: 0.4.1
Tensorflow version: 1.15.2
Torch.cuda.is_available: True


## Check model configurations (optional)


In [0]:
# move to project working directory
%cd /content/ESRNN-GPU/

# Check configuration
import pprint
from es_rnn.config import get_config

config = get_config('Monthly')    # can be quarterly, monthly, daily or yearly. 
pprint.pprint(config)

/content/ESRNN-GPU
{'add_nl_layer': True,
 'batch_size': 1024,
 'c_state_penalty': 0,
 'chop_val': 72,
 'device': 'cuda',
 'dilations': ((1, 3), (6, 12)),
 'gradient_clipping': 20,
 'input_size': 12,
 'input_size_i': 12,
 'learning_rate': 0.001,
 'learning_rates': (10, 0.0001),
 'level_variability_penalty': 50,
 'lr_anneal_rate': 0.5,
 'lr_anneal_step': 5,
 'lr_ratio': 3.1622776601683795,
 'lr_tolerance_multip': 1.005,
 'min_epochs_before_changing_lrate': 2,
 'min_learning_rate': 0.0001,
 'num_of_categories': 6,
 'num_of_train_epochs': 15,
 'output_size': 18,
 'output_size_i': 18,
 'percentile': 50,
 'print_output_stats': 3,
 'print_train_batch_every': 5,
 'prod': True,
 'rnn_cell_type': 'LSTM',
 'seasonality': 12,
 'state_hsize': 50,
 'tau': 0.5,
 'training_percentile': 45,
 'training_tau': 0.45,
 'variable': 'Monthly'}


## Edit model configurations (optional) 

In [0]:
# print config.py and copy code to clipboard 
!cat /es_rnn/config.py

cat: /es_rnn/config.py: No such file or directory


In [0]:
%%writefile /content/ESRNN-GPU/es_rnn/config.py

from math import sqrt

import torch


def get_config(interval):
    config = {
        'prod': True,
        'device': ("cuda" if torch.cuda.is_available() else "cpu"),
        'percentile': 50,
        'training_percentile': 45,
        'add_nl_layer': True,
        'rnn_cell_type': 'LSTM',
        'learning_rate': 1e-3,
        'learning_rates': ((10, 1e-4)),
        'num_of_train_epochs': 5,
        'num_of_categories': 6,  # in data provided
        'batch_size': 1024,
        'gradient_clipping': 20,
        'c_state_penalty': 0,
        'min_learning_rate': 0.0001,
        'lr_ratio': sqrt(10),
        'lr_tolerance_multip': 1.005,
        'min_epochs_before_changing_lrate': 2,
        'print_train_batch_every': 5,
        'print_output_stats': 3,
        'lr_anneal_rate': 0.5,
        'lr_anneal_step': 5
    }

    if interval == 'Quarterly':
        config.update({
            'chop_val': 72,
            'variable': "Quarterly",
            'dilations': ((1, 2), (4, 8)),
            'state_hsize': 40,
            'seasonality': 4,
            'input_size': 4,
            'output_size': 8,
            'level_variability_penalty': 80
        })
    elif interval == 'Monthly':
        config.update({
            #     RUNTIME PARAMETERS
            'chop_val': 72,
            'variable': "Monthly",
            'dilations': ((1, 3), (6, 12)),
            'state_hsize': 50,
            'seasonality': 12,
            'input_size': 12,
            'output_size': 18,
            'level_variability_penalty': 50
        })
    elif interval == 'Daily':
        config.update({
            #     RUNTIME PARAMETERS
            'chop_val': 200,
            'variable': "Daily",
            'dilations': ((1, 7), (14, 28)),
            'state_hsize': 50,
            'seasonality': 7,
            'input_size': 7,
            'output_size': 14,
            'level_variability_penalty': 50
        })
    elif interval == 'Yearly':

        config.update({
            #     RUNTIME PARAMETERS
            'chop_val': 25,
            'variable': "Yearly",
            'dilations': ((1, 2), (2, 6)),
            'state_hsize': 30,
            'seasonality': 1,
            'input_size': 4,
            'output_size': 6,
            'level_variability_penalty': 0
        })
    else:
        print("I don't have that config. :(")

    config['input_size_i'] = config['input_size']
    config['output_size_i'] = config['output_size']
    config['tau'] = config['percentile'] / 100
    config['training_tau'] = config['training_percentile'] / 100

    if not config['prod']:
        config['batch_size'] = 10
        config['num_of_train_epochs'] = 15

    return config

Overwriting /content/ESRNN-GPU/es_rnn/config.py


In [0]:
# move to project working directory
%cd /content/ESRNN-GPU/

import pandas as pd
from torch.utils.data import DataLoader
from es_rnn.data_loading import create_datasets, SeriesDataset
from es_rnn.config import get_config
from es_rnn.trainer import ESRNNTrainer
from es_rnn.model import ESRNN
import time

print('loading config')
config = get_config('Monthly')

print('loading data')
info = pd.read_csv('/content/ESRNN-GPU/data/info.csv')

train_path = '/content/ESRNN-GPU/data/Train/%s-train.csv' % (config['variable'])
test_path = '/content/ESRNN-GPU/data/Test/%s-test.csv' % (config['variable'])

train, val, test = create_datasets(train_path, test_path, config['output_size'])

dataset = SeriesDataset(train, val, test, info, config['variable'], config['chop_val'], config['device'])
dataloader = DataLoader(dataset, batch_size=config['batch_size'], shuffle=True)

run_id = str(int(time.time()))
model = ESRNN(num_series=len(dataset), config=config)
tr = ESRNNTrainer(model, dataloader, run_id, config, ohe_headers=dataset.dataInfoCatHeaders)
tr.train_epochs() 

/content/ESRNN-GPU
loading config
loading data

Train_batch: 1

Train_batch: 2
Train_batch: 3
Train_batch: 4
Train_batch: 5
Train_batch: 6
Train_batch: 7
Train_batch: 8
Train_batch: 9
Train_batch: 10
Train_batch: 11
Train_batch: 12
Train_batch: 13
Train_batch: 14
Train_batch: 15
Train_batch: 16
Train_batch: 17
Train_batch: 18
Train_batch: 19
Train_batch: 20
Train_batch: 21
Train_batch: 22
Train_batch: 23
Train_batch: 24
Train_batch: 25
Train_batch: 26
Train_batch: 27
Train_batch: 28
Train_batch: 29
Train_batch: 30
Train_batch: 31
Train_batch: 32
Train_batch: 33
Train_batch: 34
Train_batch: 35
[TRAIN]  Epoch [1/15]   Loss: 23.5622

Loss decreased, saving model!
{'Demographic': 9.967270851135254, 'Finance': 14.129257202148438, 'Industry': 14.062990188598633, 'Macro': 14.504400253295898, 'Micro': 12.656795501708984, 'Other': 13.857166290283203, 'Overall': 13.255921363830566, 'hold_out_loss': 9.428352355957031}
Train_batch: 1
Train_batch: 2
Train_batch: 3
Train_batch: 4
Train_batch: 5
Trai