### Multivariate time series prediction using MLP with Hyperparameter optimization



In this notebook, we use **Optuna** to find the optimum values of hyperparameters. Optuna is a package for optimizing hyperparameters xxxxx

In [1]:
import torch
import numpy as np
import optuna

import pandas as pd
from sklearn.preprocessing import MinMaxScaler

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
path = '../dataset/final_data.csv'

In [3]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

# Implement determinism. Set a fixed value for random seed so that when the parameters are initialized, they are initialized same across all experiments.
torch.manual_seed(42)

# If you are using CUDA, also set the seed for it
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)
    torch.cuda.manual_seed_all(42)

# Set the seed for NumPy
np.random.seed(42)

Using device: cpu


Here we define **RiverData** a custom Dataset class to load the dataset we have. It extends the pytorch Dataset class.  
- We need to define \_\_init__() function which can be used for loading data from file and optionally for data preprocessing.
- Thereafter we define \_\_len__() function which gives the length of dataset.
- Then we define \_\_getitem__() function which returns an instance of (feature, label) tuple which can be used for model training.
  For our time series data, feature means the past values to be used for training and label means the future values to be predicted.

In [4]:
class RiverData(torch.utils.data.Dataset):
    
    def __init__(self, df, target, datecol, seq_len, pred_len):
        self.df = df
        self.datecol = datecol
        self.target = target
        self.seq_len = seq_len
        self.pred_len = pred_len
        self.setIndex()
        

    def setIndex(self):
        self.df.set_index(self.datecol, inplace=True)
    

    def __len__(self):
        return len(self.df) - self.seq_len - self.pred_len


    def __getitem__(self, idx):
        if len(self.df) <= (idx + self.seq_len+self.pred_len):
            raise IndexError(f"Index {idx} is out of bounds for dataset of size {len(self.df)}")
        df_piece = self.df[idx:idx+self.seq_len].values
        feature = torch.tensor(df_piece, dtype=torch.float32)
        label_piece = self.df[self.target][idx + self.seq_len:  idx+self.seq_len+self.pred_len].values
        label = torch.tensor(label_piece, dtype=torch.float32)
        return (feature.T, label) 

### Normalize the data

In [5]:
df = pd.read_csv(path)
df = df[df['DATE'] > '2014']
raw_df = df.drop('DATE', axis=1, inplace=False)
scaler = MinMaxScaler()

# Apply the transformations
df_scaled = scaler.fit_transform(raw_df)

df_scaled = pd.DataFrame(df_scaled, columns=raw_df.columns)
df_scaled['DATE'] = df['DATE']
df = df_scaled

Some advanced python syntax have been used here. \
*common_args : it's used to pass arguments to a function, where common_args represents a python list \
**common_args: it's used to pass arguments to a function, where common_args represents a python dictionary

In [6]:

train_size = int(0.7 * len(df))
test_size = int(0.2 * len(df))
val_size = len(df) - train_size - test_size

seq_len = 13
pred_len = 1
num_features = 7

common_args = ['gauge_height', 'DATE', seq_len, pred_len]
train_dataset = RiverData(df[:train_size], *common_args)
val_dataset = RiverData(df[train_size: train_size+val_size], *common_args)
test_dataset = RiverData(df[train_size+val_size : len(df)], *common_args)


In [7]:
# Important parameters

BATCH_SIZE = 512 # keep as big as can be handled by GPU and memory
SHUFFLE = False # we don't shuffle the time series data
DATA_LOAD_WORKERS = 1 # it depends on amount of data you need to load


In [8]:
from torch.utils.data import DataLoader

common_args = {'batch_size': BATCH_SIZE, 'shuffle': SHUFFLE}
train_loader = DataLoader(train_dataset, **common_args)
val_loader = DataLoader(val_dataset, **common_args)
test_loader = DataLoader(test_dataset, **common_args)

### Here we define our pytorch model.

BasicMLPNetwork is the model class, it extends the Module class provided by pytorch. \
- We define \_\_init__() function. It sets up layers and defines the model parameters.
- Also, we define forward() function which defines how the forwared pass computation occurs

In [9]:
# Here we are adding dropout layers.

class BasicMLPNetwork(torch.nn.Module):
    
    def __init__(self, seq_len, pred_len, num_features, dropout):
        # call the constructor of the base class
        super().__init__()
        self.seq_len = seq_len
        self.pred_len = pred_len
        self.num_features = num_features
        hidden_size_time = 512
        hidden_size_feat = 128
        # define layers for combining across time series
        self.fc1 = torch.nn.Linear(self.seq_len, hidden_size_time)
        self.relu = torch.nn.ReLU()
        self.dropout1 = torch.nn.Dropout(p=dropout)
        self.fc2 = torch.nn.Linear(hidden_size_time, self.pred_len)
        self.dropout2 = torch.nn.Dropout(p=dropout)

        # define layers for combining across the features
        self.fc3 = torch.nn.Linear(self.num_features, hidden_size_feat)
        self.fc4 = torch.nn.Linear(hidden_size_feat, 1)

    def forward(self, x):

        # computation over time
        out = self.fc1(x)
        out = self.relu(out)
        out = self.dropout1(out)
        out = self.fc2(out)
        out = self.relu(out) # has dimension 512 x 7 x 12
        out = self.dropout2(out)
        # computation over features
        out = out.transpose(1,2) # dimension 512 x 12 x 7
        out = self.fc3(out) # dimension 512 x 12 x 20
        out = self.relu(out)
        out = self.fc4(out) # dimension 512 x 12 x 1

        out = out.squeeze(-1) # dimension 512 x 12
        
        return out

# Note that the gradients are stored insize the FC layer objects
# For each training example we need to get rid of these gradients

In [10]:
loss = torch.nn.MSELoss()

In [11]:
for i, (f,l) in enumerate(train_loader):
    print('features shape: ', f.shape)
    print('labels shape: ', l.shape)
    break

features shape:  torch.Size([512, 7, 8])
labels shape:  torch.Size([512, 1])


In [12]:
# define metrics
import numpy as np
epsilon = np.finfo(float).eps

def Wape(y, y_pred):
    """Weighted Average Percentage Error metric in the interval [0; 100]"""
    y = np.array(y)
    y_pred = np.array(y_pred)
    nominator = np.sum(np.abs(np.subtract(y, y_pred)))
    denominator = np.add(np.sum(np.abs(y)), epsilon)
    wape = np.divide(nominator, denominator) * 100.0
    return wape

def nse(y, y_pred):
    y = np.array(y)
    y_pred = np.array(y_pred)
    return (1-(np.sum((y_pred-y)**2)/np.sum((y-np.mean(y))**2)))


def evaluate_model(model, data_loader):
    # following line prepares the model for evaulation mode. It disables dropout and batch normalization if they have 
    # are part of the model. For our simple model it's not necessary. Still I'm going to use it.

    model.eval()
    all_inputs = torch.empty((0, num_features, seq_len))
    all_labels = torch.empty(0, pred_len)
    for inputs, labels in data_loader:
        all_inputs = torch.vstack((all_inputs, inputs))
        all_labels = torch.vstack((all_labels, labels))
    
    with torch.no_grad():
        all_inputs = all_inputs.to(device)
        outputs = model(all_inputs).detach().cpu()
        avg_val_loss = loss(outputs, all_labels)
        nsee = nse(all_labels.numpy(), outputs.numpy())
        wapee = Wape(all_labels.numpy(), outputs.numpy())
        
    print(f'NSE : {nsee}', end=' ')
    print(f'WAPE : {wapee}', end=' ')
    print(f'Validation Loss: {avg_val_loss}')
    model.train()
    return avg_val_loss


In [13]:
def objective(trial):
    # Here we define the search space of the hyper-parameters. Optuna uses byaesian optimization to find the optimal values of the hyperparameters.
    learning_rate = trial.suggest_loguniform('lr', 1e-4, 1e-1)
    weight_decay = trial.suggest_loguniform('weight_decay', 1e-5, 1e-2)
    dropout_p = trial.suggest_uniform('dropout_p', 0.0, 0.5)
    
    model = BasicMLPNetwork(seq_len, pred_len, num_features, dropout_p)
    model = model.to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate, weight_decay=weight_decay)
    
    num_epochs = 5
    best_val_loss = float('inf')
    patience = 2
    
    for epoch in range(num_epochs):
        model.train()
        epoch_loss = []
        for batch_idx, (inputs, labels) in enumerate(train_loader):
            inputs = inputs.to(device)
            labels = labels.to(device)
            outputs = model(inputs)
            loss_val = loss(outputs, labels)
    
            # calculate gradients for back propagation
            loss_val.backward()
    
            # update the weights based on the gradients
            optimizer.step()
    
            # reset the gradients, avoid gradient accumulation
            optimizer.zero_grad()
            epoch_loss.append(loss_val.item())
    
        avg_train_loss = sum(epoch_loss)/len(epoch_loss)
        print(f'Epoch {epoch+1}: Traning Loss: {avg_train_loss}', end=' ')
        avg_val_loss = evaluate_model(model, val_loader)
    
        # Check for improvement
        if avg_val_loss < best_val_loss:
            best_val_loss = avg_val_loss
            epochs_no_improve = 0
            # Save the best model
            torch.save(model.state_dict(), 'best_model_trial.pth')
        else:
            epochs_no_improve += 1
            if epochs_no_improve == patience:
                print('Early stopping!')
                # Load the best model before stopping
                model.load_state_dict(torch.load('best_model_trial.pth'))
                break

        # Report intermediate objective value
        trial.report(best_val_loss, epoch)

        # Handle pruning based on the intermediate value
        if trial.should_prune():
            raise optuna.exceptions.TrialPruned()

    return best_val_loss

study = optuna.create_study(direction='minimize')

study.optimize(objective, n_trials=20)

print('Number of finished trials:', len(study.trials))
print('Best trial:')
trial = study.best_trial

print('  Value (Best Validation Loss):', trial.value)
print('  Params:')
for key, value in trial.params.items():
    print(f'    {key}: {value}')


[I 2024-11-18 16:53:07,970] A new study created in memory with name: no-name-8950f954-95a8-4f38-b603-574277c5572a
  learning_rate = trial.suggest_loguniform('lr', 1e-4, 1e-1)
  weight_decay = trial.suggest_loguniform('weight_decay', 1e-5, 1e-2)
  dropout_p = trial.suggest_uniform('dropout_p', 0.0, 0.5)


Epoch 1: Traning Loss: 0.02138136008195579 NSE : -0.004740238189697266 WAPE : 49.405791019343084 Validation Loss: 0.011892914772033691
Epoch 2: Traning Loss: 0.018876492513343692 NSE : 0.012080132961273193 WAPE : 47.82370479826412 Validation Loss: 0.01169381570070982
Epoch 3: Traning Loss: 0.01840300301462412 NSE : -0.016654253005981445 WAPE : 46.842339085660775 Validation Loss: 0.012033938430249691
Epoch 4: Traning Loss: 0.018120203169062734 

[I 2024-11-18 16:53:48,192] Trial 0 finished with value: 0.01169381570070982 and parameters: {'lr': 0.00016194304342083342, 'weight_decay': 6.841424555602147e-05, 'dropout_p': 0.40027493799568015}. Best is trial 0 with value: 0.01169381570070982.


NSE : -0.011871337890625 WAPE : 46.606716848438644 Validation Loss: 0.01197732426226139
Early stopping!


  learning_rate = trial.suggest_loguniform('lr', 1e-4, 1e-1)
  weight_decay = trial.suggest_loguniform('weight_decay', 1e-5, 1e-2)
  dropout_p = trial.suggest_uniform('dropout_p', 0.0, 0.5)


Epoch 1: Traning Loss: 0.01877187854703516 NSE : -0.009142398834228516 WAPE : 49.07704940717934 Validation Loss: 0.011945025064051151
Epoch 2: Traning Loss: 0.01854699834343046 NSE : -0.008787989616394043 WAPE : 49.04997337290652 Validation Loss: 0.01194082759320736
Epoch 3: Traning Loss: 0.018543006581254302 NSE : -0.008099794387817383 WAPE : 48.99615329419173 Validation Loss: 0.011932680383324623
Epoch 4: Traning Loss: 0.018541007621213794 NSE : -0.007363557815551758 WAPE : 48.93673088708135 Validation Loss: 0.011923967860639095
Epoch 5: Traning Loss: 0.018537821112200616 

[I 2024-11-18 16:54:38,595] Trial 1 finished with value: 0.011915133334696293 and parameters: {'lr': 0.0002988266883817752, 'weight_decay': 5.5655669735639195e-05, 'dropout_p': 0.381211877661159}. Best is trial 0 with value: 0.01169381570070982.


NSE : -0.006617188453674316 WAPE : 48.87427907276447 Validation Loss: 0.011915133334696293
Epoch 1: Traning Loss: 0.019335698308888823 NSE : -0.03278648853302002 WAPE : 50.46701537671263 Validation Loss: 0.01222489308565855
Epoch 2: Traning Loss: 0.018862698198761792 NSE : -0.039726853370666504 WAPE : 50.8017931676191 Validation Loss: 0.012307043187320232
Epoch 3: Traning Loss: 0.018813551472034305 

[I 2024-11-18 16:55:09,539] Trial 2 finished with value: 0.01222489308565855 and parameters: {'lr': 0.0007573436945323, 'weight_decay': 0.0011534209960972512, 'dropout_p': 0.32493002350507727}. Best is trial 0 with value: 0.01169381570070982.


NSE : -0.044745564460754395 WAPE : 51.03313349054561 Validation Loss: 0.01236644946038723
Early stopping!
Epoch 1: Traning Loss: 0.03006215603835881 NSE : 0.011563658714294434 WAPE : 47.97291159156076 Validation Loss: 0.011699928902089596
Epoch 2: Traning Loss: 0.019834825143218042 NSE : 0.1303076148033142 WAPE : 45.07158832536633 Validation Loss: 0.01029437966644764
Epoch 3: Traning Loss: 0.017032656334340573 NSE : 0.24961018562316895 WAPE : 41.55711002248628 Validation Loss: 0.008882218040525913
Epoch 4: Traning Loss: 0.014663853550329804 NSE : 0.36523133516311646 WAPE : 37.88078290445529 Validation Loss: 0.007513634394854307
Epoch 5: Traning Loss: 0.012781767981126905 

[I 2024-11-18 16:56:00,335] Trial 3 finished with value: 0.006337152794003487 and parameters: {'lr': 0.00010062111955309728, 'weight_decay': 0.0018651433403429928, 'dropout_p': 0.30544139646709195}. Best is trial 3 with value: 0.006337152794003487.


NSE : 0.46462303400039673 WAPE : 34.25243586220551 Validation Loss: 0.006337152794003487
Epoch 1: Traning Loss: 0.01916814944264479 NSE : -0.033257365226745605 WAPE : 50.490394251505535 Validation Loss: 0.012230467051267624
Epoch 2: Traning Loss: 0.018855348623357714 NSE : -0.04117751121520996 WAPE : 50.869500230053866 Validation Loss: 0.012324215844273567
Epoch 3: Traning Loss: 0.01878470928175375 

[I 2024-11-18 16:56:30,749] Trial 4 finished with value: 0.012230467051267624 and parameters: {'lr': 0.0007647318895243781, 'weight_decay': 0.0011153948667736694, 'dropout_p': 0.14728290448677217}. Best is trial 3 with value: 0.006337152794003487.


NSE : -0.044689297676086426 WAPE : 51.03057943241509 Validation Loss: 0.012365785427391529
Early stopping!
Epoch 1: Traning Loss: 0.019598420839291066 

[I 2024-11-18 16:56:40,912] Trial 5 pruned. 


NSE : -0.016392827033996582 WAPE : 49.56996753592168 Validation Loss: 0.012030845507979393
Epoch 1: Traning Loss: 0.023479830966796726 

[I 2024-11-18 16:56:51,148] Trial 6 pruned. 


NSE : -0.25265049934387207 WAPE : 58.01619627480247 Validation Loss: 0.014827379956841469
Epoch 1: Traning Loss: 0.018309891231358053 NSE : -0.0007570981979370117 WAPE : 47.57050596306553 Validation Loss: 0.011845767498016357
Epoch 2: Traning Loss: 0.018452448347583413 NSE : -0.0014967918395996094 WAPE : 47.459315778012886 Validation Loss: 0.011854524724185467
Epoch 3: Traning Loss: 0.018425532422959804 

[I 2024-11-18 16:57:21,784] Trial 7 finished with value: 0.011845767498016357 and parameters: {'lr': 0.00023888792655853055, 'weight_decay': 0.0046018639029721635, 'dropout_p': 0.232572141827935}. Best is trial 3 with value: 0.006337152794003487.


NSE : -0.002379179000854492 WAPE : 47.36429877937903 Validation Loss: 0.01186496764421463
Early stopping!
Epoch 1: Traning Loss: 0.022090568671002984 

[I 2024-11-18 16:57:31,969] Trial 8 pruned. 


NSE : -0.00888669490814209 WAPE : 49.078347185609026 Validation Loss: 0.011941997334361076
Epoch 1: Traning Loss: 0.040203986243577676 

[I 2024-11-18 16:57:42,228] Trial 9 pruned. 


NSE : -0.024647116661071777 WAPE : 50.04466772269613 Validation Loss: 0.012128547765314579
Epoch 1: Traning Loss: 0.08001132544735447 

[I 2024-11-18 16:57:52,447] Trial 10 pruned. 


NSE : -0.022893905639648438 WAPE : 49.94850385010116 Validation Loss: 0.012107795104384422
Epoch 1: Traning Loss: 0.03482726379856467 

[I 2024-11-18 16:58:02,940] Trial 11 pruned. 


NSE : -0.030905604362487793 WAPE : 46.61697080707784 Validation Loss: 0.012202630750834942
Epoch 1: Traning Loss: 0.024911257274448873 

[I 2024-11-18 16:58:13,727] Trial 12 pruned. 


NSE : -0.18161511421203613 WAPE : 47.54232455356045 Validation Loss: 0.013986548408865929
Epoch 1: Traning Loss: 0.022149462811183185 

[I 2024-11-18 16:58:24,628] Trial 13 pruned. 


NSE : -0.18958961963653564 WAPE : 56.172294573369676 Validation Loss: 0.01408094260841608
Epoch 1: Traning Loss: 0.019665456728544087 NSE : 0.26971375942230225 WAPE : 41.95841404836096 Validation Loss: 0.008644256740808487
Epoch 2: Traning Loss: 0.012408047121018171 NSE : 0.5593275129795074 WAPE : 30.41528985202157 Validation Loss: 0.0052161552011966705
Epoch 3: Traning Loss: 0.008443294410593808 NSE : 0.7279270887374878 WAPE : 24.767520348335804 Validation Loss: 0.003220474347472191
Epoch 4: Traning Loss: 0.0068508101562038065 NSE : 0.7711854428052902 WAPE : 22.865214844955467 Validation Loss: 0.0027084338944405317
Epoch 5: Traning Loss: 0.006477498605847358 

[I 2024-11-18 16:59:17,522] Trial 14 finished with value: 0.0024257085751742125 and parameters: {'lr': 0.00029409586609282086, 'weight_decay': 0.0005606064055967145, 'dropout_p': 0.18936945301450742}. Best is trial 14 with value: 0.0024257085751742125.


NSE : 0.7950707077980042 WAPE : 21.549820204870926 Validation Loss: 0.0024257085751742125
Epoch 1: Traning Loss: 0.022872959461528806 NSE : 0.08119946718215942 WAPE : 45.287276082290624 Validation Loss: 0.010875665582716465
Epoch 2: Traning Loss: 0.01680846110265702 NSE : 0.2730065584182739 WAPE : 40.79196400505892 Validation Loss: 0.008605279959738255
Epoch 3: Traning Loss: 0.01248659360781312 NSE : 0.54670250415802 WAPE : 31.47593322718713 Validation Loss: 0.00536559522151947
Epoch 4: Traning Loss: 0.008650019470602274 NSE : 0.6838905811309814 WAPE : 26.94885953193357 Validation Loss: 0.0037417258135974407
Epoch 5: Traning Loss: 0.007276059971190989 

[I 2024-11-18 17:00:11,951] Trial 15 finished with value: 0.00324849970638752 and parameters: {'lr': 0.0003635986405603939, 'weight_decay': 0.000724155154832045, 'dropout_p': 0.180785869026715}. Best is trial 14 with value: 0.0024257085751742125.


NSE : 0.7255594432353973 WAPE : 25.030316707735317 Validation Loss: 0.00324849970638752
Epoch 1: Traning Loss: 0.02264887248980813 

[I 2024-11-18 17:00:22,522] Trial 16 pruned. 


NSE : -0.11314988136291504 WAPE : 53.704949923106625 Validation Loss: 0.013176139444112778
Epoch 1: Traning Loss: 0.020169242594391106 

[I 2024-11-18 17:00:33,372] Trial 17 pruned. 


NSE : -0.03541207313537598 WAPE : 50.59593683682202 Validation Loss: 0.012255973182618618
Epoch 1: Traning Loss: 0.02353827174473554 NSE : 0.26146310567855835 WAPE : 39.0458974432105 Validation Loss: 0.00874191801995039
Epoch 2: Traning Loss: 0.008782417987007648 NSE : 0.5659763514995575 WAPE : 28.75150017904815 Validation Loss: 0.005137453321367502
Epoch 3: Traning Loss: 0.009085985296405853 NSE : 0.11166989803314209 WAPE : 49.086499799523416 Validation Loss: 0.01051499042659998
Epoch 4: Traning Loss: 0.0080467936352361 NSE : 0.7697699069976807 WAPE : 21.860551368711075 Validation Loss: 0.0027251888532191515
Epoch 5: Traning Loss: 0.007442744900006801 

[I 2024-11-18 17:01:27,859] Trial 18 finished with value: 0.0027251888532191515 and parameters: {'lr': 0.002244854964181319, 'weight_decay': 0.00015666065923624156, 'dropout_p': 0.16416494504439288}. Best is trial 14 with value: 0.0024257085751742125.


NSE : 0.7243680059909821 WAPE : 24.569701566795825 Validation Loss: 0.003262602724134922
Epoch 1: Traning Loss: 0.02056050811242312 

[I 2024-11-18 17:01:39,100] Trial 19 pruned. 


NSE : -0.15606296062469482 WAPE : 55.13053311983369 Validation Loss: 0.01368409302085638
Number of finished trials: 20
Best trial:
  Value (Best Validation Loss): 0.0024257085751742125
  Params:
    lr: 0.00029409586609282086
    weight_decay: 0.0005606064055967145
    dropout_p: 0.18936945301450742


In [14]:
import optuna.visualization as vis

# Optimization history
vis.plot_optimization_history(study)

# Hyperparameter importance
vis.plot_param_importances(study)

## We need to make sure we can to plot it.

ImportError: Tried to import 'plotly' but failed. Please make sure that the package is installed correctly to use this feature. Actual error: No module named 'plotly'.