## Multivariate time series prediction using Long Short Term Memory (LSTM)

### Theory of Recurrent Neural Network (RNN)

The most important difference between RNN and Fully Connected Network(FCN) is that RNN does sequential processing of input data and maintains a hidden state.
The hidden state is updated for each sequence input. When we train the model, it learns how to update the hidden state in the best possible way such that it can be used to predict the next value in the sequence. Basic example:

### Let's consider a network that learns to count the number of 1's in a binary sequence.

So, in an FCN, we feed the entire binary sequence at once, e.g. 1, 0, 1, 1 is fed at once, and it multiplies with a weight matrix of a 4x4 matrix and the computes flows forward.

But, in a LSTM, 
- Model first takes 1 and an initial hidden state h which is normally assigned to be 0, updates the hidden state h,
- Then takes 0 and h and updates hidden state h,
- Then takes 1 and h updates hidden state h,
- Also finally takes 1 and h and updates the hidden state h.

  This final h value is used to predict the number of 1's in the binary sequence. Normally this h is passed through a fully connected layer to get the required final output.

Mathematically,

$$ h_t = W_{xh}x_t + b_t + W_{hh} h_{t-1} + b_h $$

Final output is calculated as
$$ y_t = h_t W_{hy}^T + b_y $$

Here is a PyTorch implementation of an LSTM neural network which has 1 LSTM layer and 1 fully connected layer.


### LSTM

LSTM is basically an RNN where the way we calculate the hidden state is more advanced and involves multiple equations. In practice, LSTM has been implemented in different time series modeling applications. Pytorch implementation of LSTM uses the following equations in each layer.

We initialize the $h_{t-1}$ and $c_{t-1}$ to zeros in every forward pass
$$\begin{array}{ll} \\
         Input Gate:   i_t = \sigma(W_{ii} x_t + b_{ii} + W_{hi} h_{t-1} + b_{hi}) \\
         Forget Gate:   f_t = \sigma(W_{if} x_t + b_{if} + W_{hf} h_{t-1} + b_{hf}) \\
         Candidate Cell State:   g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hg} h_{t-1} + b_{hg}) \\
         Output Gate:   o_t = \sigma(W_{io} x_t + b_{io} + W_{ho} h_{t-1} + b_{ho}) \\
         Cell State Update:   c_t = f_t \odot c_{t-1} + i_t \odot g_t \\
         Hidden State:   h_t = o_t \odot \tanh(c_t) \\
        \end{array}$$



In [1]:
import torch
import numpy as np

import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import optuna

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
path = 'final_data.csv'

In [3]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

# Implement determinism. Set a fixed value for random seed so that when the parameters are initialized, they are initialized same across all experiments.
torch.manual_seed(42)

# If you are using CUDA, also set the seed for it
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)
    torch.cuda.manual_seed_all(42)

# Set the seed for NumPy
np.random.seed(42)

Using device: cuda


Here we define **RiverData** a custom Dataset class to load the dataset we have. It extends the Pytorch Dataset class.  
- We need to define \_\_init__() function which can be used for loading data from the file and optionally for data preprocessing.
- Thereafter we define \_\_len__() function which gives the length of dataset.
- Then we define \_\_getitem__() function which returns an instance of (feature, label) tuple which can be used for model training.
  For our time series data, feature means the past values to be used for training and label means the future values to be predicted.

In [4]:
class RiverData(torch.utils.data.Dataset):
    
    def __init__(self, df, target, datecol, seq_len, pred_len):
        self.df = df
        self.datecol = datecol
        self.target = target
        self.seq_len = seq_len
        self.pred_len = pred_len
        self.setIndex()
        

    def setIndex(self):
        self.df.set_index(self.datecol, inplace=True)
    

    def __len__(self):
        return len(self.df) - self.seq_len - self.pred_len


    def __getitem__(self, idx):
        if len(self.df) <= (idx + self.seq_len+self.pred_len):
            raise IndexError(f"Index {idx} is out of bounds for dataset of size {len(self.df)}")
        df_piece = self.df[idx:idx+self.seq_len].values
        feature = torch.tensor(df_piece, dtype=torch.float32)
        label_piece = self.df[self.target][idx + self.seq_len:  idx+self.seq_len+self.pred_len].values
        label = torch.tensor(label_piece, dtype=torch.float32)
        return (feature, label) 

### Normalize the data

In [5]:
df = pd.read_csv(path)
raw_df = df.drop('DATE', axis=1, inplace=False)
scaler = MinMaxScaler()

# Apply the transformations
df_scaled = scaler.fit_transform(raw_df)

df_scaled = pd.DataFrame(df_scaled, columns=raw_df.columns)
df_scaled['DATE'] = df['DATE']
df = df_scaled

Some advanced Python syntax has been used here. \
*common_args : it's used to pass arguments to a function, where common_args represents a python list \
**common_args: it's used to pass arguments to a function, where common_args represents a python dictionary

In [6]:

train_size = int(0.7 * len(df))
test_size = int(0.2 * len(df))
val_size = len(df) - train_size - test_size

seq_len = 13
pred_len = 1
num_features = 7

common_args = ['gauge_height', 'DATE', seq_len, pred_len]
train_dataset = RiverData(df[:train_size], *common_args)
val_dataset = RiverData(df[train_size: train_size+val_size], *common_args)
test_dataset = RiverData(df[train_size+val_size : len(df)], *common_args)


In [7]:
# Important parameters

BATCH_SIZE = 512 # keep as big as can be handled by GPU and memory
SHUFFLE = False # we don't shuffle the time series data
DATA_LOAD_WORKERS = 1 # it depends on the amount of data you need to load
learning_rate = 1e-3


In [8]:
from torch.utils.data import DataLoader

common_args = {'batch_size': BATCH_SIZE, 'shuffle': SHUFFLE}
train_loader = DataLoader(train_dataset, **common_args)
val_loader = DataLoader(val_dataset, **common_args)
test_loader = DataLoader(test_dataset, **common_args)

### Here we define our PyTorch model.

BasicLSTMNetwork is the model class, it extends the Module class provided by pytorch. \
- We define \_\_init__() function. It sets up layers and defines the model parameters.
- Also, we define forward() function which defines how the forwared pass computation occurs

In [9]:
class BasicLSTMNetwork(torch.nn.Module):
    
    def __init__(self, seq_len, pred_len):
        # call the constructor of the base class
        super().__init__()
        self.seq_len = seq_len
        self.pred_len = pred_len
        self.num_features = num_features
        self.n_layers = 1
        
        self.n_hidden = 128
        
        # define layers for combining across time series
        self.lstm1 = torch.nn.LSTM(input_size = self.num_features, hidden_size = self.n_hidden, num_layers=self.n_layers, batch_first = True)
        self.relu = torch.nn.ReLU()
        self.fc1 = torch.nn.Linear(self.n_hidden * self.seq_len, self.pred_len)


    def init_hidden(self, batchsize):
        device = next(self.parameters()).device
        hidden_state = torch.zeros(self.n_layers, batchsize, self.n_hidden, device=device)
        cell_state = torch.zeros(self.n_layers, batchsize, self.n_hidden, device=device)
        return hidden_state, cell_state

    
    def forward(self, x):
        batchsize, seqlen, featlen = x.size()
        self.hidden_states = self.init_hidden(batchsize)
        lstm_out, self.hidden_states = self.lstm1(x, self.hidden_states)
        lstm_out = lstm_out.contiguous().view(batchsize, -1)
        lstm_out = self.relu(lstm_out)
        lstm_out = self.fc1(lstm_out)
        return lstm_out
# Note that the gradients are stored insize the FC layer objects
# For each training example we need to get rid of these gradients

In [10]:
loss = torch.nn.MSELoss()


In [11]:
for i, (f,l) in enumerate(train_loader):
    print('features shape: ', f.shape)
    print('labels shape: ', l.shape)
    break

features shape:  torch.Size([512, 13, 7])
labels shape:  torch.Size([512, 1])


In [16]:
# define metrics
import numpy as np
epsilon = np.finfo(float).eps

def wape_function(y, y_pred):
    """Weighted Average Percentage Error metric in the interval [0; 100]"""
    y = np.array(y)
    y_pred = np.array(y_pred)
    nominator = np.sum(np.abs(np.subtract(y, y_pred)))
    denominator = np.add(np.sum(np.abs(y)), epsilon)
    wape = np.divide(nominator, denominator) * 100.0
    return wape

def nse_function(y, y_pred):
    y = np.array(y)
    y_pred = np.array(y_pred)
    return (1-(np.sum((y_pred-y)**2)/np.sum((y-np.mean(y))**2)))


def evaluate_model(model, data_loader):
    # following line prepares the model for evaluation mode. It disables dropout and batch normalization if they have 
    # are part of the model. For our simple model, it's not necessary. Still, I'm going to use it.

    model.eval()
    all_inputs = torch.empty((0, seq_len, num_features))
    all_labels = torch.empty(0, pred_len)
    for inputs, labels in data_loader:
        all_inputs = torch.vstack((all_inputs, inputs))
        all_labels = torch.vstack((all_labels, labels))
    
    with torch.no_grad():
        all_inputs = all_inputs.to(device)
        outputs = model(all_inputs).detach().cpu()
        avg_val_loss = loss(outputs, all_labels)
        nse = nse_function(all_labels.numpy(), outputs.numpy())
        wape = wape_function(all_labels.numpy(), outputs.numpy())
        
    print(f'NSE : {nse}', end=' ')
    print(f'WAPE : {wape}', end=' ')
    print(f'Validation Loss: {avg_val_loss}')
    model.train()
    return avg_val_loss


In [17]:
def objective(trial):
    # Here we define the search space of the hyper-parameters. Optuna uses byaesian optimization to find the optimal values of the hyperparameters.
    learning_rate = trial.suggest_loguniform('lr', 1e-4, 1e-2)
    weight_decay = trial.suggest_loguniform('weight_decay', 1e-5, 1e-2)

    model = BasicLSTMNetwork(seq_len, pred_len)
    model = model.to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate, weight_decay=weight_decay)

    # I have kept it low for the demonstration purpose. In real world scenario it should be kept high ~ 50.
    num_epochs = 10
    
    best_val_loss = float('inf')

    # Same for this, I have kept it low for demonstration purpose. In real world scenario it should be high ~ 5 to 10.
    patience = 1
    
    for epoch in range(num_epochs):
        model.train()
        epoch_loss = []
        for batch_idx, (inputs, labels) in enumerate(train_loader):
            inputs = inputs.to(device)
            labels = labels.to(device)
            outputs = model(inputs)
            loss_val = loss(outputs, labels)
    
            # calculate gradients for back propagation
            loss_val.backward()
    
            # update the weights based on the gradients
            optimizer.step()
    
            # reset the gradients, avoid gradient accumulation
            optimizer.zero_grad()
            epoch_loss.append(loss_val.item())
    
        avg_train_loss = sum(epoch_loss)/len(epoch_loss)
        print(f'Epoch {epoch+1}: Traning Loss: {avg_train_loss}', end=' ')
        avg_val_loss = evaluate_model(model, val_loader)
    
        # Check for improvement
        if avg_val_loss < best_val_loss:
            best_val_loss = avg_val_loss
            epochs_no_improve = 0
            # Save the best model
            torch.save(model.state_dict(), 'best_model_trial.pth')
        else:
            epochs_no_improve += 1
            if epochs_no_improve == patience:
                print('Early stopping!')
                # Load the best model before stopping
                model.load_state_dict(torch.load('best_model_trial.pth'))
                break

        # Report intermediate objective value
        trial.report(best_val_loss, epoch)

        # Handle pruning based on the intermediate value
        if trial.should_prune():
            raise optuna.exceptions.TrialPruned()

    return best_val_loss

study = optuna.create_study(direction='minimize')

study.optimize(objective, n_trials=20)

print('Number of finished trials:', len(study.trials))
print('Best trial:')
trial = study.best_trial

print('  Value (Best Validation Loss):', trial.value)
print('  Params:')
for key, value in trial.params.items():
    print(f'    {key}: {value}')




[I 2024-11-19 16:04:24,044] A new study created in memory with name: no-name-757d3e92-2f76-4796-9da6-0c87a6cc172f
  learning_rate = trial.suggest_loguniform('lr', 1e-4, 1e-2)
  weight_decay = trial.suggest_loguniform('weight_decay', 1e-5, 1e-2)


Epoch 1: Traning Loss: 0.014093939006564369 NSE : -0.6168206930160522 WAPE : 77.44650288799536 Validation Loss: 0.027603207156062126
Epoch 2: Traning Loss: 0.01087123653708885 NSE : 0.0015901923179626465 WAPE : 60.48224610044913 Validation Loss: 0.01704537309706211
Epoch 3: Traning Loss: 0.007364156993320519 NSE : 0.5486051440238953 WAPE : 39.16127082360514 Validation Loss: 0.007706447504460812
Epoch 4: Traning Loss: 0.004622035719534657 NSE : 0.7580589205026627 WAPE : 27.21856395390195 Validation Loss: 0.004130543675273657
Epoch 5: Traning Loss: 0.0033482389509362324 NSE : 0.797127902507782 WAPE : 24.704020972741354 Validation Loss: 0.0034635381307452917
Epoch 6: Traning Loss: 0.002861688583516067 NSE : 0.8124763369560242 WAPE : 23.735947389957904 Validation Loss: 0.003201501676812768
Epoch 7: Traning Loss: 0.0026119323752789173 NSE : 0.8206541240215302 WAPE : 23.199395989556592 Validation Loss: 0.003061886178329587
Epoch 8: Traning Loss: 0.0024772764033703537 NSE : 0.8248678296804428

[I 2024-11-19 16:05:43,663] Trial 0 finished with value: 0.002907732268795371 and parameters: {'lr': 0.0007561625800237071, 'weight_decay': 0.004708536115006036}. Best is trial 0 with value: 0.002907732268795371.


NSE : 0.8296834826469421 WAPE : 22.595315115826185 Validation Loss: 0.002907732268795371
Epoch 1: Traning Loss: 0.006959353710053758 NSE : 0.7103018164634705 WAPE : 28.061775580116954 Validation Loss: 0.004945878405123949
Epoch 2: Traning Loss: 0.002870339009707538 NSE : 0.7805216610431671 WAPE : 21.699795769673123 Validation Loss: 0.003747048554942012
Epoch 3: Traning Loss: 0.00239318782798807 NSE : 0.872596874833107 WAPE : 18.43789683084679 Validation Loss: 0.0021750922314822674
Epoch 4: Traning Loss: 0.0021330302906097656 

[I 2024-11-19 16:06:15,384] Trial 1 finished with value: 0.0021750922314822674 and parameters: {'lr': 0.00739110623107165, 'weight_decay': 0.0032251005173378213}. Best is trial 1 with value: 0.0021750922314822674.


NSE : 0.6995921432971954 WAPE : 26.81292491493778 Validation Loss: 0.005128719378262758
Early stopping!
Epoch 1: Traning Loss: 0.012103991124456627 NSE : -0.2770369052886963 WAPE : 68.79279962643757 Validation Loss: 0.02180224098265171
Epoch 2: Traning Loss: 0.006492827798468581 NSE : 0.8003312200307846 WAPE : 23.711842612176135 Validation Loss: 0.003408849472180009
Epoch 3: Traning Loss: 0.0025353606544768316 NSE : 0.8427001237869263 WAPE : 20.512998620269986 Validation Loss: 0.002685505198314786
Epoch 4: Traning Loss: 0.0016571320549036532 NSE : 0.8844922110438347 WAPE : 17.93587169496909 Validation Loss: 0.0019720089621841908
Epoch 5: Traning Loss: 0.001184242763399202 NSE : 0.8966521397233009 WAPE : 17.123673620956684 Validation Loss: 0.0017644085455685854
Epoch 6: Traning Loss: 0.0010054516390339485 NSE : 0.9079659432172775 WAPE : 16.04966843165054 Validation Loss: 0.0015712531749159098
Epoch 7: Traning Loss: 0.0009132620324436836 NSE : 0.9171340689063072 WAPE : 15.18050433216912 

[I 2024-11-19 16:07:34,868] Trial 2 finished with value: 0.0009632355067878962 and parameters: {'lr': 0.0003934442587122108, 'weight_decay': 0.0001942244840788907}. Best is trial 2 with value: 0.0009632355067878962.


NSE : 0.9435797743499279 WAPE : 12.194063885936684 Validation Loss: 0.0009632355067878962
Epoch 1: Traning Loss: 0.011557344935667763 NSE : 0.198744535446167 WAPE : 53.30042212413626 Validation Loss: 0.013679451309144497
Epoch 2: Traning Loss: 0.007646651394005752 NSE : 0.7858569920063019 WAPE : 24.85425545289162 Validation Loss: 0.0036559610161930323
Epoch 3: Traning Loss: 0.0021842981974989043 NSE : 0.8158048689365387 WAPE : 22.786797554377493 Validation Loss: 0.0031446751672774553
Epoch 4: Traning Loss: 0.0013912819028015929 NSE : 0.9160984382033348 WAPE : 14.819676585721098 Validation Loss: 0.001432411139830947
Epoch 5: Traning Loss: 0.0010805342988043712 NSE : 0.936137244105339 WAPE : 12.598536422688989 Validation Loss: 0.0010902982903644443
Epoch 6: Traning Loss: 0.0009661771159722212 NSE : 0.947092991322279 WAPE : 11.26966590672349 Validation Loss: 0.000903255888260901
Epoch 7: Traning Loss: 0.0008649240202624422 NSE : 0.9538114666938782 WAPE : 10.436657873440938 Validation Loss

[I 2024-11-19 16:08:54,357] Trial 3 finished with value: 0.0006196001777425408 and parameters: {'lr': 0.0010292516182143097, 'weight_decay': 0.0002503903940662566}. Best is trial 3 with value: 0.0006196001777425408.


NSE : 0.9637077525258064 WAPE : 9.170560032869831 Validation Loss: 0.0006196001777425408
Epoch 1: Traning Loss: 0.013700634541819084 NSE : -0.755825400352478 WAPE : 80.86985673768156 Validation Loss: 0.029976362362504005
Epoch 2: Traning Loss: 0.01198190504000735 NSE : -0.213647723197937 WAPE : 66.98103240992138 Validation Loss: 0.020720025524497032
Epoch 3: Traning Loss: 0.009417215886060148 NSE : 0.3084389567375183 WAPE : 49.795928141350835 Validation Loss: 0.011806690134108067
Epoch 4: Traning Loss: 0.006673192336128878 NSE : 0.5082704424858093 WAPE : 41.2629072369269 Validation Loss: 0.008395062759518623
Epoch 5: Traning Loss: 0.005022590549211821 NSE : 0.6248090863227844 WAPE : 35.63385177093524 Validation Loss: 0.006405454128980637
Epoch 6: Traning Loss: 0.004451752275105585 NSE : 0.6763741970062256 WAPE : 32.95126424189153 Validation Loss: 0.005525107961148024
Epoch 7: Traning Loss: 0.003996230542425156 NSE : 0.7014378607273102 WAPE : 31.57643133783101 Validation Loss: 0.0050972

[I 2024-11-19 16:10:13,743] Trial 4 finished with value: 0.004047873895615339 and parameters: {'lr': 0.0010561717817610934, 'weight_decay': 0.007798537139907157}. Best is trial 3 with value: 0.0006196001777425408.


NSE : 0.7629012167453766 WAPE : 27.578377063520403 Validation Loss: 0.004047873895615339
Epoch 1: Traning Loss: 0.011552640947224904 NSE : -0.062052369117736816 WAPE : 62.57483661926002 Validation Loss: 0.018131909891963005
Epoch 2: Traning Loss: 0.0061433927150417195 NSE : 0.8206175118684769 WAPE : 21.771738142879176 Validation Loss: 0.0030625115614384413
Epoch 3: Traning Loss: 0.0020877651327028445 NSE : 0.8546998798847198 WAPE : 19.985264215832313 Validation Loss: 0.0024806393776088953
Epoch 4: Traning Loss: 0.0013990596782263642 NSE : 0.8979388698935509 WAPE : 16.84246515283732 Validation Loss: 0.0017424407415091991
Epoch 5: Traning Loss: 0.0010515293974380912 NSE : 0.9222323149442673 WAPE : 14.46097825604662 Validation Loss: 0.0013276904355734587
Epoch 6: Traning Loss: 0.0009202024969818935 NSE : 0.9339597225189209 WAPE : 13.223451466112804 Validation Loss: 0.0011274740099906921
Epoch 7: Traning Loss: 0.0008194794926599145 NSE : 0.9437585547566414 WAPE : 12.053527881251659 Validat

[I 2024-11-19 16:11:32,406] Trial 5 finished with value: 0.0006474709371104836 and parameters: {'lr': 0.0005584104884190766, 'weight_decay': 0.00016854828502381434}. Best is trial 3 with value: 0.0006196001777425408.


NSE : 0.9620752595365047 WAPE : 9.28503626809443 Validation Loss: 0.0006474709371104836
Epoch 1: Traning Loss: 0.012081803504964407 

[I 2024-11-19 16:11:40,249] Trial 6 pruned. 


NSE : -0.3618673086166382 WAPE : 71.07080237040293 Validation Loss: 0.023250507190823555
Epoch 1: Traning Loss: 0.004886623936159262 NSE : 0.8436838239431381 WAPE : 21.317251273556302 Validation Loss: 0.002668710658326745
Epoch 2: Traning Loss: 0.0016762010899708053 NSE : 0.9178655594587326 WAPE : 15.099139702417801 Validation Loss: 0.0014022418763488531
Epoch 3: Traning Loss: 0.0012021332059624643 NSE : 0.9306038469076157 WAPE : 13.692442579919048 Validation Loss: 0.001184767228551209
Epoch 4: Traning Loss: 0.0010236001697859105 

[I 2024-11-19 16:12:11,750] Trial 7 finished with value: 0.001184767228551209 and parameters: {'lr': 0.005057059582258928, 'weight_decay': 0.0010239036710077514}. Best is trial 3 with value: 0.0006196001777425408.


NSE : 0.9285470694303513 WAPE : 13.362277334536282 Validation Loss: 0.0012198816984891891
Early stopping!
Epoch 1: Traning Loss: 0.013603427543849708 NSE : 0.08792859315872192 WAPE : 56.866956417012304 Validation Loss: 0.015571357682347298
Epoch 2: Traning Loss: 0.010729435714302103 

[I 2024-11-19 16:12:27,536] Trial 8 pruned. 


NSE : 0.21875756978988647 WAPE : 53.104159227463875 Validation Loss: 0.013337777927517891
Epoch 1: Traning Loss: 0.012761181334168342 

[I 2024-11-19 16:12:35,444] Trial 9 pruned. 


NSE : -0.4499669075012207 WAPE : 73.46090817787601 Validation Loss: 0.024754589423537254
Epoch 1: Traning Loss: 0.010124910320199644 NSE : 0.2732051610946655 WAPE : 49.88540986573868 Validation Loss: 0.012408220209181309
Epoch 2: Traning Loss: 0.0036947255500820986 NSE : 0.8669676333665848 WAPE : 18.554874993221084 Validation Loss: 0.0022711975034326315
Epoch 3: Traning Loss: 0.001481130956927714 NSE : 0.939994465559721 WAPE : 12.148024464676729 Validation Loss: 0.0010244457516819239
Epoch 4: Traning Loss: 0.0010678783823058794 

[I 2024-11-19 16:13:06,860] Trial 10 finished with value: 0.0010244457516819239 and parameters: {'lr': 0.0018044440036282699, 'weight_decay': 3.2119745657494695e-05}. Best is trial 3 with value: 0.0006196001777425408.


NSE : 0.9396714940667152 WAPE : 12.900838772314968 Validation Loss: 0.001029959530569613
Early stopping!
Epoch 1: Traning Loss: 0.01332901565264674 NSE : 0.24784469604492188 WAPE : 48.4659043288589 Validation Loss: 0.012841186486184597
Epoch 2: Traning Loss: 0.010516184145173408 

[I 2024-11-19 16:13:22,631] Trial 11 pruned. 


NSE : 0.3513110876083374 WAPE : 47.97707387366068 Validation Loss: 0.011074754409492016
Epoch 1: Traning Loss: 0.007109248048187408 NSE : 0.7098385691642761 WAPE : 27.23824042517377 Validation Loss: 0.004953786730766296
Epoch 2: Traning Loss: 0.0036807611907021553 

[I 2024-11-19 16:13:38,332] Trial 12 finished with value: 0.004953786730766296 and parameters: {'lr': 0.002420484822582271, 'weight_decay': 8.303816072011545e-05}. Best is trial 3 with value: 0.0006196001777425408.


NSE : 0.6432669162750244 WAPE : 31.595275850921546 Validation Loss: 0.006090333219617605
Early stopping!
Epoch 1: Traning Loss: 0.012176520158483547 

[I 2024-11-19 16:13:46,198] Trial 13 pruned. 


NSE : 0.12205725908279419 WAPE : 56.64490721861502 Validation Loss: 0.014988696202635765
Epoch 1: Traning Loss: 0.009830789359258062 

[I 2024-11-19 16:13:54,060] Trial 14 pruned. 


NSE : -0.02518165111541748 WAPE : 47.28447088868276 Validation Loss: 0.01750243455171585
Epoch 1: Traning Loss: 0.011876849364289874 

[I 2024-11-19 16:14:01,927] Trial 15 pruned. 


NSE : -0.10741662979125977 WAPE : 63.80023618808558 Validation Loss: 0.01890639401972294
Epoch 1: Traning Loss: 0.013842971368780274 

[I 2024-11-19 16:14:09,811] Trial 16 pruned. 


NSE : -0.16296052932739258 WAPE : 65.15491501525169 Validation Loss: 0.01985466480255127
Epoch 1: Traning Loss: 0.00887467448028197 NSE : 0.24350106716156006 WAPE : 49.93456142686198 Validation Loss: 0.012915343046188354
Epoch 2: Traning Loss: 0.004444658466399681 

[I 2024-11-19 16:14:25,499] Trial 17 pruned. 


NSE : 0.5608006417751312 WAPE : 35.934107239321534 Validation Loss: 0.007498240098357201
Epoch 1: Traning Loss: 0.011952241111799678 

[I 2024-11-19 16:14:33,331] Trial 18 pruned. 


NSE : -0.2190399169921875 WAPE : 67.2187743644896 Validation Loss: 0.020812084898352623
Epoch 1: Traning Loss: 0.010832582341213379 

[I 2024-11-19 16:14:41,063] Trial 19 pruned. 


NSE : -0.021014690399169922 WAPE : 61.451041594035395 Validation Loss: 0.017431294545531273
Number of finished trials: 20
Best trial:
  Value (Best Validation Loss): 0.0006196001777425408
  Params:
    lr: 0.0010292516182143097
    weight_decay: 0.0002503903940662566


In [18]:
import optuna.visualization as vis

# Optimization history
fig1 = vis.plot_optimization_history(study)
fig1.write_html("optimization_history_lstm.html")