# To Generate sequential data


To be able to train neural networks on sequential data, we need to pre-process it first. we'll chunk the data into inputs-target pairs, where the inputs are some number of consecutive data points and the target is the next data point.

In [None]:
import numpy as np

def create_sequences(df, seq_length):
    data = df['consumption'].values
    X, y = [], []
    for i in range(len(data) - seq_length):
        X.append(data[i:i+seq_length])
        y.append(data[i+seq_length])
    return np.array(X), np.array(y)

# Example usage:
# X, y = create_sequences(df, seq_length=24)

### Sequential Dataset

Let's create a training Dataset and DataLoader.To build a sequential Dataset, you will call `create_sequences()` to get the NumPy arrays with inputs and targets, and inspect their shape. Next, you will pass them to a `TensorDataset` to create a proper torch Dataset, and inspect its length.

In [2]:
import torch
from torch.utils.data import TensorDataset
import pandas as pd

# Load the data
train_data = pd.read_csv('./electricity_consump/electricity_train.csv')

# Check if 'consumption' column exists
assert 'consumption' in train_data.columns, "Column 'consumption' not found in DataFrame"

# Create sequences
seq_length = 24 * 4  # 4 days of hourly data, adjust as needed
X_train, y_train = create_sequences(train_data, seq_length)

print("X_train shape:", X_train.shape)
print("y_train shape:", y_train.shape)

# Ensure shapes are compatible for TensorDataset
assert len(X_train) == len(y_train), "Input and target lengths do not match"

dataset_train = TensorDataset(
    torch.tensor(X_train).float(),
    torch.tensor(y_train).float()
)
print("Number of samples in dataset_train:", len(dataset_train))

X_train shape: (105119, 96)
y_train shape: (105119,)
Number of samples in dataset_train: 105119


### Building a forecasting RNN

It's time to build our first recurrent network! It will be a sequence-to-vector model consisting of an RNN layer with two layers and a `hidden_size` of 32. After the RNN layer, a simple linear layer will map the outputs to a single value to be predicted.

In [3]:
import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        # Define RNN layer
        self.rnn = nn.RNN(
            input_size = 1,
            hidden_size = 32,
            num_layers = 2,
            batch_first = True,
        )
        self.fc = nn.Linear(32, 1)

    
    def forward(self, x):
        h0 = torch.zeros(2, x.size(0), 32)

        out, _ = self.rnn(x, h0)
        out = self.fc(out[:, -1, :])
        return out

#### LSTM network

As we already know, plain RNN cells are not used that much in practice. A more frequently used alternative that ensures a much better handling of long sequences are Long Short-Term Memory cells, or LSTMs.

The most important implementation difference from the RNN network we have built previously comes from the fact that LSTMs have two rather than one hidden states. This means we will need to initialize this additional hidden state and pass it to the LSTM cell.

In [4]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()

        self.lstm = nn.LSTM(
            input_size=1,
            hidden_size=32,
            num_layers=2,
            batch_first=True
        )

        self.fc = nn.Linear(32, 1)

    def forward(self, x):

        h0 = torch.zeros(2, x.size(0), 32)
        c0 = torch.zeros(2, x.size(0), 32)
        out, _ = self.lstm(x, (h0, c0))
        out = self.fc(out[:, -1, :])
        return out
    

#### GRU network

Next to LSTMs, another popular recurrent neural network variant is the Gated Recurrent Unit, or GRU. It's appeal is in its simplicity: GRU cells require less computation than LSTM cells while often matching them in performance.

In [None]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        
        self.gru = nn.GRU(
            input_size=1,
            hidden_size=32,
            num_layers=2,
            batch_first=True,
        )
        self.fc = nn.Linear(32, 1)

    def forward(self, x):
        h0 = torch.zeros(2, x.size(0), 32)
        out, _ = self.gru(x, h0)  
        out = self.fc(out[:, -1, :])
        return out