# Electricity Forcasting - RNN
Notebook ini merupakan bentuk *re-practice* dari course [Datacamp: Intermediate Deep Learning with Pytorch](https://app.datacamp.com/learn/courses/intermediate-deep-learning-with-pytorch) pada bab **Sequances & Recurrent Neural Networks**.

Task yang dikerjakan adalah memprediksi konsumsi listrik berdasarkan pola masa lalu.

Data yang digunakan sama dengan data yang disampaikan dalam course yang merupakan subset data bersumber pada [ElectricityLoadDiagrams20112014](https://doi.org/10.24432/C58C86).

In [1]:
%%bash
head ../data/electricity_consump/electricity_train.csv

timestamp,consumption
2011-01-01 00:15:00,-0.7043185184993116
2011-01-01 00:30:00,-0.7043185184993116
2011-01-01 00:45:00,-0.6789826341438515
2011-01-01 01:00:00,-0.6536467497883897
2011-01-01 01:15:00,-0.7043185184993116
2011-01-01 01:30:00,-0.7043185184993116
2011-01-01 01:45:00,-0.7299077616983259
2011-01-01 02:00:00,-0.7043185184993116
2011-01-01 02:15:00,-0.7043185184993116


Data berisi *consumption* dalam kilowatts atau kW, dan dicatat setiap 15 menit dalam beberapa tahun. *Split* data berdasarkan waktu:
- **Train**: 2011-2023
- **Test**: 2014

In [2]:
import numpy as np
import pandas as pd

_**Sequench length**_ merupakan jumlah data poin dalam satu *training example*. Kita akan memprediksi berdasarkan data 24 jam sebelumnya, dan satu jam terdapat 4 record (data poin), sehingga *sequench length* sama dengan:
$$24 \times 4 = 96$$

In [3]:
seq_length = 96

In [4]:
train_df = pd.read_csv("../data/electricity_consump/electricity_train.csv")
train_df.head()

Unnamed: 0,timestamp,consumption
0,2011-01-01 00:15:00,-0.704319
1,2011-01-01 00:30:00,-0.704319
2,2011-01-01 00:45:00,-0.678983
3,2011-01-01 01:00:00,-0.653647
4,2011-01-01 01:15:00,-0.704319


In [5]:
train_df.shape

(105215, 2)

In [6]:
def create_sequences(df, seq_length):
    xs, ys = [], []
    for i in range(len(df) - seq_length):
        x = df.iloc[i:(i+seq_length), 1]
        y = df.iloc[i+seq_length, 1]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)

In [7]:
X_train, y_train = create_sequences(train_df, seq_length)

## TensorDataset

In [8]:
import torch
from torch.utils.data import TensorDataset, DataLoader

In [9]:
dataset_train = TensorDataset(
    torch.from_numpy(X_train).float(),
    torch.from_numpy(y_train).float()
)

In [10]:
dataset_train

<torch.utils.data.dataset.TensorDataset at 0x769be836bed0>

In [11]:
dataset_train[0]

(tensor([-0.7043, -0.7043, -0.6790, -0.6536, -0.7043, -0.7043, -0.7299, -0.7043,
         -0.7043, -0.6790, -0.6790, -0.6283, -0.6790, -0.7299, -0.7299, -0.7299,
         -0.7043, -0.6790, -0.7043, -0.7552, -0.6790, -0.6790, -0.6790, -0.6536,
         -0.7299, -0.7043, -0.6790, -0.6790, -0.7043, -0.7299, -0.7043, -0.7299,
         -0.6790, -0.7043, -0.7552, -0.9073, -1.0089, -0.9579, -0.9326, -0.9833,
         -0.9833, -1.0089, -0.9833, -1.0089, -0.9833, -0.9579, -0.9579, -0.9579,
         -0.9579, -1.0089, -0.9833, -0.9579, -0.9833, -0.9833, -1.0089, -0.9579,
         -0.9326, -1.0089, -0.9833, -0.9326, -0.9833, -1.0342, -0.9833, -0.9833,
         -0.8566, -0.6790, -0.7299, -0.7299, -0.7043, -0.7043, -0.7299, -0.7043,
         -0.7043, -0.6790, -0.6790, -0.6536, -0.6790, -0.6536, -0.6790, -0.7043,
         -0.7043, -0.7299, -0.7299, -0.6790, -0.6536, -0.7299, -0.7043, -0.7299,
         -0.7299, -0.6283, -0.6536, -0.7043, -0.7043, -0.6536, -0.7299, -0.7043]),
 tensor(-0.7043))

In [12]:
dataset_train[-1]

(tensor([ 0.8483,  0.7720,  0.7213,  0.6957,  0.7720,  0.7467,  0.7467,  0.7467,
          0.7720,  0.7213,  0.6957,  0.7213, -0.1973,  0.6957,  0.6957, -0.3494,
         -0.9073, -0.8819, -0.8566, -0.8819, -0.8566, -0.8566, -0.8313, -0.8313,
         -0.8819, -0.8566, -0.8819, -0.8819, -0.8566, -0.8313, -0.8566, -0.8313,
          0.5941,  0.7467,  0.5941,  0.4416,  0.3144,  0.2128,  0.3144,  0.3398,
          0.3398,  0.5432,  0.6451,  0.1873,  0.4416, -0.1720, -0.1464, -0.0958,
         -0.1464, -0.1211, -0.0958, -0.1211, -0.1211, -0.2227, -0.2227, -0.2227,
         -0.3240, -0.3494, -0.4254, -0.2734, -0.3240,  0.4923,  0.6195,  0.5941,
          0.5688,  0.6704,  0.8483,  0.9755,  1.1026,  1.0008,  0.9501,  0.8992,
          0.4670, -0.3747, -0.4510, -0.4763, -0.5016, -0.5270, -0.5523, -0.6030,
         -0.6536, -0.6536, -0.7299, -0.7299, -0.7299, -0.7806, -0.8313, -0.8313,
         -0.8819, -0.9073, -0.9073, -0.9326, -0.9326, -0.9073, -0.9326, -0.9326]),
 tensor(-0.9326))

In [13]:
dataset_train.__len__()

105119

In [14]:
dataloader_train = DataLoader(dataset_train, batch_size=16)

## RNN in PyTorch

In [15]:
import torch.nn as nn

In [16]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.rnn = nn.RNN(
            input_size=1,
            hidden_size=32,
            num_layers=2,
            batch_first=True
        )
        self.fc = nn.Linear(32, 1)

    def forward(self, x):
        h0 = torch.zeros(2, x.size(0), 32)
        out, _ = self.rnn(x, h0)
        out = self.fc(out[:, -1, :])
        return out

In [17]:
net = Net()

### Training

In [18]:
import torch.optim as optim

In [19]:
criterion = nn.MSELoss()
optimizer = optim.Adam(
    net.parameters(), lr=0.001
)

Recurrent layer membutuhkan bentuk input `(batch_size, seq_length, num_features)`

In [20]:
num_epochs = 100

In [21]:
# for epoch in range(num_epochs):
#     for seqs, labels in dataloader_train:
#         seqs = seqs.view(16, 96, 1)
#         outputs = net(seqs)
#         loss = criterion(outputs, labels)
#         optimizer.zero_grad()
#         loss.backward()
#         optimizer.step()