# LSTM time series data Tutorial

Using past W * [7 input, 3 output] sets, estimate the outputs.

W*[7 inputs + 3outputs] -> 3 outputs

## 1. Import libraries

In [1]:
import pandas as pd  # series data management
import numpy as np  # data array calculations
import torch  # machine learning framework
from datetime import datetime, timedelta  # datetime

## 2. Prepare data
This is an example of making time series data that is totally random.

Or, you can use your own data with simliar format.

In [2]:
df = pd.DataFrame()  # make empty dataframe

In [3]:
# make time series index, 10 minutes gap for example
start_time = datetime.strptime("00:00:00", "%H:%M:%S")
end_time = datetime.strptime("23:50:00", "%H:%M:%S")
time_diff = timedelta(minutes=10)
time_indexs = []
while start_time <= end_time:
    time_indexs.append(start_time.strftime("%H:%M:%S"))
    start_time += time_diff

num_index = len(time_indexs)
df["Hours"] = np.array(time_indexs)
df.set_index("Hours", inplace=True)

# make random inputs and outputs
input_cols = ["input1", "input2", "input3", "input4", "input5", "input6", "input7"]
output_cols = ["output1", "output2", "output3"]
for col in input_cols + output_cols:
    df[col] = np.random.rand(num_index)

In [4]:
df.head()

Unnamed: 0_level_0,input1,input2,input3,input4,input5,input6,input7,output1,output2,output3
Hours,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
00:00:00,0.272217,0.543551,0.508452,0.896121,0.028985,0.001714,0.026478,0.753832,0.747477,0.035091
00:10:00,0.463637,0.795516,0.957192,0.540517,0.248124,0.613166,0.297787,0.356912,0.666159,0.789203
00:20:00,0.983764,0.999542,0.704283,0.0988,0.732428,0.027293,0.967092,0.819743,0.598115,0.140531
00:30:00,0.163205,0.887882,0.99326,0.357354,0.027471,0.393128,0.419646,0.630511,0.378567,0.630852
00:40:00,0.160713,0.400776,0.724516,0.037273,0.738431,0.147715,0.540566,0.960408,0.327427,0.415917


And here is example of bringing your data.

In [5]:
df = pd.read_csv('sample.csv', index_col="Hours")
df.head()

Unnamed: 0_level_0,input1,input2,input3,input4,input5,input6,input7,output1,output2,output3
Hours,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
00:00:00,0.01595,0.325941,0.243507,0.924663,0.263466,0.309855,0.258402,0.342601,0.348267,0.588811
00:10:00,0.929199,0.889737,0.937425,0.597543,0.827709,0.837048,0.66766,0.829207,0.68184,0.276235
00:20:00,0.449561,0.99902,0.69437,0.955522,0.940954,0.919795,0.187271,0.372497,0.487677,0.953071
00:30:00,0.112158,0.159191,0.29991,0.29778,0.286478,0.570143,0.614302,0.121487,0.024395,0.599209
00:40:00,0.043976,0.740073,0.81562,0.995026,0.437561,0.279105,0.169911,0.278675,0.344108,0.434303


## 3. Make pytorch training code
Pytorch needs 3 essential elements for training.
- A. Dataset: dataset.
- B. Model: model.
- C. Object fuction(a.k.a. Loss, criterion, ...): calculating function of  how effective the prediction is.


And these are optional:
- Optimizer: **Optimize the weight update** for enhanced training
- Scheduler: **Schedule the learning rate** for enhanced training
- Dataloader: **Data Supply Tool** for batch tranining, multi threading, etc

We will use Optimizer and Dataloader in this tutorial.

### a. Dataset
In pytorch Dataset, we need to override built-in functions
- \_\_init\_\_: constructor
- \_\_len\_\_: get dataset lengths with len()
- \_\_getitem\_\_: get element with using python list index system (e.g. arr[i])

In [6]:
from torch.utils.data import Dataset


class MyDataset(Dataset):
    def __init__(self, df, window_size: int):
        self.df = df
        self.window_size = window_size  # How much past data to refer to predict

    def __len__(self):
        return len(self.df) - (self.window_size + 1)

    def __getitem__(self, idx):
        x = self.df.iloc[idx:idx+self.window_size].values  # all past data * window_size
        y = self.df.iloc[idx+self.window_size+1][['output1', 'output2', 'output3']].values  # outputs
        return torch.tensor(x).float(), torch.tensor(y).float()


window_size = 5
dataset = MyDataset(df, window_size)
print("length of dataset:", len(dataset))
# get first element
x, y = dataset[0]
print("\nexample of x(input):\n", x)  # [7 inputs + 3 outputs] * window_size
print("\nexample of y(output)\n:", y)  # 3 outputs

length of dataset: 138

example of x(input):
 tensor([[0.0159, 0.3259, 0.2435, 0.9247, 0.2635, 0.3099, 0.2584, 0.3426, 0.3483,
         0.5888],
        [0.9292, 0.8897, 0.9374, 0.5975, 0.8277, 0.8370, 0.6677, 0.8292, 0.6818,
         0.2762],
        [0.4496, 0.9990, 0.6944, 0.9555, 0.9410, 0.9198, 0.1873, 0.3725, 0.4877,
         0.9531],
        [0.1122, 0.1592, 0.2999, 0.2978, 0.2865, 0.5701, 0.6143, 0.1215, 0.0244,
         0.5992],
        [0.0440, 0.7401, 0.8156, 0.9950, 0.4376, 0.2791, 0.1699, 0.2787, 0.3441,
         0.4343]])

example of y(output)
: tensor([0.4136, 0.9490, 0.3915])


### b. Simple LSTM Model
In pytorch Model, we need to override 2 built-in functions
- \_\_init\_\_: constructor
- forward: pytorch function of inferencing. Usage: model(x)

In [7]:
import torch.nn as nn
import torch.optim as optim


class MyLSTM(nn.Module):
    def __init__(self, n_input: int, n_hidden: int, n_output: int):
        super(MyLSTM, self).__init__()
        self.lstm = nn.LSTM(input_size=n_input, hidden_size=n_hidden)  # LSTM layer
        self.linear = nn.Linear(n_hidden, n_output)  # fully connected layer for last prediciton

    def forward(self, x):
        lstm_out, hidden = self.lstm(x)
        out = self.linear(lstm_out[:, -1])
        return out


n_input = 10  # cause we will use 7 inputs + 3 outputs = 10
n_hidden = 5  # this is hidden size that LSTM uses. you can change this vaule
n_output = 3  # 3 outputs

model = MyLSTM(n_input, n_hidden, n_output)
print(model)

MyLSTM(
  (lstm): LSTM(10, 5)
  (linear): Linear(in_features=5, out_features=3, bias=True)
)


### Sample usage with x

In [8]:
pred = model(x)
print("predicted value for input x:\n", pred.detach().numpy())

predicted value for input x:
 [-0.18180266 -0.04074691 -0.39606753]


### c. Object function
We will just use torch.nn.MSELoss() here.

In [9]:
criterion = nn.MSELoss()

### Sample loss calculation

In [10]:
loss = criterion(pred, y)
print("predicted value for input x:\n", pred.detach().numpy())
print("target:\n", y.detach().numpy())
print("\nMSE loss:", loss.item())

predicted value for input x:
 [-0.18180266 -0.04074691 -0.39606753]
target:
 [0.41364914 0.94902205 0.3915306 ]

MSE loss: 0.6515054702758789


## 4. Training
Now we will just assemble above ingredients that we prepared.

As I commented above, we will use Optimizer and Dataloader additionally.

In [11]:
from torch.utils.data import DataLoader


### hyperparameters
num_window = 5
num_batch = 4
num_input = 10
num_hidden = 5
num_output = 3
num_epoch = 500
###################


# prepare dataset & loader
dataset = MyDataset(df, window_size)
dataloader = DataLoader(dataset, batch_size=num_batch, shuffle=True)

# prepare model
model = MyLSTM(num_input, num_hidden, num_output)

# prepare object fucntion
criterion = nn.MSELoss()

# prepare optimizer
optimizer = optim.Adam(model.parameters(), lr=0.01)


# gpu utilization if it's available
device = 'cpu'
if torch.cuda.is_available():
    device = 'cuda:0'
model.to(device)
criterion.to(device)

# train
for epoch in range(num_epoch):
    avg_loss = 0.
    for step, (x, y) in enumerate(dataloader):
        optimizer.zero_grad()  # reset optimizer
        x = x.to(device)  # gpu utilization if it's available
        y = y.to(device)  # gpu utilization if it's available

        output = model(x)  # prediction
        loss = criterion(output, y)  # calcuate the loss
        loss.backward()  # backpropagation and update the model weights

        avg_loss += loss / len(dataloader)
        optimizer.step()

    if (epoch + 1) % 50 == 0:  # log per 50 epochs
        print(f"Epoch: {str(epoch + 1).rjust(4, ' ')}, ", "average loss =", "{:.6f}".format(loss))

Epoch:   50,  average loss = 0.060346
Epoch:  100,  average loss = 0.083287
Epoch:  150,  average loss = 0.036458
Epoch:  200,  average loss = 0.018195
Epoch:  250,  average loss = 0.060213
Epoch:  300,  average loss = 0.026442
Epoch:  350,  average loss = 0.019931
Epoch:  400,  average loss = 0.068454
Epoch:  450,  average loss = 0.017393
Epoch:  500,  average loss = 0.098631


Loss does not appear to be dramatically decreased. Because we are using meaningless random data.

You can enhance the result by these common methods.
- make more complicated model
- data augmentation
- hyperparameter tuning