### Batches and epochs
A training dataset can be divided into one or more batches

  - Batch Gradient Descent. Batch Size = Size of Training Set
  - Stochastic Gradient Descent. Batch Size = 1
  - Mini-Batch Gradient Descent. 1 < Batch Size < Size of Training Set
  
The number of epochs defines the number times that the learning algorithm will work through the entire training dataset.

In [None]:
import pandas as pd
import numpy as np

label_df = pd.DataFrame(data=np.arange(0,30).reshape(10,3),columns=['env','dit','dah'])
label_df

Let's say 'env' column is time series (1 dim) and 'dit', 'dah' the labels (2 dim)

In [None]:
ts_df = label_df.drop(columns=['dit','dah'])
print(type(ts_df))
ts_df

In [None]:
label_df.drop(columns=['env'], inplace=True)
label_df

Float tensors can be built directly from Pandas data series or data sets

In [None]:
import torch

t_X = torch.FloatTensor(ts_df.values)
print(t_X.shape, t_X.__len__())
t_y = torch.FloatTensor(label_df.values)
print(t_y.shape)

The time series dataset will return an element of the batch that is a pair of X and corresponding y label values. For X this is a window on the time series starting at `index` for the length of `seq_len` and for y the label at `index + seq_len` that are the values to predict at `t+1` for the window of length `seq_len` ending at `t`.

In [None]:
class TimeSeriesDataset(torch.utils.data.Dataset):
    def __init__(self, X, y, seq_len=1):
        self.X = torch.FloatTensor(X.values) # X is a Pandas dataset
        self.y = torch.FloatTensor(y.values) # y is a Pandas dataset
        self.seq_len = seq_len
        
    def __len__(self):
        return self.X.__len__() - self.seq_len
    
    def __getitem__(self, index):
        return (self.X[index:index+self.seq_len], self.y[index+self.seq_len])

This creates the data loader from the dataset. We verify that the shape of X is sizes of `batch, sequence, input` ready to be consumed by a LSTM with `batch_first=True` and y shape is size of `batch, labels`

In [None]:
train_dataset = TimeSeriesDataset(ts_df, label_df, 3)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=2, shuffle=False)
print(len(train_loader))
X_item, y_item = next(iter(train_loader))
print(X_item.shape)
print(X_item)
print(y_item.shape)
print(y_item)