# Week 4C: Train Keypoints Classifier, i.e. Pose Classifier

In this notebook we are going to train our own pose classifier in PyTorch based on the dataset we built from the python script and notebook `00` and `01` respectively.

**Before you go any further** make sure you have already created and saved your csv file in `class-datasets/my-pose-classification-dataset`.

Code adapted from this [repo](https://github.com/Alimustoofaa/YoloV8-Pose-Keypoint-Classification/tree/master).

### Setting up your Python environment

Before you work through this notebook, please follow the instructions in [Setup-and-test-conda-environment.ipynb](Setup-and-test-conda-environment.ipynb)

Once you have done that you will need to make sure that the environment selected to run this notebook and all the other notebooks used in this unit is called `aim`. 

To do this click the **Select kernel** button in the top right corner of this notebook, and then select `aim`.

To make sure that is configured properly, Hit the run cell button (▶) on the cell below:

In [None]:
import os
print(os.environ['CONDA_DEFAULT_ENV'])

Does it output the text `aim`?

If it does not output the text `aim`, please revisit and follow the instructions in [Setup-and-test-conda-environment.ipynb](Setup-and-test-conda-environment.ipynb).

If you still cannot get it working, please raise this with the course instructor. 

Now you can import the libraries you need to train the classifier:

In [27]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
import torch.nn.functional as F

##### Hyperparameters

Now let's define our hyperparameters:

In [28]:
device = 'cpu'
num_epochs = 200
num_classes = 3
num_keypoints = 34
test_size = 0.3
batch_size = 128
learn_rate = 0.001
data_path = 'class-datasets/my-pose-classification-dataset/poses_keypoints.csv'

##### Read Dataset

Here, we are reading the first 5 rows of our dataset.

In [None]:
df = pd.read_csv(data_path)
df = df.drop('image_name', axis=1)
df.head()

##### Count and plot our data per class

In the following two cells, we are counting and plotting the number of data per class.

In [None]:
df.label.value_counts()

In [None]:
df.label.value_counts().plot(kind="bar")
plt.xticks(rotation=45)
plt.show()

##### Define the 1st column as our labels `y` and the following 34 columns as our keypoints input dataset `X`

In [None]:
# Use the encoder label, to turn each label into an index number
encoder = LabelEncoder()
y_label = df['label']
y = encoder.fit_transform(y_label)
y

In [None]:
# Get keypoint dataset 
X = df.iloc[:, 1:] # start from 11: if you want to skip the keypoints of the face
X

##### Train Test Split

Perform a train-test split with test_size=0.3 (defined in our hyperparameters), and a random but deterministic split and a strification.

Stratified sampling is a method of sampling that involves dividing a population into homogeneous subgroups known as strata, and then sampling from each stratum.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size, random_state=42, stratify=y)

print("Number of Training keypoints: ", len(X_train))
print("Number of Testing keypoints: ", len(X_test))

In [None]:
# A glimpse into the test data in a table format
X_test

##### MinMax scaling to scale each feature into a given range

For more information, look [here](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html).

In [None]:
# A glipse into the test data in the format of an array and after performing a minmax scaling
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
X_test

##### Data Loader

The data are currently numpy arrays and need to get transformed into torch tensors in order to get into the dataloaders.

In [37]:
class DataKeypointClassification(Dataset):
    def __init__(self, X, y):
        self.x = torch.from_numpy(X.astype(np.float32))
        self.y = torch.from_numpy(y.astype(np.int64))
        self.n_samples = X.shape[0]
    
    def __getitem__(self, index):
        return self.x[index], self.y[index]
    
    def __len__(self):
        return self.n_samples

In [38]:
train_dataset = DataKeypointClassification(X_train, y_train)
test_dataset = DataKeypointClassification(X_test, y_test)

In [39]:
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size,  shuffle=False)

##### Define our simple feed forward network

In [40]:
class PoseClassificationMLP(nn.Module):
    def __init__(self):
      super(PoseClassificationMLP, self).__init__()
      self.fc1 = nn.Linear(num_keypoints, 256)
      self.fc2 = nn.Linear(256, num_classes)     
  
    def forward(self, x):
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        x = F.softmax(x, dim=-1)
        return x

##### Setup core objects

Here we setup our core objects, the model, the loss function and the optimiser.

In [41]:
model = PoseClassificationMLP()
model.to(device)

# Cross entropy loss for training classification
criterion = nn.CrossEntropyLoss()

# Adam optimiser
optimizer = torch.optim.Adam(model.parameters(), lr=learn_rate)

##### Training loop

Here is our training loop for our data.

In [None]:
train_losses = []
best_loss = 100000
for epoch in range(num_epochs):
    train_loss = 0.0
    
    # Training loop
    for i, data in enumerate(train_loader, 0):
        # Get data
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        
        # Process data
        outputs = model(inputs)
        
        # Calculate loss
        loss = criterion(outputs, labels)
        
        # Update model weights
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        train_loss += loss.item()
    
    train_loss = train_loss / len(train_loader)
    
    # Added cumulative losses to lists for later display
    train_losses.append(train_loss)
    
    print(f'Epoch {epoch + 1}, train loss: {train_loss:.3f}')

##### Plot training loss

In [None]:
plt.figure(figsize=(10,5))
plt.title("Train loss")
plt.plot(train_losses,label="train")
plt.xlabel("epochs")
plt.ylabel("cumulative loss")
plt.legend()
plt.show()

##### Test our model

Here we use the model to predict the label on unseen data, our test data.

The predictions are the predicted classes (in the encoded format 0-1-2) for each item in the test dataset.

In [None]:
test_features = torch.from_numpy(X_test.astype(np.float32))
test_labels = y_test
with torch.no_grad():
    outputs = model(test_features)
    _, predictions = torch.max(outputs, 1)
predictions

##### Confusion Matrix

A confusion matrix is a really good way to visualise the number of true positives, false negatives, false positives, and true negatives.

The header row corresponds to the predicted labels while the first column corresponds to the ground truth.

In [None]:
cm = confusion_matrix(test_labels, predictions)
df_cm = pd.DataFrame(
    cm, 
    index = encoder.classes_,
    columns = encoder.classes_
)
df_cm

##### Visualising the confusion matrix with a seaborn heatmap

In [None]:
def show_confusion_matrix(confusion_matrix):
    hmap = sns.heatmap(confusion_matrix, annot=True, fmt="d", cmap="Blues")
    plt.ylabel("Surface Ground Truth")
    plt.xlabel("Predicted Surface")
    plt.legend()
    
show_confusion_matrix(df_cm)

##### Save model


In [47]:
PATH_SAVE = 'pose_classifier.pt'
torch.save(model.state_dict(), PATH_SAVE)

##### Load Inference Model

In [None]:
model_inference =  PoseClassificationMLP()
model_inference.load_state_dict(torch.load(PATH_SAVE, map_location=device))

In [None]:
feature, label = test_dataset.__getitem__(12) # test out different item numbers

out = model_inference(feature)
_, predict = torch.max(out, -1)
print(f'\
    prediction label : {encoder.classes_[predict]} \n\
    ground truth label : {encoder.classes_[label]}'
    )
print(encoder.classes_)

### Training tasks:

**Task A:** Run all the cells in this code to train your own pose classifier.

> There are some bonus tasks here if you want to further develop skills in training models. Feel free to come back to these after completing the tasks for building the interactive application.
> 
> **Bonus task A:** Use Sci-kit learn's [train-test split function](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) to split this dataset into a train and test/validation dataset. Every 10 epochs assess the perfomance of the model against the test set (like in [week-2b](Week-2b-build-and-train-dog-rating-network.ipynb)) and use early stopping to save and evaluate the most performant model.
> 
> **Bonus task B:** Test out the training with different hyper-parameters, or make modifications to the network architecture to try and improve classification accuracy on the test set.

### Interactive app tasks 

In [week-4d-pose-classification-dorothy-app.py](week-4d-pose-classification-dorothy-app.py) follow the tasks below to 

**Task 1:** Copy and paste [the model definition](#define-our-simple-feed-forward-network) from this notebook into [week-4d-pose-classification-dorothy-app.py](week-4d-pose-classification-dorothy-app.py). Then instantiate a copy of that class, load the weights of the model saved into `pose_classifier.pt` using [the torch load functionality](https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-for-inference) and then put the model into eval mode.

**Task 2** Run your pose classifier model in inference in the draw loop (in the designated part of the code) on the variable `keypoint_data`. The use the `torch.max` function to get prediction for the pose of the user. 

**Task 3:** Use the predicted pose to alter the behaviour or trigger an action in the dorothy sketch. Each pose should have a different effect. For instance you could: 
- Draw text or geometric objects
- Manipulate animation of graphics on the screen
- Trigger the playback of audio
- Create a game where the user has to match a pose on command, there could be a time limit or points could be scored for how quickly the user responds to the instructed pose from the game.

If you did STEM for Creatives last term, can you adapt one of the sketches you created last term and use this pose classifer as a controller to interact with that sketch in some way?

For inspiration, checkout the [gallery of examples given in the dorothy-cci library](https://github.com/Louismac/dorothy/tree/main/examples).