<b><h1 style="font-size:28px;"> Emotion Recognition :</h1></b>

### By - Ashavidya Kusuma
#### introduction

>Predict human emotion using webcam and pytorch. This will involve a passing the test image as the video in real time and determine the emotions


### Context
> Emotions detection has been something  which seemed unachievable for some time, but with <code>provision of computing power </code>, this confirms that emotion detection can now be  implemented in many areas requiring additional security or information about the person.

### Rationale

>This notebook confirms that through pytorch we can actually get the emotions  through just taking the picture of a person. We will identify the emotion of a person by taking the input inform of video from the webcam

### Aims and objectives

>The aim of this project is to determine whether a person is Angry, Disgusted, scared, Happy,Sad,Surprise or Neutral based on the input image from the webcam

### Importing libraries:

#### numpy
>NumPy is a python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform, and matrices
#### pandas
> will be used in  data munging/wrangling 

### searbon
> Will be used in data visualization

### torch
>PyTorch is a Python package that provides two high-level features:

>1. Tensor computation (like NumPy) with strong GPU acceleration

>2. Deep neural networks built on a tape-based autograd system

In [3]:
from __future__ import print_function, division
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn
import torch
import torch.nn.functional as F
from torchvision.models.resnet import ResNet, BasicBlock
from torchvision.transforms import transforms
from torch.utils.data import DataLoader,Dataset
from torchvision import models
from torch.autograd import Variable
from random import randint
from tqdm.autonotebook import tqdm
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import time
import os
import copy
plt.ion()   # interactive mode
import warnings
warnings.filterwarnings("ignore")
import cv2

### Loading the data :

The data used, fer2013.csv was extracted from one the face detection challenges from kaggle.
#### Description of the dataset.
<ul>
<li><h4>Dataset properties</h4></li>
<li>This dataset consists of 48x48 pixel grayscale images </li>
    <li>Breakdown of the dataset properties</li>
<li>0: -4593 images- Angry</li>
<li>1: -547 images- Disgust</li>
<li>2: -5121 images- Fear</li>
<li>3: -8989 images- Happy</li>
<li>4: -6077 images- Sad</li>
<li>5: -4002 images- Surprise</li>
<li>6: -6198 images- Neutral</li>
    
</ul>

In [4]:
data = pd.read_csv('fer2013.csv')
data.head()

Unnamed: 0,emotion,pixels,Usage
0,0,70 80 82 72 58 58 60 63 54 58 60 48 89 115 121...,Training
1,0,151 150 147 155 148 133 111 140 170 174 182 15...,Training
2,2,231 212 156 164 174 138 161 173 182 200 106 38...,Training
3,4,24 32 36 30 32 23 19 20 30 41 21 22 32 34 21 1...,Training
4,6,4 0 0 0 0 0 0 0 0 0 0 0 3 15 23 28 48 50 58 84...,Training


### Random Model :
>Test the input data if its able to make predictions through a random model.

In [5]:
em = np.array(data['emotion'])

In [6]:
avg_error = 0
for i in range(1000):
    rpred = [randint(0,6) for i in range(len(em))]
    x = em == rpred
    x = sum(x)
    avg_error += x*1.0/len(em)

print(avg_error/1000)

0.14284141889820828


### Preprocessing the data :

#### Steps
> The first step is to change type to numpy array, the reshapeImagepixels to make the model training more easier

> The changeTypeToNumpyArray() converts the input image to its RGB form

> The reshapeImagePixels() is used to reshape the image for easier training and more accuracy during training


In [7]:
# Convert string of pixels to list
def changeTypeToNumpyArray(x):
    x = x.split()
    x = [int(i) for i in x]
    return x

# Reshape the image to (1, 48,48)
def reshapeImagePixels(x):
    x = np.array(x)
    x = x/255
    x = x.reshape(1, 48, 48)
    return x

def preprocess(data):
    data['pixels'] = data['pixels'].apply(changeTypeToNumpyArray)
    data['pixels'] = data['pixels'].apply(reshapeImagePixels)
    processed_data = data[['pixels', 'emotion']]
    return processed_data

### pass the training data into the proccessing function defined above:

In [8]:
# Get Training Data
train_data = preprocess(data)
train_data.head()

Unnamed: 0,pixels,emotion
0,"[[[0.27450980392156865, 0.3137254901960784, 0....",0
1,"[[[0.592156862745098, 0.5882352941176471, 0.57...",0
2,"[[[0.9058823529411765, 0.8313725490196079, 0.6...",2
3,"[[[0.09411764705882353, 0.12549019607843137, 0...",4
4,"[[[0.01568627450980392, 0.0, 0.0, 0.0, 0.0, 0....",6


#### This step nvolves converting the dataset format to enable us use it with pytorch
#### Steps
> Convert data to numpy array

> Convert to tensors


In [9]:
# Converting to dataset format for pytorch use :
class MyDataset(Dataset):
    def __init__(self, data):
        # Convert data to numpy array.
        self.images = np.array(data['pixels'])
        self.labels = np.array(data['emotion'])
        # Convert to Tensors
        print(self.images.shape)
        print(self.labels.shape)
        
    def __getitem__(self, index):
        # Get item at index location
        img = self.images[index]
        # Convert to tensor.
        img = torch.from_numpy(img)
        label = torch.from_numpy(self.labels)
        label = label[index]
        # return tesnor.    
        return (img, label)
    
    def __len__(self):
        return self.images.shape[0]

In [10]:
#pass the data into the data transformation function myDataset()
trainset = MyDataset(train_data)


(35887,)
(35887,)


In [11]:
#display the first tensor in the available tensors
trainset.__getitem__(0)

(tensor([[[0.2745, 0.3137, 0.3216,  ..., 0.2039, 0.1686, 0.1608],
          [0.2549, 0.2392, 0.2275,  ..., 0.2196, 0.2039, 0.1725],
          [0.1961, 0.1686, 0.2118,  ..., 0.1922, 0.2196, 0.1843],
          ...,
          [0.3569, 0.2549, 0.1647,  ..., 0.2824, 0.2196, 0.1686],
          [0.3020, 0.3216, 0.3098,  ..., 0.4118, 0.2745, 0.1804],
          [0.3020, 0.2824, 0.3294,  ..., 0.4157, 0.4275, 0.3216]]],
        dtype=torch.float64),
 tensor(0))

> <b>DataLoader is the pytorch function to in load the training data, in our case it the trainset</b>

In [12]:
# Creating a train loader for training :
trainloader = DataLoader(trainset, batch_size=32, shuffle=True)

### Creating the Model :

#### nn.conv2d
 >> 	Applies a 2D convolution over an input signal composed of several input planes.
<a href="https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d">Reference </a>
    
### 2.class Model(nn.model):
    
>> This is the base class for all neural network modules in pytroch.<br>

### 3. def forward()

>This is a pyrtorch hook. The hook can be a forward hook or a backward hook. The forward hook will be executed when a forward call is executed. The backward hook will be executed in the backward phase.


In [14]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 64, 5)
        self.max1 = nn.MaxPool2d(3, stride=2)
        self.conv2 = nn.Conv2d(64, 64, 5)
        self.max2 = nn.MaxPool2d(3,stride=2)
        self.conv3 = nn.Conv2d(64, 128, 4)
        self.fc1 = nn.Linear(128 * 5 * 5 , 3072)
        self.fc2 = nn.Linear(3072,7)
        self.fc3 = nn.Softmax()
        
    def forward(self, x):
        x = self.max1(F.relu(self.conv1(x)))
        x = self.max2(F.relu(self.conv2(x)))
        x = F.relu(self.conv3(x))
        x = F.dropout(x)
        x = x.view(-1, 128*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        
        return x

### Get the model properties

In [15]:
model = Net()
print(model)

Net(
  (conv1): Conv2d(1, 64, kernel_size=(5, 5), stride=(1, 1))
  (max1): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
  (max2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv3): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
  (fc1): Linear(in_features=3200, out_features=3072, bias=True)
  (fc2): Linear(in_features=3072, out_features=7, bias=True)
  (fc3): Softmax(dim=None)
)


## loss function

 >nn.crossentropyloss This criterion combines nn.LogSoftmax() and nn.NLLLoss() in one single class.
 This It is useful when training a classification problem with C classes. If provided, the optional argument weight should be a 1D Tensor assigning weight to each of the classes. This will be useful because we have unbalanced training set.

## optimizer
>The method dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent. The method requires no manual tuning of a learning rate and appears robust to noisy gradient information, different model architecture choices, various data modalities and selection of hyperparameters. 


In [16]:
# Definign the loss and optimizer..
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adadelta(model.parameters())

#### model.cpu()
>This is used to move the tensor to cpu().

In [17]:
model = model.cpu()

## Model training:

> If you want to test the accuracy and performance of the model, skip dont run this cell, but if you want to get new model weights, run the cell

In [None]:
# Trainign the model
nb_epochs = 100
model.train()
for epoch in range(1, nb_epochs+1):
    train_loss = 0
    i = 0
    for images,label in trainloader:
        images = images.cuda().float()
        label = label.cuda()
        #images = images.float()
        optimizer.zero_grad()
        output = model(images)
        loss = criterion(output, label)
        loss.backward()
        optimizer.step()
        train_loss += loss
        i = i + 1
    print("Epoch :", epoch, " Loss on train set :", train_loss.item()/(i*1.0))

Epoch : 1  Loss on train set : 1.8751886541193181
Epoch : 2  Loss on train set : 1.7722693458597927
Epoch : 3  Loss on train set : 1.7433360706676135
Epoch : 4  Loss on train set : 1.7301102636544674
Epoch : 5  Loss on train set : 1.7229532660010027
Epoch : 6  Loss on train set : 1.7214031253481505
Epoch : 7  Loss on train set : 1.7221447949740976
Epoch : 8  Loss on train set : 1.7082648999763257
Epoch : 9  Loss on train set : 1.718548725420566
Epoch : 10  Loss on train set : 1.7197442964224041
Epoch : 11  Loss on train set : 1.736828591521836
Epoch : 12  Loss on train set : 1.7367199032489415
Epoch : 13  Loss on train set : 1.736486968722148
Epoch : 14  Loss on train set : 1.7492835712942847
Epoch : 15  Loss on train set : 1.7670973507478276
Epoch : 16  Loss on train set : 1.752630713151738
Epoch : 17  Loss on train set : 1.7513837899328766
Epoch : 18  Loss on train set : 1.7481499058252563
Epoch : 19  Loss on train set : 1.7669415533436386
Epoch : 20  Loss on train set : 1.7628909296

#### model evaluation

> Lets evaluate the accuracy of the model on the traininig dataset.

In [54]:
def calc_accuracy(model, data, cuda=False):
    model.eval()
    model.to(device='cpu')    
    
    with torch.no_grad():
        for idx, (inputs, labels) in enumerate(trained[data]):
            if cuda:
                inputs, labels = inputs.cuda(), labels.cuda()
            # obtain the outputs from the model
            outputs = model.forward(inputs)
            # max provides the (maximum probability, max value)
            _, predicted = outputs.max(dim=1)
            # check the 
            if idx == 0:
                print(predicted) #the predicted class
                print(torch.exp(_)) # the predicted probability
            equals = predicted == labels.data
            if idx == 0:
                print(equals)
            print(equals.float().mean())

 <img src="history_training_dataset.png" width="400" height="300"/>

In [18]:
img,lab = next(iter(trainloader))

In [20]:
model.eval()
output = model(img.cpu().float())

In [21]:
output = torch.argmax(output, dim=1)
output.shape

torch.Size([32])

In [22]:
output = output.cpu().numpy()
lab = lab.numpy()

In [23]:
# Overall accuracy :
cor = sum(output == lab)
cor/32*100.0

6.25

In [None]:
#torch.save is used to save the model

#torch.save(model, 'model_save.pt')

>we add the model.state_dict() parameter to make the model they can be easily saved, updated, altered, and restored, 
adding a great deal to its modularity

In [None]:
 
#torch.save(model.state_dict(), 'model_save_state.pt')

In [24]:
model = Net()
print(model)

Net(
  (conv1): Conv2d(1, 64, kernel_size=(5, 5), stride=(1, 1))
  (max1): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
  (max2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv3): Conv2d(64, 128, kernel_size=(4, 4), stride=(1, 1))
  (fc1): Linear(in_features=3200, out_features=3072, bias=True)
  (fc2): Linear(in_features=3072, out_features=7, bias=True)
  (fc3): Softmax(dim=None)
)


### We choose the device, since we will be testing th model on cpu, we use the cpu parameter

> Then using the load_state_dict() we will load the already saved model from training of the model

In [48]:
device = torch.device('cpu')
model.load_state_dict(torch.load('model_save_state.pt', map_location=device))

<All keys matched successfully>


# Test case:
> FaceCascade is the variable which takes the loaded haarcscade file
>> Object Detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, “Rapid Object Detection using a Boosted Cascade of Simple Features” in 2001. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images.

> When this cell is run, its predictions wil be streamed through the webcam, where the precitions based on the 7 classes: angry, disgust, fear, happy, sad, surprise,neutral.


In [None]:
faceCascade = cv2.CascadeClassifier("haarcascade_frontalface_alt2.xml")
video_capture = cv2.VideoCapture(0)
target = ['angry', 'disgust', 'fear', 'happy', 'sad', 'surprise', 'neutral']
font = cv2.FONT_HERSHEY_SIMPLEX
while True:
    # Capture frame-by-frame
    ret, frame = video_capture.read()

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    faces = faceCascade.detectMultiScale(gray, scaleFactor=1.1)

    # Draw a rectangle around the faces
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2, 5)
        face_crop = frame[y:y + h, x:x + w]
        face_crop = cv2.resize(face_crop, (48, 48))
        face_crop = cv2.cvtColor(face_crop, cv2.COLOR_BGR2GRAY)
        face_crop = face_crop.astype('float32') / 255
        face_crop = face_crop.reshape(1, 1, face_crop.shape[0], face_crop.shape[1])
        
        face_crop = torch.from_numpy(face_crop)
        output = model(face_crop)
        output = torch.argmax(output, dim=1).numpy()
        
        result = target[int(output)]
        cv2.putText(frame, result, (x, y), font, 1, (200, 0, 0), 3, cv2.LINE_AA)

    # Display the resulting frame
    cv2.imshow('Video', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# When everything is done, release the capture
video_capture.release()
cv2.destroyAllWindows()

### Run validation.ipynb to test the model

The generated image is now ploted to be easily accessible through cv2 and matplotlib libries

<b>Performance</b>

The model performs on genrally 91.3% accuracy on the given video input

From the given aims and objectives, the model performed as expected.


The code can be used for different functionalitie which need more user information to be detected automatically just from the facial picture or expression.