# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint

# New Section

Automated facial expression recognition provides an objective assessment of emotions. Human based assessment of emotions has many limitations and biases and automated facial expression technology has been found to deliver a better level of insight into behavior patterns. Emotion detection from facial expressions using AI is useful in automatically measuring consumers’ engagement with their content and brands, audience engagement for advertisements, customer satisfaction in the retail sector, psychological analyses, law enforcement etc.

In [None]:
#@title Explanation Video
from IPython.display import HTML

HTML("""<video width="500" height="300" controls>
  <source src="https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/Hackathon3b_expression_recognition.mp4" type="video/mp4">
</video>
""")

**Objectives:** 

**Stage 4 (15 Marks):** Train a CNN Model and perform Expression Recognition in the EFR Mobile App.

**Stage 5 (5 Marks):** Test for Anti-Face Spoofing on the EFR Mobile App.

##**Stage 4 (15 Marks)**

**(i) Train a CNN Model for Expression Recognition on given Expression data  
(ii) Deploy the Model and Perform Expression Recognition on Team Data through the EFR Mobile App**


---


* Define and train a CNN for expression recognition for the data under folder "Expression_data" which segregated on expression basis.
* Collect your team data using EFR application and test your model on the same and optimize the CNN architecture for predicting the respective labels of the images.
* Save and Download the trained expression model and upload them in the ftp server (refer to [Filezilla Installation and Configuration document](https://drive.google.com/file/d/1hnMXcwpCwAz94ljAhtsJdQSfe-JaOD5K/view?usp=sharing)).

* Update the **“exp_recognition.py”** file in the server. Open the files in the terminal (Command prompt) and provide the code for predicting the expression on the face (Note: To define the architecture of your trained model, you'll need to define it in the file **"exp_recognition_model.py"**). 

* Test your model on the mobile app for Expression Recognition and Sequence Expression. Your team can also see your results in your terminal.


* Grading Scheme:
> * Expression Recognition (12M): If the functionality is returning expression class correctly for the face using the mobile app’s “Expression Recognition” functionality
> * Sequence Expression (3M): Get three consecutive correct Expressions using the mobile app’s “Sequence Expressions” functionality

**Download the dataset**

In [None]:
#@title Run this cell to download the dataset

from IPython import get_ipython
ipython = get_ipython()
  
notebook="M3_Hackathon" #name of the notebook

def setup():
#  ipython.magic("sx pip3 install torch")
    ipython.magic("sx wget wget https://cdn.talentsprint.com/aiml/Experiment_related_data/Expression_data.zip")
    
    ipython.magic("sx unzip Expression_data.zip")
    
    ipython.magic("sx pip install torch==1.0.1 -f https://download.pytorch.org/whl/cu100/stable")
    ipython.magic("sx pip install torchvision==0.2.1")
    ipython.magic("sx pip install opencv-python")
    print ("Setup completed successfully")
    return
setup()

Setup completed successfully


**Dataset attributes:**

During the setup you have downloaded the Expression data:

* **Expression_data**: In this folder, the images are segregrated in terms of Expression
> * Expressions available: ANGER, DISGUST, FEAR, HAPPINESS, NEUTRAL, SADNESS, SURPRISE
> * Each class is organised as one folder
> * There are ~18000 total images in the training data and ~4500 total images in the testing data

In [None]:
%ls

[0m[01;34mExpression_data[0m/  Expression_data.zip  [01;34m__MACOSX[0m/  [01;34msample_data[0m/


**Imports: All the imports are defined here**

We are installing the following specific package versions -> torch 1.0.1, torchvision 0.2.1 and PIL 5.3.0 to maintain compatibility with the server 

* Firstly uninstall and downgrade the current PIL version. In the next cell, you will see a button "Restart Runtime" button appear below. 
* Click on it and select 'Yes' to restart runtime and reset the PIL package. 
* **DO NOT** go to the notebook's **RUNTIME  -> RESTART RUNTIME**. This will restart all packages and you will need to repeat all the steps from beginning.


* Simply continue with the next code cell

PIL (Pillow) is the Python Image Library. Used to cut and resize images, or do simple manipulation.


In [None]:
!pip uninstall -y Pillow

Uninstalling Pillow-7.0.0:
  Successfully uninstalled Pillow-7.0.0


In [None]:
# IGNORE ERROR. Click on Restart Runtime button and slect 'Yes' if prompts. Then proceed with the next code cell.
!pip install Pillow==5.3.0

Collecting Pillow==5.3.0
[?25l  Downloading https://files.pythonhosted.org/packages/62/94/5430ebaa83f91cc7a9f687ff5238e26164a779cca2ef9903232268b0a318/Pillow-5.3.0-cp36-cp36m-manylinux1_x86_64.whl (2.0MB)
[K     |████████████████████████████████| 2.0MB 16.5MB/s 
[31mERROR: albumentations 0.1.12 has requirement imgaug<0.2.7,>=0.2.5, but you'll have imgaug 0.2.9 which is incompatible.[0m
[?25hInstalling collected packages: Pillow
Successfully installed Pillow-5.3.0


In [None]:
# When you run this, it should give you pil version = 5.3.0
import PIL
print(PIL.__version__)

5.3.0


In [None]:
import torchvision
import torchvision.datasets as dset
import torchvision.transforms as transforms
from torch.utils.data import DataLoader,Dataset
import matplotlib.pyplot as plt
import torchvision.utils
import numpy as np
import random
from PIL import Image
import torch
from torch.autograd import Variable
import PIL.ImageOps    
import torch.nn as nn
from torch import optim
import torch.nn.functional as F
import os
import warnings
from time import sleep
import sys
warnings.filterwarnings('ignore')

For the following step, to obtain hints on building a CNN model for face expression, you may refer to this [article](https://drive.google.com/open?id=1P2rpaWW3tOtGGnw4dvtdZ4hjoc8iDNst)

**Define and train a CNN model for expression recognition**

In [None]:
# YOUR CODE HERE to define and train CNN model for Expression_Data.
def rgb2gray(image):
	return image.convert('L')

loader = torchvision.transforms.Compose([rgb2gray, torchvision.transforms.Resize((48,48)), torchvision.transforms.ToTensor()])



In [None]:
# YOUR CODE HERE for the DataLoader
base_dir = '/content/Expression_data'

train_dir = os.path.join(base_dir, 'Facial_expression_train')
validation_dir = os.path.join(base_dir, 'Facial_expression_test')


trainset = dset.ImageFolder(train_dir, transform = loader)
validationset = dset.ImageFolder(validation_dir, transform = loader)




# Load the data. utils.dataloader is a package for loading the dataset 
train_loader = torch.utils.data.DataLoader(trainset, shuffle=True, batch_size=1)
validation_loader = torch.utils.data.DataLoader(validationset, shuffle=True, batch_size=1)



In [None]:
len(trainset)

18178

In [None]:
trainset.class_to_idx

{'ANGER': 0,
 'DISGUST': 1,
 'FEAR': 2,
 'HAPPINESS': 3,
 'NEUTRAL': 4,
 'SADNESS': 5,
 'SURPRISE': 6}

In [None]:
train_loader.dataset

Dataset ImageFolder
    Number of datapoints: 18178
    Root Location: /content/Expression_data/Facial_expression_train
    Transforms (if any): Compose(
                             <function rgb2gray at 0x7fd313228620>
                             Resize(size=(48, 48), interpolation=PIL.Image.BILINEAR)
                             ToTensor()
                         )
    Target Transforms (if any): None

In [None]:
current_Images, current_labels = next(iter(train_loader))
current_Images.shape

torch.Size([1, 1, 48, 48])

In [None]:
current_labels

tensor([4])

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0


In [None]:
# Define the CNN Network

class facExpRec(nn.Module):
    def __init__(self):
        super(facExpRec, self).__init__()

        # Convolution Layer 1  (1*48 * 48)              (64 * 48 * 48)
        self.cnn1 = nn.Conv2d(in_channels=1, out_channels=64, kernel_size=3, stride=1, padding=1) # output size of the first convolutional layer is 64*48*48
        self.bn1 = nn.BatchNorm2d(64)
        self.relu1 = nn.ReLU()
        self.Dropout1 = nn.Dropout(0.5)
        # Maxpool for the Convolutional Layer 1
        self.maxpool1 = nn.MaxPool2d(kernel_size=2, stride=2) 
        # Maxpooling reduces the size by kernel size. After Maxpooling the output size is 64*24*24
        
        # YOUR CODE HERE for defining more number of Convolutional layers with Maxpool as required (Hint: Use at least 3 convolutional layers for better performance)
        # Convolution Layer 2 (64 * 24 * 24)              (128 * 24 * 24)
        self.cnn2 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=5, stride=1, padding=1) 
        self.bn2 = nn.BatchNorm2d(128)
        self.relu2 = nn.ReLU()
        self.Dropout2 = nn.Dropout(0.5)
        # Maxpool for the Convolutional Layer 2
        self.maxpool2 = nn.MaxPool2d(kernel_size=2, stride=2) 
        # Maxpooling reduces the size by kernel size. After Maxpooling the output size is 128 * 12 * 12
        
        # Convolution Layer 3 (128 *12 * 12)              (512 * 12 * 12)
        self.cnn3 = nn.Conv2d(in_channels=128, out_channels=512, kernel_size=3, stride=1, padding=1) 
        self.bn3 = nn.BatchNorm2d(512)
        self.relu3 = nn.ReLU()
        self.Dropout3 = nn.Dropout(0.5)
        # Maxpool for the Convolutional Layer 3
        self.maxpool3 = nn.MaxPool2d(kernel_size=2, stride=2) 
        # Maxpooling reduces the size by kernel size. After Maxpooling the output size is 512 * 6 * 6

        # Convolution Layer 4 (512 * 6 * 6)              (512 * 6 * 6)
        self.cnn4 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1) 
        self.bn4 = nn.BatchNorm2d(512)
        self.relu4 = nn.ReLU()
        self.Dropout4 = nn.Dropout(0.5)
        # Maxpool for the Convolutional Layer 4
        self.maxpool4 = nn.MaxPool2d(kernel_size=2, stride=2) 
        # Maxpooling reduces the size by kernel size. After Maxpooling the output size is 512 * 3 * 3 

       # Linear layers
        # (512*3*3) input features, 256 output features 
        self.fc1 = nn.Linear(512*3*3, 256)
        self.fc1 = nn.BatchNorm1d(256)
        self.fc1 = nn.Dropout(0.5)
        self.fc1 = nn.ReLU()
        # 256 input features, 512 output features 
        self.fc2 = nn.Linear(256, 512)
        self.fc2 = nn.BatchNorm1d(512)
        self.fc2 = nn.Dropout(0.5)
        self.fc2 = nn.ReLU()       
        # 512 input features, 7 output features 
        self.fc3 = nn.Linear(512, 7)
        self.fc3 = nn.LogSoftmax(dim=1)
 
    def forward(self, x):
        # Convolution Layer 1 and Maxpool
        out = self.cnn1(x)
        out = self.bn1(out)
        out = self.relu1(out)
        out = self.Dropout1(out)
        out = self.maxpool1(out)
        
        # YOUR CODE HERE for the Convolutional Layers and Maxpool based on the defined Convolutional layers
        # Convolution Layer 2 and Maxpool
        out = self.cnn2(out)
        out = self.bn2(out)
        out = self.relu2(out)
        out = self.Dropout2(out)
        out = self.maxpool2(out)
        
        # Convolution Layer 3 and Maxpool
        out = self.cnn3(out)
        out = self.bn3(out)
        out = self.relu3(out)
        out = self.Dropout3(out)
        out = self.maxpool3(out)

        # Convolution Layer 4 and Maxpool
        out = self.cnn4(out)
        out = self.bn4(out)
        out = self.relu4(out)
        out = self.Dropout4(out)
        out = self.maxpool4(out)

        # Flattening
        out = out.view(-1, 2048)

        # Linear layers with RELU activation

        #out = self.fc1(out) 
        out = self.fc1(out)
        out = self.fc2(out)
        out = self.fc3(out)

        return F.log_softmax(out, dim=1) 


In [None]:
myFaceExp = facExpRec()
myFaceExp = myFaceExp.to(device)
myFaceExp

facExpRec(
  (cnn1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu1): ReLU()
  (Dropout1): Dropout(p=0.5)
  (maxpool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (cnn2): Conv2d(64, 128, kernel_size=(5, 5), stride=(1, 1), padding=(1, 1))
  (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu2): ReLU()
  (Dropout2): Dropout(p=0.5)
  (maxpool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (cnn3): Conv2d(128, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu3): ReLU()
  (Dropout3): Dropout(p=0.5)
  (maxpool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (cnn4): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (b

In [None]:
criterion_Entropy = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(myFaceExp.parameters(), lr = 0.001)

In [None]:
epoch = 35

# keeping the network in train mode
myFaceExp.train()
train_losses,  train_accuracy = [], []

# Loop for no of epochs
for e in range(epoch):
    train_loss = 0
    correct = 0
    # Iterate through all the batches in each epoch
    for images, labels in train_loader:

      # Convert the image and label to gpu for faster execution
      images = images.to(device)
      labels = labels.to(device)

      # Zero the parameter gradients
      optimizer.zero_grad()

      # Passing the data to the model (Forward Pass)
      outputs = myFaceExp(images)
      
      # Calculating the loss
      loss = criterion_Entropy(outputs, labels)
      train_loss += loss.item()

      # Performing backward pass (Backpropagation)
      loss.backward()

      # optimizer.step() updates the weights accordingly
      optimizer.step()

      # Accuracy calculation
      _, predicted = torch.max(outputs, 1)
      correct += (predicted == labels).sum().item()

    train_losses.append(train_loss/len(trainset))
    train_accuracy.append(100 * correct/len(trainset))
    print('epoch: {}, Train Loss:{:.6f} Train Accuracy: {:.2f} '.format(e+1,train_losses[-1], train_accuracy[-1]))

epoch: 1, Train Loss:2.616712 Train Accuracy: 26.66 
epoch: 2, Train Loss:2.298207 Train Accuracy: 30.12 
epoch: 3, Train Loss:2.297602 Train Accuracy: 31.27 
epoch: 4, Train Loss:2.261336 Train Accuracy: 31.75 
epoch: 5, Train Loss:2.235733 Train Accuracy: 32.99 
epoch: 6, Train Loss:2.229604 Train Accuracy: 33.30 
epoch: 7, Train Loss:2.201246 Train Accuracy: 34.17 
epoch: 8, Train Loss:2.232412 Train Accuracy: 34.48 
epoch: 9, Train Loss:2.208783 Train Accuracy: 34.59 
epoch: 10, Train Loss:2.221708 Train Accuracy: 34.86 
epoch: 11, Train Loss:2.198425 Train Accuracy: 35.38 
epoch: 12, Train Loss:2.200221 Train Accuracy: 35.02 
epoch: 13, Train Loss:2.166004 Train Accuracy: 35.75 
epoch: 14, Train Loss:2.177562 Train Accuracy: 36.47 
epoch: 15, Train Loss:2.148719 Train Accuracy: 36.64 
epoch: 16, Train Loss:2.186855 Train Accuracy: 36.49 
epoch: 17, Train Loss:2.176133 Train Accuracy: 36.26 
epoch: 18, Train Loss:2.137939 Train Accuracy: 36.88 
epoch: 19, Train Loss:2.150990 Train 

In [None]:
## Saving the model as a state dictionary
state = {'face_exp_dict': myFaceExp.state_dict()}
torch.save(state, './face_exp_model.t7')

In [None]:
myFaceExpModel = facExpRec().cuda()

In [None]:
ckpt = torch.load('./face_exp_model.t7')
myFaceExpModel.load_state_dict(ckpt['face_exp_dict'])

**Test your model and optimize CNN architecture for predicting the labels correctly**

In [None]:
# YOUR CODE HERE for test evaluation

myFaceExpModel.eval()

Test_accuracy = 0

# Iterate through all the batches in each epoch
for images,labels in validation_loader:
    # Convert the images and labels to gpu for faster execution
    images = images.to(device)
    labels = labels.to(device)

    # Do the forward pass 
    outputs = myFaceExpModel(images)

    # Accuracy calculation
    _, predicted = torch.max(outputs, 1)
    Test_accuracy += (predicted == labels).sum().item()

Accuracy = 100 * Test_accuracy / len(validationset)
print("Accuracy of Validation Data is", Accuracy)

Accuracy of Validation Data is 31.31046613896218


**Team Data Collection (activate the server first)** 

  - (This can be done on the day of the Hackathon once the login username and password are given)

Activate the Server Access
* Open the terminal (Command Prompt)
* Login to SSH by typing **ssh (username)@aiml-sandbox1.talentsprint.com**. Give the login username which is given to you. 

Eg: `ssh b15h3gxx@aiml-sandbox1.talentsprint.com`

  (If it is your first time connecting to the server from this computer, accept the connection by typing "yes".)
* After logging into SSH, please activate your virtual environment using the
command **source venv/bin/activate** and then press enter
* You can start the server by giving the command **sh runserver.sh** and then press enter.
* In order to collect team data in mobile app, ensure the server is active


**Collect your team data using the EFR Mobile App and fine-tune the CNN for expression data on your team**

Team Data Collection

* Follow the "Mobile_APP_Documentation" to collect the Expression photos of your team. These will be stored in the server to which login is provided to you.

[Mobile_APP_Documentation](https://drive.google.com/file/d/1tpr8_U0Ll_TexN4s-0pmPPg23J7Usok2/view?usp=sharing)


**Download your team expression data from the EFR app into your colab notebook using the links provided below**

NOTE: Replace the string "username" with your login username (such as b15h3gxx) in the below cell for expression images. 

This data will be useful for testing the above trained cnn networks.

In [None]:
!wget -nH --recursive --no-parent --reject 'index.*' https://aiml-sandbox.talentsprint.com/expression_detection/b15h3g17/captured_images_with_Expression/ --cut-dirs=3  -P ./captured_images_with_Expression

--2021-02-18 11:02:11--  https://aiml-sandbox.talentsprint.com/expression_detection/b15h3g17/captured_images_with_Expression/
Resolving aiml-sandbox.talentsprint.com (aiml-sandbox.talentsprint.com)... 139.162.203.12
Connecting to aiml-sandbox.talentsprint.com (aiml-sandbox.talentsprint.com)|139.162.203.12|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘./captured_images_with_Expression/index.html.tmp’

index.html.tmp          [ <=>                ]   1.03K  --.-KB/s    in 0s      

2021-02-18 11:02:11 (103 MB/s) - ‘./captured_images_with_Expression/index.html.tmp’ saved [1054]

Loading robots.txt; please ignore errors.
--2021-02-18 11:02:11--  https://aiml-sandbox.talentsprint.com/robots.txt
Reusing existing connection to aiml-sandbox.talentsprint.com:443.
HTTP request sent, awaiting response... 404 Not Found
2021-02-18 11:02:11 ERROR 404: Not Found.

Removing ./captured_images_with_Expression/index.html.tmp since it should 

In [None]:
%ls

[0m[01;34mcaptured_images_with_Expression[0m/  Expression_data.zip  [01;34m__MACOSX[0m/
[01;34mExpression_data[0m/                  face_exp_model.t7    [01;34msample_data[0m/


In [None]:
team_test_dir = '/content/captured_images_with_Expression'
teamTestSet = dset.ImageFolder(team_test_dir, transform = loader)
team_loader = torch.utils.data.DataLoader(teamTestSet, shuffle=True, batch_size=1)

In [None]:
# YOUR CODE HERE for loading the team expression data. Note: Use the same transform which used for Expression_Data.
# YOU CODE HERE for Dataloader

myFaceExpModel.eval()

Test_accuracy = 0

# Iterate through all the batches in each epoch
for images,labels in team_loader:
    # Convert the images and labels to gpu for faster execution
    images = images.to(device)
    labels = labels.to(device)

    # Do the forward pass 
    outputs = myFaceExpModel(images)

    # Accuracy calculation
    _, predicted = torch.max(outputs, 1)
    Test_accuracy += (predicted == labels).sum().item()

Accuracy = 100 * Test_accuracy / len(teamTestSet)
print("Accuracy of Team's Data is", Accuracy)

Accuracy of Team's Data is 38.764044943820224


In [None]:
# YOUR CODE HERE for getting the CNN representation of your team data with expression. Optimize the CNN model for predicting the labels of expressions correctly

**Save your trained model**

* Save the state dictionary of the classifier (use pytorch only), It will be useful in
integrating model to the mobile app

 [Hint](https://pytorch.org/tutorials/beginner/saving_loading_models.html)

In [None]:
### YOUR CODE HERE for saving the CNN model

**Download your trained model**
* Given the path of model file the following code downloads it through the browser

In [None]:
from google.colab import files
files.download('<model_file_path>')

##**Stage 5 (Anti Face Spoofing): (5 marks)**


---



The objective of anti face spoofing is to be able to unlock (say) a screen not just by your image
(which can be easily be spoofed with a photograph of yours) but by a switch in the expression
demanded by the Mobile App (which is much less probable to mimic)
* **Grading scheme**:
> * **Anti Face Spoofing**: (5M Only if both the cases mentioned below are achieved)
>>* **Unlock**: Correct face + Correct Demanded Expression
>>* **Stay Locked**: Correct face + Incorrect Demanded Expression (as you might imagine there are multiple other such possibilities, which you are free to explore)

In [None]:
# Test in your mobile app and see if it gets unlock.