# Intro to the Dataset and Preparation
- Data Set Name: Facial Expression Recognition(FER)Challenge
- Data Set Owner: NowYSM
- Size: 287MB
- Link to Data set: https://www.kaggle.com/ashishpatel26/facial-expression-recognitionferchallenge
## After going through the Dataset:- 
The Dataset Contains:

| Training Set    | Public Test Set | Private Test Set |
| :-------------: | :----------:    | -----------:     |
|  1 - 28710      | 28711 - 32299   | 32300 - 35888    |

## Our goal is to create a dataset which will contain images and labels
> Which will make the dataset easier to work for others too..

### This is how our Dataset will look like after preparation:
**train**
> anger<br>
> disgust<br>
> fear<br>
> happiness<br> 
> sadness<br>
> surprise<br>
> neutral<br>

**test**
> anger<br>
> disgust<br>
> fear<br>
> happiness<br>
> sadness<br>
> surprise<br>
> neutral

**labels.txt**
> anger<br>
> disgust<br>
> fear<br>
> happiness<br>
> sadness<br>
> surprise<br>
> neutral

The `train` and `test` are the two directories containing only the images of the respective emotions.<br>
The `labels.txt` will contain the labels of all the `seven` labels.

# **Imports**

In [1]:
import os
import math
import pandas as pd
import numpy as np
from random import randint
import PIL
from PIL import Image
import cv2

Let us define a project name which will help us in commiting our file to jovian.

In [2]:
project_name='convert_pixels_to_Images_in_one_go'

## Preparing the Dataset

In [3]:
df = pd.read_csv('fer2013.csv')

In [4]:
# No. Rows in the Dataset & Columns
print('No. of Rows: ',len(df))
print('No. of Columns: ',df.shape[1])

No. of Rows:  35887
No. of Columns:  3


In [5]:
df.head()

Unnamed: 0,emotion,pixels,Usage
0,0,70 80 82 72 58 58 60 63 54 58 60 48 89 115 121...,Training
1,0,151 150 147 155 148 133 111 140 170 174 182 15...,Training
2,2,231 212 156 164 174 138 161 173 182 200 106 38...,Training
3,4,24 32 36 30 32 23 19 20 30 41 21 22 32 34 21 1...,Training
4,6,4 0 0 0 0 0 0 0 0 0 0 0 3 15 23 28 48 50 58 84...,Training


In [6]:
df.tail()

Unnamed: 0,emotion,pixels,Usage
35882,6,50 36 17 22 23 29 33 39 34 37 37 37 39 43 48 5...,PrivateTest
35883,3,178 174 172 173 181 188 191 194 196 199 200 20...,PrivateTest
35884,0,17 17 16 23 28 22 19 17 25 26 20 24 31 19 27 9...,PrivateTest
35885,3,30 28 28 29 31 30 42 68 79 81 77 67 67 71 63 6...,PrivateTest
35886,2,19 13 14 12 13 16 21 33 50 57 71 84 97 108 122...,PrivateTest


In [7]:
df.Usage.value_counts()

Training       28709
PrivateTest     3589
PublicTest      3589
Name: Usage, dtype: int64

So, Our Dataset contains: <br>`28709`: Training sets,<br> `3589`: Public Test Sets<br> `3589`: Private Test Set

Let us see how many unique emotions does our dataset has:-

In [8]:
df.emotion.unique()

array([0, 2, 4, 6, 3, 5, 1])

We have 7 different classes, so let us represent them in textual format

In [9]:
emotion_label_to_text = {
    0: 'anger', 
    1: 'disgust', 
    2: 'fear', 
    3: 'happiness', 
    4: 'sadness', 
    5: 'surprise', 
    6: 'neutral'        }

In [10]:
df.emotion.value_counts()

3    8989
6    6198
4    6077
2    5121
0    4953
5    4002
1     547
Name: emotion, dtype: int64

Therefore, we can see that `3: 'happiness'` has the highest set of images and `6: 'neutral'` has the lowest set.

### Now let us define the height and widths of our images

In [11]:
height = int(math.sqrt(len(df.pixels[0].split()))) 
width = int(height)

In [12]:
height, width

(48, 48)

Lets create a two sets of Dataframe `train_df` and `test_df` where `train_df` will contain `Training` and `PublicTest` and the `test_df` will contain `PrivateTest`

## Preparation of the `train` Dataset

## train_df
> Will contain all the images of all classes from the `Traing` and `PublicTest` of the original dataset.

In [13]:
train_df1 = df[df.Usage == 'Training']
train_df2 = df[df.Usage == 'PublicTest']

train_df = train_df1.append(train_df2, ignore_index=True)
train_df.head()

Unnamed: 0,emotion,pixels,Usage
0,0,70 80 82 72 58 58 60 63 54 58 60 48 89 115 121...,Training
1,0,151 150 147 155 148 133 111 140 170 174 182 15...,Training
2,2,231 212 156 164 174 138 161 173 182 200 106 38...,Training
3,4,24 32 36 30 32 23 19 20 30 41 21 22 32 34 21 1...,Training
4,6,4 0 0 0 0 0 0 0 0 0 0 0 3 15 23 28 48 50 58 84...,Training


In [14]:
train_df.Usage.value_counts()

Training      28709
PublicTest     3589
Name: Usage, dtype: int64

## Separation of the images to their respective classes

So lets target the `disgust` emotion first as it contains the less pixels,<br>
> 1: 'disgust'

Our first step will be to assign the `emotion` rows containing all the `disgust` emotion to a single variable. 

In [17]:
disgust = train_df[train_df.emotion == 1] # Now disgust variable contains all the rows with emotion == disgust

Now the **disgust** datframe is containg all the rows of the **training** dataframe with `emotion == 1` row

In [31]:
disgust.columns

Index(['emotion', 'pixels', 'Usage'], dtype='object')

So now let us seprate the `pixels` coloumn and store it in different dataframe.

In [18]:
disgust_images_pixels = disgust['pixels']

Now the `disgust_images_pixels` contains only the pixels.

In [19]:
disgust_images_pixels

299      126 126 129 120 110 168 174 172 173 174 170 15...
388      89 55 24 40 43 48 53 55 59 41 33 31 22 32 42 4...
416      204 195 181 131 50 50 57 56 66 98 138 161 173 ...
473      14 11 13 12 41 95 113 112 111 122 132 137 142 ...
533      18 25 49 75 89 97 100 100 101 103 105 107 107 ...
                               ...                        
32086    57 61 50 20 12 35 65 78 146 214 210 212 217 20...
32120    255 249 229 231 129 184 156 106 86 95 142 171 ...
32178    84 76 65 65 33 19 100 140 152 162 166 168 173 ...
32260    255 251 97 82 75 51 52 86 125 160 182 192 198 ...
32289    136 83 25 126 135 134 134 133 134 65 91 106 11...
Name: pixels, Length: 492, dtype: object

Let us have a look at the number of images it contains<br>
As this will help us in verifying the no. of images in the directory after the creation.

In [20]:
len(disgust_images_pixels)

492

**Fially** we can now create a directory named `disgust` and store all the pixels after the conversion into images.

In [22]:
access_rights = 0o755
os.mkdir('disgust', access_rights)
os.chdir('disgust/')
for pixels in disgust_images_pixels:
    a = list(map(int, pixels.split(' ')))[:48 * 48]
    i = np.array(a).reshape((48, 48)).astype('uint8')
    img_name = str(len(pixels) + randint(100, 9999999)) + ".png"
    cv2.imwrite(img_name, i)

### Now let us repeat the process for `surprise`

> 5: 'surprise'

In [33]:
surprise = train_df[train_df.emotion == 5]

In [34]:
surprise_images_pixels = surprise['pixels']

In [35]:
len(surprise_images_pixels)

3586

In [69]:
access_rights = 0o755
os.mkdir('surprise', access_rights)
os.chdir('surprise/')
for pixels in surprise_images_pixels:
    a = list(map(int, pixels.split(' ')))[:48 * 48]
    i = np.array(a).reshape((48, 48)).astype('uint8')
    img_name = str(len(pixels) + randint(100, 99999999)) + ".png"
    cv2.imwrite(img_name, i)

### Now let us repeat the process for `anger`

> 0: 'anger',

In [70]:
anger = train_df[train_df.emotion == 0]

In [71]:
anger_images_pixels = anger['pixels']

In [72]:
len(anger_images_pixels)

4462

In [76]:
access_rights = 0o755
os.mkdir('anger', access_rights)
os.chdir('anger/')
for pixels in anger_images_pixels:
    a = list(map(int, pixels.split(' ')))[:48 * 48]
    i = np.array(a).reshape((48, 48)).astype('uint8')
    img_name = str(len(pixels) + randint(100, 99999999)) + ".png"
    cv2.imwrite(img_name, i)

### I guess our above steps can be done in a single line with help of functions
Lets define a function

In [89]:
def convert_pixels_to_image(emotion_label):
    var = train_df[train_df.emotion == emotion_label]
    var_images_pixels = var['pixels']
    print('Total No. Images: ',len(var_images_pixels))
    access_rights = 0o755
    folder_name = emotion_label_to_text[emotion_label]
    os.mkdir(folder_name, access_rights)
    os.chdir(folder_name)
    for pixels in var_images_pixels:
        a = list(map(int, pixels.split(' ')))[:48 * 48]
        i = np.array(a).reshape((48, 48)).astype('uint8')
        img_name = str(len(pixels) + randint(100, 999999999)) + ".png"
        cv2.imwrite(img_name, i)

# Now let us repeat the process for `fear`
> 2: 'fear'

Since we have defined our function to do all this task, **BUT** we have to keep onething in mind i.e., our present working directory should by the directory in which we are trying to store all our `train` dataset in.

In [101]:
cd ..

'/home/manish/Pytorch_Basics_Jovian/Final Course Project'

In [102]:
convert_pixels_to_image(2)

Total No. Images:  4593


# Now let us repeat the process for `sadness`
> 4: 'sadness'

In [104]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [105]:
convert_pixels_to_image(4)

Total No. Images:  5483


# Now let us repeat the process for `neutral`
> 6: 'neutral'

In [107]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [108]:
convert_pixels_to_image(6)

Total No. Images:  5572


# Now let us repeat the process for `happiness`
> 3: 'happiness'

In [111]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [112]:
convert_pixels_to_image(3)

Total No. Images:  8110


# -------Finally we are done with creation of `train` set--------

## Preparation of the Test Dataset

### test_df
> Will contain all the classes of images from the private test dataset of the original dataset.

In [113]:
test_df =  df[df.Usage == 'PrivateTest']
test_df.head()

Unnamed: 0,emotion,pixels,Usage
32298,0,170 118 101 88 88 75 78 82 66 74 68 59 63 64 6...,PrivateTest
32299,5,7 5 8 6 7 3 2 6 5 4 4 5 7 5 5 5 6 7 7 7 10 10 ...,PrivateTest
32300,6,232 240 241 239 237 235 246 117 24 24 22 13 12...,PrivateTest
32301,4,200 197 149 139 156 89 111 58 62 95 113 117 11...,PrivateTest
32302,2,40 28 33 56 45 33 31 78 152 194 200 186 196 20...,PrivateTest


In [114]:
test_df.Usage.value_counts()

PrivateTest    3589
Name: Usage, dtype: int64

We need to make a small chnage in our function to perform all our task with one go..<br>
we need to chnage the `train_df` to `test_df` in the second line of the function.

In [120]:
test_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3589 entries, 32298 to 35886
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   emotion  3589 non-null   int64 
 1   pixels   3589 non-null   object
 2   Usage    3589 non-null   object
dtypes: int64(1), object(2)
memory usage: 112.2+ KB


In [121]:
def convert_pixels_to_image(emotion_label):
    var = test_df[test_df.emotion == emotion_label]
    var_images_pixels = var['pixels']
    print('Total No. Images: ',len(var_images_pixels))
    access_rights = 0o755
    folder_name = emotion_label_to_text[emotion_label]
    os.mkdir(folder_name, access_rights)
    os.chdir(folder_name)
    for pixels in var_images_pixels:
        a = list(map(int, pixels.split(' ')))[:48 * 48]
        i = np.array(a).reshape((48, 48)).astype('uint8')
        img_name = str(len(pixels) + randint(100, 999999999)) + ".png"
        cv2.imwrite(img_name, i)

In [122]:
pwd

'/home/manish/Pytorch_Basics_Jovian/Final Course Project'

Let's go serially with this set from `0-6`

## Let us start with `anger`
> 0: 'anger',

In [123]:
convert_pixels_to_image(0)

Total No. Images:  491


## Then `disgust`
> 1: 'disgust', 

In [125]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [126]:
convert_pixels_to_image(1)

Total No. Images:  55


## Then `fear`
> 2: 'fear', 

In [128]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [129]:
convert_pixels_to_image(2)

Total No. Images:  528


## Then `happiness`
> 3: 'happiness',

In [130]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [131]:
convert_pixels_to_image(3)

Total No. Images:  879


## Then `sadness`
> 4: 'sadness',

In [132]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [133]:
convert_pixels_to_image(4)

Total No. Images:  594


## Then `surprise`
> 5: 'surprise', 

In [134]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [135]:
convert_pixels_to_image(5)

Total No. Images:  416


## Then `neutral`
> 6: 'neutral'   

In [136]:
cd ..

/home/manish/Pytorch_Basics_Jovian/Final Course Project


In [137]:
convert_pixels_to_image(6)

Total No. Images:  626


# -------------------------THE END---------------------------