<a href="https://colab.research.google.com/github/akutayaydin/Magnimind-5.1-DeepLearning/blob/main/7_GenderIDTransfLearn_v1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### `list_attr_celeba` Dataset
A popular component of computer vision and deep learning revolves around identifying faces for various applications from logging into your phone with your face or searching through surveillance images for a particular suspect. This dataset is great for training and testing models for face detection, particularly for recognising facial attributes such as finding people with brown hair, are smiling, or wearing glasses. Images cover large pose variations, background clutter, diverse people, supported by a large quantity of images and rich annotations. This data was originally collected by researchers at MMLAB, The Chinese University of Hong Kong (specific reference in Acknowledgment section).



- 202,599 number of face images of various celebrities
- 10,177 unique identities, but names of identities are not given
- 40 binary attribute annotations per image

You can obtain the dataset from https://www.kaggle.com/jessicali9530/celeba-dataset

In [None]:
import numpy as np
import pandas as pd 

from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import random
import os
import keras
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
from tensorflow.keras.utils import load_img
from keras.utils import to_categorical
from keras.models import Model
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.models import Sequential
from keras.layers import Dense, Dropout, GlobalAveragePooling2D, BatchNormalization

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
mypath='/content/gdrive/My Drive/Google Colab Folder/celeb_small'
print(os.listdir(mypath))

df=pd.read_csv(mypath+'/list_attr_celeba.csv')

df.head()
df.columns.values

['man.033915.jpg', 'female.014001.jpg', 'female.030808.jpg', 'female.033290.jpg', 'female.017920.jpg', 'man.044800.jpg', 'female.032945.jpg', 'man.029808.jpg', 'man.012082.jpg', 'man.011788.jpg', 'man.020192.jpg', 'female.018612.jpg', 'female.018966.jpg', 'man.030946.jpg', 'female.017722.jpg', 'female.032824.jpg', 'female.012872.jpg', 'female.014609.jpg', 'man.042068.jpg', 'man.029391.jpg', 'female.019135.jpg', 'female.013592.jpg', 'man.003372.jpg', 'female.033247.jpg', 'man.029001.jpg', 'man.013014.jpg', 'man.003587.jpg', 'man.044831.jpg', 'man.011367.jpg', 'man.007285.jpg', 'man.032385.jpg', 'man.034075.jpg', 'man.004029.jpg', 'female.012666.jpg', 'man.007497.jpg', 'female.034111.jpg', 'man.013145.jpg', 'female.019901.jpg', 'female.021114.jpg', 'man.030801.jpg', 'man.047652.jpg', 'female.016777.jpg', 'man.001799.jpg', 'man.009267.jpg', 'man.046226.jpg', 'female.018186.jpg', 'man.036304.jpg', 'man.026869.jpg', 'man.010076.jpg', 'female.021539.jpg', 'man.014165.jpg', 'man.046623.jpg', 

array(['image_id', '5_o_Clock_Shadow', 'Arched_Eyebrows', 'Attractive',
       'Bags_Under_Eyes', 'Bald', 'Bangs', 'Big_Lips', 'Big_Nose',
       'Black_Hair', 'Blond_Hair', 'Blurry', 'Brown_Hair',
       'Bushy_Eyebrows', 'Chubby', 'Double_Chin', 'Eyeglasses', 'Goatee',
       'Gray_Hair', 'Heavy_Makeup', 'High_Cheekbones', 'Male',
       'Mouth_Slightly_Open', 'Mustache', 'Narrow_Eyes', 'No_Beard',
       'Oval_Face', 'Pale_Skin', 'Pointy_Nose', 'Receding_Hairline',
       'Rosy_Cheeks', 'Sideburns', 'Smiling', 'Straight_Hair',
       'Wavy_Hair', 'Wearing_Earrings', 'Wearing_Hat', 'Wearing_Lipstick',
       'Wearing_Necklace', 'Wearing_Necktie', 'Young'], dtype=object)

#### See sample image

### 4. Build Model

- First, copy VGG16 without the dense layers, use the weights from `imagenet`. Set the input shape to `(178,218,3)`.
- Freeze the layers except the last two layers and print to see if the layers are trainable or not.
- Build your sequential model (you are free to use a functioanl API as a further exercise). Include all the frozen VGG layers to your model. Add a Dense layer with 128 inouts and `relu` activation. Add a batch nomalizer, then a dense layer as the output layer. 
- Create an early stopping criteria monitorin the loss value for the validation set. Stop the search if the loss value deosnt change for two consecutive times.
- Compile the model.
- Save the best model automatically based on the performance of the validation set.

## 5. Data Preparation

- Create a validation set with 20% of the data. Check the number of data points per class from both the train and validation sets.
- Set your batch size to 20.
- Create the data generator and set the `preprocessing_function` to `preprocess_input` of VGG16.
- Create train and validation data generators (batches will be picked up from the dataframe). Set target size to (178,218) (you can try something else, but you need to do the corresponding change in the model).
- Set your validation  and epoch step size (`validation_steps` and `steps_per_epoch`)

## 6. Train the Model

- Fit the model
- save the model