# Introduction
## Image Processing

An image is a composed of pixels. Every Pixel is assigned a value between 0 and 255 in a rows and columns. So it is easy to represent in matrix format as below.

![img](https://i.imgur.com/L5F39c4.jpg?1)

### Color Image
A color image is a matrix that specifies the color of various pixels in terms of the amount of red, green and blue components.

A set of one dot of each color form a pixel.

![Imgur2](https://visualled.com/wp-content/uploads/2018/08/pitch.jpg)
Every pixel is assigned an RGB value, each components being a value between 0 and 255.

Example: Red is rgb(255,0,0), yellow is rgb(241,252,23) and white is rgb(255,255,255).



![Imgur1](https://api.intechopen.com/media/chapter/51312/media/fig3.png)

Now will see an example of reading an image and storing in a ndarray. 

## OpenCV-Python

* When reading a color image file, OpenCV imread() reads as a NumPy array ndarray of row (height) x column (width) x color (3). The order of color is BGR (blue, green, red).
* On the other hand, the order of colors is assumed to be RGB (red, green, blue). 
* You can use the OpenCV function cvtColor() or simply change the order of ndarray. 


**Below image is 768 * 1024 pixels with blue, green and red components.**

In [None]:
import matplotlib.pyplot as plt
import cv2
img = cv2.imread('../input/image-processing/pichai2.jpg',1)
plt.imshow(img) # default value is BGR

print("The Dimension of the matrix is ", img.shape)
# print("Matrix value of the blue color \n",img[:,:,0], "\n Y,X Pixels:",img[:,:,0].shape) ## 0 refer blue, 1 refer green and 2 refer red


Now we will convert BGR to RGB component to see the actual image.

In [None]:
fig=plt.figure(figsize=(20, 20))
fig.add_subplot(1, 4, 1)
#####################################################################
### Converting BGR to RGB
plt.imshow(img[:,:,[2,1,0]])
## im = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) # using cvtcolor function also we can change the color
## plt.imshow(im)

fig.add_subplot(1, 4, 2)
## Select the pixels of the face
plt.imshow(img[40:275,500:750,[2,1,0]])


fig.add_subplot(1, 4, 3)
## Select the pixels of the eyes
plt.imshow(img[120:160,550:600,[2,1,0]])

fig.add_subplot(1, 4, 4)
## Select the pixels of the nose
plt.imshow(img[140:190,600:650,[2,1,0]])
plt.show()

In [None]:
color = ('r','g','b')

for i,col in enumerate(color):
    histogram2 = cv2.calcHist([img],[i],None,[256],[0,256])
    plt.plot(histogram2,color = col)
    plt.xlim([0,256])
plt.show()    

In [None]:
# import tensorflow as tf
# tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
# tf.config.experimental_connect_to_cluster(tpu)
# tf.tpu.experimental.initialize_tpu_system(tpu)

# # instantiate a distribution strategy
# tpu_strategy = tf.distribute.experimental.TPUStrategy(tpu)

In [None]:
main_features = ['left_eye_center_x', 'left_eye_center_y',
            'right_eye_center_x','right_eye_center_y',
            'nose_tip_x', 'nose_tip_y',
            'mouth_center_bottom_lip_x',
            'mouth_center_bottom_lip_y', 'Image']
# from IPython.display import clear_output
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
training = pd.read_csv('../input/facial-keypoints-detection/training.zip')
test = pd.read_csv('../input/facial-keypoints-detection/test.zip')
lookid_data = pd.read_csv('../input/facial-keypoints-detection/IdLookupTable.csv')

# Convolutions

You don't directly choose the numbers to go into your convolutions for deep learning... instead the deep learning technique determines what convolutions will be useful from the data (as part of model-training). We'll come back to how the model does that soon.

![Imgur](https://i.imgur.com/op9Maqr.png)

But looking closely at convolutions and how they are applied to your image will improve your intuition for these models, how they work, and how to debug them when they don't work.

![img4](https://pathmind.com/images/wiki/convgaus.gif)
![img5](https://cdn-media-1.freecodecamp.org/images/gb08-2i83P5wPzs3SL-vosNb6Iur5kb5ZH43)

### Feature Maps
Feature maps are the results we get after applying the filters.<br>

The shape of the feature map is influenced by:<br><br>
    1) Filter/Kernals<br>
    2) Padding<br>
    3) Striding<br>
    
### Filter/Kernals:
![filters](https://miro.medium.com/max/875/1*cfO8kMdUGG-X33TZa2h-vQ.gif)
- The filters are the ___neurons___ of the convolutional layers. 
- They are used for ___feature detection___. 
- They are represented in the form of square matrix. 

Initial weights of the filters are assigned randomly, and during the training phase these weights gets updated based on backward propagations.<br>
<br>

Examples of image after we apply a filter:<br>
![img5](https://i.imgur.com/6KrzpuC.jpg)


    
#### Striding
   The amount of movement between applications of the filter to the input image is referred to as the stride, and it is almost always symmetrical in height and width dimensions.<br>
   The default stride or strides in two dimensions is (1,1) for the height and the width movement, performed when needed. And this default works well in most cases.<br>
<br>
The stride can be changed, which has an effect both on how the filter is applied to the image and, in turn, the size of the resulting feature map.
![imgstr](https://miro.medium.com/max/588/1*BMngs93_rm2_BpJFH2mS0Q.gif)

#### Padding:


   The pixels on the edge of the input are only ever exposed to the edge of the filter. By starting the filter outside the frame of the image, it gives the pixels on the border of the image more of an opportunity for interacting with the filter, more of an opportunity for features to be detected by the filter, and in turn, an output feature map that has the same shape as the input image. This process of creating extra layers/borders in image is known as padding.<br>
   Padding are extremely usefull when the input dimension are small and when we don't want to have any information leakage.<br>

![padding](https://miro.medium.com/max/430/1*KGrCz7aav02KoGuO6znO0w.gif)

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2
import sys, requests, shutil, os
from urllib import request, error


def fetch_image(image_url,name):
    img_data = requests.get(image_url).content
    with open(f'../input/{name}', 'wb') as handler:
        handler.write(img_data)

def plot_image(image):
    plt.imshow(image,cmap='gray')
    plt.xticks([])
    plt.yticks([])

    
def pooling(image, kernel_shape=3):
    #showing max_pooling
    print ("shape before pooling",image.shape)
    y, x = image.shape
    new_image = []
    for i in range(0,y,kernel_shape):
        temp = []
        for j in range(0,x,kernel_shape):
            temp.append(np.max(image[i:i+kernel_shape, j:j+kernel_shape]))
        new_image.append(temp)
    new_image = np.array(new_image)
    print ("shape after pooling",new_image.shape)
    return (new_image)

def padding(image,top=1,bottom=1,left=1,right=1,values=0):
  # Create new rows/columns in the matrix and fill those with some values
  #return cv2.copyMakeBorder(image,top,bottom,left,right,cv2.BORDER_CONSTANT,value=values)
    
    x,y = image.shape
    #print (image.shape)
    arr = np.full((x+top+bottom,y+left+right),values,dtype=float)
    #print(image[0])
    #print (arr.shape)
    #print (top,x-bottom)
    #print (y,y-bottom)
    arr[top:x+top,left:y+left] = image
    #print(arr[top])
    return arr

def convolution2d(image, kernel, bias=0,strid=1,pad_val=()):
  #including padding,striding and convolution
    print ("shape before padding/striding",image.shape)
    if not pad_val:
        print (pad_val)
    image = padding(image,*pad_val)#(how many rows, columns to be padded, and of what type)
    m, n = kernel.shape
    y, x = image.shape
    y = y - m + 1
    x = x - m + 1
    new_image = []
    for i in range(0,y,strid):
        temp = []
        for j in range(0,x,strid):
            temp.append(np.sum(image[i:i+m, j:j+m]*kernel) + bias)
        new_image.append(temp)
    new_image = np.array(new_image)
    print ("shape after padding/striding",new_image.shape)
    return (new_image)

### Converting pixel value into image

Below is the image pixel value represent in 96*96 matrix.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import sys
np.set_printoptions(threshold=sys.maxsize)
img_txt_input = training['Image'][0]
print("No of pixel values is :",len(img_txt_input.split(' ')),"Converting the pixel values into rows and column:",np.sqrt(len(img_txt_input.split(' '))),"*",np.sqrt(len(img_txt_input.split(' '))),"\n")
fn_reshape = lambda a: np.fromstring(a, dtype=int, sep=' ').reshape(96,96)
img = fn_reshape(img_txt_input)
print("Below is the pixel value conveted into an image")
plt.imshow(img,cmap='gray')
plt.show()

### Applying Filters on image

In [None]:
samp_imag = img.copy()
samp_imag = samp_imag/255.
h_kernal = np.array([[1,1,1],[0,0,0],[-1,-1,-1]])
v_kernal = np.array([[1,0,-1],[1,0,-1],[1,0,-1]])
h_image = cv2.filter2D(samp_imag,-1, h_kernal)
v_image = cv2.filter2D(samp_imag,-1, v_kernal)
#Laplacian filter
lap_filter = np.array([[0,1,0],[1,-4,1],[0,1,0]])
lap_image = cv2.filter2D(samp_imag,-1, lap_filter)

print ("Shapes before applying the filter:{} ".format(samp_imag.shape))
print ("Shapes after applying the filter:{} ".format(lap_image.shape))
plt.figure(figsize=(10,10))
plt.suptitle("Image and its transformations after applying filters")
plt.subplot(221)
plt.title("Actual Gray Scale Image")
plot_image(samp_imag)
plt.subplot(222)
plt.title("Horizontal Filter applied")
plot_image(h_image)
plt.subplot(223)
plt.title("Vertical Filter applied")
plot_image(v_image)
plt.subplot(224)
plt.title("Laplacian Filter applied")
plot_image(lap_image)
plt.show()

### Applying padding on image

In [None]:
# print ("shape of actual image: {}".format(samp_imag.shape))
padded_image_5 = padding(samp_imag,*(5,5,5,5,1))
padded_image_10 = padding(samp_imag,*(10,10,10,10,0))
padded_bimage_10 = padding(samp_imag,*(10,10,10,10,1))
# # print ("shape of padded image: {}".format(padded_image.shape))

plt.figure(figsize=(10,10))
plt.suptitle("Padding")
plt.subplot(2,2,1)
plt.imshow(samp_imag,cmap='gray')
plt.title("Actual Image")
plt.subplot(2,2,2)
plt.imshow(padded_image_5,cmap='gray')
plt.title("Padding 5 border with white")
plt.subplot(2,2,3)
plt.imshow(padded_image_10,cmap='gray')
plt.title("Padding 10 border with black")
plt.subplot(2,2,4)
plt.imshow(padded_bimage_10,cmap='gray')
plt.title("Padding 10 border with white")
plt.show()

### Applying Striding on image

In [None]:
print ("Padding used is 1 for all the borders\nVertical filter is used")
fig, ax = plt.subplots(2, 2,figsize=(10,10))
plt.suptitle("Affect of Stride")
ax[0,0].set_title("Actual Image")
ax[0,0].imshow(samp_imag,cmap='gray')
rave = ax.ravel()
for i in range(1,4):
    print ("")
    print(f"striding value = {i}")
    custom_conv = convolution2d(samp_imag,v_kernal,strid=i,pad_val=(1,1,1,1,0))
    rave[i].set_title(f"striding value = {i}")
    rave[i].imshow(custom_conv,cmap='gray')
#print (custom_conv.shape)
#plt.imshow(custom_conv,cmap='gray')

## Pooling Layer
Its function is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network.<br>
Pooling layer operates on each feature map independently<br>
The most common approach used in pooling is max pooling.<br>

![pooling](https://miro.medium.com/max/875/1*jU_Mp73fXzh9_ffvtnbrDQ.png)

### Applying MaxPooling on image

In [None]:
print ("Pooling example")
fig, ax = plt.subplots(2, 2,figsize=(10,10))
plt.suptitle("Affect of Pooing")
ax[0,0].set_title("Actual Image")
ax[0,0].imshow(samp_imag,cmap='gray')
rave = ax.ravel()
for i in range(2,5):
    print ("")
    print(f"Pooling and striding value = {i}")
    custom_conv = pooling(samp_imag,i)
    rave[i-1].set_title(f"striding value = {i}")
    rave[i-1].imshow(custom_conv,cmap='gray')
#print (custom_conv.shape)
#plt.imshow(custom_conv,cmap='gray')

# Facial Keypoints Detection competition
The goal of the competition is to locate specific keypoints on face images. Build a model that, given an image of a face, automatically locates where these keypoints are located.

## Given
1. training.csv - It contains (x,y) coordinates of 30 facial keypoints(both left and right) and pixel values of Images.
2. test.csv - It contains pixel values of images
3. IdLookupTable.csv - It contains required Feature Names along with ImageId for submission.

**Important points:** In total, we have 7049 rows, each one with 31 columns. The first 30 columns are keypoint locations, which python correctly identified as numbers. The last one is a string representation of the image, identified as a string.

In [None]:
train_columns = training.columns[:-1].values
training.head().T

In [None]:
test.head()

## Exploring Data

In [None]:
training[training.columns[:-1]].describe(percentiles = [0.05,0.1,.25, .5, .75,0.9,0.95]).T

## Missing Data and outliers

In [None]:
whisker_width = 1.5
total_rows = training.shape[0]
missing_col = 0
for col in training[training.columns[:-1]]:
    count = training[col].count()
    q1 = training[col].quantile(0.25)
    q3 = training[col].quantile(0.75)
    iqr = q3 - q1
    outliers = training[(training[col] < q1 - whisker_width*iqr)
                       | (training[col] > q3 + whisker_width*iqr)][col].count()
    print (f"dv:{col}, dv_rows:{count}, missing_pct:{round(100.*(1-count/total_rows),2)}%, outliers:{outliers}, outlier_pct:{round(100.*outliers/count,2)}%")
    if (100.*(1-count/total_rows)>65):
        missing_col+=1

print(f"DVs containing more than 65% of data missing : {missing_col} out of {len(training.columns[:-1])}")

From the above analysis we can see that, around 2-5% of the data are prone to outliers.
And for detailed DV such as mouth_right_corner,left_eyebrow_outer_end_y etc, >65% of the data are missing.

In [None]:
def plot_loss(hist,name,plt,RMSE_TF=False):
    '''
    RMSE_TF: if True, then RMSE is plotted with original scale 
    '''
    loss = hist['loss']
    val_loss = hist['val_loss']
    if RMSE_TF:
        loss = np.sqrt(np.array(loss))*48 
        val_loss = np.sqrt(np.array(val_loss))*48 
        
    plt.plot(loss,"--",linewidth=3,label="train:"+name)
    plt.plot(val_loss,linewidth=3,label="val:"+name)

def plot_sample_val(X,y,axs,pred):
    '''
    kaggle picture is 96 by 96
    y is rescaled to range between -1 and 1
    '''
    
    axs.imshow(X.reshape(96,96),cmap="gray")
    axs.scatter(48*y[0::2]+ 48,48*y[1::2]+ 48, label='Actual')
    axs.scatter(48*pred[0::2]+ 48,48*pred[1::2]+ 48, label='Prediction')

def plot_sample(X,y,axs):
    '''
    kaggle picture is 96 by 96
    y is rescaled to range between -1 and 1
    '''
    
    axs.imshow(X.reshape(96,96),cmap="gray")
    axs.scatter(48*y[0::2]+ 48,48*y[1::2]+ 48)

### Get only the data with keypoints

### Manually splitting the training and validation data

In [None]:
def data_loader(data_frame):
    
    # Load dataset file
   
    data_frame['Image'] = data_frame['Image'].apply(lambda i: np.fromstring(i, sep=' '))
    data_frame = data_frame.dropna()  # Get only the data with 15 keypoints
   
    # Extract Images pixel values
    imgs_array = np.vstack(data_frame['Image'].values)/ 255.0
    imgs_array = imgs_array.astype(np.float32)    # Normalize, target values to (0, 1)
    imgs_array = imgs_array.reshape(-1,96, 96, 1)
        
    # Extract labels (key point cords)
    labels_array = data_frame[data_frame.columns[:-1]].values
    labels_array = (labels_array - 48) / 48    # Normalize, traget cordinates to (-1, 1)
    labels_array = labels_array.astype(np.float32) 
    
    # shuffle the train data
#     imgs_array, labels_array = shuffle(imgs_array, labels_array, random_state=9)  
    
    return imgs_array, labels_array

def data_loader_test(data_frame):
    
    # Load dataset file
   
    data_frame['Image'] = data_frame['Image'].apply(lambda i: np.fromstring(i, sep=' '))
  
    # Extract Images pixel values
    imgs_array = np.vstack(data_frame['Image'].values)/ 255.0
    imgs_array = imgs_array.astype(np.float32)    # Normalize, target values to (0, 1)
    imgs_array = imgs_array.reshape(-1, 96, 96, 1)
    
    return imgs_array


X,Y = data_loader(training)
X_test = data_loader_test(test)

In [None]:
X_train, X_val, y_train, y_val = train_test_split(X, Y, test_size=0.2, random_state=42)
print("Train sample:",X_train.shape,"Val sample:",X_val.shape)

# Data augmentation

Image data augmentation is a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset.

Training deep learning neural network models on more data can result in more skillful models, and the augmentation techniques can create variations of the images that can improve the ability of the fit models to generalize what they have learned to new images.

The Keras deep learning neural network library provides the capability to fit models using image data augmentation via the ImageDataGenerator class.

### Horizontal and Vertical Flip Augmentation

An image flip means reversing the rows or columns of pixels in the case of a vertical or horizontal flip respectively.

### Horizontal and Vertical Shift Augmentation

A shift to an image means moving all pixels of the image in one direction, such as horizontally or vertically, while keeping the image dimensions the same.

Flipping pictures can double the number of pictures twice. If we allow the pictures to shift by some pixcels within frames, this can increase the number of pictures substantially!

In [None]:
class DataModifier(object):
    def fit(self,X_,y_):
        return(NotImplementedError)
    
class FlipPic(DataModifier):
    def __init__(self,flip_indices=None):
        if flip_indices is None:
            flip_indices = [
                (0, 2), (1, 3),
                (4, 8), (5, 9), (6, 10), (7, 11),
                (12, 16), (13, 17), (14, 18), (15, 19),
                (22, 24), (23, 25)
                ]
        
        self.flip_indices = flip_indices
        
    def fit(self,X_batch,y_batch):

        batch_size = X_batch.shape[0]
        indices = np.random.choice(batch_size, batch_size//2, replace=False)

        X_batch[indices] = X_batch[indices, :, ::-1,:]
        y_batch[indices, ::2] = y_batch[indices, ::2] * -1

        # flip left eye to right eye, left mouth to right mouth and so on .. 
        for a, b in self.flip_indices:
            y_batch[indices, a], y_batch[indices, b] = (
                    y_batch[indices, b], y_batch[indices, a]
                )
        return X_batch, y_batch

class FlipPic_8(DataModifier):
    def __init__(self,flip_indices=None):
        if flip_indices is None:
            flip_indices = [
                (0, 2), (1, 3)
                ]
        
        self.flip_indices = flip_indices

    def fit(self,X_batch,y_batch):

        batch_size = X_batch.shape[0]
        indices = np.random.choice(batch_size, batch_size//2, replace=False)

        X_batch[indices] = X_batch[indices, :, ::-1,:]
        y_batch[indices, ::2] = y_batch[indices, ::2] * -1

        # flip left eye to right eye, left mouth to right mouth and so on .. 
        for a, b in self.flip_indices:
            y_batch[indices, a], y_batch[indices, b] = (
                    y_batch[indices, b], y_batch[indices, a]
                )
        return X_batch, y_batch

In [None]:
class ShiftFlipPic(FlipPic):
    def __init__(self,flip_indices=None,prop=0.1):
        super(ShiftFlipPic,self).__init__(flip_indices)
        self.prop = prop
        
    def fit(self,X,y):
        X, y = super(ShiftFlipPic,self).fit(X,y)
        X, y = self.shift_image(X,y,prop=self.prop)
        return(X,y)
    def random_shift(self,shift_range,n=96):
        '''
        :param shift_range: 
        The maximum number of columns/rows to shift
        :return: 
        keep(0):   minimum row/column index to keep
        keep(1):   maximum row/column index to keep
        assign(0): minimum row/column index to assign
        assign(1): maximum row/column index to assign
        shift:     amount to shift the landmark

        assign(1) - assign(0) == keep(1) - keep(0)
        '''
        shift = np.random.randint(-shift_range,
                                  shift_range)
        def shift_left(n,shift):
            shift = np.abs(shift)
            return(0,n - shift)
        def shift_right(n,shift):
            shift = np.abs(shift)
            return(shift,n)

        if shift < 0:
            keep = shift_left(n,shift) 
            assign = shift_right(n,shift)
        else:
            assign = shift_left(n,shift) ## less than 96
            keep = shift_right(n,shift)

        return((keep,  assign, shift))

    def shift_single_image(self,x_,y_,prop=0.1):
        '''
        :param x_: a single picture array (96, 96, 1)
        :param y_: 15 landmark locations 
                   [0::2] contains x axis values
                   [1::2] contains y axis values 
        :param prop: proportion of random horizontal and vertical shift
                     relative to the number of columns
                     e.g. prop = 0.1 then the picture is moved at least by 
                     0.1*96 = 8 columns/rows
        :return: 
        x_, y_
        '''
        w_shift_max = int(x_.shape[0] * prop)
        h_shift_max = int(x_.shape[1] * prop)

        w_keep,w_assign,w_shift = self.random_shift(w_shift_max)
        h_keep,h_assign,h_shift = self.random_shift(h_shift_max)

        x_[w_assign[0]:w_assign[1],
           h_assign[0]:h_assign[1],:] = x_[w_keep[0]:w_keep[1],
                                           h_keep[0]:h_keep[1],:]

        y_[0::2] = y_[0::2] - h_shift/float(x_.shape[0]/2.)
        y_[1::2] = y_[1::2] - w_shift/float(x_.shape[1]/2.)
        return(x_,y_)

    def shift_image(self,X,y,prop=0.1):
            ## This function may be modified to be more efficient e.g. get rid of loop?
            for irow in range(X.shape[0]):
                x_ = X[irow]
                y_ = y[irow]
                X[irow],y[irow] = self.shift_single_image(x_,y_,prop=prop)
            return(X,y)

### Flipping pictures

In [None]:
from keras.preprocessing.image import ImageDataGenerator

generator = ImageDataGenerator()
modifier = FlipPic_8()
fig = plt.figure(figsize=(20,20))
count = 1
for batch in generator.flow(X[:4],Y[:4]):
    X_batch, y_batch = modifier.fit(*batch)
    ax = fig.add_subplot(5,4, count,xticks=[],yticks=[])  
    plot_sample(X_batch[0],y_batch[0],ax)
    count += 1
    if count == 10:
        break
plt.show()

### Shifting pictures

In [None]:
from keras.preprocessing.image import ImageDataGenerator
generator = ImageDataGenerator()
shiftFlipPic = ShiftFlipPic(prop=0.1)

fig = plt.figure(figsize=(20,20))

count = 1
for batch in generator.flow(X[:4],Y[:4]):
    X_batch, y_batch = shiftFlipPic.fit(*batch)

    ax = fig.add_subplot(5,4, count,xticks=[],yticks=[])  
    plot_sample(X_batch[0],y_batch[0],ax)
    count += 1
    if count == 10:
        break
plt.show()

In [None]:
from math import sin, cos, pi

def rotate_augmentation(images, keypoints):
    rotated_images = []
    rotated_keypoints = []
#     print("Augmenting for angles (in degrees): ")
    for angle in rotation_angles:    # Rotation augmentation for a list of angle values
        for angle in [angle,-angle]:
#             print(f'{angle}', end='  ')
            M = cv2.getRotationMatrix2D((48,48), angle, 1.0)
            angle_rad = -angle*pi/180.     # Obtain angle in radians from angle in degrees (notice negative sign for change in clockwise vs anti-clockwise directions from conventional rotation to cv2's image rotation)
            # For train_images
            for image in images:
                rotated_image = cv2.warpAffine(image, M, (96,96), flags=cv2.INTER_CUBIC)
                rotated_images.append(rotated_image)
            # For train_keypoints
            for keypoint in keypoints:
                rotated_keypoint = (keypoint+1)-1    # Subtract the middle value of the image dimension
                for idx in range(0,len(rotated_keypoint),2):
                    # https://in.mathworks.com/matlabcentral/answers/93554-how-can-i-rotate-a-set-of-points-in-a-plane-by-a-certain-angle-about-an-arbitrary-point
                    rotated_keypoint[idx] = rotated_keypoint[idx]*cos(angle_rad)-rotated_keypoint[idx+1]*sin(angle_rad)
                    rotated_keypoint[idx+1] = rotated_keypoint[idx]*sin(angle_rad)+rotated_keypoint[idx+1]*cos(angle_rad)
                rotated_keypoint = (rotated_keypoint-1)+1   # Add the earlier subtracted value
                rotated_keypoints.append(rotated_keypoint)

    return np.reshape(rotated_images,(-1,96,96,1)), rotated_keypoints

rotation_angles=[6,12] # Rotation angle in degrees (includes both clockwise & anti-clockwise rotations)



def alter_brightness(images, keypoints):
    altered_brightness_images = []
    inc_brightness_images = np.clip(images*2, 0.0, 1.0)    # Increased brightness by a factor of 1.2 & clip any values outside the range of [-1,1]
    dec_brightness_images = np.clip(images*0.1, 0.0, 1.0)    # Decreased brightness by a factor of 0.6 & clip any values outside the range of [-1,1]
    altered_brightness_images.extend(inc_brightness_images)
    altered_brightness_images.extend(dec_brightness_images)
    return altered_brightness_images, np.concatenate((keypoints, keypoints))


def add_blur(images,keypoints,blur_val=3):
    noisy_images = []
    for image in images:
#         kernel = np.ones((5,5),np.float32)/25
        image = cv2.blur(image, (blur_val, blur_val), cv2.BORDER_DEFAULT).reshape(96,96,1)
        noisy_images.append(image)
    return noisy_images, keypoints

def add_noise(images,keypoints,noise_val):
    noisy_images = []
    for image in images:
#         kernel = np.ones((5,5),np.float32)/25
        gauss = np.random.normal(0,1,image.size)
        gauss = gauss.reshape(image.shape[0],image.shape[1],image.shape[2]).astype('uint8')
        # Add the Gaussian noise to the image
        img_gauss = cv2.add(img,gauss)
        noisy_images.append(img_gauss).reshape(96,96,1)
    return noisy_images, keypoints

# Image rotation

In [None]:
from keras.preprocessing.image import ImageDataGenerator
generator = ImageDataGenerator()

fig = plt.figure(figsize=(20,20))

count = 1
for batch in generator.flow(X[:4],Y[:4]):
    X_batch, y_batch = add_blur(*batch,3)
    ax = fig.add_subplot(5,4, count,xticks=[],yticks=[])  
    plot_sample(X_batch[0],y_batch[0],ax)
    count += 1
    if count == 10:
        break
plt.show()

### Image Augmentation With ImageDataGenerator

In [None]:
modifier = FlipPic()
generator = ImageDataGenerator()
shiftFlipPic_1 = ShiftFlipPic(prop=0.02)
shiftFlipPic_2 = ShiftFlipPic(prop=0.03)
shiftFlipPic_3 = ShiftFlipPic(prop=0.04)
shiftFlipPic_4 = ShiftFlipPic(prop=0.05)
shiftFlipPic_5 = ShiftFlipPic(prop=0.06)
shiftFlipPic_6 = ShiftFlipPic(prop=0.07)
shiftFlipPic_7 = ShiftFlipPic(prop=0.1)
batches = 0
for batch in generator.flow(X_train,y_train):
    X_batch, y_batch = add_blur(*batch,5)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = add_blur(*batch,3)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = add_blur(*batch,2)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = rotate_augmentation(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = rotate_augmentation(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = rotate_augmentation(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = rotate_augmentation(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = alter_brightness(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = alter_brightness(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = alter_brightness(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = alter_brightness(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = modifier.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = modifier.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = modifier.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = modifier.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = shiftFlipPic_1.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = shiftFlipPic_2.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = shiftFlipPic_3.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
    X_batch, y_batch = shiftFlipPic_4.fit(*batch)
    y_train = np.concatenate((y_train,y_batch))
    X_train = np.concatenate((X_train,X_batch))
#     X_batch, y_batch = shiftFlipPic_5.fit(*batch)
#     y_train = np.concatenate((y_train,y_batch))
#     X_train = np.concatenate((X_train,X_batch))
#     X_batch, y_batch = shiftFlipPic_6.fit(*batch)
#     y_train = np.concatenate((y_train,y_batch))
#     X_train = np.concatenate((X_train,X_batch))
#     X_batch, y_batch = shiftFlipPic_7.fit(*batch)
#     y_train = np.concatenate((y_train,y_batch))
#     X_train = np.concatenate((X_train,X_batch))
    batches += 1
    if batches >= 6:
#         len(X_train) / 32
        # we need to break the loop by hand because
        # the generator loops indefinitely
        break  

print("Train sample:",X_train.shape,"Val sample:",X_val.shape)

### Custom Metrics in Keras

Kaggle submissions are scored on the root mean squared error. RMSE is very common and is a suitable general-purpose error metric. Compared to the Mean Absolute Error, RMSE punishes large errors:

![rmse_img](https://www.includehelp.com/ml-ai/Images/rmse-1.jpg)

In [None]:
from keras import backend
 
def rmse(y_true, y_pred):
    return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))

In [None]:

from keras.models import Sequential, Model
from keras.layers import Activation, TimeDistributed,Convolution2D, MaxPooling2D, BatchNormalization, AvgPool2D, Flatten, Dense, Dropout, Conv2D,MaxPool2D, ZeroPadding2D,GRU, LSTM
from keras.optimizers import Adam, SGD
from keras import regularizers
import keras.backend as K
from keras.initializers import GlorotNormal
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
import tensorflow as tf
import numpy as np
from keras.layers.advanced_activations import LeakyReLU


### Define simple CNN function using Keras

In [None]:
def cnn(n_out):
    model = Sequential()
    model = Sequential()
    model.add(TimeDistributed(Convolution2D(32, (3,3), padding='same', use_bias=False, kernel_initializer=GlorotNormal, activation='relu',input_shape=(96,96,1))))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 96, 96, 32)
    model.add(TimeDistributed(Convolution2D(32, (3,3), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
#     model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2))))

    # Input dimensions: (None, 48, 48, 32)
    model.add(TimeDistributed(Convolution2D(64, (3,1), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 48, 48, 64)
    model.add(TimeDistributed(Convolution2D(64, (1,3), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
#     model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2))))

    # Input dimensions: (None, 24, 24, 64)
    model.add(TimeDistributed(Convolution2D(96, (3,1), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 24, 24, 96)
    model.add(TimeDistributed(Convolution2D(96, (1,3), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
#     model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2))))

    # Input dimensions: (None, 12, 12, 96)
    model.add(TimeDistributed(Convolution2D(128, (3,1),padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 12, 12, 128)
    model.add(TimeDistributed(Convolution2D(128, (1,3),padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2))))

    # Input dimensions: (None, 6, 6, 128)
    model.add(TimeDistributed(Convolution2D(256, (3,1),padding='same',use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 6, 6, 256)
    model.add(TimeDistributed(Convolution2D(256, (1,3),padding='same',use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2))))

    # Input dimensions: (None, 3, 3, 256)
    model.add(TimeDistributed(Convolution2D(512, (3,1), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 3, 3, 512)
    model.add(TimeDistributed(Convolution2D(512, (1,3), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2))))

    # Input dimensions: (None, 48, 48, 32)
    model.add(TimeDistributed(Convolution2D(64, (3,1), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 48, 48, 64)
    model.add(TimeDistributed(Convolution2D(64, (1,3), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
#     model.add(TimeDistributed(MaxPool2D(pool_size=(2, 2))))
    
    # Input dimensions: (None, 3, 3, 32)
    model.add(TimeDistributed(Convolution2D(32, (3,1), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))
    # Input dimensions: (None, 3, 3, 32)
    model.add(TimeDistributed(Convolution2D(32, (1,3), padding='same', use_bias=False, kernel_initializer=GlorotNormal)))
    model.add(TimeDistributed(LeakyReLU(alpha = 0.1)))
    model.add(TimeDistributed(BatchNormalization()))

    
    model.add(TimeDistributed(Flatten()))

    model.add(LSTM(1024, activation='relu', return_sequences=False))
    model.add(Dense(1024,activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(n_out))
    return model
# model.compile(A(lr=0.0001),loss='mean_squared_error',  metrics=[rmse,mae,own_loss])

# filepath="weights-improvement_1.hdf5"
# checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='min')
# callbacks_list = [checkpoint]
# Fit the model

In [None]:
model_30 = cnn(30)
# model_30.compile(optimizer=SGD(lr=0.01,momentum = 0.9,decay=0.00006, nesterov=True), loss=rmse,metrics=[rmse,'mse', 'mae'])
model_30.compile(optimizer=Adam(lr=0.001),loss=rmse, metrics=[rmse,'mse', 'mae'])
# print(model_30.summary())
# LR_callback_30 = ReduceLROnPlateau(monitor='val_loss',  verbose=10, factor=.4, min_lr=.00001)
EarlyStop_callback_30 = EarlyStopping(restore_best_weights=True,mode='min')
# print(model_30.summary())

In [None]:
# model_30.fit(X_train,y_train,validation_data=(X_val, y_val), epochs = 120) 

In [None]:
from keras.preprocessing.sequence import TimeseriesGenerator
train_sequences = TimeseriesGenerator(X_train, y_train, length=5, batch_size=20)
test_sequences = TimeseriesGenerator(X_val, y_val, length=5, batch_size=20)

# fit model using fit_generator instead of fit
model_30.fit_generator(train_sequences,validation_data = test_sequences,epochs=120)

### CNN Model fitting

In [None]:

# # instantiating the model in the strategy scope creates the model on the TPU
# with tpu_strategy.scope():
#     model_30 = cnn(30)
# # model_30.compile(optimizer=SGD(lr=0.01,momentum = 0.9,decay=0.00006, nesterov=True), loss=rmse,metrics=[rmse,'mse', 'mae'])
#     model_30.compile(optimizer=Adam(lr=0.001),loss=rmse, metrics=[rmse,'mse', 'mae'])

In [None]:
hist = model_30.fit(X_train,y_train,validation_data=(X_val, y_val), epochs = 120, batch_size=20) 

### Evaluting model results on train and val sample

In [None]:
# scores = model_30.evaluate(train_sequences, verbose=0)
print(" Train %s: %.2f%% %s: %.2f%% %s: %.2f%%" % (model_30.metrics_names[1], scores[1]*100,model_30.metrics_names[2], scores[2]*100,model_30.metrics_names[3], scores[3]*100))
scores = model_30.evaluate(test_sequences, verbose=0)
print(" Val %s: %.2f%% %s: %.2f%% %s: %.2f%%" % (model_30.metrics_names[1], scores[1]*100,model_30.metrics_names[2], scores[2]*100,model_30.metrics_names[3], scores[3]*100))
# test1_sequences = TimeseriesGenerator(X_test, length=5, batch_size=20)
# y_hat_30 = model_30.predict_generator(X_test,350) 

In [None]:
1783/5

### Plot val and train rmse on each epoch

In [None]:
def plot_loss(hist,name,plt,RMSE_TF=False):
    '''
    RMSE_TF: if True, then RMSE is plotted with original scale 
    '''
    loss = hist['rmse']
    val_loss = hist['val_rmse']
    if RMSE_TF:
        loss = np.sqrt(np.array(loss))*48 
        val_loss = np.sqrt(np.array(val_loss))*48 
        
    plt.plot(loss,"--",linewidth=3,label="train:"+name)
    plt.plot(val_loss,linewidth=3,label="val:"+name)

plot_loss(hist.history,"model 1",plt)
plt.legend()
plt.grid()
plt.xlabel("epoch")
plt.ylabel("RMSE")
plt.show()

### Plot actual and prediction keypoints on val sample

In [None]:
  
pred = model_30.predict(X_val)

fig = plt.figure(figsize=(7, 7))
fig.subplots_adjust(hspace=0.13,wspace=0.0001,
                    left=0,right=1,bottom=0, top=1)
Npicture = 9
count = 1
for irow in range(Npicture):
    ipic = np.random.choice(X_val.shape[0])
    ax = fig.add_subplot(Npicture/3 , 3, count,xticks=[],yticks=[])        
    plot_sample_val(X_val[ipic],y_val[ipic], ax,pred[ipic])
    ax.legend( ncol = 1)
    ax.set_title("picture "+ str(ipic))
    count += 1
plt.show()

In [None]:
pred = model_30.predict(X_test)
label_points = (np.squeeze(pred)*48)+48

feature_names = list(lookid_data['FeatureName'])
image_ids = list(lookid_data['ImageId']-1)
row_ids = list(lookid_data['RowId'])

feature_list = []
for feature in feature_names:
    feature_list.append(feature_names.index(feature))
    
predictions = []
for x,y in zip(image_ids, feature_list):
    predictions.append(label_points[x][y])
    
row_ids = pd.Series(row_ids, name = 'RowId')
locations = pd.Series(predictions, name = 'Location')
locations = locations.clip(0.0,96.0)
submission_result = pd.concat([row_ids,locations],axis = 1)
submission_result.to_csv('face_key_detection_submission_12.csv',index = False)

In [None]:
## Model with less feature

In [None]:
import pandas as pd
training_8 = pd.read_csv('../input/facial-keypoints-detection/training.zip',usecols = main_features).dropna()


In [None]:
training_8.drop(training_8[
(training_8.nose_tip_x < training_8.nose_tip_x.quantile(0.10)) & 
(training_8.nose_tip_x > training_8.nose_tip_x.quantile(0.90)) & 
(training_8.nose_tip_y < training_8.nose_tip_y.quantile(0.10)) & 
(training_8.nose_tip_y > training_8.nose_tip_y.quantile(0.90)) & 
(training_8.mouth_center_bottom_lip_x > training_8.mouth_center_bottom_lip_x.quantile(0.10)) & 
(training_8.mouth_center_bottom_lip_x < training_8.mouth_center_bottom_lip_x.quantile(0.90))
(training_8.mouth_center_bottom_lip_y > training_8.mouth_center_bottom_lip_y.quantile(0.05)) & 
(training_8.mouth_center_bottom_lip_y < training_8.mouth_center_bottom_lip_y.quantile(0.95))].index, axis=0, inplace=True)

In [None]:
X_8,Y_8 = data_loader(training_8)

In [None]:
X_train_8, X_val_8, y_train_8, y_val_8 = train_test_split(X_8, Y_8, test_size=0.2, random_state=42)
print("Train sample:",X_train_8.shape,"Val sample:",X_val_8.shape, y_train_8.shape)

In [None]:
modifier = FlipPic_8()
generator = ImageDataGenerator()
shiftFlipPic_1 = ShiftFlipPic(prop=0.02)
shiftFlipPic_2 = ShiftFlipPic(prop=0.03)
shiftFlipPic_3 = ShiftFlipPic(prop=0.04)
shiftFlipPic_4 = ShiftFlipPic(prop=0.05)
shiftFlipPic_5 = ShiftFlipPic(prop=0.06)
shiftFlipPic_6 = ShiftFlipPic(prop=0.07)
shiftFlipPic_7 = ShiftFlipPic(prop=0.1)
batches = 0
for batch in generator.flow(X_train_8,y_train_8):
    X_batch_8, y_batch_8 = add_blur(*batch,5)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = add_blur(*batch,3)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = add_blur(*batch,2)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = rotate_augmentation(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = rotate_augmentation(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = rotate_augmentation(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = rotate_augmentation(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = alter_brightness(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = modifier.fit(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = modifier.fit(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = modifier.fit(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = alter_brightness(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = alter_brightness(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    X_batch_8, y_batch_8 = alter_brightness(*batch)
    y_train_8 = np.concatenate((y_train_8,y_batch_8))
    X_train_8 = np.concatenate((X_train_8,X_batch_8))
    batches += 1
    if batches >= 6:
#         len(X_train_8) / 32
        # we need to break the loop by hand because
        # the generator loops indefinitely
        break  

print("Train sample:",X_train_8.shape,"Val sample:",X_val_8.shape)

In [None]:
# model_8 = cnn(8)
# model_8.compile(optimizer=Adam(lr=0.001),loss=rmse, metrics=[rmse,'mse', 'mae'])
# # model_8.compile(optimizer=SGD(lr=0.01,momentum = 0.9,decay=0.000125, nesterov=True), loss=rmse,metrics=[rmse,'mse', 'mae'])
# print(model_8.summary())

# instantiating the model in the strategy scope creates the model on the TPU
with tpu_strategy.scope():
    model_8 = cnn(8)
# model_30.compile(optimizer=SGD(lr=0.01,momentum = 0.9,decay=0.00006, nesterov=True), loss=rmse,metrics=[rmse,'mse', 'mae'])
    model_8.compile(optimizer=Adam(lr=0.001),loss=rmse, metrics=[rmse,'mse', 'mae'])

LR_callback_8 = ReduceLROnPlateau(monitor='val_loss', patience=4, verbose=10, factor=.4, min_lr=.00001)
EarlyStop_callback_8 = EarlyStopping(patience=15, restore_best_weights=True)

In [None]:
hist1 = model_8.fit(X_train_8,y_train_8,validation_data=(X_val_8, y_val_8), epochs = 120) 

In [None]:
boxplot_features = ['nose_tip_x', 'mouth_center_bottom_lip_x',
                    'nose_tip_y',
            'mouth_center_bottom_lip_y']

remove_outlier.boxplot(column=boxplot_features )


In [None]:
training_8.boxplot(column=boxplot_features )

In [None]:
scores_8 = model_8.evaluate(X_train_8, y_train_8, verbose=0)
print(" Train %s: %.2f%% %s: %.2f%% %s: %.2f%%" % (model_8.metrics_names[1], scores_8[1]*100,model_8.metrics_names[2], scores_8[2]*100,model_8.metrics_names[3], scores_8[3]*100))
scores_8 = model_8.evaluate(X_val_8, y_val_8, verbose=0)
print(" Val %s: %.2f%% %s: %.2f%% %s: %.2f%%" % (model_8.metrics_names[1], scores_8[1]*100,model_8.metrics_names[2], scores_8[2]*100,model_8.metrics_names[3], scores_8[3]*100))

In [None]:
def plot_loss(hist,name,plt,RMSE_TF=False):
    '''
    RMSE_TF: if True, then RMSE is plotted with original scale 
    '''
    loss = hist['rmse']
    val_loss = hist['val_rmse']
    if RMSE_TF:
        loss = np.sqrt(np.array(loss))*48 
        val_loss = np.sqrt(np.array(val_loss))*48 
        
    plt.plot(loss,"--",linewidth=3,label="train:"+name)
    plt.plot(val_loss,linewidth=3,label="val:"+name)

plot_loss(hist1.history,"model 1",plt)
plt.legend()
plt.grid()
plt.xlabel("epoch")
plt.ylabel("RMSE")
plt.show()

In [None]:
  
pred = model_8.predict(X_val)

fig = plt.figure(figsize=(7, 7))
fig.subplots_adjust(hspace=0.13,wspace=0.0001,
                    left=0,right=1,bottom=0, top=1)
Npicture = 9
count = 1
for irow in range(Npicture):
    ipic = np.random.choice(X_val.shape[0])
    ax = fig.add_subplot(Npicture/3 , 3, count,xticks=[],yticks=[])        
    plot_sample_val(X_val_8[ipic],y_val_8[ipic], ax,pred[ipic])
    ax.legend( ncol = 1)
    ax.set_title("picture "+ str(ipic))
    count += 1
plt.show()

### Create .csv files to submit to kaggle competition

In [None]:
y_hat_8 = model_8.predict(X_test)

In [None]:
y_hat_8

In [None]:

print('Predictions shape', y_hat_30.shape)
print('Predictions shape', y_hat_8.shape)
feature_8_ind = [0, 1, 2, 3, 20, 21, 28, 29]
#Merge 2 prediction from y_hat_30 and y_hat_8.
for i in range(8):
    print('Copy "{}" feature column from y_hat_8 --> y_hat_30'.format(main_features[i]))
    y_hat_30[:,feature_8_ind[i]] = (y_hat_8[:,i]*0.8+y_hat_30[:,feature_8_ind[i]]*0.2)

In [None]:
label_points_30 = (np.squeeze(y_hat_30)*48)+48

feature_names = list(lookid_data['FeatureName'])
image_ids = list(lookid_data['ImageId']-1)
row_ids = list(lookid_data['RowId'])

feature_list = []
for feature in feature_names:
    feature_list.append(feature_names.index(feature))
    
predictions = []
for x,y in zip(image_ids, feature_list):
    predictions.append(label_points_30[x][y])
    
row_ids = pd.Series(row_ids, name = 'RowId')
locations = pd.Series(predictions, name = 'Location')
locations = locations.clip(0.0,96.0)
submission_result = pd.concat([row_ids,locations],axis = 1)
submission_result.to_csv('face_key_detection_submission_combine_16.csv',index = False)

In [None]:
from keras.models import model_from_json

def save_model(model,name):
    '''
    save model architecture and model weights
    '''
    json_string = model.to_json()
    open(name+'_architecture.json', 'w').write(json_string)
    model.save_weights(name+'_weights.h5')
    
def load_model(name):
    model = model_from_json(open(name+'_architecture.json').read())
    model.load_weights(name + '_weights.h5')
    return(model)

save_model(model_30,"model30")
model = load_model("model30")