# Virtual Concierge 

## Face Recognition Project with MXNet

***
Copyright [2017]-[2018] Amazon.com, Inc. or its affiliates. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at

http://aws.amazon.com/apache2.0/

or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
***

### Prerequisites:

#### Python package dependencies

The following packages need to be installed before proceeding:

* Boto3 - `pip install boto3`
* MXNet - `pip install mxnet`
* numpy - `pip install numpy`
* OpenCV - `pip install opencv-python`
* Graphviz - `pip install graphviz`
* Matplotlib - `pip install matplotlib`
* Seaborn - `pip install seaborn`

### Import dependencies

Verify that all dependencies are installed using the cell below. Continue if no errors encountered, warnings can be ignored.

In [None]:
from __future__ import print_function

import boto3
import cv2
import sys
import numpy as np
import mxnet as mx
import os
import json
from matplotlib import pyplot as plt
from scipy import stats
import seaborn as sns 

%matplotlib inline

In [None]:
mx.__version__

### Load pretrained model

`get_model()` : Loads MXNet symbols and params, defines model using symbol file and binds parameters to the model using params file.

In [None]:
def get_model(ctx, image_size, model_str, layer):
    model_parts = model_str.split(',')
    assert len(model_parts)==2
    prefix = model_parts[0]
    epoch = int(model_parts[1])
    print('loading',prefix, epoch)
    sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
    all_layers = sym.get_internals()
    sym = all_layers[layer+'_output']
    model = mx.mod.Module(symbol=sym, context=ctx, label_names = None)
    model.bind(data_shapes=[('data', (1, 3, image_size[0], image_size[1]))])
    model.set_params(arg_params, aux_params)
    return model, sym

### Preprocess images

In order to input only face pixels into the network, all input images are passed through a pretrained face detection and alignment model as described above. The output of this model are landmark points and a bounding box corresponding to the face in the image. Using this output, the image is processed using affine transforms to generate the aligned face images which are input to the network. The functions performing this is defined below.

`get_input()` : Returns aligned face to the bbox and margin, and [rotation](https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html)

`show_input()` : Shows the image after transposing it

In [None]:
def get_input(img, image_size, bbox=None, rotate=0, margin=0):
    if bbox is None:
        det = np.zeros(4, dtype=np.int32)
        det[0] = int(img.shape[1]*0.0625)
        det[1] = int(img.shape[0]*0.0625)
        det[2] = img.shape[1] - det[0]
        det[3] = img.shape[0] - det[1]
    else:
        det = bbox
    # Crop
    bb = np.zeros(4, dtype=np.int32)
    bb[0] = np.maximum(det[0]-margin/2, 0)
    bb[1] = np.maximum(det[1]-margin/2, 0)
    bb[2] = np.minimum(det[2]+margin/2, img.shape[1])
    bb[3] = np.minimum(det[3]+margin/2, img.shape[0])
    img = img[bb[1]:bb[3],bb[0]:bb[2],:]
    # Rotate if required
    if 0 < rotate and rotate < 360:
        rows,cols,_ = img.shape
        M = cv2.getRotationMatrix2D((cols/2,rows/2),360-rotate,1)
        img = cv2.warpAffine(img,M,(cols,rows))
    # Resize and transform
    img = cv2.resize(img, (image_size[1], image_size[0]))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    aligned = np.transpose(img, (2,0,1))
    return aligned

def show_input(aligned):
    plt.imshow(np.transpose(aligned,(1,2,0)))

### Get Features

`l2_normalize()`: Performs row normalization on the vector

`get_feature()` : Performs forward pass on the data aligned using model and returns the embedding

In [None]:
def l2_normalize(X):
    norms = np.sqrt((X * X).sum(axis=1))
    X /= norms[:, np.newaxis]
    return X

def get_feature(model, aligned):
    input_blob = np.expand_dims(aligned, axis=0)
    data = mx.nd.array(input_blob)
    db = mx.io.DataBatch(data=(data,))
    model.forward(db, is_train=False)
    embedding = model.get_outputs()[0].asnumpy()
    embedding = l2_normalize(embedding).flatten()
    return embedding

### Visualize Model

Load the pre-trained mobilenet mobile, setting the context to cpu and visualize the architecture.

In [None]:
%%time

image_size = (112,112)
model_name = './models/mobilefacenet/mobilenet1,0'
model, sym = get_model(mx.cpu(), image_size, model_name, 'fc1')

In [None]:
mx.viz.plot_network(sym)

### Face Alignment

We can use the the MTCNN algorithm to detect face bounding boxes.  

Insightface MXNET version adapted from original caffe version [Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks](https://github.com/kpzhang93/MTCNN_face_detection_alignment)

In [None]:
from mtcnn_detector import MtcnnDetector

model_folder = './models/mtcnn'
detector =  MtcnnDetector(model_folder, ctx=mx.cpu(), num_worker=1, accurate_landmark=True)

def detect_bboxes(img):
    ret = detector.detect_face(img)
    if ret is not None:
        bbox, points = ret
        rotate = 0
        return bbox.astype(int).tolist(), rotate

Or we could make use of the Rekognition library

In [None]:
rekognition = boto3.client('rekognition')

def get_bboxes(img, margin=0):
    # Detect faces
    ret, buf = cv2.imencode('.jpg', img)
    ret = rekognition.detect_faces(
        Image={
            'Bytes': buf.tobytes()
        },
        Attributes=['ALL'], # require OrientationCorrection
    )
    # Get the rotation
    rotate = int(ret.get('OrientationCorrection', 'ROTATE_0').strip('ROTATE_'))
    # Return the bounding boxes for each face
    height, width, _ = img.shape
    bboxes = []
    for face in ret['FaceDetails']:
        box = face['BoundingBox']
        x1 = int(box['Left'] * width)
        y1 = int(box['Top'] * height)
        x2 = int(box['Left'] * width + box['Width'] * width)
        y2 = int(box['Top'] * height + box['Height']  * height)
        bboxes.append((x1, y1, x2, y2))
    return bboxes, rotate

### Evaulate

Download sample image, and extract face coordinates

In [None]:
!aws s3 cp s3://aiml-lab-sagemaker/politicians/politicians1.jpg tmp/image

In [None]:
!ls container/local_test/Tom_Hanks_54745.png

In [None]:
%%time

# Load the image, and get bboxes
img = cv2.imread('tmp/image')
boxes, rotate = detect_bboxes(img) #get_bboxes(img)
print(boxes)

For each of the coordinates, get a the aligned image, and draw the rectangle

In [None]:
# blue, green, red, grey
colors = ((220,220,220),(242,168,73),(76,182,252),(52,194,123))

img_aligned = []
for col, bbox in enumerate(boxes): 
    img_aligned.append(get_input(img, image_size, bbox, rotate))
    cv2.rectangle(img, (bbox[0], bbox[1]), (bbox[2], bbox[3]), colors[col], 3)
    
# Plot the figure in it's original rotation
plt.figure(figsize=(10,10))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)

In [None]:
# output the aligned image
fig = plt.figure(figsize=(10,10))
for i, aligned in enumerate(img_aligned):
    a = fig.add_subplot(1, len(img_aligned), i+1)
    a.set_title('Image {}'.format(i))
    show_input(aligned)
plt.show()

### Generate embedding

Pass each face through the network sequentially to generate embedding vectors for each. 

In [None]:
img_vecs = np.array([get_feature(model, aligned) for aligned in img_aligned])
print(img_vecs.shape)
img_vecs[0]

### Calculate similarity

Calculate the cosine similarity between the embedding vectors to see how similar they are to each out. 

Similarity values in [-1,1].

In [None]:
sims = np.dot(img_vecs, img_vecs.T)
np.fill_diagonal(sims, 0)
sns.heatmap(sims, annot=True, fmt=".03f")

### Vectorize Dataset

Download a the politician dataset, and vectories the images.

In [None]:
!mkdir -p tmp/images
!aws s3 sync s3://aiml-lab-sagemaker/actors/ tmp/images

In [None]:
%%time

image_dir = 'tmp/images'
names = []
vecs = []

for file in os.listdir(image_dir):
    name = file.split('.')[0]
    img = cv2.imread(os.path.join(image_dir, file))
    bboxes, rotate = get_bboxes(img)
    bbox = bboxes[0]
    print(name, bbox, rotate)
    aligned = get_input(img, image_size, bbox, rotate)
    vec = get_feature(model, aligned)   
    names.append(name)
    vecs.append(vec)
    
vecs = np.array(vecs)

Save the vectors back to a file with the names.

In [None]:
np.savez('models/people.npz', names=names, vecs=vecs)
vecs.shape

### Plot Distribution

Compare the vectors of all the politications to input image, plot the distribution and outliner for match.

In [None]:
img = img_vecs[2]

# calculate cosine similarity and relative zscores
sims = np.dot(vecs, img)
zscores = stats.zscore(sims)

# plot series and print score and name
sns.set(color_codes=True)
plt.figure(figsize=(10,6))
ax = sns.distplot(zscores, bins=50, kde=False, rug=True)
ax.set(xlabel='zscore', ylabel='number of people')
plt.title('zscore distribution')
plt.show()

Output the name of the highest similarity based on the dataset

In [None]:
from math import erf, sqrt
def phi(x):
    #'Cumulative distribution function for the standard normal distribution'
    return (1.0 + erf(x / sqrt(2.0))) / 2.0

idx = sims.argmax()
print('sim: {}, zscore: {}, prob: {}, name: {}'.format(sims[idx], zscores[idx], phi(zscores[idx]), names[idx]))

## Deploy custom Container

We can bring our own [pre-trained mxnet model](https://aws.amazon.com/blogs/machine-learning/bring-your-own-pre-trained-mxnet-or-tensorflow-models-into-amazon-sagemaker/) so that we can form inference in sagemaker.

### Create model file

First step is to create a zip of the model and people artifacts

In [None]:
import os
import tarfile

def flatten(tarinfo):
    tarinfo.name = os.path.basename(tarinfo.name)
    return tarinfo
    
tar = tarfile.open("model.tar.gz", "w:gz")
tar.add("./models/") #, filter=flatten)
tar.close()

Setup the sagemaker environment, getting the role and default bucket

In [None]:
import sagemaker
role = sagemaker.get_execution_role()

### Upload model

Before hosting a model, the model artifacts have to be moved to an Amazon S3 bucket After the artifacts are in the S3 bucket, the model can be deployed as an endpoint on Amazon SageMaker. To perform these tasks, we can make use of the  that accompanies Amazon SageMaker. The Python SDK conveniently provides us an API to upload our models into S3.

In [None]:
sagemaker_session = sagemaker.Session()
model_data = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')
model_data

In [None]:
!aws s3 ls $model_data

### Deploy endpoint

1) Build the docker container similar to the BYO [sci-kit learn](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/scikit_bring_your_own/container) container. 

2) Then use [python sdk](https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-deploy-model.html) to create the model

In [None]:
import boto3
sagemaker = boto3.client('sagemaker')

model_name = 'sagemaker-virtual-concierge'
primary_container = {
    'Image': '423079281568.dkr.ecr.ap-southeast-2.amazonaws.com/sagemaker-virtual-concierge',
    'ModelDataUrl': model_data
}
create_model_response = sagemaker.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    PrimaryContainer = primary_container)

print(create_model_response['ModelArn'])

Then create the endpoint configuration

In [None]:
from time import gmtime, strftime

endpoint_config_name = 'sagemaker-virtual-concierge-config-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_config_name)
create_endpoint_config_response = sagemaker.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType':'ml.m4.large',
        'InitialInstanceCount':1,
        'ModelName': model_name,
        'VariantName':'AllTraffic'}])

print("Endpoint Config Arn: " + create_endpoint_config_response['EndpointConfigArn'])

Then creat the endpoint and wait until it is active

In [None]:
import time

endpoint_name = 'sagemaker-virtual-concierge-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)
create_endpoint_response = sagemaker.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name)
print(create_endpoint_response['EndpointArn'])

resp = sagemaker.describe_endpoint(EndpointName=endpoint_name)
status = resp['EndpointStatus']
print("Status: " + status)

try:
    sagemaker.get_waiter('endpoint_in_service').wait(EndpointName=endpoint_name)
finally:
    resp = sagemaker.describe_endpoint(EndpointName=endpoint_name)
    status = resp['EndpointStatus']
    print("Arn: " + resp['EndpointArn'])
    print("Create endpoint ended with status: " + status)
    if status != 'InService':
        message = sagemaker.describe_endpoint(EndpointName=endpoint_name)['FailureReason']
        print('Create endpoint failed with the following error: {}'.format(message))
        raise Exception('Endpoint creation did not succeed')

### Perform inference

Get an image, and [send the bytes](https://medium.com/@julsimon/using-chalice-to-serve-sagemaker-predictions-a2015c02b033) to the endpoint

In [None]:
import json

runtime = boto3.Session().client('sagemaker-runtime')

payload = open('container/local_test/Tom_Hanks_54745.png', 'rb').read()

response = runtime.invoke_endpoint(EndpointName=endpoint_name, 
                                   ContentType='text/csv', 
                                   Body=payload)

result = json.loads(response['Body'].read().decode())
print(result)

In [None]:
import requests

# Post image data directly
data = open('container/local_test/Tom_Hanks_54745.png', 'rb').read()
response = requests.post('http://localhost:8080/invocations', data=data, headers={'Content-Type': 'application/image-x'})
response.json()

In [None]:
import base64

# Wrap data in base64 encoded json so we can base a boundary box
payload = { 'data': base64.b64encode(data).decode("utf-8", "ignore") }
json_data = json.dumps(payload)
response = requests.post('http://localhost:5000/invocations', data=json_data, headers={'Content-Type': 'application/json'})
response.json()

## Lambda Layers

Turn this container into a lambda layere