# Final project: Finding the suspect

<a href="https://en.wikipedia.org/wiki/Facial_composite">Facial composites</a> are widely used in forensics to generate images of suspects. Since victim or witness usually isn't good at drawing, computer-aided generation is applied to reconstruct the face attacker. One of the most commonly used techniques is evolutionary systems that compose the final face from many predefined parts.

In this project, we will try to implement an app for creating a facial composite that will be able to construct desired faces without explicitly providing databases of templates. We will apply Variational Autoencoders and Gaussian processes for this task.

The final project is developed in a way that you can apply learned techniques to real project yourself. We will include the main guidelines and hints, but a great part of the project will need your creativity and experience from previous assignments.

### Setup
Load auxiliary files and then install and import the necessary libraries.

In [None]:
import os

def download_file(url, file_path):
    print(url, file_path)
    if os.path.exists(file_path):
        os.remove(file_path)
    template = "wget '{}' -O '{}'"
    os.system(template.format(url, file_path))

def load_data_final_project():
    download_file(
        "https://github.com/hse-aml/bayesian-methods-for-ml/"
        "releases/download/v0.1/CelebA_VAE_small_8.h5",
        "CelebA_VAE_small_8.h5"
    )
    
load_data_final_project()

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
import tensorflow as tf
import GPy
import GPyOpt
import keras
from keras.layers import Input, Dense, Lambda, InputLayer, concatenate, Activation, Flatten, Reshape
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D, Deconv2D
from keras.losses import MSE
from keras.models import Model, Sequential
from keras import backend as K
from keras import metrics
from keras.datasets import mnist
from keras.utils import np_utils
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import utils
import os
%matplotlib inline

## Model description
We will first train variational autoencoder on face images to compress them to low dimension. One important feature of VAE is that constructed latent space is dense. That means that we can traverse the latent space and reconstruct any point along our path into a valid face.

Using this continuous latent space we can use Bayesian optimization to maximize some similarity function between a person's face in victim/witness's memory and a face reconstructed from the current point of latent space. Bayesian optimization is an appropriate choice here since people start to forget details about the attacker after they were shown many similar photos. Because of this, we want to reconstruct the photo with the smallest possible number of trials.

## Generating faces

For this task, you will need to use some database of face images. There are multiple datasets available on the web that you can use: for example, <a href="http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html">CelebA</a> or <a href="http://vis-www.cs.umass.edu/lfw/">Labeled Faces in the Wild</a>. We used Aligned & Cropped version of CelebA that you can find <a href="https://www.dropbox.com/sh/8oqt9vytwxb3s4r/AADSNUu0bseoCKuxuI5ZeTl1a/Img?dl=0&preview=img_align_celeba.zip">here</a> to pretrain VAE model for you. See optional part of the final project if you wish to train VAE on your own.

<b>Task 1:</b> Train VAE on faces dataset and draw some samples from it. (You can use code from previous assignments. You may also want to use convolutional encoders and decoders as well as tuning hyperparameters)

In [None]:
sess = tf.InteractiveSession()
K.set_session(sess)

In [None]:
latent_size = 8

In [None]:
vae, encoder, decoder = utils.create_vae(batch_size=128, latent=latent_size)
sess.run(tf.global_variables_initializer())
vae.load_weights('CelebA_VAE_small_8.h5')

In [None]:
K.set_learning_phase(False)

In [None]:
latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))
decode = decoder(latent_placeholder)

In [None]:
np.random.seed(73)

#### GRADED 1 (3 points): Draw 25 samples from trained VAE model
As the first part of the assignment, you need to become familiar with the trained model. For all tasks, you will only need a decoder to reconstruct samples from a latent space.

To decode the latent variable, you need to run ```decode``` operation defined above with random samples from a standard normal distribution.

In [None]:
latent_vec = np.random.normal(size=(1, latent_size))
image = sess.run(decode, feed_dict={latent_placeholder : latent_vec})[0]

In [None]:
### TODO: Draw 25 samples from VAE here
plt.figure(figsize=(10, 10))
for i in range(25):
    plt.subplot(5, 5, i+1)
    latent_vec = np.random.normal(size=(1, latent_size))
    image = sess.run(decode, feed_dict={latent_placeholder : latent_vec})[0] ### YOUR CODE HERE
    plt.imshow(np.clip(image, 0, 1))
    plt.axis('off')

## Search procedure

Now that we have a way to reconstruct images, we need to set up an optimization procedure to find a person that will be the most similar to the one we are thinking about. To do so, we need to set up some scoring utility. Imagine that you want to generate an image of Brad Pitt. You start with a small number of random samples, say 5, and rank them according to their similarity to your vision of Brad Pitt: 1 for the worst, 5 for the best. You then rate image by image using GPyOpt that works in a latent space of VAE. For the new image, you need to somehow assign a real number that will show how good this image is. The simple idea is to ask a user to compare a new image with previous images (along with their scores). A user then enters score to a current image.

The proposed scoring has a lot of drawbacks, and you may feel free to come up with new ones: e.g. showing user 9 different images and asking a user which image looks the "best".

Note that the goal of this task is for you to implement a new algorithm by yourself. You may try different techniques for your task and select one that works the best.

<b>Task 2:</b> Implement person search using Bayesian optimization. (You can use code from the assignment on Gaussian Processes)

Note: try varying `acquisition_type` and `acquisition_par` parameters.

In [None]:
class FacialComposit:
    def __init__(self, decoder, latent_size):
        self.latent_size = latent_size
        self.latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))
        self.decode = decoder(self.latent_placeholder)
        self.samples = None
        self.images = None
        self.rating = None

    def _get_image(self, latent):
        img = sess.run(self.decode, 
                       feed_dict={self.latent_placeholder: latent[None, :]})[0]
        img = np.clip(img, 0, 1)
        return img

    @staticmethod
    def _show_images(images, titles):
        assert len(images) == len(titles)
        clear_output()
        plt.figure(figsize=(3*len(images), 3))
        n = len(titles)
        for i in range(n):
            plt.subplot(1, n, i+1)
            plt.imshow(images[i])
            plt.title(str(titles[i]))
            plt.axis('off')
        plt.show()

    @staticmethod
    def _draw_border(image, w=2):
        bordred_image = image.copy()
        bordred_image[:, :w] = [1, 0, 0]
        bordred_image[:, -w:] = [1, 0, 0]
        bordred_image[:w, :] = [1, 0, 0]
        bordred_image[-w:, :] = [1, 0, 0]
        return bordred_image

    def test_dev(self, n_start=5, select_top=None):
        return self.rating, np.quantile(self.rating, 0.5, interpolation='nearest')

    def query_initial(self, n_start=5, select_top=None):
        '''
        Creates initial points for Bayesian optimization
        Generate *n_start* random images and asks user to rank them.
        Gives maximum score to the best image and minimum to the worst.
        :param n_start: number of images to rank initialy.
        :param select_top: number of images to keep
        '''

        samples = np.random.normal(size=(n_start,self.latent_size))
        images = np.array([self._get_image(samples[i]) for i in range(n_start)])

        if select_top is None:
          select_top = n_start

        self._show_images(images, range(1,n_start+1))
        print('Initialization for Bayesian Optimization')
        print('Please provide a rating for the following images, between 0 and 9')

        rating = []
        for i in range(n_start):
            rating_x = int(input('Rating for IMAGE ' + str(i + 1)+ '  :  '))
            rating.append(rating_x)

        select_top_idx = np.argsort(rating)[::-1][:select_top]

        self.samples = samples[select_top_idx]### YOUR CODE HERE (size: select_top x 64 x 64 x 3)
        self.images = images[select_top_idx]### YOUR CODE HERE (size: select_top x 64 x 64 x 3)
        self.rating = np.array(rating)[select_top_idx] ### YOUR CODE HERE (size: select_top)

        # Check that tensor sizes are correct
        np.testing.assert_equal(self.rating.shape, [select_top])
        np.testing.assert_equal(self.images.shape, [select_top, 64, 64, 3])
        np.testing.assert_equal(self.samples.shape, [select_top, self.latent_size])

    def evaluate(self, candidate):
        '''
        Queries candidate vs known image set.
        Adds candidate into images pool.
        :param candidate: latent vector of size 1xlatent_size
        '''
        initial_size = len(self.images)
        
        ### YOUR CODE HERE
        ## Show user an image and ask to assign score to it.
        ## You may want to show some images to user along with their scores
        ## You should also save candidate, corresponding image and rating
        candidate_image = self._get_image(candidate[0]).reshape(1,64,64,3)

        def find_nearest(array, value):
            array = np.asarray(array)
            idx = (np.abs(array - value)).argmin()
            return idx
        
        avg_rating_idx = find_nearest(self.rating, np.average(self.rating))
        image_idx = np.array([0, avg_rating_idx, (initial_size - 1) // 2, initial_size - 1])
        images = self.images[image_idx]
        ratings = self.rating[image_idx]

        comp = np.append(images, candidate_image, axis=0)

        self._show_images(comp, [f'Best Image (r:{ratings[0]})',
                                 f'Avg Image (r:{ratings[1]})',
                                 f'Middle Image (r:{ratings[2]})',
                                 f'Worst Image (r:{ratings[3]})', 'Candidate'])

        print("Bayesian optimization loop.")
        print("Please provide a rating for the new candidate image from 0 to 9")
        candidate_rating = int(input("Rating for candidate image : "))
        rating = np.append(self.rating, candidate_rating)

        sorted_rating_idx = np.argsort(rating)[::-1]

        self.images = np.append(self.images, candidate_image, axis=0)[sorted_rating_idx]
        self.rating = rating[sorted_rating_idx]
        self.samples = np.append(self.samples, candidate, axis=0)[sorted_rating_idx]

        assert len(self.images) == initial_size + 1
        assert len(self.rating) == initial_size + 1
        assert len(self.samples) == initial_size + 1
        return candidate_rating

    def optimize(self, n_iter=10, w=4, acquisition_type='EI', acquisition_par=0.3):
        select_top = 5
        if self.samples is None:
            self.query_initial(n_start=20, select_top=select_top)

        bounds = [{'name': 'z_{0:03d}'.format(i),
                   'type': 'continuous',
                   'domain': (-w, w)} 
                  for i in range(self.latent_size)]
        optimizer = GPyOpt.methods.BayesianOptimization(f=self.evaluate, domain=bounds,
                                                        acquisition_type = acquisition_type,
                                                        acquisition_par = acquisition_par,
                                                        exact_eval=False, # Since we are not sure
                                                        model_type='GP',
                                                        X=self.samples,
                                                        Y=self.rating[:, None],
                                                        maximize=True)
        optimizer.run_optimization(max_iter=n_iter, eps=-1)

    def get_best(self):
        index_best = np.argmax(self.rating)
        return self.images[index_best]

    def draw_best(self, title=''):
        index_best = np.argmax(self.rating)
        image = self.images[index_best]
        plt.imshow(image)
        plt.title(title)
        plt.axis('off')
        plt.show()

Describe your approach below: How do you assign a score to a new image? How do you select reference images to help user assign a new score? What are the limitations of your approach?

> ### How do you assign a score to a new image ?
We first start with a random pick of 10 samples from which we get the images using the decoder of the VAE. We ask the user to score the ten images on a scale from 0 to 9 depending on the objective they have (eg: lecturer, darkest hair). Once the scoring of the images is done, we select the top 5 images to use as initialization of the bayesian optimization process. A sample is then generated by the optimizer, displayed to the user for scoring. This process is done 10 times. At each iteration, the optimizer takes into account the new rating to generate the next candidate. At the end, the best rated image is selected.

> ### How do you select reference images to help user assign a new score?
At each iteration, the current best, the worst and the median image are selected in the sense of the previous ratings, so that the use can rate the newly generated image according the previous ratings.

> ### What are the limitations of your approach ?
Faces are complicated to describe, for more complicated queries the optimizer has hard time finding what features to optimize when it come to answer more complex queries, like finding the lecturer. It is very hard to give feedback on what exact feature the user is giving feedback. When we want to reconstruct faces with more details than a simple hair color or smile, the current system of user feedback might be too poor to express those complicated features.

## Testing your algorithm

In these sections, we will apply the implemented app to search for different people. Each task will ask you to generate images that will have some property like "dark hair" or "mustache". You will need to run your search algorithm and provide the best discovered image.

#### Task 3.1: Finding person with darkest hair

In [None]:
composit = FacialComposit(decoder, 8)
composit.optimize()

In [None]:
composit.draw_best('Darkest hair')

#### Task 3.2. Finding person with the widest smile

In [None]:
composit = FacialComposit(decoder, 8)
composit.optimize()

In [None]:
composit.draw_best('Widest smile')

#### Task 3.3. Finding Daniil Polykovskiy or Alexander Novikov — lecturers of this course

Note: this task highly depends on the quality of a VAE and a search algorithm. You may need to restart your search algorithm a few times and start with larget initial set.

In [None]:
composit = FacialComposit(decoder, 8)
composit.optimize(n_iter=25, acquisition_par=0.5)

In [None]:
composit.draw_best('Lecturer')

#### <small>Don't forget to post resulting image of lecturers on the forum ;)</small>

#### Task 3.4. Finding specific person (optional, but very cool)

Now that you have a good sense of what your algorithm can do, here is an optional assignment for you. Think of a famous person and take look at his/her picture for a minute. Then use your app to create an image of the person you thought of. You can post it in the forum <a href="https://www.coursera.org/learn/bayesian-methods-in-machine-learning/discussions/forums/SE06u3rLEeeh0gq4yYKIVA">Final project: guess who!</a>


In [None]:
### Your code here