## Using Pre-trained ResNet50 Network

This is the baseline model that is used in the given [link](https://aws.amazon.com/blogs/machine-learning/building-a-visual-search-application-with-amazon-sagemaker-and-amazon-es/). 

We are getting the LFW dataset using sklearn package, it provides "interesting" (or face area) part of the images directly. 

In [1]:
from sklearn.datasets import fetch_lfw_people
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input
import numpy as np

lfw = fetch_lfw_people(color=True, resize=1)
images = preprocess_input(lfw['images'])
labels = lfw['target']

model = ResNet50(weights='imagenet', include_top=False, input_shape=(125, 94, 3), pooling='avg')
vectors = model.predict(images)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5


## Using KNN for Vector Search

We are using the NearestNeighbors class from sklearn to generate neighbours for given image queries. 

In [2]:
from sklearn.neighbors import NearestNeighbors

knn = NearestNeighbors(n_jobs=-1)
knn.fit(vectors)
neigh = knn.kneighbors(vectors, 20, return_distance=False)

## Evaluate Search Results

This part of evaluation is not required by the project, however, we come up with a simple evaluation algorithm since we want to compare the performance of our improved model to our baseline model. The algorithm used will be introduced here as well as the same part in the notebook of our improved model. 

The idea behind our evaluation algorithm is to find the images of same persons in the dataset. If there are three images of Elon Musk in our dataset, we set one of his image as the input query image, and we can find both other two images of him in the neighbours generated by our model, then we will say our model achieves 100% accuracy. Same process as above, if we only get one of the other two images of Elon Musk, we will say out model achieves 50% of accuracy. For those people with only one image in the dataset, we ignore them when computing accuracy but we do not ignore their images during computing accuracy for other people. 

We can find the accuracy of the baseline model is not that good, we cannot find a lot of same person iamges. 

In [3]:
# 5749 people in total
faces_per_person = np.zeros(5749, dtype=int)

for label in labels:
  faces_per_person[label] += 1

acc = 0
for n in neigh:
  faces_cur_person = faces_per_person[labels[n[0]]]

  # we don't want person with only 1 image during evaluation
  # but we can use them as noise so skip them
  if faces_cur_person == 1:
    continue

  # we calculate 20 nearest neighbours, so 20 as maximum
  if faces_cur_person > 20:
    faces_cur_person = 20
  
  cnt = 0
  for i in range(20):
    if labels[n[i]] == labels[n[0]]:
      cnt += 1

  acc += (cnt-1)/(faces_cur_person-1)

# 9164 images of 1680 people with 2 or more images
print(acc/9164)

0.0979594586947156
