# Visual Search System Report

## System Design
By revisiting our original requirements below, we can confirm that the system meets the requirements.
1. The system should be able to identify millions of known personnel
    * This can be achieved by making the system scalable, which includes using an indexing strategy to organize the data in a way that makes the search process quicker. In this system, we used a KD-Tree to achieve this. The embeddings for the gallery are preocomputed when the system first starts, and saved to files so they can be reused later, so that the search process will be quicker. Analysis in [design_considerations.ipynb](./design_considerations.ipynb) showed that the probe images could be processed in a couple hundredths of a second each.
2. The system should be able to detect non-employees
    * This was achieved by using a representation learning strategy, which involved precomputing embeddings for images from a gallery of known employees. Non-employees can be detected by finding the nearest neighbors to the probe image, and if the similarity measure between the indetified nearest neighbors and the probe image is very large, this indicates that the probe does not sufficiently match any of the gallery images, indicating that the person in the probe image is an intruder.
3. The system should be able to maintain a high performance despite different lighting conditions
    * This was achieved by implementing a preprocessing module in the extraction service, which scales and resizes the input images to ensure that the inputs provided to the model are consistent. Additional transformations, such as brightness and color adjustments could be added to this preprocessing module if necessary to further accomodate for different lighting conditions.
4. The system should be able to adjust access permissions (add new hires / remove recent departures)
    * This was achieved by implementing the `add_identity` and `remove_identity` endpoints, which can be used to add new identites and images to the gallery of known employees, or remove existing images from the gallery of known employees.

A diagram of the entire system, from the Module 8 lecture slides, can be seen below. There are three main services: the extraction service, the retrieval service, and the interface service. The extraction service processes images and extracts embedding vectors from each image, saving the embeddings to numpy files. This is run on both the gallery images and probe images. For the gallery images, the extraction service applies a KD-Tree indexing strategy to store the gallery embeddings in an organized way that makes the search process more efficient. The retrieval service is used to perform a nearest neighbors search to find the images most similar to a given probe image, using the KD-Tree index created after the gallery embeddings were computed upon system start up. Finally the interface is how the outside world interacts with the system by providing input as authentication requests and images or by making API calls to retrieve information such as access logs or image files.

![diagram](../assets/images/diagram.png)

## Data, Data Pipelines, and model

### Data

### Data Pipelines

### Model

## Metrics Definition

### Offline Metrics

### Online Metrics

## Analysis of System Parameters and Configurations

### Design Considerations

### Extraction Service

### Search Service

## Post-Deployment Policies
### Monitoring and Maintenance Plan

### Fault Mitigation Strategies