# AI System Project Paper
---

**Author**: James Wells \
**JHU ID**: jwells52 \
**Class**: Creating AI Enabled Systems \
**Date**: 08-20-2023

## Introduction
---

<!-- This AI system automates the identification process of humpback whales using Few Shot Learning [1]. 

Identification of individual animals is a manual process conducted by scientists and it is a time consuming because scientist are required to manually annotate hundreds of data points. Additionally, each data point is scrutinized heavily because the discriminative features for humpback whales are subtle, making it difficult to accurately discriminate individuals. Lastly, traditional deep learning methodologies are driven by data hungry algorithms that require an extensive amount of data and training steps to obtain high performing results. In the case of animal identification, available data on individuals is limited and insufficient for fully training a deep learning network.

This AI system attempts to enhance the identification process of humpback whales by using Few Shot Learning, specifically Prototypical Networks. These networks are simple, yet effective algorithm for use-cases that have very limited data. Furthermore, this system is a web application that allows users to upload images of whales for automatic identification. -->


The purpose of this AI system is to automatically identify individual humpback whales from images containing their flukes (i.e tail).

Idenitfication of individual humpback whales is a manual process that is time-consuming and mentally exhaustive. Deep learning can be utilized to automate this process, however there are challenges with collecting a sufficient dataset for training a traditional deep learning algorithm such as Convolutional Neural Networks (CNN) [1].

Fortunately, a new area of AI/ML called Few Shot Learning [2] is tailored for use-cases where limited data is available. Using this new method, a deep learning algorithm can be trained on the limited amount of data available for individual humpback whales and be deployed in a system for automated identification.

The system proposed in this paper is a web application with three components: Web User Interface, REST API for Model Predictions, and a Data Store for storing uploaded images and prediction metadata.

![Figure 1](diagrams/systems-project-architecture-final.png "System Architecture") \
*Figure 1: Architecture of Systems Project.*



## Decomposition
---

Identification of individual animals is a manual process that is extremely time-consuming. When images are collected, scientist are required to manually review each image for annotation. Furthermore, accurately identifying a humpback whale in an image is difficult because the discriminative features of these animals are subtle. Another pain point of this process is that it is highly repetitive which can cause mental fatigue and this ultimately leads to an increased rate of incorrect identifications.


This system will enhance this process by providing a tool to scientist that can show the most probable identity of humpback whales in given images. Automating this process is of high value because the time saved by this tool allows scientist to focus on the analysis they are conducting for the humpback whales they are observing/tracking.

## Domain Expertise
---

Since the AI model that is predicting the identity of humpback whales is a Few Shot Learning algorithm, the user needs to provide images to the system in the form of support and query sets. 

The support set is required to contain *M* individual whales and for each individual whale *K* images are required. To minimize the chance of incorrect identifications, the suggested values for *M* and *K* are listed below:
* M $<5$
* K $>3$

It is important to note that the identity of *M* whales are provided in the support set.

Comparatively, the query set contains the images of humpback whales that are in need of identification. The size of the query set can be whatever the user desires. For each image in the query set, the predicted identity is the identity of a whale in the support set that is *most similar*. Reference *Figure 2* for a visualization of a Few Shot Learning model making a prediction on given images.


![Figure 2](diagrams/fsl.png "Model Inference") \
*Figure 2: Example of model inference when M and K both set to 3.*

## Data
---

Data is collected through a camera that is mounted onto a drone, when collecting images the drone will be located perpendicular to the surface of the water (i.e birds eye view). Once a whales fluke breaks water and is exposed, then the drone will capture images of the exposed fluke. Having a standard operation for collecting images of humpback whale flukes is essential for maximizing the performance of this system. If the system is introduced to images that deviate from the norm, then there is a higher chance that model will incorrectly identify a humpback whale.

Once images are collected, the stakeholder will need to manually annotate 3-5 images of every individual humpback whale captured during the collection phase. The stakeholder **can** annotate images through the web interface. It is important to note that this human feedback loop right after collection means the system is not *100%* automated, but this system does drastically reduce the amount of time spend in the data annotation phase by only required a subset of images to be manually annotated. Images of humpback whales along with their identifier will be stored in an AWS S3 bucket where an individual whale will have its own folder and it the folder are the images of that particular humpback whale, each image will be given a randomly generated UUID.

```
oscar/
  - 0a0c1df99.jpg
  - 12ffcs341.jpg
  - mas023n12.jpg
  ...
benny/
  - 1asd942nb.jpg
  - 0093ldnas.jpg
  - 91asdnjks.jpg
  ...
louis/
  - kj231ndsa.jpg
  - yt1sdb2sd.jpg
  - 0l0dw2123.jpg
```
*Example 1: AWS S3 Bucket directory structure for images manually annotated by stakeholders.*


When the stakeholders want to automatically assign identities to the humpback whales in the newly collected images, they must upload them via the web interface. Once uploaded, the images are preprocessed by simply being resized to a resolution of 256x512 pixels, resizing is necessary because the model in the system is incompatible with images that vary in size.


![Figure 3](images/image-preprocessing.png "System Architecture") \
*Figure 3: Image Preprocessing*

After preprocessing the uploaded images, the web interface will convert the images from `jpg/png` images to a base64 encoded string. These encoded strings are then sent to the Few Shot Learning model via a REST POST. The predicted identifiers for each image will be returned to the web interface where the stakeholder can review and change if needed. Once the stakeholder approves the predicted identities for each humpback whale then they can add these images to AWS S3 Bucket. Note that similar to the manual data annotation process, the images will be added to the folders named with the identity they have been given.

## Design
---

As stated in the introduction, the proposed solution is a web application that allows stakeholders to manually annotate images of humpback whales, and then use those manual annotations to automatically annotate the remaining images of humpback whales with no identifier. The solution has three components: Web Interface, REST Model API, and AWS S3 Bucket.

The [web interface](./app.py) is a Plotly Dash web application that allows users to upload newly collected images and manually annotate them (if needed) and also allows them to automatically annotate images. The results of either manual annotations or automatic annotations are uploaded to an AWS S3 Bucket.

The [REST Model API](./api.py) is a Prototypical Network that takes in a support set and query set for model inference. The output of the model is the predicted identifier for each image in the query set. Reference *Example 2* for the format of the POST request body sent to the REST Model API and the response.

The AWS S3 Bucket contains the uploaded images in a standardized directory structure. Each individual whale identified has its own folder in the AWS S3 Bucket and all the images containing that humpback whale exists within that folder, whether the image was manually or automatically annotated.


```json
POST Request Body
{
    "support_set_labels": ["oscar", "benny", "louis"],
    "support_set_images": {
        "oscar": [
            "data:image/png;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgc...",
            "data:image/png;base64,/9j/4AAIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL..."
            "data:image/png;base64,/9j/4AAQwBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLD..."
        ],
        "benny": [
            "data:image/png;base64,/9j/4GGSAQSADAJRgAASDJHASIOJS2wBDAAgGBgc...",
            "data:image/png;base64,/9j/4AKLJASDGUISDNFLKAMSDFimsdfladfazNDL..."
            "data:image/png;base64,/9j/4OOKASFAKSDJLAKNSGAOISFMADFl/23fsdsf..."
        ],
        "louis": [
            "data:image/png;base64,/9j/8ASKJDANFJKAkjsdfaskjdfnNJKAAAgGBgc...",
            "data:image/png;base64,/9j/9MASKFNGKLASASNDklndfasdlfaLAKSNDLK..."
            "data:image/png;base64,/9j/0LMJKNDFKJNSDGkjnkJKASNDNKJASDNJKnk..."
        ]
    },
    "query_set_images": [
        "data:image/png;base64,/9j/6HJKAbajfbjkaoeirgjaenrgkIUOAHSDABgc...",
        "data:image/png;base64,/9j/9KLASFNKLASlkdn\sadgasdf\sgdfsdfSDFs..."
        "data:image/png;base64,/9j/5NNMNAlkmlqtyqwetyYTCVYytsdvfyDsFVYd..."
    ]
}
```
*Example 2: Model REST API POST request body.*
```json
Response Body
{
    "predicted_labels: [
        "oscar",
        "oscar",
        "louis"
    ]
}
```
*Example 3: Response body from the Model REST API.*

## Diagnosis
---

Classification accuracy on the predicted identifiers is the metric used for evaluating the performance of the Few Shot Learning model.

## Deployment
---

## References

[1] Keiron O'Shea, Ryan Nash. "*An introduction to convolutional neural networks*". https://arxiv.org/pdf/1511.08458.pdf.\
[2] Archit Parnami, Minwoo Lee. "Learning From Few Examples: A summary of approaches to Few Shot Learning." 2022. arXiv:2203.04291v1 [cs.LG]