## Material Anomaly Detection

This project aims to leverage pre-trained Deep Convolutional Neural Networks to solve the problem of anomaly detection in material a material based settings. This is a common problem in the field that we work in, and is in essence a binary classification problem.

Although this tends to be very easy in the case of two distinct labelled classes where ample examples are present it can be very challenging and open ended if no examples of the negative (failure) class are available. This is a common problem in manufacturing settings where reliability creates a situation where thousands of normal examples are present for each abnormal example.

Key Tools/Libraries and Concepts:
- Pytorch and torchvision models (pretrained models), you can use another framework if you find that easier
- Transfer learning to train classifier layers on top of pretrained backbones
- Residual networks explanation (resnets) https://www.youtube.com/watch?v=o_3mboe1jYI&ab_channel=rupertai

Throughout this project, it is expected that you will use a CNN network that is pretrained and use transfer learning in order to train the network to perform the binary classification task. Notably, there are other ways to perfrom this task that are generally better approaches, and exploration of other approaches is left as an optional exercise. 

You can use any resources at your disposal, including youtube, blog posts, chatgpt etc to complete the project. This is meant to be an easy project that we can discuss about during a later interview. The open ended part of the problem is significantly more challenging however.





## Step 1: Dataset Exploration

Download the Concrete Crack Dataset from the provided link: https://data.mendeley.com/datasets/5y9wdsg2zt/2
Extract the dataset and explore its structure.
Identify the folders containing the positive and negative samples.
Visualize a few samples from each class to understand the data.

The dataset is composed of 20,000 negative examples and 20,000 positive examples so the class imbalance is non-issue

Since this is a classification problem, you may use pretrained backbones

In [1]:
import cv2
import torch



## Step 2: Preprocessing/Building Data Loaders

Often times it might be helpful to preprocess the data. Things like normalization, resizing, other types of augmentation can be done. 

During training, data augmentation can be done on the fly to artificially increase the size of the dataset. You can use Pytorch's torchvision transforms in order to do this very easily. Here's the documentation: https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html

There is a long list of transforms that can be applied, choose some to try out, and reason about why you might want to choose those.

Afterwards, you can build a DataLoader object which helps with the training process to load the data in set batches:
https://pytorch.org/tutorials/beginner/basics/data_tutorial.html

Pytorch lightning works with objects called "DataModules" which you should implement to load the train/val/test datasets.

In [None]:
import torch
import torchvision.transforms as tf
import pytorch_lightning as pl

## Step 3: Model Creation

The basic idea here is to do a standard transfer learning process whereby you change the last layer of a classifier CNN model and train it yourself.

As a hint, you can use Resnet18 and simply change the last layer called "fc" to begin building the model. This should not require a lot of code to do so.





In [14]:
import torchvision.models
import torch
import torch.nn.functional as F
import torch.optim as optim
import pytorch_lightning as pl

class NeuralNet(pl.LightningModule):

    def __init__(self):
        super(NeuralNet,self).init__()
        pass

    def forward(self,x):
        pass
    
    def training_step(self,batch,batch_idx):
        pass        
    
    def validation_step(self,batch,batch_idx):
        pass
    
    def configure_optimizers(self):
        pass


## Step 4: Training

You will need to work with Pytorch Lightning DataModules, Lightning Modules, and Trainers to begin training. Once you have those objects a simple call to the trainer is possible. You will want to figure out how to monitor metrics if you want to experiment with different training strategies. WANDB is a good library that we use internally to track the progress, you can do this as an optional assignment (it is not hard to implement).



## Step 5: Review Results

Looking at the results what do you think are the key metrics that tell you whether this model is good or bad?

What other testing would you try to implement to ensure the model is accurate in real life?



## STEP 6: Deploy a server endpoint
Using FASTAPI you can easily deploy a server endpoint to allow your model to run. Note that this will require you to organize your inference code so that it can be used in handling requests. This is a practical step of the machine learning deployment process, since other applications may want to interface with your prediction engine. 

Following is the example of code that would exist on a server (which processes the machine learning code). Running this inside a ".py" file will start a server in the terminal.


In [None]:
import base64
import uvicorn
from fastapi import FastAPI, Request
from PIL import Image
from io import BytesIO
app = FastAPI()

def res2dict():
    '''
    Convert the output of the model into a serializable format for request
    '''
    # Start code here
    pass 

@app.get("/health")
async def inference(request: Request):
    output_dict = {'status': 'server is running'}
    return output_dict

@app.post("/inference")
async def inference(request: Request):

    # Get the request and convert to PIL image
    data = await request.json()  # Get the json
    img_str = data.get("img")  # Get the image as base64 encoded string
    img_bytes = base64.b64decode(img_str)
    img = Image.open(BytesIO(img_bytes))  # Read as if it were a file

    # Run the image through
    res = model.predict(img) # Your model prediction functions would be here
    output_dict = res2dict(res)
    return output_dict


if __name__ == "__main__":
    uvicorn.run(
        app,
        port=8000,
        host="0.0.0.0",
        reload=True,
        reload_dirs=["src"],
    )

## Send a request to the server
Assuming the server runs, you can now send images to the endpoint and get predictions back in the form of JSON. Below is example of that client code. Notably, you can use pretty much anything to send the request, but this is what it looks like in Python


In [None]:
import base64
import requests
from PIL import Image
from io import BytesIO

def img2str(img: Image) -> str: 
    ''' 
    Encoding function to get base64 image string
    '''
    buffered = BytesIO()
    img.save(buffered, format="PNG")  # Save into the bytes object

    encoded_img = base64.b64encode(buffered.getvalue())  # Save the encoding as byetes
    encoded_str = encoded_img.decode("utf-8")  # Decode into a string
    return encoded_str


img = Image.open("data/city-scape.png")
img_str = img2str(img) 
payload = {"img": img_str}

# Endpoint of the server
url = "http://127.0.0.1:8000/inference"

# Get the
response = requests.post(url, json=payload)
print(response.json().get("detections")) # Get the relevant key that contains detections

## Optional: Few shot learning

Suppose now that instead of a balanced 50/50 data set between good and bad, we now have instead 20,000 good examples, and 50 bad (cracked) examples.

How would this change your strategy around model building/training?

What other approaches might be more suitable?

Any attempts to address this problem, even ones that are purely conceptual, will be considered extra credit. If you decide to develop an algorithm, do not feel constrained to need to use the standard "classification" transfer learning paradigm.

## Optional: Docker Deployment

You may want to use Docker to containerize your server in production. The reasons for this are many, but include:

- replicate errors for debug
- make the software more portable
- allow for scaling on cloud
- development inside containers (allows you to develop/debug in the container itself)

You will need to download and install Docker Desktop to do this. Below is a template dockerfile that you can use to build a server. You may have to change which paths are copied into the container on build.

Building a container looks something like this:
```
docker build . -t container_name
```

Afterwards, you should be able to run the container with:

```
docker run -p 8000:8000 container_name
```

Now your server is online, on port 8000 allowing you to perform model inference inside a container. Anyone else will now be able to run your container and do inference with your models!


In [None]:
FROM python:3.8

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    libgl1 libgl1-mesa-glx \
    libglib2.0-0 

WORKDIR /app

# Dependencies
RUN python -m pip install -U pip

# Copy requirements
COPY requirements.txt requirements.txt
# Install requirements
RUN --mount=type=ssh pip install -r requirements.txt

# Copy the server code
COPY ./server.py ./server.py 
# Copy model weights
COPY ./yolov5s.pt ./yolov5s.pt

CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]