# Introduction to Deployment

## Installing our required dependencies

In [None]:
! pip install --quiet gradio
! pip install -Uqq fastai

- fastai --> the library that gives us access to deep learning models and functions so we do not have to implement them by scratch ourselves
- duckduckgo_search --> this library allows us to automate searching the duckduckgo broswer and in this case, pulling the images from the internet
- time --> this library allows us to measure time elapsed, pause the program for a certain period of time, and various other time-related functions
- pathlib --> this library allows us to manipulate file paths in our systems
- gradio --> this library is used to quickly building a simple web application for machine(deep) learning models

In [None]:
from fastai.vision.all import *
from duckduckgo_search import ddg_images
from fastcore.all import *
from fastdownload import download_url
from time import sleep
from pathlib import Path
import gradio as gr

## Making our dataset

In this example, we will be making our own image classification dataset. Run the code below to enter the categories you would like the model to be trained on.

In [None]:
search_terms = []
while True:
    term = input("Enter a category that you would like as part of the image dataset. Enter \"done\" to quit.")
    if term.lower() == "done":
        break
    search_terms.append(term)

We need to set the path for our data folder and make the folder itself if it does not already exist.

In [None]:
output_dir = Path("data")
output_dir.mkdir(exist_ok=True, parents=True)

Run the code below to enter how many images you want each category to contains.

Note, that the number you enter might not be how many images each category will actually contain. This is because some of the images we get from the internet may be corrupt and we would have to get rid of them.

In [None]:
max_images = int(input("Enter the maximum number of images you would like to have for each category."))

This code block uses duckduckgo_search to get the images from the internet and then downloads them into the data folder. Seperate folders are also created for each category and the images will go to their respestive category.

The file architecture will look something like this:

Data:
- Category 1:
    - Image 1
    - Image 2
- Category 2:
    - Image 1
    - Image 2


In [None]:
for term in search_terms:
    # Create folder for search term
    dest = (output_dir/term)
    dest.mkdir(exist_ok=True, parents=True)

    # Searchs for images and gets urls
    urls = L(ddg_images(term, max_results=max_images)).itemgot("image")

    # Trys downloading images from url
    for url in urls:
        try:
            download_url(url, dest, show_progress=False)
        except:
            continue

    # Resizes images
    resize_images(dest, max_size=400, dest=dest)

Since we are taking the images from the internet, we would like to keep track of which images failed to download.

In [None]:
failed = verify_images(get_image_files(output_dir))
failed.map(Path.unlink)
len(failed)

This code block is basically taking all of our images and performing some operation on it in one batch. It is then stored as one object.

In [None]:
dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2),
    get_y=parent_label,
    item_tfms=[Resize(128, method="squish")],
).dataloaders(output_dir, bs=20)

dls.show_batch(max_n=6)

## Training, Evaluating, and Saving our Model

Now we get to create and train our model.

Note that in this case, we are "fine-tuning" our model. This basically just means that we are taking a pretrained model and then training it again on a smaller dataset. This is helpful because we do not have to train our own model from scratch and only worry about using a pretrained model and training it for our specific use case.

In [None]:
learn = vision_learner(dls, resnet18, metrics=[accuracy, error_rate])
learn.fine_tune(4)

To get a better understanding of where the model misclassified and how we can see what the model classified with a confusion matrix and by seeing which item the model was most confused about.

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_top_losses(1, nrows=2)

If our model performs well, then it always good practice to save/export our model.

In [None]:
learn.export('model.pkl')

##Deployment

This code is the function that our web server will call when we want an inference made.

Inference is basically when we want a model to predict something. In this case, that is classifying an image.

In [None]:
categories = search_terms
def classify_image(image):
    pred, i, prob = learn.predict(image)
    return dict(zip(categories, map(float, prob)))

This code starts up the local server where our model is running. We can run inference on our model without any more code and a cleaner UI.

In [None]:
image = gr.inputs.Image(shape=(192, 192))
label = gr.outputs.Label()

interface = gr.Interface(fn=classify_image, inputs=image, outputs=label)
interface.launch(inline=False)