<a href="https://colab.research.google.com/github/thesteve0/impatient-computer-vision/blob/main/2_classify_embed.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Classification and Embedding

We are going to do our housekeep steps which will take a little while to run. While they are running we will go back to slides and I will introduce the topics.

### Housekeeping
Before we do anything else, we are need to change our machine time to one that has a GPU. Doing computer vision tasks with a CPU, except for some specific models, is extremely slow. One of the reasons we are using Colab is that you can get free access to a GPU for the workshop.

Please:
1. Go up to the top right of the browser
2. Select "Connect"
3. Then "Change Runtime Type"
![change_runtime](assets/2_pick_GPU1.png)

4. Pick T4 GPU
5. Click Save
![pick GPU](assets/2_pick_GPU2.png)

6. When the run time connects it should look like this
![running GPU](assets/2_pick_GPU3.png)


Time to do our long running tasks
1. Load the dependencies
2. Map the drive
2. Load the data

In [None]:
!pip install fiftyone==1.4.1 torch torchvision umap-learn
from google.colab import drive
drive.mount('/content/drive')

import fiftyone as fo

name = "our-photos"
dir = "/content/drive/MyDrive/impatient-cv/flickr-labeled"

dataset = fo.Dataset.from_dir(
    dataset_dir=dir,
    dataset_type=fo.types.FiftyOneDataset,
    name=name
)

print(dataset)

## Classification

As we discussed in the slides, Classification is the computer vision task where you try to assign an image to single class out of a list of classes. We are going to use a classification model that is the foundation for many other models and is still quite powerful - ResNet. We are going to use the simplement version, ResNet18, because:

1. It doesn't require much GPU resources
2. It is fast to compute

There are many variations to ResNet where a number is appended to the name. This number usually represents the number of layers in the neural network.

### Training data

While ResNet18 has a specific architecture, to use it for predictions, the model needs to be trained on data. There are many foundational data sets in computer vision but a partciularly common one is [ImageNet](https://www.image-net.org/index.php). This dataset has 1k classes and millions of annotated images.  FiftyOne has a [dataset zoo](https://docs.voxel51.com/dataset_zoo/datasets.html) where many important computer vision datasets have been converted into FiftyOne format and are easy to download and view.

Let's go ahead and download and view a small subset of the ImageNet Data, the [ImageNet Sample Data](https://docs.voxel51.com/dataset_zoo/datasets.html#imagenet-sample)

In [None]:
1. import fiftyone.zoo as foz

imagenet_samples = foz.load_zoo_dataset("imagenet-sample")

session = fo.launch_app(imagenet_samples, auto=False)

session.url


### FiftyOne Model Zoo

The computer vision platform we have been using, FiftyOne, also has a set of models already converted into a format that works with the rest of the FiftyOne platform. Typically, you would have to use library specific code, such as PyTorch, along with other code to specify the architecture to run a computer vision model. With FiftyOne, we can load the model in one line of code,  and then run it for classification (inference) with another line of code. Two lines of code and you are in business.

#### ResNet18 in the model zoo

We are going to load the PytTorch version of [ResNet18 model](https://docs.voxel51.com/model_zoo/models.html#resnet18-imagenet-torch) that was trained on ImageNet

In [None]:
resnet18_imagenet_model = foz.load_zoo_model("resnet18-imagenet-torch")


### Predictions of our Photos

We loaded our Flickr dataset and we have loaded our classification model, time to have it predict the classifications for our images.

In [None]:
dataset.apply_model(resnet18_imagenet_model, label_field="rn18_in_predictions", progress_bar=True)

# Now let's look at the results
session.dataset = dataset