<img src="../img/saturn_logo.png" width="300" />

# Baseline Inference

This project will do inference: classify an image with the most accurate label our model can give it. We're using the [Stanford Dogs Dataset]( http://vision.stanford.edu/aditya86/ImageNetDogs/), so we're asking Resnet50 to give us the correct breed label. 

Before we go into parallelization of this tasks, let's do a quick single-thread version. Then, in [Notebook 4](04-parallel-inference.ipynb), we'll convert this to a parallelized task.

### Set up file store

Connect to our S3 bucket where the images are held.

In [None]:
import s3fs
s3 = s3fs.S3FileSystem(anon=True)

### Download model and labels for ResNet

First, we connect to the S3 data store, where we will get one sample image, as well as the 1000-item ImageNet label dataset. This will allow us to turn the predictions from our model into human-interpretable strings.

PyTorch has the companion library torchvision which gives us access to a number of handy tools, including copies of popular models like Resnet. You can learn more about the available models in [the torchvision documentation](https://pytorch.org/docs/stable/torchvision/models.html).

In [None]:
from torchvision import datasets, transforms, models

resnet = models.resnet50(pretrained=True)

with s3.open('s3://saturn-public-data/dogs/imagenet1000_clsidx_to_labels.txt') as f:
    classes = [line.strip() for line in f.readlines()]

### Load image and design transform steps

In [None]:
from PIL import Image

with s3.open("s3://saturn-public-data/dogs/2-dog.jpg", 'rb') as f:
    img = Image.open(f).convert("RGB")
    
transform = transforms.Compose([
    transforms.Resize(256), 
    transforms.CenterCrop(250), 
    transforms.ToTensor()])

### Set up inference function

In [None]:
import torch
to_pil = transforms.ToPILImage()

def classify_img(transform, img, model):
    img_t = transform(img)
    batch_t = torch.unsqueeze(img_t, 0)

    resnet.eval()
    out = model(batch_t)
    
    _, indices = torch.sort(out, descending=True)
    percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
    labelset = [(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]
    return to_pil(img_t), labelset


Key aspects of the function to pay attention to include:

* `img_t = transform(img)` : we must run the transformation we defined above on every image before we try to classify it.
* `batch_t = torch.unsqueeze(img_t, 0)` : this step reshapes our image tensors to allow the model to accept it.
* `resnet.eval()` : When we download the model, it can either be in training or in evaluation mode. We need it in evaluation mode here, so that it can return the predicted labels to us without changing itself.
* `out = model(batch_t)` : This step actually evaluates the images. We are using batches of images here, so many can be classified at once.

### Results Processing
* `_, indices = torch.sort(out, descending=True)` : Sorts the results, high score to low (gives us the most likely labels at the top).
* `percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100` : Rescales the scores from the model to probabilities (returns probabilities of each label) .
* `labelset = [(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]` : Interprets the top five labels in human readable form.

In [None]:
%%time

dogpic, labels = classify_img(transform, img, resnet)

In [None]:
dogpic

In [None]:
labels

Great job, we have proved the basic task works!

<img src="https://media.giphy.com/media/Qw75aRmhdpEuntisgj/giphy.gif" alt="success" style="width: 300px;"/>


***

## Moving to Parallel

Our job with one image runs quite fast! However, if we want to classify all 20,000+ images in the the [Stanford Dogs Dataset]( http://vision.stanford.edu/aditya86/ImageNetDogs/), that's going to add up to real time. So, let's take a look at how we can do this so that images are not classified one at a time, but in a highly parallel way, in [Notebook 4](04-parallel-inference.ipynb).