<a href="https://colab.research.google.com/github/yhatpub/yhatpub/blob/main/notebooks/fastai/lesson6_multicat.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fastai Lesson 6 Multicat on YHat.pub

This notebook picks up from [Fastai Fastbook 6 multicat](https://github.com/fastai/fastbook/blob/master/06_multicat.ipynb) X to [YHat.pub](https://yhat.pub)

To save your model, you'll need to save just the weights and balances of the model, the `pth` file for your learner. A really nice and easy to follow tutorial on `pth` files is here [inference-with-fastai](https://benjaminwarner.dev/2021/10/01/inference-with-fastai)

This is because `load_learner` from lesson 6 relies on the serialized `get_x` `get_y` methods, which when unserialzied, need to be on the `__main__` module. If that doesn't make sense, don't worry about it. Just follow the steps below and you'll be fine.


On your lesson 6 notebook, after fine tune your learner, do the following to save and download your `pth` file, and labels.
```
learn.save('lesson_6_multi_saved_model', with_opt=False)
from google.colab import files
files.download('models/lesson_6_multi_saved_model.pth') 
```

And do the following to save and download your labels into a file.
```
df = pd.DataFrame(dls.vocab)
df.to_csv('lesson_6_multi_saved_labels.csv', index=False, header=False)
files.download('lesson_6_multi_saved_labels.csv') 
```

### Installs
The following cell installs pytorch, fastai and yhat_params, which is used to decorate your `predict` function.

In [11]:
!pip install -q --upgrade --no-cache-dir fastai
!pip install -q --no-cache-dir git+https://github.com/yhatpub/yhat_params.git@main

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone


Add the following since matplotlib needs to know where to write it's temp files

In [12]:
import os
import tempfile
os.environ["MPLCONFIGDIR"] = tempfile.gettempdir()

### Imports
**Warning** don't place `pip installs` and `imports` in the same cell. The imports might not work correctly if done that way.

In [13]:
from fastai.vision.all import *
from yhat_params.yhat_tools import FieldType, inference_predict

### Download Model
Google drive does not allow direct downloads for files over 100MB, so you'll need to follow the snippet below to get the download url.

In [14]:
#cleanup from previous download
!rm uc*

#file copied from google drive
google_drive_url = "https://drive.google.com/file/d/1lzqJyV1bf7RE3C2Ix_sFDr0bFAM62QJt/view?usp=sharing"
import os
os.environ['GOOGLE_FILE_ID'] = google_drive_url.split('/')[5]
os.environ['GDRIVE_URL'] = f'https://docs.google.com/uc?export=download&id={os.environ["GOOGLE_FILE_ID"]}'
!echo "This is the Google drive download url $GDRIVE_URL"

rm: cannot remove 'uc*': No such file or directory
This is the Google drive download url https://docs.google.com/uc?export=download&id=1lzqJyV1bf7RE3C2Ix_sFDr0bFAM62QJt


`wget` it from google drive. This script places the model in a `model` folder

In [15]:
!wget -q --no-check-certificate $GDRIVE_URL -r -A 'uc*' -e robots=off -nd
!mkdir -p models
!mv $(ls -S uc* | head -1) ./models/export.pth

Now let's do the same for the labels csv

In [16]:
#cleanup from previous download
!rm uc*
#file copied from google drive
google_drive_url = "https://drive.google.com/file/d/1p6gRb0v8jaBiDSGRKsYPpdnEi4IaJrcw/view?usp=sharing"
import os
os.environ['GOOGLE_FILE_ID'] = google_drive_url.split('/')[5]
os.environ['GDRIVE_URL'] = f'https://docs.google.com/uc?export=download&id={os.environ["GOOGLE_FILE_ID"]}'
!echo "This is the Google drive download url $GDRIVE_URL"

This is the Google drive download url https://docs.google.com/uc?export=download&id=1p6gRb0v8jaBiDSGRKsYPpdnEi4IaJrcw


In [17]:
!wget -q --no-check-certificate $GDRIVE_URL -r -A 'uc*' -e robots=off -nd
!mkdir -p models
!mv $(ls -S uc* | head -1) ./models/vocab.csv

verify the model exists. **Warning** YHat is pretty finicky about where you place your models. Make sure you create a `model` directory and download your model(s) there  

In [18]:
!ls -l models

total 100448
-rw-r--r-- 1 root root 102854125 Nov  1 19:14 export.pth
-rw-r--r-- 1 root root       135 Nov  1 19:14 vocab.csv


### Recreate dataloader and learner

Let's start by creating a dummy image as well as set up our labels for multicategory classification. These are going to be used for our dataloader. One thing to note, our labels are an array of numpy arrays, since we can have multiple classifications for our prediction.

In [19]:
from PIL import Image
import os

if not os.path.exists('data'):
    os.mkdir('data')
    img = Image.new('RGB', (1, 1))
    img.save('data/dummyimage.jpg')

with open("models/vocab.csv") as f:
    lines = f.read().rstrip()
    labels = lines.split('\n')
labels = [np.array([label]) for label in labels]

And now, we can make a lightweight `DataBlock`, passing in the single image and labels. We are multiplying the arrays to oversample the dataloader, to ensure the dataloader sees all the possible classes.

In [20]:
dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
                   get_x=ColReader('images'), 
                   get_y=ColReader('labels'),
                   item_tfms=Resize(192),
                   batch_tfms=Normalize.from_stats(*imagenet_stats))

df = pd.DataFrame(
     {
        'images': [
                  'data/dummyimage.jpg', 
                  ]*100, 
        'labels': labels*5, 
        'valid': [True] *100
     },
    )
dls = dblock.dataloaders(df, bs=64, num_workers=1)
learn_inf = cnn_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2), pretrained=False)
learn_inf.load('export')
learn_inf.model.eval();

  elif with_opt: warn("Saved filed doesn't contain an optimizer state.")


### Load your learner
The following is the equivalent of torch `torch.load` or ts `model.load_weights`

And write your predict function. Note, you will need to decorate your function with <a href="https://github.com/yhatpub/yhat_params">inference_predict</a> which takes 2 parameters, a `dic` for input and output.

**Info** These parameters are how YHat.pub maps your predict functions input/output of the web interface. The `dic` key is how you access the variable and the value is it's type. You can use autocomplete to see all the input/output types and more documentation on `inference_predict` is available at the link. 

In [21]:
input = {"image": FieldType.PIL}
output = {"text": FieldType.Text}

@inference_predict(input=input, output=output)
def predict(params):
    img = PILImage.create(np.array(params["image"].convert("RGB")))
    result = learn_inf.predict(img)
    return {"text": str(result[0])}

### Test
First, import `in_colab` since you only want to run this test in colab. YHat will use this colab in a callable API, so you don't want your test to run every time `predict` is called. Next, import `inference_test` which is a function to make sure your `predict` will run with YHat.

Now, inside a `in_colab` boolean, first get whatever test data you'll need, in this case, an image. Then you'll call your predict function, wrapped inside  `inference_test`, passing in the same params you defined above. If something is missing, you should see an informative error. Otherwise, you'll see something like
`Please take a look and verify the results`

In [22]:
from yhat_params.yhat_tools import in_colab, inference_test

if in_colab():
    import urllib.request
    from PIL import Image
    urllib.request.urlretrieve("https://s3.amazonaws.com/cdn-origin-etr.akc.org/wp-content/uploads/2017/11/11234019/Bulldog-standing-in-the-grass.jpg", "input_image.jpg")
    img = Image.open("input_image.jpg")
    inference_test(predict_func=predict, params={'image': img})

Wrote results to result.json duration: 0.243556 seconds
Please take a look and verify the results
{
    "text": "['dog']"
}


### That's it

If you run into errors, feel free to hop into Discord.

Otherwise, you'll now want to clear your outputs and save a public repo on Github