<a href="https://www.kaggle.com/code/davidphummel/computer-vision-is-it-a-dog-or-a-cat?scriptVersionId=193777559" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

## Computer Vision - Is it a dog or a cat?

This is the first Juypter notebook I created while reading chapter 1 of the Fastai course called [Practical Deep Learning](https://course.fast.ai).  Practical Deep Learning is a free course designed for people who have some coding experience and want to learn how to apply deep learning and maching learning to practical problems.

This Juypter notebook is a copy of Jeremy Howard's notebook called [Is it a bird? Creating a model from your own data](https://www.kaggle.com/code/jhoward/is-it-a-bird-creating-a-model-from-your-own-data).  I modified Jeremy's notebook to determine if an image


In this lesson you’re going to hit the ground running – in the first five minutes you’ll see a complete end to end example of training and using a model that’s so advanced it was considered at the cutting edge of research capabilities in 2015.

In [None]:
# It's a good idea to ensure you're running the latest version of any libraries you need.
# `!pip install -Uqq <libraries>` upgrades to the latest version of <libraries>
# You can safely ignore any warnings or errors pip spits out about running as root or incompatibilities
import os
iskaggle = os.environ.get('KAGGLE_KERNEL_RUN_TYPE', '')

if iskaggle:
    !pip install -Uqq fastai fastbook

The basic steps we'll take are:

1. Use XXX to search for images dogs and cats
2. Fine-tune a pretrained neural network to recognise these two groups.
3. Try running this model on set of pictures and see if it works.

## Step 1: Download images of cats and dogs

In [None]:
from fastcore.all import *
from fastbook import search_images_ddg

def search_images(term, max=30):
  print(f"Searching for '{term}'")
  # search_images_ddg comes from fastbook: https://github.com/fastai/fastbook/blob/master/utils.py#L45
  return search_images_ddg(term, max_images=max)

Let's start by searching for a bird photo and seeing what kind of result we get. We'll start by getting URLs from a search:

In [None]:
from fastdownload import download_url
from fastai.vision.all import *

# Download a dog image
urls = search_images('dog photos', max=4)
dest = 'dog.jpg'
download_url(urls[3], dest, show_progress=False)
im = Image.open(dest)
im.to_thumb(256,256)

In [None]:
# Download a cat image
dest = 'cat.jpg'
urls = search_images('cat photos', max=1)
download_url(urls[0], dest, show_progress=False)
im = Image.open(dest)
im.to_thumb(256,256)

In [None]:
# Download a building image
dest = 'building.jpg'
urls = search_images('building hotos', max=1)
download_url(urls[0], dest, show_progress=False)
im = Image.open(dest)
im.to_thumb(256,256)

Our searches seem to be giving reasonable results, so let's grab a few examples dog and cat photos, and save each group of photos to a different folder (I'm also trying to grab a range of lighting conditions here):

In [None]:
searches = 'dog','cat'
path = Path('dog_or_not')
from time import sleep

for o in searches:
    dest = (path/o)
    dest.mkdir(exist_ok=True, parents=True)
    download_images(dest, urls=search_images(f'{o} photo'))
    sleep(10)  # Pause between searches to avoid over-loading server
    download_images(dest, urls=search_images(f'{o} sun photo'))
    sleep(10)
    download_images(dest, urls=search_images(f'{o} shade photo'))
    sleep(10)
    resize_images(path/o, max_size=400, dest=path/o)

## Step 2: Train our model

Some photos might not download correctly which could cause our model training to fail, so we'll remove them:

In [None]:
failed = verify_images(get_image_files(path))
failed.map(Path.unlink)
len(failed)

To train a model, we'll need `DataLoaders`, which is an object that contains a *training set* (the images used to create a model) and a *validation set* (the images used to check the accuracy of a model -- not used during training). In `fastai` we can create that easily using a `DataBlock`, and view sample images from it:

In [None]:
dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=[Resize(192, method='squish')]
).dataloaders(path, bs=32)

dls.show_batch(max_n=6)

Here what each of the `DataBlock` parameters means:

    blocks=(ImageBlock, CategoryBlock),

The inputs to our model are images, and the outputs are categories (in this case, "bird" or "forest").

    get_items=get_image_files, 

To find all the inputs to our model, run the `get_image_files` function (which returns a list of all image files in a path).

    splitter=RandomSplitter(valid_pct=0.2, seed=42),

Split the data into training and validation sets randomly, using 20% of the data for the validation set.

    get_y=parent_label,

The labels (`y` values) is the name of the `parent` of each file (i.e. the name of the folder they're in, which will be *bird* or *forest*).

    item_tfms=[Resize(192, method='squish')]

Before training, resize each image to 192x192 pixels by "squishing" it (as opposed to cropping it).

Now we're ready to train our model. The fastest widely used computer vision model is `resnet18`. You can train this in a few minutes, even on a CPU! (On a GPU, it generally takes under 10 seconds...)

`fastai` comes with a helpful `fine_tune()` method which automatically uses best practices for fine tuning a pre-trained model, so we'll use that.

In [None]:
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(3)

Generally when I run this I see 100% accuracy on the validation set (although it might vary a bit from run to run).

"Fine-tuning" a model means that we're starting with a model someone else has trained using some other dataset (called the *pretrained model*), and adjusting the weights a little bit so that the model learns to recognise your particular dataset. In this case, the pretrained model was trained to recognise photos in *imagenet*, and widely-used computer vision dataset with images covering 1000 categories) For details on fine-tuning and why it's important, check out the [free fast.ai course](https://course.fast.ai/).

## Step 3: Use our model

Let's see what our model thinks about that bird and forest we downloaded at the start:

In [None]:
# Test the model using a dog image
file = 'dog.jpg'
result,_,prob = learn.predict(PILImage.create(file))
print("Here is the image:")
im = Image.open(file)
im.to_thumb(256,256) 
display(im)
print(f"The model thinks this is a {result}.")
print(f"Probability it's a dog: {prob[1]:.4f}")
print(f"Probability it's a cat: {prob[0]:.4f}")

# Test the model using a cat image
file = 'cat.jpg'
result,_,prob = learn.predict(PILImage.create(file))
print("Here is the image:")
im = Image.open(file)
im.to_thumb(256,256) 
display(im)
print(f"The model things this is a {result}.")
print(f"Probability it's a dog: {prob[1]:.4f}")
print(f"Probability it's a cat: {prob[0]:.4f}")

# Test the model using a building image
file = 'building.jpg'
result,_,prob = learn.predict(PILImage.create(file))
print("Here is the image:")
im = Image.open(file)
im.to_thumb(256,256) 
display(im)
print(f"The model thinks this is a {result}.")
print(f"Probability it's a dog: {prob[1]:.4f}")
print(f"Probability it's a cat: {prob[0]:.4f}")

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
interp.plot_top_losses(5,nrows=1)

Good job, resnet18. :)

So, as you see, in the space of a few years, creating computer vision classification models has gone from "so hard it's a joke" to "trivially easy and free"!

It's not just in computer vision. Thanks to deep learning, computers can now do many things which seemed impossible just a few years ago, including [creating amazing artworks](https://openai.com/dall-e-2/), and [explaining jokes](https://www.datanami.com/2022/04/22/googles-massive-new-language-model-can-explain-jokes/). It's moving so fast that even experts in the field have trouble predicting how it's going to impact society in the coming years.

One thing is clear -- it's important that we all do our best to understand this technology, because otherwise we'll get left behind!

Now it's your turn. Click "Copy & Edit" and try creating your own image classifier using your own image searches!

If you enjoyed this, please consider clicking the "upvote" button in the top-right -- it's very encouraging to us notebook authors to know when people appreciate our work.