<a href="https://colab.research.google.com/github/JorgeEncinas/fast.ai_book/blob/main/Is_it_SonicTheHedgehog.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Is it Sonic the Hedgehog?
As we all know, this fellow is often confused in-universe with other fellow anthropomorphic creatures. Let's see how this Binary Classification fares!

In [None]:
!rm -rf /kaggle/working/sonic_or_not/

# Setup
Let's follow the setup done by fast.ai
We'll check the connection and install the packages needed.

In [None]:
import socket,warnings
try:
    socket.setdefaulttimeout(1)
    socket.socket(socket.AF_INET, socket.SOCK_STREAM).connect(('1.1.1.1', 53))
except socket.error as ex: raise Exception("STOP: No internet. Click '>|' in top right and set 'Internet' switch to on")

In [None]:
#hide_output
# It's a good idea to ensure you're running the latest version of any libraries you need.
# `!pip install -Uqq <libraries>` upgrades to the latest version of <libraries>
# NB: You can safely ignore any warnings or errors pip spits out about running as root or incompatibilities
!pip install -Uqq fastai duckduckgo_search
!pip install scipy --upgrade --quiet

# Searching for images through DuckDuckGo
Note that I found fastai's notebook used some stuff that was deprecated.
Which is why I've included in the code comments what I changed, and where I got it from.

Here's the [most recent link](https://forums.fast.ai/t/duckduckgo-search-not-working/105738/70)

In [None]:
from duckduckgo_search import DDGS
from fastcore.all import *

def search_images(term, max_images=30):
    print(f"Searching for '{term}'")
    #Current form was deprecated. Switching to: https://stackoverflow.com/a/76711510
    #return L(DDGS().images(term, max_results=max_images)).itemgot('image')
    #Also from: https://stackoverflow.com/a/76700197
    with DDGS() as ddgs:
        search_results = ddgs.images(keywords=term,max_results=max_images)
        print("Search result successful")
        #images_list = [next(iter(search_results)).get("image") for _ in range(max_images)]
        #return L(images_list)
        return L(search_results).itemgot('image')

urls = search_images("sonic the hedgehog pose", 1)

# Test downloading an image, and printing it.
Here we download an image by importing the "download_url" library.
It's simple: you specify the download URL, and the image's filename.

Then we'll print it to a cell output using ".to_thumb()", which asks us for the image resolution.

In [None]:

from fastdownload import download_url
dest = 'sonic.jpg'
download_url(urls[0], dest, show_progress=False)

from fastai.vision.all import *
im = Image.open(dest)
im.to_thumb(256,256)

## Downloading another image
Now I download a Shadow the Hedgehog image.

In [None]:
download_url(search_images('shadow the hedgehog pose', max_images=1)[0], 'shadow.jpg', show_progress=False)
Image.open('shadow.jpg').to_thumb(256,256)

# Generating the dataset
Here we iteratively download images using search queries "sonic the hedgehog pose", and "shadow the hedgehog pose". I added "pose" because DuckDuckGo tends to add other Sonic characters if you only include Sonic's name, or Shadow's name.
Asking for a pose seems to more consistently get correct images.

For each name in "searches", we're making a search, and downloading images from the resulting URLs. Lastly, we resize them, I guess it's because the Neural Network requires a fixed, smaller size. It's fixed to the input layer units, and it's small to be feasible to run without using up resources too much.

In [None]:
searches = 'sonic the hedgehog pose','shadow the hedgehog pose'
path = Path('sonic_or_not')
from time import sleep

for o in searches:
    dest = (path/o)
    dest.mkdir(exist_ok=True, parents=True)
    urls = search_images(f'{o}', max_images=100)
    print(urls)
    download_images(dest, urls=urls)
    sleep(10)  # Pause between searches to avoid over-loading server
    #download_images(dest, urls=search_images(f'{o} sun photo'))
    #sleep(10)
    #download_images(dest, urls=search_images(f'{o} shade photo'))
    #sleep(10)
    resize_images(path/o, max_size=400, dest=path/o)

# Deleting failed images
Seems like the "verify_images()" function can tell which images would not be appropriate to use.

In [None]:
failed = verify_images(get_image_files(path))
failed.map(Path.unlink)
len(failed)

# Creating the DataBlock
My understanding is that this is setting up the NN's architecture, data, and labels.
First, you set the "blocks", which look like two layers: an Image-processing layer ("ImageBlock"), which could be a simple NN layer flattening the image into a one-dimensional vector; and a Softmax layer ("CategoryBlock"), in which a probability is assigned to the image for each category available. In this case, we have a category for Sonic, and another one for Shadow.

Then get_items asks for a function to get the images.
The splitter is asking for a split of the images. RandomSplitter is an easy way to setup our dataset to be divided between Training and Dev. There's no third, "Test" dataset.
get_y asks for the labels, for which we've supplied it the folder names. Hence the printed labels show "shadow the hedgehog pose" and "sonic the hedgehog pose".
Lastly, I don't remember what item_tfms does. Seems like an image-resizing tool.

In [None]:
dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=[Resize(192, method='squish')]
).dataloaders(path, bs=32)

dls.show_batch(max_n=6)


# Determining which category's which
I had trouble telling which category was Sonic, and which was Shadow. In fact, I didn't know they would be mixed-up. It striked me as more than coincidence that results always showed probabilities to be almost the opposite of ground-truth.

So I wrote this block to ascertain, by printing one batch, what each category corresponds to.

In [None]:
#While I've done this strategic grid at work,
# here I didn't want to linger too much on this part, as it's mostly for debugging purposes
# so I got some AI help to get the grid printing going.

images, labels = dls.train.one_batch()
num_images = len(images)
num_cols = 3
num_rows = (num_images + num_cols - 1) // num_cols

fig, axes = plt.subplots(num_rows, num_cols, figsize=(15, 5*num_rows))

for i, (image,label) in enumerate(zip(images, labels)):
    category_name = dls.vocab[label]
    image_np = image.permute(1, 2, 0).numpy()
    ax = axes[i // num_cols, i % num_cols]
    ax.imshow(image_np)
    ax.set_title(f"{label} - {category_name}")
    ax.axis('off')

plt.tight_layout()
plt.show()

# Running the model
Now we get to actually training a model. It's a pre-trained ResNet18 model, so the weights are already pretty good, just not for our purposes.
To adapt the model, what is most likely happening is that the top layer (could be more than one) is the only one that is not frozen: that is, just that one has its weights updated.

We're essentially using Transfer Learning, as we're working with a CNN that has already learned many lower-level features. Now we just need it to adapt to new, higher-level abstractions to detect Sonic and Shadow

In [None]:
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(3)

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

# Testing the model: Download another image
Let's download a new image, and see how the model can extrapolate to new, unseen examples.

In [None]:
#Download another image
img_urls = search_images("sonic the hedgehog dynamic pose", 1)
dest = 'sonic3.jpg'
download_url(img_urls[0], dest, show_progress=False)
im = Image.open(dest)
display(im.to_thumb(256,256))

img_urls_2 = search_images("shadow the hedgehog movie pose", 1)
dest = "shadow3.jpg"
download_url(img_urls_2[0], dest, show_progress=False)
im = Image.open(dest)
im.to_thumb(256, 256)

# The actual test!
See if it thinks it's Sonic or not!

In [None]:
is_category, prediction_index, probs = learn.predict(PILImage.create('sonic3.jpg'))
predicted_category = dls.vocab[prediction_index]
print(f"probs: {probs}")
print(f"Is this {predicted_category}? - {is_category}.")
print(f"Probability it's {predicted_category}: {probs[prediction_index]:.4f}")
display(Image.open('sonic3.jpg').to_thumb(256,256))

is_category_2, prediction_index_2, probs_2 = learn.predict(PILImage.create('shadow.jpg'))
predicted_category_2 = dls.vocab[prediction_index_2]
print(f"probs: {probs_2}")
print(f"Is this {predicted_category_2}? - {is_category_2}.")
print(f"Probability it's {predicted_category_2}: {probs_2[prediction_index_2]:.4f}")
Image.open('shadow3.jpg').to_thumb(256,256)


After some testing, seems like the model is predicting correctly (at least on these two examples)

Now we can prove that this Neural Network is better than the Sonic Adventure 2 dudes at telling Sonic and Shadow apart. Wow!

In [None]:
#I just wanted to look at the validation after the fact.
images, labels = dls.valid.one_batch()
num_images = len(images)
num_cols = 3
num_rows = (num_images + num_cols - 1) // num_cols

fig, axes = plt.subplots(num_rows, num_cols, figsize=(15, 5*num_rows))

for i, (image,label) in enumerate(zip(images, labels)):
    category_name = dls.vocab[label]
    image_np = image.permute(1, 2, 0).numpy()
    ax = axes[i // num_cols, i % num_cols]
    ax.imshow(image_np)
    ax.set_title(f"{label} - {category_name}")
    ax.axis('off')

plt.tight_layout()
plt.show()

# Exporting the model

In [None]:
learn.export('sonicModel.pkl') #here "learn" is our model, the variable is just called that.