![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fdata-science-and-artificial-intelligence&branch=main&subPath=11-training-ai.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# Training AI

We are going to train an AI system to recognize if an image contains a cat or a dog. You can also watch this [video introduction to supervised machine learning](https://www.youtube.com/embed/wpYmbeyCmlQ).

## Training Data

We will use images that are [public domain](https://en.wikipedia.org/wiki/Public_domain) or [Creative Commons](https://creativecommons.org/) because we are allowed to use them without purchasing a license.

The more examples you have, the better the AI will be able to discriminate between cats and dogs.

1. Create two new folders on your computer, one called `cats` and one called `dogs`.
1. Find and download at least 10 images of cats from [Pexels](https://www.pexels.com/search/cat/) or [Pixabay](https://pixabay.com/images/search/cat/). Put them in your `cats` folder.
1. Find and download at least 10 images of dogs from [Pexels](https://www.pexels.com/search/dog/) or [Pixabay](https://pixabay.com/images/search/dog/). Put them in your `dogs` folder.

## Teaching the Machine

1. Open [Teachable Machine image training](https://teachablemachine.withgoogle.com/train/image)
1. Rename `Class 1` as `cat`, and `Class 2` as `dog` by clicking on the pencil icons.
1. Upload your cat images to the `cat` class and your dog images to the `dog` class.
1. Click the `Train model` button.
1. After the training has finished, click the `Export Model` button, click the `Tensorflow Lite` tab on the right, then click the `Download my model` button. The button will change to `Converting model...` and it will take a few minutes, don't click away from that browser tab.
1. Your model should then download automatically as `converted_tflite.zip`.
1. Upload your `converted_tflite.zip` file to [the folder that this notebook is in](.) on Callysto Hub or wherever you are running this notebook.
    * You can open the folder that this notebook is in by clicking on the logo at the top left of the page and then click on the `data-science-and-artificial-intelligence` folder.
    * If you're running in Jupyter Lab the file browser is already on the left.

After you have completed all of those steps, run the following cell to set up the image classifier.

In [None]:
%pip install -q tflite-runtime zipfile pillow
import tflite_runtime.interpreter as tflite
from zipfile import ZipFile
from PIL import Image, ImageOps
import numpy as np
import requests, urllib.request, os
import pandas as pd
from IPython.display import clear_output, display
clear_output()

try:
    with ZipFile('converted_tflite.zip', 'r') as zip_object:
        zip_object.extractall()
except:
    print('Unable to find your converted_tflite.zip file, using online version')
    r = requests.get('https://raw.githubusercontent.com/callysto/data-files/main/data-science-and-artificial-intelligence/converted_tflite.zip')
    with open('converted_tflite.zip', 'wb') as f:
        f.write(r.content)
    with ZipFile('converted_tflite.zip', 'r') as zip_object:
        zip_object.extractall()
interpreter = tflite.Interpreter('model_unquant.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]['shape']

class_names = open('labels.txt', 'r').readlines()
os.remove('model_unquant.tflite')
os.remove('labels.txt')

def classify_image(image_url, show_image=False):
    filename = image_url.split('/')[-1]
    r = requests.get(image_url, stream=True)
    with open(filename, 'wb') as f:
        f.write(r.content)
    image = Image.open(filename).convert('RGB')
    image = image.resize((input_shape[1], input_shape[2]))
    if show_image:
        display(image)
    os.remove(filename)
    input_data = (np.expand_dims(np.array(image), axis=0) / 255.0).astype(np.float32)
    interpreter.set_tensor(input_details[0]['index'], input_data)
    interpreter.invoke()
    output_data = interpreter.get_tensor(output_details[0]['index'])
    predicted_class = np.argmax(output_data)
    predicted_class_name = class_names[predicted_class].strip()[2:]
    confidence_level = output_data[0][predicted_class]
    return predicted_class_name, confidence_level, image

clear_output()
print('Model imported and classify_image(image_url) function defined')

The function will return a classification and confidence level, and a resized version of the image.

Now that we have set up the `classify_image()` function, we can load an image from a link and get its classification according to our trained AI.

Change the string in the `image_url` variable to be a direct link to an online image.

**Make sure you have copied the `image address` and that it is not a link to a webpage. The url should end with something like `.jpg`, `.gif`, or `.png`**

In [None]:
image_url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/b/b6/Felis_catus-cat_on_snow.jpg/1024px-Felis_catus-cat_on_snow.jpg'

results = classify_image(image_url)
results

The first value returned is the classification, in our case `cat` or `dog`.

The second is "confidence score" which is how sure the AI is of that classification, `1` means `100%` confident.

The third value is the downloaded and resized image.

In [None]:
results[2]

We can even use this to categorize a list of online images. We'll try it with some art rather than photos and see how it performs.

In [None]:
urls = [
    'https://www.artic.edu/iiif/2/e8e67721-bbb1-d007-82bd-c430ea73db70/full/843,/0/default.jpg',
    'https://www.artic.edu/iiif/2/08b38ff2-2659-2c0b-dba8-221e173f4fc3/full/843,/0/default.jpg',
    'https://www.artic.edu/iiif/2/c7a1688c-8a21-8eab-086d-3537b1506705/full/843,/0/default.jpg',
    'https://www.artic.edu/iiif/2/86706756-2cf8-6a7c-58cc-90efaa4db124/full/843,/0/default.jpg',
    'https://www.artic.edu/iiif/2/2a1a5c49-9249-ad65-f1ec-cacf3078619d/full/843,/0/default.jpg',
    'https://www.artic.edu/iiif/2/4f6137e2-b96e-2815-f698-e3ad45840178/full/843,/0/default.jpg',
]

data = pd.DataFrame(urls, columns=['url'])
labels = []
confidences = []
images = []

for url in urls:
    results = classify_image(url, True)
    labels.append(results[0])
    confidences.append(results[1])
    images.append(results[2])

data['label'] = labels
data['confidence'] = confidences
data['image'] = images
data

To display the images and their classifications, we need to loop through the dataframe.

In [None]:
for index, row in data.iterrows():
    display(row['image'])
    print(row['label'], row['confidence'])
    print('-----------------------------')

If the model is not accurately identifying cats and dogs, go back to the start of this notebook and train it with more images.

Of course we can also use this same process to train an AI model to categorize images of other things, for example identifying if something is a soup, salad, or sandwich.

---

<span style="color:#663399">Your **assignment** is to write a paragraph what you learned about training and implementing an AI system.</span>

<span style="color:#FF6633">An **optional advanced challenge** is write a few paragraphs about potential applications of AI that you could train.</span>

---

The [next notebook](12-getting-data.ipynb) will introduce some other ways to get data.

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)