<a href="https://colab.research.google.com/github/Peter-Apps/coding-camp/blob/main/Machine_Vision_Exploration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Machine Vision using Colab



AI Image recognition is a rapidly advancing field that has applications in a wide variety of topics. From readling license plates for automatic tolling, to organizing your photos, this technology is becoming an integral part of our lives.

In this notebook we're going to use a pre-built model to try to recognize images we create. The goal here isn't to fully understand how the model "knows" what is in the image, but rather to see some of the limitations in the model and it's implications for us.

For this activity we're using a model trained by [Keras](https://keras.io/about/). Keras is free to use and has several models for a variety of machine learning tasks including Vision and Natural Language (think ChatGPT) workflows.

# Using the Model

First, import some libraries that you'll need:

*   `tensorflow` does the actual machine learning and image recognition work
*   `os` accesses the operating system, to work with the stored image file
*   `maplotlib.pyplot` displays the image in the Colab notebook
*   `numpy` is used to help convert the image into numbers for the model to read



In [None]:
import tensorflow as tf
import os
import matplotlib.pyplot as plt
import numpy as np

Load the model that will identify your images. In this case we're using the VGG16 model. You can find a list of all the pre-trained models in the Keras application as well as their performance [here](https://keras.io/api/applications/).

In [None]:
model = tf.keras.applications.VGG16()

## Creating helper functions

The `get_image_from_url` function below takes a URL, fetches the image at it, and stores it in a file.
It then returns the path to the file. This lets you take an image from the internet and turn it into a local file you can give to your model.

It will fail if the image at the URL cannot be read by a machine. Wikipedia images are a good way to avoid this, as is uploading images to your Google Drive.

In [None]:
def get_image_from_url(image_url):
  # If the temporary test_image.jpg file already exists,
  # delete it so a new one can be made.
  if os.path.exists('/root/.keras/datasets/test_image.jpg'):
    os.remove('/root/.keras/datasets/test_image.jpg')

  image_path = tf.keras.utils.get_file('test_image.jpg', origin=image_url)
  return image_path

The `load_image` function takes a path to an image (either a local path or an image path from `get_image_from_url` and loads it into memory. This image gets printed on the screen and then passed to the `classify_image` function.

In [None]:
def load_image(image_path):
  image = tf.keras.preprocessing.image.load_img(image_path, target_size=(224, 224))
  plt.figure()
  plt.imshow(image)
  return image

The `print_classifications` function takes a list of predictions, looks up the labels for them,
then prints the labels and the weight the model has given to that label. It will save you some work later.

In [None]:
def print_classifications(classifications):
    for (classification, number) in zip(classifications[0], range(1, len(classifications[0])+1)):
      print('{}. {} {:.2f}%'.format(number, classification[1], classification[2]*100))

Putting it all together, we have the `classify_image` function which runs the model on an image and prints out the classifications.

In [None]:
def classify_image(image):
  image = tf.keras.preprocessing.image.img_to_array(image)
  image = np.expand_dims(image, axis=0)
  classification_result = model.predict(image, batch_size=1)
  classifications = tf.keras.applications.imagenet_utils.decode_predictions(classification_result, top=15)
  print_classifications(classifications)


## Try it with an online image:

In [None]:
image_path = get_image_from_url('https://dojo.soy/predict-dog')
image = load_image(image_path)
classify_image(image)

Nice! We now have the predictions from the image as well as the % confidence the model has in the result.

Questions to discuss:
1. Do you notice anything strange about these predictions?
2. Why do you think that could be?

## Classify your own image

To try this with your own images, find a link to an image file and replace the `image_url` parameter below with the link (remember to keep it in quotes).

[Wikipedia's Featured Images](https://en.wikipedia.org/wiki/Wikipedia:Featured_pictures) is a great place to find items to test. Simply right click on an image and choose Copy Image Address to get the link.

In [None]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Olympus_E-M1_Mark_III_Zuiko_12-100mm.jpg/1920px-Olympus_E-M1_Mark_III_Zuiko_12-100mm.jpg"
image_path = get_image_from_url(image_url)
image = load_image(image_path)
classify_image(image)


Questions to discuss:
1. How did the model work?
2. Were it's predictions close to the actual image? If yes, can you find an image that breaks it's predictions?
3. If not, why do you think that was?

## Test the model with your webcam

The code block below creates a window to capture an image from the webcam. These images are only stored in this runtime and will be automatically deleted when this session ends.

In [None]:
from IPython.display import HTML, Audio
from google.colab.output import eval_js
from base64 import b64decode
import numpy as np
import io
from PIL import Image

VIDEO_HTML = """
<div class="video_container">
  <video autoplay
   width=%d height=%d></video>
  <div style='position: absolute;top: 40px; left: 40px; font-size: 40px; color: green;'>Click to save!</div>
</div>
<script>
var video = document.querySelector('video')
navigator.mediaDevices.getUserMedia({ video: true })
  .then(stream=> video.srcObject = stream)

var data = new Promise(resolve=>{
  video.onclick = ()=>{
    var canvas = document.createElement('canvas')
    var [w,h] = [video.offsetWidth, video.offsetHeight]
    canvas.width = w
    canvas.height = h
    canvas.getContext('2d')
          .drawImage(video, 0, 0, w, h)
    video.srcObject.getVideoTracks()[0].stop()
    video.replaceWith(canvas)
    resolve(canvas.toDataURL('image/jpeg', %f))
  }
})
</script>
"""

def take_photo(filename=None, quality=0.8, size=(800,600)):
  handle = display(HTML(VIDEO_HTML % (size[0],size[1],quality)), display_id='videoHTML')
  data = eval_js("data")
  binary = b64decode(data.split(',')[1])

  if filename:
    f = io.BytesIO(binary)
    Image.open(f).save(filename)
  else:
    f = io.BytesIO(binary)
    return np.asarray(Image.open(f))

def classifyWebcamPhoto():
  take_photo("Test.jpeg")
  path = "/content/Test.jpeg"
  image = load_image(path)
  classify_image(image)

Run the code block below to take a photo. For best results, try to keep the background clear and hold the object as close to the camera as possible.

In [None]:
classifyWebcamPhoto()

Questions to discuss:
1. How well did the model predict the image?
2. Was it better, worse, or about the same as the online images?

Based on your results answer the following questions:
3. What types of images do you think this model was trained on?
4. In what use cases would you feel comfortable using this model to identify objects?
5. Are there any areas where you wouldn't trust this model's predictions? Why or why not?


#Credits:
This notebook builds upon the example [Testing your computer's vision](https://projects.raspberrypi.org/en/projects/testing-your-computers-vision/0) activity published by the Raspberry Pi Foundation.

That activity and this notebook are released under a [CC BY-SA 4.0 DEED License](https://creativecommons.org/licenses/by-sa/4.0/)