# Introduction to Computer Vision

*Computer Vision* is a branch of artificial intelligence (AI) that explores the development of AI systems that can "see" the world, either in real-time through a camera or by analyzing images and video. This is made possible by the fact that digital images are essentially just arrays of numeric pixel values, and we can use those pixel values as *features* to train machine learning models that can classify images, detect discrete objects in an image, and even generate text-based summaries of a photographs.

<p style='text-align:center'><img src='./images/computer_vision.jpg' alt='A robot with glasses'/></p>

## Use the Computer Vision Cognitive Service

Microsoft Azure includes a number of *cognitive services* that encapsulate common AI functions, including some that can help you build computer vision solutions.

The *Computer Vision* cognitive service provides an obvious starting point for our exploration of computer vision in Azure. It uses pre-trained machine learning models to analyze images and extract information about them.

For example, suppose Adventure Works Cycles has set up a number of cameras around the city to track the cycles that have been rented. By using the Computer Vision service, the images taken by the cameras can be analyzed to provide meaningful descriptions of what they depict.

> **Citation**: The images used in this lab are from the [PASCAL Visual Object Classes (VOC) challenge dataset](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/#citation).
>
> @misc{pascal-voc-2007,
	author = "Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.",
	title = "The {PASCAL} {V}isual {O}bject {C}lasses {C}hallenge 2007 {(VOC2007)} {R}esults",
	howpublished = "http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html"}

Let's start by creating a **Cognitive Services** resource in your Azure subscription:

1. In another browser tab, open the Azure portal at https://portal.azure.com, signing in with your Microsoft account.
2. Click the **&#65291;Create a resource** button, search for *Cognitive Services*, and create a **Cognitive Services** resource with the following settings:
    - **Name**: *Enter a unique name*.
    - **Subscription**: *Your Azure subscription*.
    - **Location**: *Choose any available region*:
    - **Pricing tier**: S0
    - **Resource group**: *Create a resource group with a unique name*.
3. Wait for deployment to complete. Then go to your cognitive services resource, and on the **Quick start** page, note the keys and endpoint. You will need these to connect to your cognitive services resource from client applications.
4. Copy the **Key1** for your resource and paste it in the code below, replacing **YOUR_COG_KEY**.
5. Copy the **endpoint** for your resource and and paste it in the code below, replacing **YOUR_COG_ENDPOINT**.
6. Run the code in the cell below by clicking its green <span style="color:green">&#9655</span> button (at the top left of the cell).

In [None]:
cog_key = 'YOUR_COG_KEY'
cog_endpoint = 'YOUR_COG_ENDPOINT'

print('Ready to use cognitive services at {} using key {}'.format(cog_endpoint, cog_key))

Now that you've set up the key and endpoint, you can use the custom vision service to analyze an image.

Run the following cell to get a description for an image in the */data/voc/2009_004642.jpg* file.

In [None]:
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials
import matplotlib.pyplot as plt
from PIL import Image
import os
%matplotlib inline

# Get a client for the computer vision service
computervision_client = ComputerVisionClient(cog_endpoint, CognitiveServicesCredentials(cog_key))

# Open an image and display it
image_path = os.path.join('data', 'voc', '2009_004642.jpg')
img = Image.open(image_path)
plt.axis('off')
plt.imshow(img)

# Get a description from the computer vision service
image_stream = open(image_path, "rb")
description_results = computervision_client.describe_image_in_stream(image_stream )
if (len(description_results.captions) == 0):
    print("No description detected.")
else:
    for caption in description_results.captions:
        print("'{}' with confidence {:.2f}%".format(caption.text, caption.confidence * 100))



The description provided seems to be pretty accurate.

The Computer Vision cognitive service offers a lot more functionality than generating image descriptions, including:

- Suggesting "tags" for images, which can be useful if you want to index a lot of images for searching.
- Identifying celebrities or well-known landmarks in images.
- Detecting brand logos in an image.
- Performing optical character recognition (OCR) to read text in an image.
- Detect adult content in an image.

## Use the Custom Vision Cognitive service

The Computer Vision cognitive service provides useful pre-built models for working with images, but you'll often need to train your own model for computer vision. For example, suppose Adventure Works Cycles wants to use the cameras around the city to analyze traffic by identifying images of cars, buses, and cyclists. To do this, you'll need to train a *classification* model that can categorize images into these classes of road user.

### Train an Image Classification Model
You can use the *Custom Vision* cognitive service to train an image classification model based on existing images.

1. Download and extract the training images from https://github.com/GraemeMalcolm/ai-fundamentals/raw/master/data/voc/training_images.zip.
2. In another browser tab, open the Custom Vision portal(https://www.customvision.ai/projects). If prompted, sign in using the Microsoft account associated with your Azure subscription.
3. In the Custom Vision portal, create a new project with the following settings:
    - **Name**: Traffic Classification
    - **Description**: Image classification for traffic.
    - **Resource**: *The Cognitive Services resource you created previously*. If this is not listed, create and select a new resource with the following settings:
        - **Name**: *A unique name*
        - **Subscription**: *Your Azure subscription*
        - **Resource Group**: *Select an existing resource group or create and select a new one*.
        - **Kind**: Cognitive Services
        - **Location**: *Any available location*
        - **Pricing Tier**: S0
    - **Project Types**: Classification
    - **Classification Types**: Multiclass (single tag per image)
    - **Domains**: General
4. Click **\[+\] Add images**, and select all of the files in the **bus** folder you extracted previously. Then upload the image files, specifying the tag *bus*.
5. Repeat the previous step to upload the images in the **car** folder with the tag *car*, and the images in the **cyclist** folder with the tag *cyclist*.
6. Explore the images you have uploaded in the Custom Vision project - there should be 40 images of each class.
7. In the Custom Vision project, click **Train** to train a classification model using the tagged images. Select the **Quick Training** option.
8. Wait for training to complete, and then review the *Precision*, *Recall*, and **AP* performance metrics - these measure the prediction accuracy of the classification model, and should all be high.
9. Click **&#128504; Publish** to publish the trained model with the following settings:
    - **Model name**: traffic
    - **Prediction Resource**: *Your cognitive services resource*.
10. After publishing, click the **&#9881;** icon at the top right to view the project settings. Under **General** (on the left), note the **Project Id**; and under **Resources** (on the right)  note the **Key** and **Endpoint** values. Copy these values and paste them in the code cell below, replacing **YOUR_PROJECT_ID**, **YOUR_KEY** and **YOUR_ENDPOINT**.
11. Run the code cell below.

In [None]:
project_id = 'YOUR_PROJECT_ID'
cog_key = 'YOUR_KEY'
cog_endpoint = 'YOUR_ENDPOINT'
model_name = 'traffic' # this must match the model name you set when publishing your model iteration exactly (including case)!
print('Ready to predict using model {} in project {}'.format(model_name, project_id))

Now you're ready to use your custom vision classification model.

Run the following code cell, which uses your model to classifiy a selection of test images.

In [None]:
from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
import matplotlib.pyplot as plt
from PIL import Image
import os
%matplotlib inline

# Get the test images from the data/voc/test folder
test_folder = os.path.join('data', 'voc', 'test')
test_images = os.listdir(test_folder)

# Create an instance of the prediction service
custom_vision_client = CustomVisionPredictionClient(cog_key, endpoint=cog_endpoint)

# Create a figure to display the results
fig = plt.figure(figsize=(16, 8))

# Get the images and show the predicted classes
for idx in range(len(test_images)):
    # Open the image, and use the custom vision model to classify it
    image_contents = open(os.path.join(test_folder, test_images[idx]), "rb")
    classification = custom_vision_client.classify_image(project_id, model_name, image_contents.read())
    # The results include a prediction for each tag, in descending order of probability - get the first one
    prediction = classification.predictions[0].tag_name
    # Display the image with its predicted class
    img = Image.open(os.path.join(test_folder, test_images[idx]))
    a=fig.add_subplot(len(test_images)/3, 3,idx+1)
    a.axis('off')
    imgplot = plt.imshow(img)
    a.set_title(prediction)
plt.show()

Hopefully, your image classification model has correctly identified the vehicles in the images.

You can also use the Custom Vision service to create *object detection* models, which not only classify objects in images, but also identify *bounding boxes* that show the location of the object in the image.

## Using the Face Cognitive service

You may have noticed that the Custom Vision model you trained actually identifies *cycles* rather than *cyclists*. It might be useful to extend the traffic analysis application to analyze images that are classified as *cyclist* to determine if they contain any human faces; and if so count the number of faces detected and highlight them in the image.

To accomplish this, you'll use a third cognitive service that provides face detection and facial recognition capabilies.

The code below performs the same image classification as previously, but now when a *cyclist* image is found, the code uses the **Face** cognitive service to detect faces in the image.

Run the code cell below to see the results of this enhancement to the application.

In [None]:
from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from azure.cognitiveservices.vision.face import FaceClient
from msrest.authentication import CognitiveServicesCredentials
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw
import os
%matplotlib inline

# Get the test images from the data/voc/test folder
test_folder = os.path.join('data', 'voc', 'test')
test_images = os.listdir(test_folder)

# Create a prediction client
custom_vision_client = CustomVisionPredictionClient(cog_key, endpoint=cog_endpoint)

# Create a face detection client.
face_client = FaceClient(cog_endpoint, CognitiveServicesCredentials(cog_key))

# Create a figure to display the results
fig = plt.figure(figsize=(16, 8))

# Get the images and show the predicted classes
for idx in range(len(test_images)):
    # Open the image, and use the custom vision model to classify it
    image_contents = open(os.path.join(test_folder, test_images[idx]), "rb")
    classification = custom_vision_client.classify_image(project_id, model_name, image_contents.read())
    # The results include a prediction for each tag, in descending order of probability - get the first one
    prediction = classification.predictions[0].tag_name
    # Open the image so we can add it to the figure
    img = Image.open(os.path.join(test_folder, test_images[idx]))

    # If the image is a cyclist, detect faces
    if prediction == 'cyclist':
        image_stream = open(os.path.join(test_folder, test_images[idx]), "rb")
        detected_faces = face_client.face.detect_with_stream(image=image_stream)
        if detected_faces:
            # If there are faces, how many?
            num_faces = len(detected_faces)
            prediction = prediction + ' (' + str(num_faces) + ' faces detected)'
            # Draw a rectangle around each detected face
            for face in detected_faces:
                r = face.face_rectangle
                bounding_box = ((r.left, r.top), (r.left + r.width, r.top + r.height))
                draw = ImageDraw.Draw(img)
                draw.rectangle(bounding_box, outline='magenta', width=5)

    # Display the image with its predicted class and detected faces
    a=fig.add_subplot(len(test_images)/3, 3,idx+1)
    a.axis('off')
    imgplot = plt.imshow(img)
    a.set_title(prediction)
plt.show()

The Face cognitive service can do much more than simply detect faces. It can also analyze facial features and expressions to suggest gender, age, and emotional state; and it can compare faces for similarity and be trained to recognize individual faces.

## Learn More

- To learn more about the Computer Vision cognitive service, see the [Computer Vision documentation](https://docs.microsoft.com/azure/cognitive-services/computer-vision/)
- To learn more about the Custom Vision cognitive service, view the [Custom Vision documentation](https://docs.microsoft.com/azure/cognitive-services/custom-vision-service/home)
- To learn more about the Face cognitive service, see the [Face documentation](https://docs.microsoft.com/azure/cognitive-services/face/)
