# Women in Tech - Cognitive services workshop

This workshop is to acquaint you with a very small subselection of Microsoft Cognitive Services. These services make it very easy to leverage the power of AI in your software solutions. You can find more information on <a href="https://azure.microsoft.com/en-us/services/cognitive-services/">the official webpage</a>.

The examples in this workshop come from this Microsoft github: https://github.com/microsoft/cognitive-services-notebooks

We are going to use Azure Notebooks for this. With Azure Notebooks you are able to run Jupyter notebooks from the cloud instead of your desktop. Read more on Azure Notebooks <a href="https://notebooks.azure.com/">here</a>

## Prerequisites
* Know what Python is
* Laptop or other device
* Working internet connection


## Contents

For this workshop we are going to focus on some of the Computer Vision Cognitive Services. We are going to accomplish the following tasks:
* [Analyze an image](#AnalyzeImage)
* [Use a domain-specific Model](#DomainSpecificModel)
* [Intelligently generate a thumbnail](#GetThumbnail)
* [Detect and extract printed text from an image](#OCR)
* [Detect and extract handwritten text from an image](#RecognizeText)

These tasks are pretty straight-forward for most humans, but are very hard to accomplish by writing code. By using the Cognitive Services in Azure, we can accomplish these tasks by just calling into an API. The only thing you need is a subscription key you can get from the Azure portal. Just add a new 'Cognitive Services' resource and copy the subscription key provided to you. 

To save time and allow everyone to participate, I have already created this resource and provided the key in this notebook. This key will however be deleted somewhere in the near future when we are done with the workshop.

If you want to keep using the cognitive services after this workshop, just follow <a href="https://docs.microsoft.com/nl-nl/azure/cognitive-services/cognitive-services-apis-create-account">these instructions</a>

## Setting up our notebook with a Cognitive Services subscription key

We assign the Cognitive Services subscription key to a local variable:

In [None]:
subscription_key = "ff7942ef54fe4146bf49094999093c01"
assert subscription_key

Then we assign the base api url for the Vision API to a local variable as well:

In [None]:
vision_base_url = "https://westeurope.api.cognitive.microsoft.com/vision/v2.0/"

Since the Cognitive Services resource associated with that subscription key is located in the West Europe region, the end point has to point to the correct region.

We need to import some standard libraries to make http calls, display and manipulate the images:

In [None]:
import requests
from PIL import Image 
from IPython.display import Image as Img
import matplotlib.pyplot as plt
from io import BytesIO

## Analyze an image with Computer Vision API using Python <a name="AnalyzeImage"> </a>

With the [Analyze Image method](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fa), you can extract visual features based on image content. You can upload an image or specify an image URL and choose which features to return, including:
* A detailed list of tags related to the image content.
* A description of image content in a complete sentence.
* The coordinates, gender, and age of any faces contained in the image.
* The ImageType (clip art or a line drawing).
* The dominant color, the accent color, or whether an image is black & white.
* The category defined in this [taxonomy](https://docs.microsoft.com/azure/cognitive-services/computer-vision/category-taxonomy).
* Does the image contain adult or sexually suggestive content?

### Analyze an image 

The image analysis URL looks like the following (see REST API docs [here](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fa)):
<code>
https://[location].api.cognitive.microsoft.com/vision/v2.0/<b>analyze</b>[?visualFeatures][&details][&language]
</code>

In [None]:
vision_analyze_url = vision_base_url + "analyze"

To begin analyzing an image, set `image_url` to the URL of any image that you want to analyze. I have included a couple, but feel free to inject any image URL which still is appropriate for this workshop.

In [None]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/1/12/Broadway_and_Times_Square_by_night.jpg/450px-Broadway_and_Times_Square_by_night.jpg"
# image_url = "https://images.pexels.com/photos/2755165/pexels-photo-2755165.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/2986509/pexels-photo-2986509.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/2765872/pexels-photo-2765872.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/3183132/pexels-photo-3183132.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/2986374/pexels-photo-2986374.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/3027216/pexels-photo-3027216.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"

To show a sample of the image, just run the code block below.

In [None]:
Img(url=image_url, width=250)

The following block uses the `requests` library in Python to call out to the Computer Vision `analyze` API and return the results as a JSON object. The API key is passed in via the `headers` dictionary and the types of features to recognize via the `params` dictionary. To see the full list of options that can be used, refer to the [REST API](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fa) documentation for image analysis.

In [None]:
headers  = {'Ocp-Apim-Subscription-Key': subscription_key }
params   = {'visualFeatures': 'Categories,Description,Color'}
data     = {'url': image_url}
response = requests.post(vision_analyze_url, headers=headers, params=params, json=data)
response.raise_for_status()
analysis = response.json()

The `analysis` object contains various fields that describe the image. The most relevant caption for the image can be obtained from the `descriptions` property. If it has no clue, it will throw an error here which is not handled gracefully.

In [None]:
image_caption = analysis["description"]["captions"][0]["text"].capitalize()
print(image_caption)

## Use a domain-specific model <a name="DomainSpecificModel"> </a>

A [domain-specific model](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fd)  is a model trained to identify a specific set of objects in an image.  The two domain-specific models that are currently available are _celebrities_ and _landmarks_. 

To view the list of domain-specific models supported, you can make the following request against the service.

In [None]:
model_url = vision_base_url + "models"
headers   = {'Ocp-Apim-Subscription-Key': subscription_key}
models    = requests.get(model_url, headers=headers).json()
[model["name"] for model in models["models"]]

### Landmark identification
To begin using the domain-specific model for landmarks, set `image_url` to point to an image to be analyzed.

In [None]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/f/f6/Bunker_Hill_Monument_2005.jpg"
# image_url = "https://images.pexels.com/photos/1461974/pexels-photo-1461974.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/1141853/pexels-photo-1141853.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/1448136/pexels-photo-1448136.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/753339/pexels-photo-753339.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# image_url = "https://images.pexels.com/photos/64271/queen-of-liberty-statue-of-liberty-new-york-liberty-statue-64271.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"

In [None]:
Img(url=image_url, width=250)

The service endpoint to analyze images for landmarks can be constructed as follows:

In [None]:
landmark_analyze_url = vision_base_url + "models/landmarks/analyze"
print(landmark_analyze_url)

The image in `image_url` can now be analyzed for any landmarks. The identified landmark is stored in `landmark_name`.

In [None]:
headers  = {'Ocp-Apim-Subscription-Key': subscription_key}
params   = {'model': 'landmarks'}
data     = {'url': image_url}
response = requests.post(landmark_analyze_url, headers=headers, params=params, json=data)
response.raise_for_status()

analysis      = response.json()
assert analysis["result"]["landmarks"] is not []

landmark_name = analysis["result"]["landmarks"][0]["name"].capitalize()
print(landmark_name)

### Celebrity identification
Along the same lines, the domain-specific model for identifying celebrities can be invoked as shown next. First set `image_url` to point to the image of a celebrity.

In [None]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/d/d9/Bill_gates_portrait.jpg"
#image_url = "https://www.cheatsheet.com/wp-content/uploads/2019/11/Brad-Pitt.jpg"
#image_url = "https://www.thesun.co.uk/wp-content/uploads/2018/07/NINTCHDBPICT0004161277131.jpg"
#image_url = "https://pmcvariety.files.wordpress.com/2019/10/shutterstock_editorial_10435445et.jpg?w=1000&h=563&crop=1"
#image_url = "https://www.grammy.com/sites/com/files/styles/image_landscape_hero/public/edsheeran-hero-478072284.jpg?itok=WFSpwjin"
#image_url = "https://www.oxy.edu/sites/default/files/styles/banner_image/public/top-level-news/obama-scholars-news.jpg?itok=eOKS79MY"

The service endpoint for detecting celebrity images can be constructed as follows:

In [None]:
celebrity_analyze_url = vision_base_url + "models/celebrities/analyze"
print(celebrity_analyze_url)

Next, the image in `image_url` can be analyzed for celebrities

In [None]:
headers  = {'Ocp-Apim-Subscription-Key': subscription_key}
params   = {'model': 'celebrities'}
data     = {'url': image_url}
response = requests.post(celebrity_analyze_url, headers=headers, params=params, json=data)
response.raise_for_status()

analysis = response.json()

In [None]:
print(analysis)

The following lines of code extract the name and bounding box for one of the celebrities found:

In [None]:
assert analysis["result"]["celebrities"] is not []
celebrity_info = analysis["result"]["celebrities"][0]
celebrity_name = celebrity_info["name"]
celebrity_face = celebrity_info["faceRectangle"]

Next, this information can be overlaid on top of the original image using the following lines of code:

In [None]:
from matplotlib.patches import Rectangle

plt.figure(figsize=(5,5))

image  = Image.open(BytesIO(requests.get(image_url).content))
ax     = plt.imshow(image, alpha=0.6)
origin = (celebrity_face["left"], celebrity_face["top"])
p      = Rectangle(origin, celebrity_face["width"], celebrity_face["height"], 
                   fill=False, linewidth=2, color='b')
ax.axes.add_patch(p)
plt.text(origin[0], origin[1], celebrity_name, fontsize=20, weight="bold", va="bottom")
_ = plt.axis("off")

## Get a thumbnail with Computer Vision API <a name="GetThumbnail"> </a>

Use the [Get Thumbnail method](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fb) to crop an image based on its region of interest (ROI) to the height and width you desire. The aspect ratio you set for the thumbnail can be different from the aspect ratio of the input image.

To generate the thumbnail for an image, first set `image_url` to point to its location. 

In [None]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg"
Img(url=image_url, width=250)

The service endpoint to generate the thumbnail can be constructed as follows:

In [None]:
thumbnail_url = vision_base_url + "generateThumbnail"
print(thumbnail_url)

Next, a 50-by-50 pixel thumbnail for the image can be generated by calling this service endpoint.

In [None]:
headers  = {'Ocp-Apim-Subscription-Key': subscription_key}
params   = {'width': '50', 'height': '50','smartCropping': 'true'}
data     = {'url': image_url}
response = requests.post(thumbnail_url, headers=headers, params=params, json=data)
response.raise_for_status()

You can verify that the thumbnail is indeed 50-by-50 pixels using the Python Image Library.

In [None]:
from PIL import Image as PImage
thumbnail = PImage.open(BytesIO(response.content))
print("Thumbnail is {0}-by-{1}".format(*thumbnail.size))
thumbnail

## Optical character recognition (OCR) with Computer Vision API <a name="OCR"> </a>

Use the [Optical Character Recognition (OCR) method](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fc) to synchronously detect printed text in an image and extract recognized characters into a machine-usable character stream.

To illustrate the OCR API, set `image_url` to point to the text to be recognized.

In [None]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png"
Img(url=image_url, width=250)

The service endpoint for OCR for your region can be constructed as follows:

In [None]:
ocr_url = vision_base_url + "ocr"
print(ocr_url)

Next, you can call into the OCR service to get the text that was recognized along with bounding boxes. In the parameters shown, `"language": "unk"` automatically detects the language in the text and `"detectOrientation": "true"` automatically aligns the image. For more information, see the [REST API documentation](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fc).

In [None]:
headers  = {'Ocp-Apim-Subscription-Key': subscription_key}
params   = {'language': 'unk', 'detectOrientation ': 'true'}
data     = {'url': image_url}
response = requests.post(ocr_url, headers=headers, params=params, json=data)
response.raise_for_status()

analysis = response.json()

The word bounding boxes and text from the results of analysis can be extracted using the following lines of code:

In [None]:
line_infos = [region["lines"] for region in analysis["regions"]]
word_infos = []
for line in line_infos:
    for word_metadata in line:
        for word_info in word_metadata["words"]:
            word_infos.append(word_info)
word_infos

Finally, the recognized text can be overlaid on top of the original image using the `matplotlib` library.

In [None]:
plt.figure(figsize=(5,5))

image  = Image.open(BytesIO(requests.get(image_url).content))
ax     = plt.imshow(image, alpha=0.5)
for word in word_infos:
    bbox = [int(num) for num in word["boundingBox"].split(",")]
    text = word["text"]
    origin = (bbox[0], bbox[1])
    patch  = Rectangle(origin, bbox[2], bbox[3], fill=False, linewidth=2, color='y')
    ax.axes.add_patch(patch)
    plt.text(origin[0], origin[1], text, fontsize=20, weight="bold", va="top")
_ = plt.axis("off")

## Text recognition with Computer Vision API <a name="RecognizeText"> </a>

Use the [Recognize Text method](https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/587f2c6a154055056008f200) to asynchronously detect handwritten or printed text in an image and extract recognized characters into a machine-usable character stream.

Set `image_url` to point to the image to be recognized.

In [None]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Cursive_Writing_on_Notebook_paper.jpg/800px-Cursive_Writing_on_Notebook_paper.jpg"
Img(url=image_url, width=350)

The service endpoint for the text recognition service can be constructed as follows:

In [None]:
text_recognition_url = vision_base_url + "recognizeText"
print(text_recognition_url)

The handwritten text recognition service can be used to recognize the text in the image. In the `params` dictionary, you can set `mode` to `Printed` to recognize only printed text.

In [None]:
headers  = {'Ocp-Apim-Subscription-Key': subscription_key}
params   = {'mode' : 'Handwritten'}
data     = {'url': image_url}
response = requests.post(text_recognition_url, headers=headers, params=params, json=data)
response.raise_for_status()

The text recognition service does not return the recognized text by itself. Instead, it returns immediately with an "Operation Location" URL in the response header that must be polled to get the result of the operation.

In [None]:
operation_url = response.headers["Operation-Location"]

After obtaining the `operation_url`, you can query it for the analyzed text. The following lines of code implement a polling loop in order to wait for the operation to complete. Notice that the polling is done via an HTTP `GET` method instead of `POST`.

In [None]:
import time

analysis = {}
while not "recognitionResult" in analysis:
    response_final = requests.get(response.headers["Operation-Location"], headers=headers)
    analysis       = response_final.json()
    time.sleep(1)

Next, the recognized text along with the bounding boxes can be extracted as shown in the following line of code. An important point to note is that the handwritten text recognition API returns bounding boxes as **polygons** instead of **rectangles**. Each polygon is _p_ is defined by its vertices specified using the following convention:

<i>p</i> = [<i>x</i><sub>1</sub>, <i>y</i><sub>1</sub>, <i>x</i><sub>2</sub>, <i>y</i><sub>2</sub>, ..., <i>x</i><sub>N</sub>, <i>y</i><sub>N</sub>]

In [None]:
polygons = [(line["boundingBox"], line["text"]) for line in analysis["recognitionResult"]["lines"]]

Finally, the recognized text can be overlaid on top of the original image using the extracted polygon information. Notice that `matplotlib` requires the vertices to be specified as a list of tuples of the form:

<i>p</i> = [(<i>x</i><sub>1</sub>, <i>y</i><sub>1</sub>), (<i>x</i><sub>2</sub>, <i>y</i><sub>2</sub>), ..., (<i>x</i><sub>N</sub>, <i>y</i><sub>N</sub>)]

and the post-processing code transforms the polygon data returned by the service into the form required by `matplotlib`.

In [None]:
from matplotlib.patches import Polygon

plt.figure(figsize=(15,15))

image  = Image.open(BytesIO(requests.get(image_url).content))
ax     = plt.imshow(image)
for polygon in polygons:
    vertices = [(polygon[0][i], polygon[0][i+1]) for i in range(0,len(polygon[0]),2)]
    text     = polygon[1]
    patch    = Polygon(vertices, closed=True,fill=False, linewidth=2, color='y')
    ax.axes.add_patch(patch)
    plt.text(vertices[0][0], vertices[0][1], text, fontsize=20, va="top")
_ = plt.axis("off")

## Wrapping it up
Yes! You did it! Great job on exploring some Azure cognitive services you can leverage in all your on-going software endeavours!
![Celebration](img/celebration.gif)

