# Digitize your notes with Azure Computer Vision OCR

The Computer Vision service provides pre-built, advanced algorithms that process and analyze images and extract text from photos and documents (Optical Character Recognition, OCR).

In this exercise, we will explore the pre-trained models of Azure Computer Vision service for optical character recognition. We will build a simple Python notebook that turns your handwritten documents into digital notes. 

You will learn how to:
* Provision a Computer Vision resource.
* Use a Computer Vision resource to extract text from photos.

## Create a Computer Vision Resource

1. Sign in to [Azure Portal](https://portal.azure.com/) and select **Create a resource**.
2. Search for **Computer Vision** and then click **Create**.
3. Create a Computer Vision resource with the following settings:
    * **Subscription**: Your Azure subscription.
    * **Resource group**: Select an existing resource group or create a new one.
    * **Region**: Choose any available region, for example **North Europe**.
    * **Name**: This would be your custom domain name in your endpoint. Enter a unique name.
    * **Pricing tier**: You can use the free pricing tier (**F0**) to try the service, and upgrade later to a paid tier.
4. Select **Review + Create** and wait for deployment to complete.
5. One the deployment is complete, select **Go to resource**. On the **Overview** tab, click **Manage keys**. Save the **Key 1** and the **Endpoint**. You will need the key and the endpoint to connect to your Computer Vision resource from client applications.


## Import libraries

In [None]:
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import time
import numpy as np

## Create variables for your key and endpoint

In [None]:
key = 'YOUR_KEY'
endpoint = 'YOUR_ENDPOINT'

## Authenticate the client

In [None]:
computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(key))

## Extract handwritten text

First download the images used in the following examples from my [GitHub repository](https://github.com/sfoteini/azure-computer-vision).

In [None]:
def read_handwritten_text(image_name):
    # Open local image file
    image_path = "images/" + image_name
    image = open(image_path, "rb")

    img = Image.open(image_path)

    # Call the API
    read_response = computervision_client.read_in_stream(image, raw=True)

    # Get the operation location (URL with an ID at the end)
    read_operation_location = read_response.headers["Operation-Location"]

    # Grab the ID from the URL
    operation_id = read_operation_location.split("/")[-1]

    # Retrieve the results 
    while True:
        read_result = computervision_client.get_read_result(operation_id)
        if read_result.status not in ['notStarted', 'running']:
            break
        time.sleep(1)

    # Create figure and axes
    fig, ax = plt.subplots()

    # Display the image
    ax.imshow(img)

    # Print the detected text and bounding boxes
    if read_result.status == OperationStatusCodes.succeeded:
        for text_result in read_result.analyze_result.read_results:
            for line in text_result.lines:
                # Print line
                print(line.text)

                # line.bounding_box contains 4 pairs of (x, y) coordinates
                xy1 = [line.bounding_box[0], line.bounding_box[1]]
                xy2 = [line.bounding_box[2], line.bounding_box[3]]
                xy3 = [line.bounding_box[4], line.bounding_box[5]]
                xy4 = [line.bounding_box[6], line.bounding_box[7]]
                box_coordinates = np.array([xy1, xy2, xy3, xy4])
                
                # Create a Rectangle patch
                box = patches.Polygon(box_coordinates, closed=True, linewidth=2, edgecolor='r', facecolor='none')

                # Add the patch to the Axes
                ax.add_patch(box)

                # Print words in line with confidence score
                for word in line.words:
                    print(f"   * {word.text}: {word.confidence * 100:.2f}")
                    
                    # Uncomment the following lines to display the bounding box of each word
                    '''
                    xy1 = [word.bounding_box[0], word.bounding_box[1]]
                    xy2 = [word.bounding_box[2], word.bounding_box[3]]
                    xy3 = [word.bounding_box[4], word.bounding_box[5]]
                    xy4 = [word.bounding_box[6], word.bounding_box[7]]
                    box_coordinates = np.array([xy1, xy2, xy3, xy4])
                
                    # Create a Rectangle patch
                    box = patches.Polygon(box_coordinates, closed=True, linewidth=1, edgecolor='c', facecolor='none')

                    # Add the patch to the Axes
                    ax.add_patch(box)
                    '''
    plt.show()

In [None]:
read_handwritten_text("notes1.jpg")

In [None]:
read_handwritten_text("notes2.jpg")

In [None]:
read_handwritten_text("notes3.jpg")

In [None]:
read_handwritten_text("notes4.jpg")