Quickstart: Extract printed text (OCR) using the Computer Vision REST API and Python

In this quickstart, you will extract printed text with optical character recognition (OCR) from an image using the Computer Vision REST API. With the OCR method, you can detect printed text in an image and extract recognized characters into a machine-usable character stream.

You can run this quickstart in a step-by step fashion using a Jupyter Notebook.

Prerequisites

An Azure subscription - Create one for free
You must have Python installed if you want to run the sample locally.
Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
- You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
- You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
Create environment variables for the key and endpoint URL, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT, respectively.

Create and run the sample

To create and run the sample, do the following steps:

Copy the following code into a text editor.
Optionally, replace the value of image_url with the URL of a different image from which you want to extract printed text.
Save the code as a file with an .py extension. For example, get-printed-text.py.
Open a command prompt window.
At the prompt, use the python command to run the sample. For example, python get-printed-text.py.

from azure.cognitiveservices.vision.customvision.training import CustomVisionTrainingClient
from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from azure.cognitiveservices.vision.customvision.training.models import ImageFileCreateBatch, ImageFileCreateEntry, Region
from msrest.authentication import ApiKeyCredentials
import os, time, uuid
import configparser
import os

import matplotlib.pyplot as plt
import numpy as np
import requests
from matplotlib.patches import Polygon

import cv2
import imutils

target_image_path= "<local-path-to-image>"
data = open(target_image_path, 'rb').read()

img = cv2.imdecode(np.array(bytearray(data), dtype='uint8'), 
    cv2.IMREAD_COLOR)
crop_bytes =bytes(cv2.imencode('.jpg', img)[1])

 # make a call to the text_recognition_url
text_recognition_url = "https://<name-of-resource>.cognitiveservices.azure.com/vision/v3.2/read/analyze"
response = requests.post(
    url=text_recognition_url, 
    data=crop_bytes, 
    headers={
        'Ocp-Apim-Subscription-Key': "<subscription-key>", 
        'Content-Type': 'application/octet-stream'})

response.raise_for_status()
operation_url = response.headers["Operation-Location"]

 # The recognized text isn't immediately available, so poll to wait for completion.
analysis = {}
poll = True
while (poll):
    response_final = requests.get(
        response.headers["Operation-Location"], 
        headers={'Ocp-Apim-Subscription-Key': "<subscription-key>"})
    analysis = response_final.json()
    time.sleep(1)
    if ("status" in analysis and analysis['status'] == 'succeeded'):
        poll = False
    if ("recognitionResults" in analysis):
        poll = False
    if ("status" in analysis and analysis['status'] == 'Failed'):
        poll = False       

#print(analysis)
result = analysis['analyzeResult']['readResults'][0]['lines']

for line in result:
    print(line['text'])

Next steps

Next, explore a Python application that uses Computer Vision to perform optical character recognition (OCR); create smart-cropped thumbnails; and detect, categorize, tag, and describe visual features in images.

Computer Vision API Python Tutorial
To rapidly experiment with the Computer Vision API, try the Open API testing console.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
ocr-api-python.ipynb		ocr-api-python.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quickstart: Extract printed text (OCR) using the Computer Vision REST API and Python

Prerequisites

Create and run the sample

Next steps

About

Uh oh!

Releases

Packages

Languages

eda-ayan/ocr-python-sdk

Folders and files

Latest commit

History

Repository files navigation

Quickstart: Extract printed text (OCR) using the Computer Vision REST API and Python

Prerequisites

Create and run the sample

Next steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages