-
Notifications
You must be signed in to change notification settings - Fork 517
/
ComputerVisionQuickstart.py
135 lines (118 loc) · 4.97 KB
/
ComputerVisionQuickstart.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
'''
Computer Vision Quickstart for Microsoft Azure Cognitive Services.
Uses local and remote images in each example.
Prerequisites:
- Install the Computer Vision SDK:
pip install --upgrade azure-cognitiveservices-vision-computervision
- Install PIL:
pip install --upgrade pillow
- Create folder and collect images:
Create a folder called "images" in the same folder as this script.
Go to this website to download images:
https://github.com/Azure-Samples/cognitive-services-sample-data-files/tree/master/ComputerVision/Images
Add the following 7 images (or use your own) to your "images" folder:
faces.jpg, gray-shirt-logo.jpg, handwritten_text.jpg, landmark.jpg,
objects.jpg, printed_text.jpg and type-image.jpg
Run the entire file to demonstrate the following examples:
- OCR: Read File using the Read API, extract text - remote
- OCR: Read File using the Read API, extract text - local
References:
- SDK: https://docs.microsoft.com/en-us/python/api/azure-cognitiveservices-vision-computervision/azure.cognitiveservices.vision.computervision?view=azure-python
- Documentaion: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/index
- API: https://westus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-2/operations/5d986960601faab4bf452005
'''
# <snippet_imports_and_vars>
# <snippet_imports>
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials
from array import array
import os
from PIL import Image
import sys
import time
# </snippet_imports>
'''
Authenticate
Authenticates your credentials and creates a client.
'''
# <snippet_vars>
subscription_key = "PASTE_YOUR_COMPUTER_VISION_KEY_HERE"
endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE"
# </snippet_vars>
# </snippet_imports_and_vars>
# <snippet_client>
computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))
# </snippet_client>
'''
END - Authenticate
'''
# <snippet_read_call>
'''
OCR: Read File using the Read API, extract text - remote
This example will extract text in an image, then print results, line by line.
This API call can also extract handwriting style text (not shown).
'''
print("===== Read File - remote =====")
# Get an image with text
read_image_url = "https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png"
# Call API with URL and raw response (allows you to get the operation location)
read_response = computervision_client.read(read_image_url, raw=True)
# </snippet_read_call>
# <snippet_read_response>
# Get the operation location (URL with an ID at the end) from the response
read_operation_location = read_response.headers["Operation-Location"]
# Grab the ID from the URL
operation_id = read_operation_location.split("/")[-1]
# Call the "GET" API and wait for it to retrieve the results
while True:
read_result = computervision_client.get_read_result(operation_id)
if read_result.status not in ['notStarted', 'running']:
break
time.sleep(1)
# Print the detected text, line by line
if read_result.status == OperationStatusCodes.succeeded:
for text_result in read_result.analyze_result.read_results:
for line in text_result.lines:
print(line.text)
print(line.bounding_box)
print()
# </snippet_read_response>
'''
END - Read File - remote
'''
'''
OCR: Read File using the Read API, extract text - local
This example extracts text from a local image, then prints results.
This API call can also recognize remote image text (shown in next example, Read File - remote).
'''
print("===== Read File - local =====")
# Get image path
read_image_path = os.path.join (images_folder, "printed_text.jpg")
# Open the image
read_image = open(read_image_path, "rb")
# Call API with image and raw response (allows you to get the operation location)
read_response = computervision_client.read_in_stream(read_image, raw=True)
# Get the operation location (URL with ID as last appendage)
read_operation_location = read_response.headers["Operation-Location"]
# Take the ID off and use to get results
operation_id = read_operation_location.split("/")[-1]
# Call the "GET" API and wait for the retrieval of the results
while True:
read_result = computervision_client.get_read_result(operation_id)
if read_result.status.lower () not in ['notstarted', 'running']:
break
print ('Waiting for result...')
time.sleep(10)
# Print results, line by line
if read_result.status == OperationStatusCodes.succeeded:
for text_result in read_result.analyze_result.read_results:
for line in text_result.lines:
print(line.text)
print(line.bounding_box)
print()
'''
END - Read File - local
'''
print("End of Computer Vision quickstart.")