### What is computer vision?
Computer Vision (CV) is a field of artificial intelligence related to the analysis of images and videos. It includes a set of methods that give the computer the ability to" see " and extract information from what it sees.

<img src='https://miro.medium.com/max/875/1*uAeANQIOQPqWZnnuH-VEyw.jpeg' />
<img src='https://miro.medium.com/max/625/1*GcI7G-JLAQiEoCON7xFbhg.gif' />
<img src=https://miro.medium.com/max/495/1*uoWYsCV5vBU8SHFPAPao-w.gif width=600>
<img src='https://miro.medium.com/max/1250/1*_vGloND6yyxFeFH5UyCDVg.png'>

### **Azure Computer Vision is**
Text extraction (OCR)
Extract printed and handwritten text from images and documents with mixed languages and writing styles.

Image understanding
Pull from a rich ontology of more than 10,000 concepts and objects to generate value from your visual assets.

Spatial analysis
Analyze how people move in a space in real time for occupancy count, social distancing and face mask detection.

Flexible deployment
Run Computer Vision in the cloud or on the edge, in containers.

Calling the API using the python SDK

Install SDK and import

In [None]:
!pip install --upgrade azure-cognitiveservices-vision-computervision
!pip install pillow

In [None]:
# Required variables for operation
endpoint = ''
subscription_key = ''

In [None]:
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials

from array import array
import os
from PIL import Image
import sys
import time


Authorization

In [None]:
computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(key))

In [None]:
# Images for test
people_in_helmets = 'https://www.spletnik.ru/img/__post/f5/f53cd4ebe97605812b89db588bbe46b3_963.jpg'
couple = 'https://st2.depositphotos.com/1075946/7273/i/600/depositphotos_72738747-stock-photo-couple-relaxing-on-sofa.jpg'
street = 'https://media-cdn.tripadvisor.com/media/photo-s/16/1f/60/57/the-street.jpg'
remote_image_handw_text_url = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg"

#### **Image description**
#### The task of identification is to classify the entire image. To do this, the image highlights the key areas and classifies them.

In [None]:
description_results = computervision_client.describe_image(people_in_helmets)

# print("Description of remote image: ")
if (len(description_results.captions) == 0):
    print("No description detected.")
else:
    for caption in description_results.captions:
        print("'{}' with confidence {:.2f}%".format(caption.text, caption.confidence * 100))

'a crowd of people' with confidence 50.20%


#### Categorical analysis

In [None]:
remote_image_features = ["categories"]
# Call API with URL and features
categorize_results_remote = computervision_client.analyze_image(people_in_helmets , remote_image_features)

# Print results with confidence score
print("Categories from remote image: ")
if (len(categorize_results_remote.categories) == 0):
    print("No categories detected.")
else:
    for category in categorize_results_remote.categories:
        print("'{}' with confidence {:.2f}%".format(category.name, category.score * 100))

Categories from remote image: 
'others_' with confidence 4.69%
'outdoor_' with confidence 1.56%


#### Search for tags by image

In [None]:
tags_result_remote = computervision_client.tag_image(people_in_helmets)
# Print results with confidence score
print("Tags in the remote image: ")
if (len(tags_result_remote.tags) == 0):
    print("No tags detected.")
else:
    for tag in tags_result_remote.tags:
        print("'{}' with confidence {:.2f}%".format(tag.name, tag.confidence * 100))

Tags in the remote image: 
'helmet' with confidence 97.58%
'person' with confidence 90.37%
'personal protective equipment' with confidence 77.68%
'goggles' with confidence 69.20%
'clothing' with confidence 59.90%


#### **Search for objects in the image**
#### The task is to be able to select a certain set of objects from the image. Until the problem is solved in the general case, the algorithm cannot classify random objects in the image. However, it is able to recognize a pre-memorized set of objects with fairly high accuracy.

In [None]:
detect_objects_results_remote = computervision_client.detect_objects(people_in_helmets)

# Print detected objects results with bounding boxes
print("Detecting objects in remote image:")
if len(detect_objects_results_remote.objects) == 0:
    print("No objects detected.")
else:
    for object in detect_objects_results_remote.objects:
        print("object at location {}, {}, {}, {}".format( \
        object.rectangle.x, object.rectangle.x + object.rectangle.w, \
        object.rectangle.y, object.rectangle.y + object.rectangle.h))

Detecting objects in remote image:
object at location 556, 1035, 51, 591
object at location 282, 1088, 33, 1080


#### Search for faces in an image

In [None]:
remote_image_features = ["faces"]
detect_faces_results_remote = computervision_client.analyze_image(couple, remote_image_features)


print("Faces in the remote image: ")
if (len(detect_faces_results_remote.faces) == 0):
    print("No faces detected.")
else:
    for face in detect_faces_results_remote.faces:
        print("'{}' of age {} at location {}, {}, {}, {}".format(face.gender, face.age, \
        face.face_rectangle.left, face.face_rectangle.top, \
        face.face_rectangle.left + face.face_rectangle.width, \
        face.face_rectangle.top + face.face_rectangle.height))

Faces in the remote image: 
'Male' of age 50 at location 230, 122, 330, 222
'Female' of age 32 at location 352, 166, 442, 256


In [None]:
remote_image_features = [VisualFeatureTypes.image_type]
# Call API with URL and features
detect_type_results_remote = computervision_client.analyze_image(street, remote_image_features)

# Prints type results with degree of accuracy
print("Type of remote image:")
if detect_type_results_remote.image_type.clip_art_type == 0:
    print("Image is not clip art.")
elif detect_type_results_remote.image_type.line_drawing_type == 1:
    print("Image is ambiguously clip art.")
elif detect_type_results_remote.image_type.line_drawing_type == 2:
    print("Image is normal clip art.")
else:
    print("Image is good clip art.")

if detect_type_results_remote.image_type.line_drawing_type == 0:
    print("Image is not a line drawing.")
else:
    print("Image is a line drawing")

Type of remote image:
Image is not clip art.
Image is not a line drawing.


#### **Text recognition**
#### First, with the help of detection algorithms, the area in which the text is written is highlighted, then text recognition is performed directly, for example, using segmentation algorithms.

In [None]:
recognize_handw_results = computervision_client.read(remote_image_handw_text_url,  raw=True)

In [None]:
operation_location_remote = recognize_handw_results.headers["Operation-Location"]
operation_id = operation_location_remote.split("/")[-1]

while True:
    get_handw_text_results = computervision_client.get_read_result(operation_id)
    if get_handw_text_results.status not in ['notStarted', 'running']:
        break
    time.sleep(1)

if get_handw_text_results.status == OperationStatusCodes.succeeded:
    for text_result in get_handw_text_results.analyze_result.read_results:
        for line in text_result.lines:
            print(line.text)
            print(line.bounding_box)
print()

The quick brown fox jumps
[38.0, 650.0, 2572.0, 699.0, 2570.0, 854.0, 37.0, 815.0]
over
[184.0, 1053.0, 508.0, 1044.0, 510.0, 1123.0, 184.0, 1128.0]
the lazy dog!
[639.0, 1011.0, 1976.0, 1026.0, 1974.0, 1158.0, 637.0, 1141.0]



### **Microsoft Azure** allows you to use computer vision technologies without the need to have highly specialized knowledge in the subject of computer vision, effectively perform tasks in your projects and easily show high results