# Azure Cognitive services

Azure Cognitive Services are APIs, SDKs, and services available to help developers build intelligent applications without having direct AI or data science skills or knowledge. Azure Cognitive Services enable developers to easily add cognitive features into their applications. The goal of Azure Cognitive Services is to help developers create applications that can see, hear, speak, understand, and even begin to reason. The catalog of services within Azure Cognitive Services can be categorized into five main pillars - Vision, Speech, Language, Web Search, and Decision.

In this module we will have a look at how we can use Azure cognitive service's Computer vision API's to develop your own application with minimal effort

## Azure Cognitive Resource

To work and interact with Cognitive API's we need to have Cognitive resources. Cognitive resources provides keys and endpoints which we can use to securely transact with our API request and responses. Let us see, how to create a cognitive resource using web interface.

![](images/cognitive/step_1.png)

![](images/cognitive/step_2.png)

![](images/cognitive/step_3.png)

![](images/cognitive/step_4.png)

![](images/cognitive/step_5.png)

![](images/cognitive/step_6.png)

![](images/cognitive/step_7.png)

![](images/cognitive/step_8.png)

Now, we have our cognitive resource ready to use.
<br>

![](images/cognitive/step_9.png)

Navigating to the "Quick Start" guide will give us good information about what all the things we can do with the cognitive resources and the type of applications which can be built by using the Cognitive services.
<br>

![](images/cognitive/step_10.png)

Before we start using these API's we need to know the "Key" and "Endpoint" of our cognitive resource, so that we can point our API's to use the same. We can find the "Key" and "Endpoint" as shown in the image below. Each cognitive resource will have unique key and Endpoint. You should take extra pre-cautions to keep it safe. You can also use the "Regeneratekey1" and "Regeneratekey2" to reset your "keys".

<br>

![](images/cognitive/step_11.png)

### Install Azure cognitive services for computer vision libraries 

Azure cognitive services provides support for programming languages such as "Java", "C#", "Go", "Javascript" and "Python". We will be using the Python programming language for the illustrations.
We will install the Azure python depndencies for computer vision API's [dependent package](https://pypi.org/project/azure-cognitiveservices-vision-computervision/)

We can install the dependencies by runnning the following command `pip install azure-cognitiveservices-vision-computervision==0.5.0`. We have tested this example with version 0.5.0 at the time of writing this example. 

In [4]:
!pip install azure-cognitiveservices-vision-computervision==0.5.0



### Import the required dependencies

Now that we have installed the required package, we can start including the dependent Azure modules.

In [5]:
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import TextOperationStatusCodes
from azure.cognitiveservices.vision.computervision.models import TextRecognitionMode
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials

#from array import array
import os
from PIL import Image
import time

In [6]:
# Add your Computer Vision subscription key to your environment variables.
if 'COMPUTER_VISION_SUBSCRIPTION_KEY' in os.environ:
    subscription_key = os.environ['COMPUTER_VISION_SUBSCRIPTION_KEY']
else:
    print("ERROR:Set the COMPUTER_VISION_SUBSCRIPTION_KEY environment variable.")

# Add your Computer Vision endpoint to your environment variables.
if 'COMPUTER_VISION_ENDPOINT' in os.environ:
    endpoint = os.environ['COMPUTER_VISION_ENDPOINT']
else:
    print("ERROR:Set the COMPUTER_VISION_ENDPOINT environment variable.")

ERROR:Set the COMPUTER_VISION_SUBSCRIPTION_KEY environment variable.
ERROR:Set the COMPUTER_VISION_ENDPOINT environment variable.


### Export the environment variables

To use the Cognitive services we need to link our Cognitive services by exporting the environment variables. For Computer vision services we need to set following two environment variables.

* COMPUTER_VISION_SUBSCRIPTION_KEY
* COMPUTER_VISION_ENDPOINT

If you remember from our previous step we have noted the "Key" and "Endpoint" as a final step of cognitive resource creation step. We need to use the same. Since we are using the notebook environment, we will use the jupyter notebook way of exporting variables as shown below.

* `%env COMPUTER_VISION_SUBSCRIPTION_KEY Your_Key_value`
* `%env COMPUTER_VISION_ENDPOINT your_end_point`

In [13]:
%env COMPUTER_VISION_SUBSCRIPTION_KEY 5f906a706cbf49bca5de9775901e6744

env: COMPUTER_VISION_SUBSCRIPTION_KEY=5f906a706cbf49bca5de9775901e6744


In [14]:
%env COMPUTER_VISION_ENDPOINT https://westeurope.api.cognitive.microsoft.com/

env: COMPUTER_VISION_ENDPOINT=https://westeurope.api.cognitive.microsoft.com/


In [15]:
# Add your Computer Vision subscription key to your environment variables.
if 'COMPUTER_VISION_SUBSCRIPTION_KEY' in os.environ:
    subscription_key = os.environ['COMPUTER_VISION_SUBSCRIPTION_KEY']
else:
    print("ERROR:Set the COMPUTER_VISION_SUBSCRIPTION_KEY environment variable.")

# Add your Computer Vision endpoint to your environment variables.
if 'COMPUTER_VISION_ENDPOINT' in os.environ:
    endpoint = os.environ['COMPUTER_VISION_ENDPOINT']
else:
    print("ERROR:Set the COMPUTER_VISION_ENDPOINT environment variable.")

Now our environment variables are set and hence we don't observe any errors.

### Create a client to interact with the Congnitive service server

We need to create a client which will interact with the Cognitive server. We will pass in our keys which we obtained from the environment variables.

It is not mandatory to export the environment variables which we created in the previous step in this example. We can pass the key and endpoint directly. But when you try to deploy it as a complete solution, it is important to secure your key and endpoint information. Usually as a best practice, it is secured by using environment variables.

In [16]:
computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

##  Image Description API

Let us use the API's provided by Cognitive services to get the description of any given image. i.e we provide an image URL from internet to the service as an input and Cognitive service will respond back with a description of the given image. It also informs about the confidence level. Let us see it in action.

In [17]:
#remote_image_url = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/landmark.jpg"
#remote_image_url = "https://cdn.getyourguide.com/img/tour_img-1543839-148.jpg"
remote_image_url = "https://i.pinimg.com/originals/2b/87/ff/2b87ffcbf89d763a01b4b05e2e447dd8.jpg"

Here is our input image for testing purposes. 
<br>

![](https://i.pinimg.com/originals/2b/87/ff/2b87ffcbf89d763a01b4b05e2e447dd8.jpg)

In [18]:
# Call API
description_results = computervision_client.describe_image(remote_image_url )

# Get the captions (descriptions) from the response, with confidence level
print("Description of remote image: ")
if (len(description_results.captions) == 0):
    print("No description detected.")
else:
    for caption in description_results.captions:
        print("'{}' with confidence {:.2f}%".format(caption.text, caption.confidence * 100))

Description of remote image: 
'a group of people standing in front of a crowd posing for the camera' with confidence 94.21%


#### Output of the Image descriptor

**'a group of people standing in front of a crowd posing for the camera' with confidence 94.21%**

In deed! House of Stark, waiting for the arrival of King "Robert Baratheon" and posing for the camera of course! :)

This is quite fun. Without knowing anything about the image processing or Computer vision or Artificial intelligence. We were able to build an application using the Azure cognitive services. 

## Printed text recognizer

Let us try out one more cognitive service API. The printed text recognizer recognizes printed text in a given image. 
This example will extract printed text in an image, then print results, line by line.
This API call can also recognize handwriting.

In [19]:
# Get an image with printed text
remote_image_printed_text_url = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/printed_text.jpg"

Let us have a look at the sample image we are giving as input.
<br>

![](https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/printed_text.jpg)

Extracting the printed text from an image is a two step process.
* Step 1: Provide an input image with printed text. As a response API provides a resonse where the extracted result has been stored as an operation ID.
* Step 2: Extract the operation ID obtained in Step-1 and extract the actual text from the image.

#### Step 1:

In [20]:
# Call API with URL and raw response (allows you to get the operation location)
recognize_printed_results = computervision_client.batch_read_file(remote_image_printed_text_url,  raw=True)

In [21]:
recognize_printed_results.headers

{'Operation-Location': 'https://westeurope.api.cognitive.microsoft.com/vision/v2.1/read/operations/40704246-91e4-452e-aebf-3ee5ed501a44'}

#### Step 2:

In [22]:
# Get the operation location (URL with an ID at the end) from the response
operation_location_remote = recognize_printed_results.headers["Operation-Location"]
# Grab the ID from the URL
operation_id = operation_location_remote.split("/")[-1]

# Call the "GET" API and wait for it to retrieve the results 
while True:
    get_printed_text_results = computervision_client.get_read_operation_result(operation_id)
    if get_printed_text_results.status not in ['NotStarted', 'Running']:
        break
    time.sleep(1)

# Print the detected text, line by line
if get_printed_text_results.status == TextOperationStatusCodes.succeeded:
    for text_result in get_printed_text_results.recognition_results:
        for line in text_result.lines:
            print(line.text)
            print(line.bounding_box)
print()

Nutrition Facts Amount Per Serving
[144.0, 0.0, 1238.0, 211.0, 1224.0, 280.0, 130.0, 57.0]
Serving size: 1 bar (40g)
[110.0, 58.0, 598.0, 157.0, 587.0, 206.0, 100.0, 108.0]
Serving Per Package: 4
[83.0, 108.0, 548.0, 206.0, 538.0, 256.0, 72.0, 157.0]
Total Fat 13g
[683.0, 213.0, 1000.0, 286.0, 989.0, 332.0, 672.0, 260.0]
Saturated Fat 1.5g
[695.0, 295.0, 1120.0, 394.0, 1108.0, 447.0, 683.0, 347.0]
Amount Per Serving
[29.0, 207.0, 491.0, 309.0, 478.0, 367.0, 16.0, 265.0]
Trans Fat 0g
[668.0, 363.0, 954.0, 435.0, 940.0, 488.0, 655.0, 416.0]
alories 190
[8.0, 293.0, 265.0, 350.0, 254.0, 396.0, 0.0, 339.0]
Cholesterol Omg
[593.0, 424.0, 1007.0, 526.0, 993.0, 580.0, 579.0, 479.0]
ories from Fat 110
[9.0, 377.0, 398.0, 464.0, 388.0, 509.0, 0.0, 421.0]
Sodium 20mg
[561.0, 497.0, 913.0, 588.0, 899.0, 643.0, 547.0, 552.0]
nt Daily Values are based on
[7.0, 476.0, 521.0, 598.0, 511.0, 640.0, 0.0, 518.0]
Vitamin A 50%
[525.0, 597.0, 776.0, 657.0, 766.0, 699.0, 514.0, 640.0]
calorie diet.
[12.0, 5

As we can observe we were able to extract the printed text from an image. Go ahead and build your own nutrition monitor app!. Just scan the food package boxes, monitor the amount of carbohydrates, fat and sugar you consume. 

We can also pipe the output of our text scanner to Azure Text analytics or Text translators to create your own complete application of Text translator for different languages!

As we can see Azure cognitive services will help you develop a complete solution without having to worry too much about the internals of the AI or ML.