![nyp.jpg](attachment:nyp.jpg)

## Google Cloud Vision API

In this practical, we are going to learn more about the Google Cloud Vision API. You have some image files in the data directory that you can work with but feel free to try out your own images anytime.


Let's start with exploring the [Cloud Vision Demo](https://cloud.google.com/vision).

Go to the link. Under the demo section, upload an image. If you upload <code>nyp_cafe.jpg</code>, you will see the following.

![detection_face_s.png](attachment:detection_face_s.png)

### Todo

> Upload any image and observe the results of the various detections.

### Face Detection

Cloud Vision API supports many types of detection. Let's start with face detection.

We will now connect to the Cloud Vision API to make a request and get a response.

In [1]:
import requests 
import base64
import json
from PIL import Image, ImageDraw
from io import BytesIO
import matplotlib.pyplot as plt
%matplotlib inline


These parameters are required to complete the request.

In [2]:
googleAPIKey = "AIzaSyCsKkffvZCz-crLLcgT7MnMpyxKE1shwbI"
googleurl = "https://vision.googleapis.com/v1/images:annotate?key=" + googleAPIKey
req_headers = {'Content-Type': 'application/json'}


In [3]:
# helper function
def get_base64(image_filename):
    with open(image_filename, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read())
    return encoded_string

This is the image we will work with.

In [6]:
img_filename = 'C:\Users\User\Downloads\AIA pics\nyp_cafe.jpg'

plt.imshow(plt.imread(img_filename))
plt.axis('off');


SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape (3513990845.py, line 1)

Make an API request. Ensure the request parameters are filled in correctly as required under [Cloud Vision Face Detection](https://cloud.google.com/vision/docs/detecting-faces#vision_face_detection-drest).

In [None]:

data = {
    'requests': 
    [
        {
            'image': { 'content': get_base64(img_filename).decode('utf-8') },
            'features': [{ 'type': 'FACE_DETECTION' },
                         #{ 'type': 'LANDMARK_DETECTION' }
                        ]
        }
    ]
}

# Send the image data to Google for label detection
r = requests.post(url = googleurl, headers = req_headers, json = data) 

# Check and display the results
if r.status_code == 200:
    result = r.json()

    print (result)

    # loop through the response to get the parameters needed
    
    
else:
    print('Error with status')
    print(r.content)
    


**Brief Analysis**

Using the result from the API, we can print out the content to understand it better and dive into the specific parameters. Do check if a certain key exist before accessing the dictionary to prevent your app from crashing.

In [None]:
# Pretty print JSON response
print(json.dumps(result, indent=4))

Let's analyse the annotations in the response.

In [None]:
# number of faces detected
annotations = result['responses'][0]['faceAnnotations']
len(annotations)

In [None]:
for annotation in annotations:
    print(json.dumps(annotation, indent=4))

Print out values of description and score properties

In [None]:
for annotation in annotations:
    print('\n\nEach face')
    
    print('\n*** Confidence ***')
    print('Detection Confidence: %.2f' % (annotation['detectionConfidence'] * 100))
    print('Landmarking Confidence: %.2f' % (annotation['landmarkingConfidence'] * 100))
    
    print('\n*** Likelihood ***')
    print('Joy: ' + annotation['joyLikelihood'])
    print('Sorrow: ' + annotation['sorrowLikelihood'])
    
    print('\n*** Features ***')
    
    # check if key is present before processing
    if 'landmarks' in annotation:
        for features in annotation['landmarks']:
            print(features)
            
            # process each individual feature; uncomment to see details
            print('\tType: ' + features['type'])
            coordinates = features['position']
            for key, value in coordinates.items():
                print('\t', key, value)
            
             
    

Let's plot a bounding polygon of the first face in the image. Note the definitions:
- `boundingPoly`: The bounding polygon around the face.
- `fdBoundingPoly`: This bounding polygon is tighter than the `boundingPoly`, and encloses only the skin part of the face.

These are the vertices of the first face (hence index 0).

In [None]:
annotations[0]['fdBoundingPoly']

Let's draw the bounding box of the face on the image.

In [None]:
# helper function
def drawbox(image, left, top, right, bottom, text):
    draw = ImageDraw.Draw(image)
    draw.rectangle([left, top, right, bottom], outline=(255,255,0,255)) # yellow line
    draw.rectangle([left, top, right, top + 12], fill=(255,255,0,255), outline=(255,255,0,255))
    draw.text((left, top), text, fill=(0,0,0,255)) # black
    

In [None]:
vertices = annotations[0]['fdBoundingPoly']['vertices']
    
image = Image.open(img_filename)

drawbox(image, 
        vertices[0]['x'], vertices[0]['y'], 
        vertices[2]['x'], vertices[2]['y'], 'Face 0')

plt.imshow(image)
plt.axis('off');


Details on vertices is shown below. You just need the top left (index 0) and bottom right (index 2) to plot a bounding box.

![vertices.JPG](attachment:vertices.JPG)

### Landmark Detection

Read up on [Landmark detection](https://cloud.google.com/vision/docs/detecting-landmarks).

Using the same technique above, perform landmark detection with place.jpg (<span class="attribution">"<a target="_blank" rel="noopener noreferrer" href="https://www.flickr.com/photos/57785759@N06/5552134623">Marina Bay Sands</a>" by <a target="_blank" rel="noopener noreferrer" href="https://www.flickr.com/photos/57785759@N06">alantankenghoe</a> is licensed under <a target="_blank" rel="noopener noreferrer" href="https://creativecommons.org/licenses/by/2.0/?ref=openverse">CC BY 2.0</a>.</span>)

Plot the bounding box with the description as shown below.

<div><img src=attachment:place_result.JPG align="left" width="256" /></div>

> You might encounter the following error if you call the drawbox function:  
> `KeyError: 'x'`  
> 
> Why? Analyse the vertices. How do you resolve the problem?


### Handwriting Detection

Read up on [Handwriting detection](https://cloud.google.com/vision/docs/handwriting). 

Plot the bounding boxes with the descriptions as shown below. Perform landmark detection using note.jpg.

<div><img src=attachment:note_result.JPG align="left" width=512/></div>

### Miscellaneous

Try out these detections using Cloud Vision API
- [Text detection in images](https://cloud.google.com/vision/docs/ocr)
- [Image properties detection](https://cloud.google.com/vision/docs/detecting-properties)
- [Label detection](https://cloud.google.com/vision/docs/labels)
- [Logo detection](https://cloud.google.com/vision/docs/detecting-logos)

For each API, using an appropriate image, show that you can 
- send the correct request
- receive the intended response
- draw bounding boxes if the annotations in the response contain coordinates


> What applications can you build using these image detection capabilities?