![nyp.jpg](attachment:nyp.jpg)

# Client Library

Previously, we have been able to perform vision detections using just an API key. In this exercise, we will use the Cloud Client Library (Python) with a service key to perform Vision and Video predictions.

Before we do that, we need to do some setup. Complete the following steps on your **PC/laptop**.

### Set up PC

1. Download the [Google Cloud CLI installer](https://dl.google.com/dl/cloudsdk/channels/rapid/GoogleCloudSDKInstaller.exe)
2. Double click to install the SDK

3. Click Next. 

4. Click "I Agree"
5. Use default installation path

6. Click "Next" when done

6. Check all the boxes and click "Finish"
7. The terminal window will auto launch gcloud init
8. Press Y

9. Your browser will launch. Sign in to your personal Google account. Agree to the terms of service.
10. Once done, press the Win button, search for <code>Google Cloud SDK Shell</code> and launch it
11. Type <code>gcloud auth list</code> to see your active account; it should be the account you have just logged in
12. Type <code>gcloud config list</code> to see some info on configuration

### Required files

Before you proceed, download the following data from Brightspace into the current directory:
- <code>it3386-2024-s2.json</code>: service key
- images from week 1 practical
- <code>oh_2021_short.mp4</code>

You should have also completed the Anaconda environment set up so that you can import the vison and video intelligence libraries. See the instructions on Brightspace.

Try performing a Vision Cloud Detection using the following codes.

In [1]:
print("hello")

hello


In [2]:
%conda env list


Note: you may need to restart the kernel to use updated packages.


In [None]:
from google.cloud import vision
import io

In [None]:
# refer to https://cloud.google.com/vision/docs/detecting-faces

def detect_faces(path):
    """Detects faces in an image."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.face_detection(image=image)
    faces = response.face_annotations

    # Names of likelihood from google.cloud.vision.enums
    likelihood_name = ('UNKNOWN', 'VERY_UNLIKELY', 'UNLIKELY', 'POSSIBLE',
                       'LIKELY', 'VERY_LIKELY')
    print('Faces:')

    for face in faces:
        print('anger: {}'.format(likelihood_name[face.anger_likelihood]))
        print('joy: {}'.format(likelihood_name[face.joy_likelihood]))
        print('surprise: {}'.format(likelihood_name[face.surprise_likelihood]))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in face.bounding_poly.vertices])

        print('face bounds: {}'.format(','.join(vertices)))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(response.error.message))
    
    return response

In [3]:
# ensure the image is present in the current folder
response = detect_faces('nyp_cafe.jpg')

[0;31m--------------------------------------------------[0m
[0;31mDefaultCredentialsError[0mTraceback (most recent call last)
Cell [0;32mIn[3], line 2[0m
[1;32m      1[0m [38;5;66;03m# ensure the image is present in the current folder[39;00m
[0;32m----> 2[0m response [38;5;241m=[39m [43mdetect_faces[49m[43m([49m[38;5;124;43m'[39;49m[38;5;124;43mnyp_cafe.jpg[39;49m[38;5;124;43m'[39;49m[43m)[49m

Cell [0;32mIn[2], line 7[0m, in [0;36mdetect_faces[0;34m(path)[0m
[1;32m      5[0m [38;5;28;01mfrom[39;00m [38;5;21;01mgoogle[39;00m[38;5;21;01m.[39;00m[38;5;21;01mcloud[39;00m [38;5;28;01mimport[39;00m vision
[1;32m      6[0m [38;5;28;01mimport[39;00m [38;5;21;01mio[39;00m
[0;32m----> 7[0m client [38;5;241m=[39m [43mvision[49m[38;5;241;43m.[39;49m[43mImageAnnotatorClient[49m[43m([49m[43m)[49m
[1;32m      9[0m [38;5;28;01mwith[39;00m io[38;5;241m.[39mopen(path, [38;5;124m'[39m[38;5;124mrb[39m[38;5;124m'[39m) [38;5;28;01m

> Are you able to get a response from the detection? What is the error?

### Set up environment

The service key that you downloaded will give you the rights to complete the detections. Place them in the same directory.

We will set up the GOOGLE_APPLICATION_CREDENTIALS env to link to the service key.

In [None]:
# you should see the service key (json) in your current folder
!dir *.json

In [None]:
# env should be empty 
%env GOOGLE_APPLICATION_CREDENTIALS

In [4]:
# set the service key
%env GOOGLE_APPLICATION_CREDENTIALS=it3386-2024-s2.json

env: GOOGLE_APPLICATION_CREDENTIALS=it3386-2024-s2.json


In [5]:
%env GOOGLE_APPLICATION_CREDENTIALS




'it3386-2024-s2.json'

### Using Client Library for Vision

In [6]:
response = detect_faces('../WEEK_1/data/nyp_cafe.jpg')

Faces:
anger: VERY_UNLIKELY
joy: VERY_UNLIKELY
surprise: VERY_UNLIKELY
face bounds: (195,256),(244,256),(244,314),(195,314)
anger: VERY_UNLIKELY
joy: VERY_LIKELY
surprise: VERY_UNLIKELY
face bounds: (658,170),(752,170),(752,279),(658,279)
anger: VERY_UNLIKELY
joy: LIKELY
surprise: VERY_UNLIKELY
face bounds: (458,87),(560,87),(560,205),(458,205)


In [7]:
print(response)

face_annotations {
  bounding_poly {
    vertices {
      x: 195
      y: 256
    }
    vertices {
      x: 244
      y: 256
    }
    vertices {
      x: 244
      y: 314
    }
    vertices {
      x: 195
      y: 314
    }
  }
  fd_bounding_poly {
    vertices {
      x: 198
      y: 265
    }
    vertices {
      x: 243
      y: 265
    }
    vertices {
      x: 243
      y: 311
    }
    vertices {
      x: 198
      y: 311
    }
  }
  landmarks {
    type_: LEFT_EYE
    position {
      x: 232.8819122314453
      y: 285.9496154785156
      z: -0.0016236305236816406
    }
  }
  landmarks {
    type_: RIGHT_EYE
    position {
      x: 221.1948699951172
      y: 284.4725341796875
      z: 8.73569393157959
    }
  }
  landmarks {
    type_: LEFT_OF_LEFT_EYEBROW
    position {
      x: 234.22482299804688
      y: 280.69549560546875
      z: -4.279675483703613
    }
  }
  landmarks {
    type_: RIGHT_OF_LEFT_EYEBROW
    position {
      x: 233.62600708007812
      y: 281.9647216796875
 

In [None]:
def detect_landmarks(path):
    """Detects landmarks in the file."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.landmark_detection(image=image)
    landmarks = response.landmark_annotations
    print('Landmarks:')

    for landmark in landmarks:
        print(landmark.description)
        for location in landmark.locations:
            lat_lng = location.lat_lng
            print('Latitude {}'.format(lat_lng.latitude))
            print('Longitude {}'.format(lat_lng.longitude))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(response.error.message))
        
    return response

In [8]:
response = detect_landmarks('../WEEK_1/data/place.jpg')

Landmarks:
Merlion Park
Latitude 1.2867449000000002
Longitude 103.8543872
Artscience Museum
Latitude 1.2862738
Longitude 103.8592663
Jubilee Bridge
Latitude 1.2879361000000003
Longitude 103.8543861
Esplanade - Theatres On The Bay, Singapore
Latitude 1.2897934
Longitude 103.8558166


In [9]:
print(response)

landmark_annotations {
  mid: "/g/11bwmwgb2l"
  description: "Merlion Park"
  score: 0.6081422567367554
  bounding_poly {
    vertices {
    }
    vertices {
      x: 800
    }
    vertices {
      x: 800
      y: 800
    }
    vertices {
      y: 800
    }
  }
  locations {
    lat_lng {
      latitude: 1.2867449000000002
      longitude: 103.8543872
    }
  }
}
landmark_annotations {
  mid: "/m/0gff2yr"
  description: "Artscience Museum"
  score: 0.5705945491790771
  bounding_poly {
    vertices {
    }
    vertices {
      x: 800
    }
    vertices {
      x: 800
      y: 800
    }
    vertices {
      y: 800
    }
  }
  locations {
    lat_lng {
      latitude: 1.2862738
      longitude: 103.8592663
    }
  }
}
landmark_annotations {
  mid: "/g/11fkf0scbc"
  description: "Jubilee Bridge"
  score: 0.5641544461250305
  bounding_poly {
    vertices {
    }
    vertices {
      x: 800
    }
    vertices {
      x: 800
      y: 800
    }
    vertices {
      y: 800
    }
  }
  locations

In [None]:
def detect_document(path):
    """Detects document features in an image."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.document_text_detection(image=image)

    for page in response.full_text_annotation.pages:
        for block in page.blocks:
            print('\nBlock confidence: {}\n'.format(block.confidence))

            for paragraph in block.paragraphs:
                print('Paragraph confidence: {}'.format(
                    paragraph.confidence))

                for word in paragraph.words:
                    word_text = ''.join([
                        symbol.text for symbol in word.symbols
                    ])
                    print('Word text: {} (confidence: {})'.format(
                        word_text, word.confidence))

                    for symbol in word.symbols:
                        print('\tSymbol: {} (confidence: {})'.format(
                            symbol.text, symbol.confidence))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(response.error.message))
    
    return response

In [10]:
response = detect_document('../WEEK_1/data/note.jpg')


Block confidence: 0.9424569010734558

Paragraph confidence: 0.9424569010734558
Word text: # (confidence: 0.8970488905906677)
	Symbol: # (confidence: 0.8970488905906677)
Word text: 1 (confidence: 0.9878649115562439)
	Symbol: 1 (confidence: 0.9878649115562439)

Block confidence: 0.9475244283676147

Paragraph confidence: 0.9475244283676147
Word text: What (confidence: 0.945880651473999)
	Symbol: W (confidence: 0.8481331467628479)
	Symbol: h (confidence: 0.9859771728515625)
	Symbol: a (confidence: 0.9818291664123535)
	Symbol: t (confidence: 0.9675832390785217)
Word text: is (confidence: 0.956630289554596)
	Symbol: i (confidence: 0.95005202293396)
	Symbol: s (confidence: 0.9632085561752319)
Word text: Mindfulness (confidence: 0.9464665055274963)
	Symbol: M (confidence: 0.6991844177246094)
	Symbol: i (confidence: 0.9543477296829224)
	Symbol: n (confidence: 0.9411085247993469)
	Symbol: d (confidence: 0.9776654243469238)
	Symbol: f (confidence: 0.9693295955657959)
	Symbol: u (confidence: 0.93

In [11]:
print(response)

text_annotations {
  locale: "en"
  description: "#1\nWhat is Mindfulness\n#2\nHabits of Mind\n#3\nUnderstanding our Thoughts & Emotions\n#4\nCultivating Love - Kindness\n#5\nWillingness to be with things as they are\n#6\nIndining mind towards Joy."
  bounding_poly {
    vertices {
      x: 207
      y: 217
    }
    vertices {
      x: 1402
      y: 217
    }
    vertices {
      x: 1402
      y: 973
    }
    vertices {
      x: 207
      y: 973
    }
  }
}
text_annotations {
  description: "#"
  bounding_poly {
    vertices {
      x: 210
      y: 218
    }
    vertices {
      x: 241
      y: 218
    }
    vertices {
      x: 241
      y: 260
    }
    vertices {
      x: 210
      y: 260
    }
  }
}
text_annotations {
  description: "1"
  bounding_poly {
    vertices {
      x: 256
      y: 217
    }
    vertices {
      x: 282
      y: 217
    }
    vertices {
      x: 282
      y: 259
    }
    vertices {
      x: 256
      y: 259
    }
  }
}
text_annotations {
  description: "W

Todo

> How about detecting labels?

In [None]:
# see https://cloud.google.com/vision/docs/labels

def detect_labels(path):

   

In [None]:
detect_labels('nyp_cafe.jpg')

*Sample output:*

<pre>
Labels:
Leisure
Customer
T-shirt
Eyewear
Event
Fun
Water bottle
Belt
Job
Room
</pre>