In [1]:
pip install google-cloud-vision

Collecting google-cloud-vision
  Downloading https://files.pythonhosted.org/packages/0d/7f/e10d602c2dc3f749f1b78377a3357790f1da71b28e7da9e5bc20b3a9bd40/google_cloud_vision-1.0.0-py2.py3-none-any.whl (435kB)
Collecting google-api-core[grpc]<2.0.0dev,>=1.14.0
  Downloading https://files.pythonhosted.org/packages/63/7e/a523169b0cc9ce62d56e07571db927286a94b1a5f51ac220bd97db825c77/google_api_core-1.16.0-py2.py3-none-any.whl (70kB)
Collecting protobuf>=3.4.0
  Downloading https://files.pythonhosted.org/packages/92/30/1b7ccde09bf0c535d11f18a574ed7d7572c729a8f754fd568b297be08b61/protobuf-3.11.3-cp37-cp37m-win_amd64.whl (1.0MB)
Collecting googleapis-common-protos<2.0dev,>=1.6.0
  Downloading https://files.pythonhosted.org/packages/05/46/168fd780f594a4d61122f7f3dc0561686084319ad73b4febbf02ae8b32cf/googleapis-common-protos-1.51.0.tar.gz
Collecting google-auth<2.0dev,>=0.4.0
  Downloading https://files.pythonhosted.org/packages/5a/8d/e2ebbd0502627ed0d8a408162020e1c0792f088b49fddeedaaeebc206ed7/goo

In [1]:
from google.cloud import vision
from google.cloud.vision import types


When that’s taken care of, now you’ll need an instance of a client. 

To do so, you’re going to use a text recognition feature.

If you won’t store your credentials in environment variables, at this stage you can add it directly to the client.

By Credentials, I mean the Google Cloud API Key JSON file, you get to download it when you make a free account and want to activate one of the many Google Cloud APIs. 

In [2]:
client = vision.ImageAnnotatorClient.from_service_account_file('My Project 77824-9ac23395f34c.json')


Assuming that you store images to be processed in a folder ‘images’ inside your project catalog, let’s open one of them.


In [5]:
image_to_open = 'images/receipt.jpg'

with open(image_to_open, 'rb') as image_file:
    content = image_file.read()


Next step is to create a Vision object, which will allow you to send a request to proceed with text recognition.

In [6]:
image = vision.types.Image(content=content)

text_response = client.text_detection(image=image)

The response consists of detected words stored as description keys, their location on the image, and a language prediction. For example, let’s take a closer look at the first word:

But What did we actually upload?

![title](images/receipt.jpg)

As you can see, to filter text only, you need to get a description “on all the elements”. Luckily, with help comes Python’s powerful list comprehension.

In [7]:
texts = [text.description for text in text_response.text_annotations]

texts

['SHOPPING STORE\nREG 12-21\n03:22 PM\nCLERK 2\n618\n1 MISC.\n1 STUFF\nSUBTOTAL\n$0.49\n$7.99\n$8.48\n$0.74\n$9.22\n$10.00\n$0.78\nTAX\nTOTAL\nCASH\nCHANGE\nNO REFUNDS\nNO EXCHANGES\nNO RETURNS\n',
 'SHOPPING',
 'STORE',
 'REG',
 '12-21',
 '03:22',
 'PM',
 'CLERK',
 '2',
 '618',
 '1',
 'MISC.',
 '1',
 'STUFF',
 'SUBTOTAL',
 '$0.49',
 '$7.99',
 '$8.48',
 '$0.74',
 '$9.22',
 '$10.00',
 '$0.78',
 'TAX',
 'TOTAL',
 'CASH',
 'CHANGE',
 'NO',
 'REFUNDS',
 'NO',
 'EXCHANGES',
 'NO',
 'RETURNS']

If you look carefully, you can notice that the first element of the list contains all text detected in the image stored as a string, while the others are separated words. 

Let’s print it out.

In [8]:
print(texts[0])

SHOPPING STORE
REG 12-21
03:22 PM
CLERK 2
618
1 MISC.
1 STUFF
SUBTOTAL
$0.49
$7.99
$8.48
$0.74
$9.22
$10.00
$0.78
TAX
TOTAL
CASH
CHANGE
NO REFUNDS
NO EXCHANGES
NO RETURNS



As we’ve mentioned in this presentation above, Google Cloud Vision is not only about recognizing text, but also it lets you discover faces, landmarks, image properties, and web connections. 

With that in mind, let’s find out what it can tell you about web associations of the image.

In [9]:
web_response = client.web_detection(image=image)

Okay Google, do you actually know what is shown on the image you received?

In [10]:
web_content = web_response.web_detection
web_content.best_guess_labels

[label: "receipt definition"
language_code: "en"
]

Good job, Google! It’s a receipt indeed. 

But let’s give you a bit more exercise — can you see anything else? 

How about more predictions expressed in percentage?

In [11]:
predictions = [(entity.description, '{:.2%}'.format(entity.score)) for entity in web_content.web_entities]

predictions

[('Receipt', '88.39%'),
 ('Printer', '56.62%'),
 ('Payment', '45.57%'),
 ('Invoice', '42.03%'),
 ('Cash register', '36.72%'),
 ('Shopping', '35.46%'),
 ('Return receipt', '34.30%'),
 ('', '27.56%'),
 ('', '25.65%'),
 ('', '25.62%')]

So Google has given us a lot of valuable insights, well done, my almighty friend! 

Can you also find out where the image comes from and whether it has any copies?

In [12]:
web_content.full_matching_images

[url: "https://cloud.netlifyusercontent.com/assets/344dbf88-fdf9-42bb-adb4-46f01eedd629/7669b141-9ff9-47e6-b56b-2bb1ff6e0d72/receipt-example-processed-by-google-cloud-vision.png"
, url: "https://www.collinsdictionary.com/images/thumb/receipt_573065707_250.jpg"
, url: "https://media.gettyimages.com/photos/shopping-receipt-picture-id901964616?b=1&k=6&m=901964616&s=170x170&h=DVyqX2Q6sgDuMCQ7oVW4n4S4X5lGT7ylOEUQ-mL0Rg0="
, url: "https://thumbs.dreamstime.com/t/recibo-da-compra-85651861.jpg"
]

I’m impressed. Thanks, Google! 

But one is not enough, can you please give me three or more examples of similar images?

In [22]:
web_content.visually_similar_images[:7]

[url: "https://images-na.ssl-images-amazon.com/images/I/51X5KPbHPOL._SX500_.jpg",
 url: "https://ctl.s6img.com/society6/img/m1gmW5rFvbVc8_cIby7hNYO3qTA/w_1500/bath-towels/small/front/~artwork,fw_7400,fh_3700,iw_7400,ih_3700/s6-original-art-uploads/society6/uploads/misc/916b12db7d714e649c4d6364d7eb3e14/~~/psychology-psychology-gifts-psychology-definition-funny-definition-funny-quotes-dictionary-art-bath-towels.jpg",
 url: "https://ctl.s6img.com/society6/img/l67X0bB5HCDHt7qYyJaY5HgSmvo/w_700/bath-towels/small/front/~artwork,fw_7400,fh_3700,iw_7400,ih_3700/s6-original-art-uploads/society6/uploads/misc/ca6190d488ee4975b40d7aae67242d9c/~~/laundry-definition-dictionary-word-laundry-print-instant-download-printable-quote-dictiona-bath-towels.jpg",
 url: "https://ctl.s6img.com/society6/img/kvtUAnUq3IIgs0tjClA4M371jXA/w_700/bath-towels/small/front/~artwork,fw_7400,fh_3700,iw_7400,ih_3700/s6-original-art-uploads/society6/uploads/misc/f8c09fca79d6419ca6977c36b572b3ce/~~/create-definition-create-q

The Similar Images found by Google Cloud are:

![title](images/gcvpred2.jpg)

![title](images/gcpred1.jpg)

## Now let's Challenge ourselves

Let's have a look at what the Google Cloud Vision API can tell us about this photo of our legendary Professor Venkatesh.

![title](images/face.jpg)

In [3]:
image_to_open = 'images/face.jpg'

with open(image_to_open, 'rb') as image_file:
    content = image_file.read()
image = vision.types.Image(content=content)

face_response = client.face_detection(image=image)
face_content = face_response.face_annotations

face_content[0].detection_confidence

ServiceUnavailable: 503 DNS resolution failed

Not too bad, the algorithm is more than 71% sure that there is a face in the picture. 

But can we learn anything about the emotions behind it?

In [4]:
face_content[0]

bounding_poly {
  vertices {
    x: 357
    y: 23
  }
  vertices {
    x: 692
    y: 23
  }
  vertices {
    x: 692
    y: 413
  }
  vertices {
    x: 357
    y: 413
  }
}
fd_bounding_poly {
  vertices {
    x: 381
    y: 99
  }
  vertices {
    x: 666
    y: 99
  }
  vertices {
    x: 666
    y: 379
  }
  vertices {
    x: 381
    y: 379
  }
}
landmarks {
  type: LEFT_EYE
  position {
    x: 477.2466735839844
    y: 209.75875854492188
    z: -0.000423431396484375
  }
}
landmarks {
  type: RIGHT_EYE
  position {
    x: 576.2607421875
    y: 214.8441162109375
    z: -3.6672720909118652
  }
}
landmarks {
  type: LEFT_OF_LEFT_EYEBROW
  position {
    x: 445.35150146484375
    y: 187.74644470214844
    z: 8.20673942565918
  }
}
landmarks {
  type: RIGHT_OF_LEFT_EYEBROW
  position {
    x: 503.37744140625
    y: 189.62448120117188
    z: -22.31841468811035
  }
}
landmarks {
  type: LEFT_OF_RIGHT_EYEBROW
  position {
    x: 550.7940673828125
    y: 192.64932250976562
    z: -24.0630970001220

The scope of possibilities to apply Google Cloud Vision service is practically endless. With Python Library available, you can utilize it in any project based on the language, whether it’s a web application or a scientific project. It can certainly help you bring out deeper interest in Machine Learning technologies.

Google documentation provides some great ideas on how to apply the Vision API features in practice as well as gives you the possibility to learn more about the Machine Learning.