


# Computer Vision Cognitive Services
## Detecting Coral Reefs and Scuba Diving Imagery

### Overview

Here we'll run a few examples of the Computer Vision cognitive service, that tries to view and recognize objects, environments, people, and activities in an image. This proof of concept will be focused on detecting corals, scuba diving equipment, and other reef-related images. We will also explore a similar example of skydiving, perhaps challenging to the service since the colors and gear visually is similar to scuba diving, and both the sky and sea are blue. These service calls return JSON contains tags with the services guesses which are each accompanied by a confidence level.

**IMPORTANT**: Remember to enter your *API key* (secret) and *uri_base* that correspond to your own cognitive service license in the code below so your service calls will succeed.

### - Computer Vision cognitive service
### - Detect objects and activity in underwater setting
### - Calls are assigned Tags and Confidences
<br><br><br>

## First Image: Coral Reef 

Our first image is obviously a photo of a coral reef with wildlife and fish.

<br><br>
![](https://timeincsecure-a.akamaihd.net/rtmp_uds/293884104/201703/2681/293884104_5360456295001_5360434467001-vs.jpg?pubId=293884104&videoId=5360434467001)


<br><br><br>
### Assign license key
Go to: https://azure.microsoft.com/en-us/try/cognitive-services/ and select Vision APIs and click "Get Key" next to the Computer Vision API:
<img width="1126" alt="Screen Shot 2019-04-24 at 5 01 51 AM" src="https://user-images.githubusercontent.com/1314285/56658077-836dbc80-664e-11e9-956d-b03dea764f46.png">

Make sure you click on the 7-Day Free Trial:
<img width="891" alt="Screen Shot 2019-04-24 at 5 01 59 AM" src="https://user-images.githubusercontent.com/1314285/56658081-8799da00-664e-11e9-88c1-83ce4fb5843a.png">

Agree to the terms and sign in with your Microsoft Account. Then you should see your endpoint url and subscription key, just like you did for the spell check one:
<img width="1306" alt="Screen Shot 2019-04-24 at 5 15 27 AM" src="https://user-images.githubusercontent.com/1314285/56658879-96818c00-6650-11e9-9068-7fe88488e61b.png">

Make sure the key and enpoint below match:

In [1]:
# Define the Subscription key for making API calls.
secret = '4012d49a01a0486c9a3949a65b96162f'

<br><br><br>
### Import required libraries
### Replace uri_base with your region
### Choose parameters for API call

In [2]:
# Import required libraries for request headers and json extraction.
import http.client, urllib.request, urllib.parse, urllib.error, base64, json

# Replace the subscription_key string value with your valid subscription key.
subscription_key = secret

# Replace to match your region.
uri_base = 'westus.api.cognitive.microsoft.com'

headers = {
    # Request headers.
    'Content-Type': 'application/json',
    'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.parse.urlencode({
    # Request parameters. All of them are optional.
    'visualFeatures': 'Description,Color,Tags',
    'language': 'en',
})

<br><br><br>
### Set image to "body"
### Call cognitive service API
### Return JSON response

In [3]:
body = "{'url':'https://timeincsecure-a.akamaihd.net/rtmp_uds/293884104/201703/2681/293884104_5360456295001_5360434467001-vs.jpg?pubId=293884104&videoId=5360434467001'}"

try:
    # Execute the REST API call and get the response.
    conn = http.client.HTTPSConnection('hackingstem.cognitiveservices.azure.com')
    conn.request("POST", "/vision/v1.0/analyze?%s" % params, body, headers)
    response = conn.getresponse()
    data = response.read()

    # 'data' contains the JSON data. The following formats the JSON data for display.
    parsed = json.loads(data.decode())
    print ("Response:")
    print (json.dumps(parsed, sort_keys=True, indent=2))
    conn.close()

except Exception as e:
    print('Error:')
    print(e)

##################################

Response:
{
  "color": {
    "accentColor": "048AC7",
    "dominantColorBackground": "Brown",
    "dominantColorForeground": "Brown",
    "dominantColors": [],
    "isBWImg": false,
    "isBwImg": false
  },
  "description": {
    "captions": [
      {
        "confidence": 0.26692594483388654,
        "text": "a group of colorful underwater"
      }
    ],
    "tags": [
      "nature",
      "covered",
      "colorful",
      "orange",
      "reef",
      "lot",
      "colored",
      "many",
      "sitting",
      "table",
      "surrounded",
      "painted",
      "colors",
      "fire",
      "old",
      "field",
      "water",
      "large",
      "group",
      "underwater",
      "blue",
      "display",
      "room",
      "hydrant",
      "umbrella",
      "people",
      "standing",
      "street",
      "white"
    ]
  },
  "metadata": {
    "format": "Jpeg",
    "height": 720,
    "width": 1280
  },
  "requestId": "fee785ed-3d27-4241-ad00-3791a29e082d",
  "tags": [
    {
 

<br><br><br>
The cognitive service understands that this is a colorful, orange, reef that may involve diving and fish with the description "a group of colorful underwater".  Look at the confidence values for each tag in the JSON.

<br><br>

### - Keywords and confidence values
### - PROBABLY a colorful, orange, reef
### - MAYBE diving and fish
<br><br><br><br><br><br><br>

## Second Image: Scuba Diver

### A second underwater image, of a diver this time
<br><br>
![](https://www.deeperblue.com/wp-content/uploads/2016/03/AdobeStock_62701813.jpeg)
<br><br><br>

In [None]:
### Set new image of diver
### Call cognitive service API
### Return JSON response

In [4]:
body = "{'url':'https://www.deeperblue.com/wp-content/uploads/2016/03/AdobeStock_62701813.jpeg'}"

try:
    # Execute the REST API call and get the response.
    conn = http.client.HTTPSConnection('hackingstem.cognitiveservices.azure.com')
    conn.request("POST", "/vision/v1.0/analyze?%s" % params, body, headers)
    response = conn.getresponse()
    data = response.read()

    # 'data' contains the JSON data. The following formats the JSON data for display.
    parsed = json.loads(data.decode())
    print ("Response:")
    print (json.dumps(parsed, sort_keys=True, indent=2))
    conn.close()

except Exception as e:
    print('Error:')
    print(e)

####################################

Response:
{
  "color": {
    "accentColor": "016BCA",
    "dominantColorBackground": "Blue",
    "dominantColorForeground": "Blue",
    "dominantColors": [
      "Blue"
    ],
    "isBWImg": false,
    "isBwImg": false
  },
  "description": {
    "captions": [
      {
        "confidence": 0.36200742539912,
        "text": "a statue of a person in a swimming pool"
      }
    ],
    "tags": [
      "sport",
      "swimming",
      "statue",
      "water",
      "small",
      "sitting",
      "air",
      "man",
      "holding",
      "flying",
      "yellow",
      "riding",
      "boat",
      "blue"
    ]
  },
  "metadata": {
    "format": "Jpeg",
    "height": 3456,
    "width": 5184
  },
  "requestId": "051a3c75-c778-4faa-b212-b7deba67fe74",
  "tags": [
    {
      "confidence": 0.9971852898597717,
      "name": "sky"
    },
    {
      "confidence": 0.9857053160667419,
      "name": "reef"
    },
    {
      "confidence": 0.9742870926856995,
      "name": "sport"
    },
    {
   

<br><br><br>
The cognitive service has more difficulty with this image, describing it as "a statue of a person in a swimming pool".  It comprehends that there is swimming and a sport involved. Ocean floor, diving, and scuba are tags that received a low confidence when they should have been high, so the API didn't guess this one very well. Look at the confidence values for each term in the JSON.
<br><br>
### - This API call returned a less accurate result
### - "a statue of a person in a swimming pool"
### - Ocean floor, diving, and scuba tags all less than 2% confidence!

<br><br><br><br><br><br>

## Image 3: Scuba Diving

### Another scuba diving image
<br><br>
![](http://www2.padi.com/blog/wp-content/uploads/2016/10/scuba-diving-reef-e1476482719834.jpg)
<br><br><br>

In [None]:
### Set next image of diver
### Call cognitive service API
### Return JSON response

In [None]:
body = "{'url':'http://www2.padi.com/blog/wp-content/uploads/2016/10/scuba-diving-reef-e1476482719834.jpg'}"

try:
    # Execute the REST API call and get the response.
    conn = http.client.HTTPSConnection('hackingstem.cognitiveservices.azure.com')
    conn.request("POST", "/vision/v1.0/analyze?%s" % params, body, headers)
    response = conn.getresponse()
    data = response.read()

    # 'data' contains the JSON data. The following formats the JSON data for display.
    parsed = json.loads(data.decode())
    print ("Response:")
    print (json.dumps(parsed, sort_keys=True, indent=2))
    conn.close()

except Exception as e:
    print('Error:')
    print(e)

####################################

<br><br><br>
The results are similar to the previous image, with less accuracy. There is a slightly higher confidence on `underwater`, but lowered confidence on `swimming` and 'ocean floor'. The description is 'a group of people in a swimming pool'. One of the largest software companies in the world built these services, leading us to believe that cognitive services are hard to build.
<br><br>
### - Even less accurate
### - "a group of people in a swimming pool"
### - Conclusion: Cognitive services are hard

<br><br><br><br><br><br>

## Image 4: Skydabbing

### Skydiving image

Skydiving is visually similar to scuba diving, and both the sky and sea are blue, which could also pose a challenge to the service's cognition, so let's see how the computer vision service does in the sky. 
<br><br>

![](https://pbs.twimg.com/media/ClQSCsgUgAA_i3B.jpg)

<br><br><br>

### Set image of skydiver
### Call cognitive service API
### Return JSON response

In [5]:
body = "{'url':'https://pbs.twimg.com/media/ClQSCsgUgAA_i3B.jpg'}"

try:
    # Execute the REST API call and get the response.
    conn = http.client.HTTPSConnection('hackingstem.cognitiveservices.azure.com')
    conn.request("POST", "/vision/v1.0/analyze?%s" % params, body, headers)
    response = conn.getresponse()
    data = response.read()

    # 'data' contains the JSON data. The following formats the JSON data for display.
    parsed = json.loads(data.decode())
    print ("Response:")
    print (json.dumps(parsed, sort_keys=True, indent=2))
    conn.close()

except Exception as e:
    print('Error:')
    print(e)

####################################

Response:
{
  "color": {
    "accentColor": "2C497D",
    "dominantColorBackground": "Grey",
    "dominantColorForeground": "Blue",
    "dominantColors": [
      "Grey",
      "Blue"
    ],
    "isBWImg": false,
    "isBwImg": false
  },
  "description": {
    "captions": [
      {
        "confidence": 0.931517660727122,
        "text": "a man flying through the air on a mountain"
      }
    ],
    "tags": [
      "outdoor",
      "air",
      "mountain",
      "flying",
      "man",
      "view",
      "jumping",
      "high",
      "snow",
      "hill",
      "plane",
      "doing",
      "airplane",
      "water",
      "riding",
      "skiing",
      "trick",
      "board",
      "city",
      "ramp"
    ]
  },
  "metadata": {
    "format": "Jpeg",
    "height": 900,
    "width": 1200
  },
  "requestId": "f39d89f5-0818-45cd-a208-cd9b440ece74",
  "tags": [
    {
      "confidence": 0.9998109936714172,
      "name": "sky"
    },
    {
      "confidence": 0.9982977509498596,
      "

<br><br><br>
The service nails the location with a high confidence on 'air', 'sky' and 'outdoor'.  The addition of 'mountain' is likely due to the slanted horizon, a tough call unless you know that this is skydiving, and 'skydiving' and 'jump' are down around 10-15%. The description is "a man flying through the air on a mountain", so cognitive services detects this as an aerial photo with high confidence but is not versed in hip-hop dance moves, which block the diver's face, making it difficult to determine whether this is a male or female. 
<br><br>
### - Service understands that the image is in the sky
### - Misinterprets horizon for 'mountain'
### - 'a man flying through the air on a mountain'
### - The diver's dance moves obscure her face

<br><br>