### Some work with Image APIs

So far, we have spent time looking at the ways in which "unstructured data" like text might be broken into a series of descriptors -- from "likes" to words in a post to the sentiment or some other characterization of a sentence. We now move up one level in complexity and consider images. 
<br><br>

<img src="https://github.com/computationaljournalism/columbia2018/raw/master/images/abc.jpg" style="width: 65%; border: #000000 1px outset;">
<br><br>

Recall that we can display images in the Jupyter notebook using a class `Image` from the `IPython` package. Here we look at three pictures of, well, my dog.

In [None]:
from IPython.display import Image

In [None]:
url1 = "https://github.com/computationaljournalism/columbia2018/raw/master/images/IMG_0254.jpg"
url2 = "https://github.com/computationaljournalism/columbia2018/raw/master/images/IMG_4627.jpg"
url3 = "https://github.com/computationaljournalism/columbia2018/raw/master/images/IMG_3781.jpg"

In [None]:
Image(url=url1,width=300)

What do you see when you look at this image? A sign that, for a moment at least, spring had arrived in NYC? The stoic look of my dog? The fact that her leash leads back to me, the person forcing her to hold still for the camera? Maybe you see the dry leaves or the barren tree limbs -- soul-crushing reminders of a winter that won't let go... but I digress.

Let's try out some tools for pulling information from images. We will use a commodity API from [Google called Cloud Vision](https://cloud.google.com/vision). To use it, we need to update an obscure package I won't spend time on and then install the `google-cloud-vision` package.
<br><br>

<img src="https://github.com/computationaljournalism/columbia2018/raw/master/images/cv.jpg" style="width: 75%; border: #000000 1px outset;">
<br><br>


In [None]:
%%sh
pip install protobuf --upgrade

In [None]:
%%sh
pip install --upgrade google-cloud-vision

You will need to sign up for an API key. For this application, the key is a JSON file. You will create a login for Google's Cloud Services (again, they want a credit card but promise they won't charge you without asking -- you get 1,000 requests for free). You download the credentials JSON file and store it in the same folder as this notebook.

We then create an "environmental variable" that points to the location of this file.

In [None]:
from os import environ
environ["GOOGLE_APPLICATION_CREDENTIALS"]="My Project-f9004cae62d3.json"

To use the API, we will need to import two objects. One will let us make calls to the API and the other will structure image data in a way that can be analyzed by Google's algorithms. 

In [None]:
# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types

Now we create our API client...

In [None]:
client = vision.ImageAnnotatorClient()

So, now let's use the `requests` package to `get()` our image url, use the data to create an `Image` type (this time Google's `Image` object) and then apply the `label_detection()` algorithm to annotate the image.

Let's have a look at the labels it finds.

In [None]:
from requests import get

response = get(url1)
image = types.Image(content=response.content)

# Performs label detection on the image file
from_google = client.label_detection(image=image)
labels = from_google.label_annotations

In [None]:
labels

Here is the same process but when we download a file and put it in the same folder as this notebook. If it's not in the same folder, we expand the `file_name` to include the full path. Here we read the content and do what we did when we used `requests`. 

In [None]:
# Download an image and put it in the same folder as this notebook
file_name = "IMG_0254.jpg"

image_file = open(file_name,"rb")
image_content = image_file.read()

image = types.Image(content=image_content)

# Performs label detection on the image file
from_google = client.label_detection(image=image)
labels = from_google.label_annotations

In [None]:
labels

**A story on Twitter**

Miguel Diaz-Canel is trending on Twitter (why?). Let's have a look at the images that are associated with him.

In [None]:
CONSUMER_KEY = ""
CONSUMER_SECRET = ""
ACCESS_TOKEN = ""
ACCESS_TOKEN_SECRET = ""

from tweepy import OAuthHandler, API

# setup the authentication
auth = OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

# create an object we will use to communicate with the Twitter API
api = API(auth)

Now search Twitter for his name and store any URLs pointing to an image in a list called `image_urls`.

In [None]:
tweets = api.search("Miguel Diaz-Canel",count=500)

image_urls = []

for tweet in tweets:
    if "media" in tweet.entities:
        for m in tweet.entities["media"]:
            image_urls.append(m["media_url"])

In [None]:
image_urls

Here we look at the label annotations for this class of image...

In [None]:
from requests import get

# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types

from os import environ
environ["GOOGLE_APPLICATION_CREDENTIALS"]="My Project-f9004cae62d3.json"

client = vision.ImageAnnotatorClient()

for url in image_urls:

    response = get(url)
    image = types.Image(content=response.content)

    # Performs label detection on the image file
    from_google = client.label_detection(image=image)
    labels = from_google.label_annotations
    
    print(url)
    print(labels)
    print("--"*10)

... or better yet, have the web lend us a hand and identify where this image has occured, name who it's of and so on.

In [None]:
for url in image_urls:

    response = get(url)
    image = types.Image(content=response.content)

    # Performs label detection on the image file
    from_google = client.web_detection(image=image)
    labels = from_google.web_detection
    
    print(url)
    print(labels)
    print("--"*10)