In [1]:
import os
import io

from IPython.display import Video
from google.cloud import videointelligence_v1 as videointelligence

In [2]:
os.environ["PROJECT_ID"] = "text-analysis-323506"

### Create a new Google Cloud service account to access the APIs programmatically.

__Don't run the following cell if you already have setup a cloud service account and have it's associated key__.

Following command creates a Google Cloud Service account, which is required to access APIs. This command should be run only once. When ran for the first time, it create a service account, details of which can be found in IAM & Admin section of GCP console.

After that, it creates a Google Cloud Service account key got the newly created account. Key is stored at /home/jupyter directory. Download this key. Upload it the next time when this notebook is to be run again. __Following cell should not be run more than once, unless we want to create a new cloud service account__.

In [None]:
%%bash

# These should be run only once. These commands setup a cloud service account with required permissions to call APIs
gcloud iam service-accounts create my-api-sa --display-name "api account"
gcloud iam service-accounts keys create ~/key.json --iam-account my-api-sa@${PROJECT_ID}.iam.gserviceaccount.com

Following command stores path to json file in an environment variable. This is required to access API endpoints using python.

In [3]:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = os.path.abspath("/home/jupyter/key.json")

#### Create a Google Cloud Video Intelligence API client object

In [4]:
video_client = videointelligence.VideoIntelligenceServiceClient()

### Important: 
#### Enable Cloud Video Intelligence API and Cloud Text-to-Speech API for this project in Google Cloud Platform console.

## Label Detection

Detects everything it can detect in the input video, labels it and returns it. For each label, info about what part of the video was it detected in is also returned.

For this demo i'm using an animated video sample of 15 seconds i found on the internet. It can be found under the 'data' folder in the same directory this notebook is located. 

In [15]:
video_file_path = './data/animated_sample.mp4'

In [16]:
Video(video_file_path)

In [17]:
with io.open(video_file_path, "rb") as f:
    input_content = f.read()

In [18]:
# create request
operation = video_client.annotate_video(
    request={
        "features": [videointelligence.Feature.LABEL_DETECTION],
        "input_content": input_content
    }
)

In [19]:
response = operation.result(timeout=300)

#### Major Labels returned by API

In [20]:
for annotations in response.annotation_results:
    for shot in annotations.shot_label_annotations :
        print(shot.entity.description)

rabbit
animation
nature
wildlife
animal
grass
hare
sky


#### Dumping response to a text file

In [21]:
with open('./results/label_detection.txt', 'w') as fstream:
    fstream.write(str(response))

Label Detection endpoint returns way more information that what i printed on screen. Find it in label_detection.txt under 'results' folder

## Logo Detection

Detects logos in input video

For this demo i'm using an advertisement video by pepsi

In [53]:
video_file_path = './data/pepsi_advertisement.mp4'

In [54]:
Video(video_file_path)

In [55]:
 with io.open(video_file_path, "rb") as f:
    input_content = f.read()

In [56]:
# create request
operation = video_client.annotate_video(
    request={
        "features": [videointelligence.Feature.LOGO_RECOGNITION],
        "input_content": input_content
    }
)

In [57]:
response = operation.result(timeout=300)

#### Logos returned by API

In [60]:
for annotations in response.annotation_results:
    for logo in annotations.logo_recognition_annotations:
        print(f"Logo: {logo.entity.description}, Confidence: {logo.tracks[0].confidence}")

Logo: Pepsi, Confidence: 0.9548713564872742
Logo: UEFA, Confidence: 0.9070633053779602
Logo: Renault Samsung Motors, Confidence: 0.8907700777053833


#### Dumping response to a text file

In [61]:
with open('./results/logo_detection.txt', 'w') as fstream:
    fstream.write(str(response))

## Face Detection

Detects faces in input video

For this demo i'm using the same video i have used for logo detection. (pepsi advertisement video)

In [62]:
video_file_path = './data/pepsi_advertisement.mp4'

In [63]:
Video(video_file_path)

In [64]:
 with io.open(video_file_path, "rb") as f:
    input_content = f.read()

In [66]:
# Configure the request
config = videointelligence.FaceDetectionConfig(
    include_bounding_boxes=True, include_attributes=True
)

In [67]:
context = videointelligence.VideoContext(face_detection_config=config)

In [68]:
# create request
operation = video_client.annotate_video(
    request={
        "features": [videointelligence.Feature.FACE_DETECTION],
        "input_content": input_content,
        "video_context": context,
    }
)

In [69]:
response = operation.result(timeout=300)

#### Dumping response to a text file

In [71]:
with open('./results/face_detection.txt', 'w') as fstream:
    fstream.write(str(response))

Response is too big to be displayed. Find it in results/face_detection.txt. API returns bounding boxes for each of the face found along with much more additional information like emotion of the detected face, whether the detected face is looking at the camera or not and many more.   