# Object Detection of YouTube Thumbnails
This notebook aims to download all thumbnail images and to run yolov3 object detection model to detect the presence of, and the number of humans in an image. 

In [1]:
# Import packages
import os
import cv2
import numpy as np
import pandas as pd
import urllib.request

In [2]:
# Change to your own directory
try: 
    os.chdir("C:/Users/Jiayi/Documents/GitHub/youtube_analysis")
    print("Directory changed")
except OSError:
    print("Can't change the Current Working Directory")      

Directory changed


## Download Thumbnail Images (locally)
All 400k images are downloaded into my local computer and it took roughly 20 hours to do so. 

In [3]:
videos_df = pd.read_csv('data/videos_df.csv')

In [4]:
videos_df.shape

(408690, 10)

In [5]:
videos_df.head()

Unnamed: 0,channelId,description,publishedAt,videoId,thumbnails,videoTitle,commentCount,dislikeCount,likeCount,viewCount
0,UCjOl2AUblVmg2rA_cRgZkFg,"In this week's Top Gear, Flintoff gets his han...",2020-10-09 09:35:54,NXX338WY_Lw,https://i.ytimg.com/vi/NXX338WY_Lw/default.jpg,PREVIEW: Attempting 200mph in the Jaguar XJ220...,528.0,257.0,5819.0,184447.0
1,UCjOl2AUblVmg2rA_cRgZkFg,"From the humble new Volkswagen GTI, right down...",2020-10-09 11:21:23,dtHcdU2c71Y,https://i.ytimg.com/vi/dtHcdU2c71Y/default.jpg,Which car will win Top Gear Speed Week 2020? (...,568.0,273.0,7136.0,217619.0
2,UCjOl2AUblVmg2rA_cRgZkFg,Here's Chris Harris' take on the rocket-disgui...,2020-10-07 07:40:22,vnrtWe-RAzg,https://i.ytimg.com/vi/vnrtWe-RAzg/default.jpg,Chris Harris on... the Ferrari SF90 Stradale |...,1091.0,408.0,10189.0,437777.0
3,UCjOl2AUblVmg2rA_cRgZkFg,"16 contenders, 8,553bhp and a festival to reme...",2020-10-06 13:59:38,Ra1F0TsOCPs,https://i.ytimg.com/vi/Ra1F0TsOCPs/default.jpg,Chris Harris vs 2020’s Best Performance Cars |...,579.0,202.0,7126.0,191070.0
4,UCjOl2AUblVmg2rA_cRgZkFg,"The 986bhp Ferrari SF90 is, unsurprisingly, no...",2020-10-06 07:36:13,fXysipmTxcQ,https://i.ytimg.com/vi/fXysipmTxcQ/default.jpg,FASTEST TOP GEAR LAP? Ferrari SF90 Stiglap | T...,888.0,168.0,9697.0,572569.0


In [6]:
# Checking if there are duplicate videoId
print(videos_df['videoId'].duplicated().any())

False


In [None]:
# Downloading thumbnail images 
videocount = len(videos_df)
errors_videoId = []

if not os.path.exists('images'):
    print("Creating images folder")
    os.makedirs('images')

for index, row in videos_df.iterrows():
    url = row['thumbnails']
    filename = 'images/' + row['videoId'] + '.jpg'
    
    if os.path.exists(filename):
        #print("[" + str(index+1) + "/" + str(videocount) + "] " + filename + " already exists")
        continue
    else:
        try:
            urllib.request.urlretrieve(url, filename)
        except Exception as e:
            #print("[" + str(index+1) + "/" + str(videocount) + "] " + str(e))
            errors_videoId.append(row['videoId'])
            continue
        print("[" + str(index+1) + "/" + str(videocount) + "] " + filename + " saved")

In [None]:
errors_videoId

In [None]:
len(errors_videoId)

## Running Object Detection Model (Locally)
Only recommend running object detection locally if there are high computational specs. Estimated to take at least 8 days to run.

In [None]:
#Load YOLO
net = cv2.dnn.readNet("models/object_detection/yolov3.weights","models/object_detection/yolov3.cfg")
classes = []
with open("models/object_detection/coco.names","r") as f:
    classes = [line.strip() for line in f.readlines()]

In [None]:
layer_names = net.getLayerNames()
outputlayers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]

In [None]:
imagecount = 0
rows = []

for filename in os.listdir('images'):
    if filename.endswith(".jpg"): # Omit non-images files
        imagecount += 1
        itemcount = 0
        itemdict = {}
        img = cv2.imread("images/" + filename)
        print("Image [" + str(imagecount) + "]: " + filename)
        
        blob = cv2.dnn.blobFromImage(img,0.00392,(416,416),(0,0,0),True,crop=False)
        net.setInput(blob)

        # Detecting Objects
        outs = net.forward(outputlayers)
        # Saving item detected
        for out in outs:
            for detection in out:
                scores = detection[5:]
                class_id = np.argmax(scores)
                confidence = scores[class_id]
                if confidence > 0.5:
                    itemcount += 1
                    
                    itemdict[class_id] = (itemdict[class_id] + 1 if class_id in itemdict else 1)
                    
        rows.append([filename[:-4], itemcount, itemdict])
        print("Item Count: " + str(itemcount) + ", Item Dict: " + str(itemdict))

In [None]:
object_detection_df = pd.DataFrame(rows, columns=["videoId", "itemCount", "itemDict"])

In [None]:
object_detection_df.to_csv('data/object_detection_df.csv', index=False)

## Upload Thumbnail Images (AWS)

This could have been further simplified if a script was written to scrape from the URLs directly into AWS S3.

In this case, to upload thumbnail images locally to AWS S3, install AWS CLI on your computer first. Then, check if it has been installed by running `aws --version` on your command prompt. Insert your AWS credentials by running `aws configure`, and check if it has been saved correctly by running `aws configure list`. <br>

Once that has been set up, navigate to the folder of images. In my case, the command used is: `cd [path]/youtube_analysis/images`. I have created an accesspoint on AWS S3 and I will be able to upload the all images in the folder by running the following command `aws s3 cp . s3://[amazon resource name (ARN)]:accesspoint/[accesspoint name]/[bucket folder] --recursive`.

The upload of these photos took about 6-8 hours, faster methods such as s3-parallel-put can be explored as an alternative.

## Running Object Detection Model (AWS)
Using a ml.m5.24xlarge notebook instance, it took roughly 36 hours to run.

In [None]:
# Import Packages
import json
import math
import os
import shutil
import subprocess as sb
import tarfile
from io import BytesIO
import csv

import boto3
import gluoncv # !pip install gluoncv
from gluoncv import model_zoo, data, utils
from matplotlib import pyplot as plt
import mxnet as mx
from mxnet import gluon, image, nd

import sagemaker
from sagemaker import get_execution_role
from sagemaker.mxnet.model import MXNetModel

In [None]:
# Define Bucket
s3_bucket = 'sagemaker-us-east-2-281536989307'
print('Using bucket: ' + s3_bucket)

In [None]:
# Download yolov3 model for use
model_name = 'yolo3_darknet53_coco'
net = model_zoo.get_model(model_name, pretrained=True)

In [None]:
# Reset the detector to the "person" class only
# By default the model will have 80 classes from coco.names in net.classes but they are not needed
classes = ['person']
net.reset_class(classes=classes, reuse_weights=classes)
print('New classes: ', net.classes)

In [None]:
net.hybridize()  # Hybridize to optimize computation

In [None]:
# Initiate Paginator to iterate through files in S3 Folder
image_folder = 'A0185610J/images'

s3 = boto3.client('s3')
paginator = s3.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket=s3_bucket, Prefix=image_folder)

errors = [] # To note if errors occur

# Adding humanCount Column
with open('data/object_detection_df.csv', "w", newline='\n') as csv_file:
    writer = csv.writer(csv_file, delimiter=',')
    writer.writerow(['videoId', 'humanCount'])

    for page in pages:
        for obj in page['Contents']:
            image_path = obj['Key']
            if image_path.endswith('.jpg'): # Omit non-images files and the main file name
                s3_key = image_path
                videoId = str(image_path.split('/')[-1].split('.')[0])

                try:
                    with BytesIO() as f:
                        boto3.client("s3").download_fileobj(Bucket=s3_bucket, Key=s3_key, Fileobj=f)
                        f.seek(0)
                        img = plt.imread(f, format='jpg')
                        #plt.imshow(img) #to show the image

                        im_array_mx = mx.ndarray.array(img)
                        x, orig_img = data.transforms.presets.yolo.transform_test(im_array_mx)

                        box_ids, scores, bboxes = net(x)

                        # Convert to numpy array and removing undetected rows (default -1)
                        scores = scores[0].asnumpy()
                        scores = scores[(scores != -1)]

                        humanCount = str(len([e for e in scores if e > 0.2])) #Threshold of 0.2 probability

                        writer.writerow([videoId, humanCount]) # Input in csv file format
                        f.close()
                except:
                    errors.append(videoId)

In [None]:
# In case of disconnection, run this code that allows continuation of progress

# # Initiate Paginator to iterate through files in S3 Folder
# image_folder = 'A0185610J/images'

# s3 = boto3.client('s3')
# paginator = s3.get_paginator('list_objects_v2')
# pages = paginator.paginate(Bucket=s3_bucket, Prefix=image_folder)

# errors = [] # To note if errors occur
# df = pd.read_csv('data/object_detection_df.csv') # continuation
# savedVideoId = df['videoId'].values # continuation

# with open('data/object_detection_df.csv', "a", newline='\n') as csv_file: # continuation
#     writer = csv.writer(csv_file, delimiter=',')
#     #writer.writerow(['videoId', 'humanCount']) # continuation

#     for page in pages:
#         for obj in page['Contents']:
#             image_path = obj['Key']
#             if image_path.endswith('.jpg'): # Omit non-images files and the main file name
#                 videoId = str(image_path.split('/')[-1].split('.')[0])
                
#                 if videoId in savedVideoId: # skip processed videos, continuation
#                     continue
                
#                 try:
#                     with BytesIO() as f:
#                         boto3.client("s3").download_fileobj(Bucket=s3_bucket, Key=image_path, Fileobj=f)
#                         f.seek(0)
#                         img = plt.imread(f, format='jpg')
#                         #plt.imshow(img) #to show the image

#                         im_array_mx = mx.ndarray.array(img)
#                         x, orig_img = data.transforms.presets.yolo.transform_test(im_array_mx)

#                         box_ids, scores, bboxes = net(x)

#                         # Convert to numpy array and removing undetected rows (default -1)
#                         scores = scores[0].asnumpy()
#                         scores = scores[(scores != -1)]

#                         humanCount = str(len([e for e in scores if e > 0.2])) #Threshold of 0.2 probability

#                         writer.writerow([videoId, humanCount]) # Input in csv file format
#                         f.close()
#                 except:
#                     errors.append(videoId)

In [7]:
# Adding humanPresence Column
object_detection_df = pd.read_csv('data/object_detection_df.csv')
object_detection_df['humanPresence'] = np.where(df['humanCount'] > 0, 1, 0)

In [8]:
object_detection_df.head()

Unnamed: 0,videoId,humanCount,humanPresence
0,---E9WQc-78,2,1
1,---M5RE8nJo,1,1
2,--0xL_wWq_0,2,1
3,--19fKTapfY,2,1
4,--2dRWL5rjE,0,0


In [9]:
object_detection_df.to_csv('data/object_detection_df.csv', index=False)