<a href="https://colab.research.google.com/github/lakhanrajpatlolla/aiml-learning/blob/master/U4W20_70_Keras_Video_Processing_Screen_Time_C.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint


## Learning Objective

At the end of the experiment, you will be able to :

* calculate the screen time of a character from a given video using deep learning

In [None]:
#@title Experiment Walkthrough Video

from IPython.display import HTML

HTML("""<video width="800" height="400" controls>
  <source src="https://cdn.talentsprint.com/talentsprint1/archives/sc/aiml/keras_video_processing_screen_time.mp4" type="video/mp4">
</video>
""")

## Dataset

### History

The screen time of an actor/character in a movie or an episode is very important. Many actors get paid according to their total screen time. Moreover, we also want to know how much time our favorite character acted on screen. So, have you ever wondered how can you calculate the total screen time of an actor? One of the best ways is by using deep learning.

### Description

We will use a video clip of **`Tom and Jerry`** cartoon series and the model shall be trained on a video. The downloaded data is in the form of a video, which is nothing but a collection of a set of images. These images are called frames and can be combined to get the original video. So, a problem related to video data is not that different from an image classification or an object detection problem. There is just one extra step of extracting frames from the video.

The Model will be evaluated (tested) on another video of **`Tom and Jerry`**

### Setup Steps:

In [1]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id = "2418775" #@param {type:"string"}

In [2]:
#@title Please enter your password (normally your phone number) to continue: { run: "auto", display-mode: "form" }
password = "9959000490" #@param {type:"string"}

In [3]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython
import re
ipython = get_ipython()

notebook= "U4W20_70_Keras_Video_Processing_Screen_Time_C" #name of the notebook

def setup():
#  ipython.magic("sx pip3 install torch")
    from IPython.display import HTML, display
    ipython.magic("sx wget -qq https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/Video_Processing/Train_data.zip")
    ipython.magic("sx unzip --q Train_data.zip")
    ipython.magic("sx wget -qq https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/Video_Processing/Test_data.zip")
    ipython.magic("sx unzip --q Test_data.zip")
    display(HTML('<script src="https://dashboard.talentsprint.com/aiml/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():
    ipython.magic("notebook -e "+ notebook + ".ipynb")

    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword()}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:
        print(r["err"])
        return None
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None

    elif getAnswer() and getComplexity() and getAdditional() and getConcepts() and getWalkthrough() and getComments() and getMentorSupport():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional,
              "concepts" : Concepts, "record_id" : submission_id,
              "answer" : Answer, "id" : Id, "file_hash" : file_hash,
              "notebook" : notebook, "feedback_walkthrough":Walkthrough ,
              "feedback_experiments_input" : Comments,
              "feedback_inclass_mentor": Mentor_support}

      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:
        print(r["err"])
        return None
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://learn-iiith.talentsprint.com/notebook_submissions")
        #print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
        return submission_id
    else: submission_id


def getAdditional():
  try:
    if not Additional:
      raise NameError
    else:
      return Additional
  except NameError:
    print ("Please answer Additional Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None

def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None


def getWalkthrough():
  try:
    if not Walkthrough:
      raise NameError
    else:
      return Walkthrough
  except NameError:
    print ("Please answer Walkthrough Question")
    return None

def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None


def getMentorSupport():
  try:
    if not Mentor_support:
      raise NameError
    else:
      return Mentor_support
  except NameError:
    print ("Please answer Mentor support Question")
    return None

def getAnswer():
  try:
    if not Answer:
      raise NameError
    else:
      return Answer
  except NameError:
    print ("Please answer Question")
    return None


def getId():
  try:
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup()
else:
  print ("Please complete Id and Password cells before running setup")



Setup completed successfully


### Importing required packages

In [None]:
import os
import cv2
import math
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


import keras
from keras.models import Sequential
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from sklearn.model_selection import train_test_split
from keras.layers import Dense, InputLayer, Dropout

### Loading the video

In [None]:
train_videoFile = "/content/Video_Processing/Tom_and_jerry_train.mp4"

# Create a directory to store all the frames
all_images = "/content/all_images"
os.mkdir(all_images)

### Read the video, extract frames from it and save them as images

For this task, use OpenCV as shown below and extract frames for every second

In [None]:
def getVideoFrames(videopath, imagespath):

  # Open the Video file using cv2.VideoCapture()
  # Capturing the video from the given path
  cap = cv2.VideoCapture(videopath)

  # Frame rate of the video
  # How many frames per video you want to capture
  frameRate = cap.get(5)
  i = 0

  while True:
      # Current frame number
      # Capturing one frame per second
      frameId = cap.get(1)

      # Read frame by frame
      ret, frame = cap.read()

      if ret == False:
          break

      # Extract one frame for each second
      if (frameId % math.floor(frameRate) == 0):

          # Save each frame using cv2.imwrite()
          cv2.imwrite(imagespath+'/frame'+str(i)+'.jpg',frame)
          i+=1

  # After loop release the VideoCapture and destroy all windows
  cap.release()
  cv2.destroyAllWindows()
  return "Successfully extracted the images!!"

In [None]:
getVideoFrames("/content/Train_data/Tom_and_jerry_train.mp4", all_images)

### Let us visualize an image (frame)

We will first read the image using the imread() function of matplotlib, and then plot it using the imshow() function.

In [None]:
img = plt.imread('/content/all_images/frame0.jpg')   # Reading the image by its name
plt.imshow(img)

Since the duration of the video is 4:58 minutes (298 seconds), we now have 298 images in total.

In this problem, there are three classes as mentioned below and hence it is a multi-class classification problem



```
0 – The frame has neither JERRY nor TOM
1 – JERRY is in the frame
2 – TOM is in the frame
```

The `train_labels.csv` contains the respective labels for each extracted frame.


### Label images for training the model

In [None]:
train_labels = '/content/Train_data/train_labels.csv'

In [None]:
df_train = pd.read_csv(train_labels)

classes = df_train['Class'].unique().astype('str')
print("Classes:", classes)

df_train['Image_ID'] = df_train['Image_ID'].apply(lambda x: all_images+'/'+x)

labels = {0:'None', 1: 'Jerry', 2: 'Tom'}
df_train['Labels'] = [labels[each] for each in df_train['Class']]

df_train.head()

### Visualize few Images

In [None]:
eachClass = df_train.groupby('Class').first()
eachClass

In [None]:
for index, row in eachClass.iterrows():
    img = plt.imread(row['Image_ID'])   # Reading the image by its name
    plt.title(row['Labels'])
    plt.imshow(img)
    plt.show()

### Input Data and Preprocessing

To prepare this extracted images data as input to our neural network, the  below mentioned preprocessing steps are to be followed:

* Read all images one by one
* Resize each image to (224, 224, 3) for the input to the model

In [None]:
def resizeFeatures(image_filenames):
  res_img = []
  for each_img in image_filenames:
    img = plt.imread(each_img)
    resized_img = cv2.resize(img, (224,224)).astype(int)
    res_img.append(resized_img)

  features = np.array(res_img)
  return features

In [None]:
features = resizeFeatures(df_train['Image_ID'])
features.shape

Since there are three classes, we will one hot encode them using the `to_categorical()` function of `keras.utils`

In [None]:
from tensorflow.keras.utils import to_categorical

y = df_train["Class"]

# one hot encoding Classes
one_hot_y = to_categorical(y)
one_hot_y.shape

## Transfer Learning

Since we have only 298 images, so it will be difficult to train a neural network with this little dataset. Here comes the concept of transfer learning.

With the help of transfer learning, we can use features generated by a model trained on a large dataset into our model. Here we will use the VGG16 model trained on the “imagenet” dataset. For this, we are using TensorFlow high-level API Keras. With Keras, you can directly import the VGG16 model as shown in the code below.

In [None]:
# include_top=False to remove the top layer
vgg_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

VGG16 model trained with imagenet dataset predicts on lots of classes, but in this problem, we are only having three classes, either `"Tom" or "Jerry" or "None"`.

That’s why above we are using `include_top = False`, which signifies that we are not including fully connected layers from the VGG16 model.

Before passing any input to the model, it is important to preprocess it as per the model’s requirement. Use the `preprocess_input()` function of `keras.applications.vgg16` to perform this step.

In [None]:
# Preprocessing the input data
X = preprocess_input(features)

Generate a validation set using the  `train_test_split()` function of the sklearn to check the performance of the model on unseen images.

In [None]:
# Preparing the validation set
X_train, X_valid, y_train, y_valid = train_test_split(X, one_hot_y, test_size=0.3, random_state=42)

print("Training Features:", X_train.shape)
print("Training Labels:", y_train.shape)
print("Validation Features:", X_valid.shape)
print("Validation Labels:", y_valid.shape)

Pass the above extracted `X_train and X_valid` features as **input to the pre-trained `vgg_model`** and get the predicted data and then use those features to retrain the model.

In [None]:
X_train_predicted = vgg_model.predict(X_train)
X_valid_predicted = vgg_model.predict(X_valid)

print("Training Features:", X_train_predicted.shape)
print("Validation Features:", X_valid_predicted.shape)

### VGG16
![picture](https://miro.medium.com/max/788/1*_Lg1i7wv1pLpzp2F4MLrvw.png)

Notice that the output features from VGG16 model will be having shape `7*7*512`

Since we are not including fully connected layers from the VGG16 model, we need to create a model with some fully connected layers and an output layer with 3 classes, either `"Tom" or "Jerry" or "None"`.

In order to pass the above extracted `X_train and X_valid` features to our neural network, we have to reshape it to 1-D, which will be an input shape for our model.

In [None]:
# Converting to 1-D
X_train_reshaped = X_train_predicted.reshape(208, 7*7*512)
X_valid_reshaped = X_valid_predicted.reshape(90, 7*7*512)

print("Training Features:", X_train_reshaped.shape)
print("Validation Features:", X_valid_reshaped.shape)

Now, preprocess the images and normalize by dividing the vector with it's maximum value, which helps the model to converge faster.

In [None]:
# Normalize the data
X_train_centered = X_train_reshaped/X_train_reshaped.max()
X_valid_centered = X_valid_reshaped/X_train_reshaped.max()

print("Training Features:", X_train_centered.shape)
print("Validation Features:", X_valid_centered.shape)

### Building the model

In [None]:
model = Sequential()

model.add(InputLayer(shape=(7*7*512,))) # Input layer

model.add(Dense(1024, activation='sigmoid')) # Hidden layer

model.add(Dropout(0.5)) # Dropout layer

model.add(Dense(512, activation='sigmoid')) # Hidden layer

model.add(Dropout(0.5)) # Dropout layer

model.add(Dense(256, activation='sigmoid')) # Hidden layer

model.add(Dropout(0.5)) # Dropout layer

model.add(Dense(3, activation='softmax')) # Output layer

In [None]:
print(model.summary())

### Compiling the model

In [None]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

### Training the model

In [None]:
history = model.fit(X_train_centered, y_train, epochs=100, validation_data=(X_valid_centered, y_valid))

### Evaluate the model

Calculating the screen time on Test Data

In [None]:
# Create a directory to store all the frames
test_images = "test_images"
os.mkdir(test_images)

In [None]:
getVideoFrames("/content/Test_data/Tom_and_Jerry_test.mp4", test_images)

### Load the Test Data

Iterate over the **`Test Images`** directory to extract all the test Id's

In [None]:
def getIDs(directory):
  ids = []
  for filename in os.listdir(directory):
      if filename.endswith(".jpg"):
          ids.append(os.path.join(directory, filename))
      else:
          continue
  return ids

In [None]:
test_ids = getIDs(test_images)
len(test_ids)

Since the duration of the video is 3:1 minutes (186 seconds), we now have 186 images in total.


### Input Data and Preprocessing

In [None]:
test_features = resizeFeatures(test_ids)

In [None]:
# preprocessing the images
preprocessed_features = preprocess_input(test_features)

# extracting features from the images using pretrained model
output_features = vgg_model.predict(preprocessed_features)

# converting the images to 1-D form
reshaped_features = output_features.reshape(test_features.shape[0], 7*7*512)

# Normalized images
zero_centered_features = reshaped_features/reshaped_features.max()

### Make predictions on the test images

In [None]:
pred = model.predict(zero_centered_features)
predictions =np.argmax(pred,axis=1)

print("The screen time of JERRY is", predictions[predictions==1].shape[0], "seconds")
print("The screen time of TOM is", predictions[predictions==2].shape[0], "seconds")
print("The screen time of Neither JERRY nor TOM is", predictions[predictions==0].shape[0], "seconds")

### Please answer the questions below to complete the experiment:




In [4]:
#@title State True or False: In VGG16 model, the parameter 'include_top = False' specifies to include the fully-connected layer at the top of the network { run: "auto", form-width: "500px", display-mode: "form" }
Answer = "FALSE" #@param ["","TRUE", "FALSE"]

In [5]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "Good and Challenging for me" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [6]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "good" #@param {type:"string"}


In [7]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "Yes" #@param ["","Yes", "No"]


In [8]:
#@title  Experiment walkthrough video? { run: "auto", vertical-output: true, display-mode: "form" }
Walkthrough = "Somewhat Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [9]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [10]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [11]:
#@title Run this cell to submit your notebook for grading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")

Your submission is successful.
Ref Id: 2450
Date of submission:  05 Apr 2025
Time of submission:  15:26:02
View your submissions: https://learn-iiith.talentsprint.com/notebook_submissions
