<a href="https://colab.research.google.com/github/lisaong/mldds-courseware/blob/master/04_SpeechTimeSeries/examples/video_classification_CNN_RNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Video Classification with RNN

This notebook demonstrates video classification of action videos using a time-distributed CNN + RNN.

The CNN will use the InceptionV3 model (transfer learning) to perform feature extraction.

1. Run on Colab
2. Runtime -> Change Runtime Type: GPU

We will be using a subset of techniques demonstrated in the more complex example here: https://github.com/harvitronix/five-video-classification-methods

We will also be using a small dataset to demonstrate how this works. Only 5 classes were selected from the UCF101 video classification dataset.

Incidentally, this entire dataset is also part of Tensorflow (but already preprocessed): https://www.tensorflow.org/datasets/catalog/ucf101. The goal of doing this manually is to demonstrate how to go about processing your own video dataset.


In [1]:
%tensorflow_version 2.x

TensorFlow 2.x selected.


In [2]:
# Ensure that ffmpeg is installed as we will be using it to extract frames
!apt install ffmpeg

Reading package lists... Done
Building dependency tree       
Reading state information... Done
ffmpeg is already the newest version (7:3.4.6-0ubuntu0.18.04.1).
The following package was automatically installed and is no longer required:
  libnvidia-common-430
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 7 not upgraded.


## Download video data

We will download the data from github and create a folder structure like this:

```
train/
  Archery/
  Basketball/
  ...
test/
  Archery/
  Basketball/
  ...
```

In [3]:
# Instead of 101 classes, we will start small, just 5 classes.

classes = [
  'Archery',
  'Basketball',
  'CricketBowling',
  'Diving',
  'Haircut'
]

classes

['Archery', 'Basketball', 'CricketBowling', 'Diving', 'Haircut']

In [4]:
!pip install wget

Collecting wget
  Downloading https://files.pythonhosted.org/packages/47/6a/62e288da7bcda82b935ff0c6cfe542970f04e29c756b0e147251b2fb251f/wget-3.2.zip
Building wheels for collected packages: wget
  Building wheel for wget (setup.py) ... [?25l[?25hdone
  Created wheel for wget: filename=wget-3.2-cp36-none-any.whl size=9681 sha256=89025d17a408b62f5a405f00e7aa2ca6ed957bb10ac95b081d482a56ddc38a2b
  Stored in directory: /root/.cache/pip/wheels/40/15/30/7d8f7cea2902b4db79e3fea550d7d7b85ecb27ef992b618f3f
Successfully built wget
Installing collected packages: wget
Successfully installed wget-3.2


In [0]:
# Download subset of the dataset for the classes we have.
import wget
import zipfile
import os

def download_and_extract(label, folder):
  url = f'https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/{folder}/{label}.zip'
  print(f'Downloaded {url}')

  # prepare folders
  if not os.path.isdir(folder):
    os.mkdir(folder)
  filename = f'{label}.zip'

  # download
  wget.download(url, filename)

  # extract within folder
  with zipfile.ZipFile(filename) as f:
    f.extractall(folder)
  print(f'Extracted {os.path.join(folder, label)}')
  
  # delete zip archive
  os.remove(filename)

In [16]:
for label in classes:
  download_and_extract(label, 'train')
  download_and_extract(label, 'test')

Downloaded https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/train/Archery.zip
Extracted train/Archery
Downloaded https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/test/Archery.zip
Extracted test/Archery
Downloaded https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/train/Basketball.zip
Extracted train/Basketball
Downloaded https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/test/Basketball.zip
Extracted test/Basketball
Downloaded https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/train/CricketBowling.zip
Extracted train/CricketBowling
Downloaded https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/test/CricketBowling.zip
Extracted test/CricketBowling
Downloaded https://github.com/lisaong/mldds-courseware/raw/master/data/ucf101-5classes/train/Diving.zip
Extracted train/Diving
Downloaded https://github.com/lisaong/mldds-courseware/raw/master

In [0]:
# Files are downloaded, press 'Refresh' in the Files tab to see them

## Extracting frames from videos

In this step, we will use ffmpeg to extract frames from the video files and save the frames into images. 

This will also generate a csv file describing the class, video filename, and number of frames for each video.

This can take a while to run.

In [0]:
# https://github.com/harvitronix/five-video-classification-methods/blob/master/data/2_extract_files.py
import glob
import csv
from subprocess import call

def get_nb_frames_for_video(video_parts):
    """Given video parts of an (assumed) already extracted video, return
    the number of frames that were extracted."""
    train_or_test, classname, filename_no_ext, _ = video_parts
    generated_files = glob.glob(os.path.join(train_or_test, classname,
                                filename_no_ext + '*.jpg'))
    return len(generated_files)

def get_video_path_parts(video_path):
    """Given a full path to a video, return its parts."""
    parts = video_path.split(os.path.sep)
    filename = parts[2]
    filename_no_ext = filename.split('.')[0]
    classname = parts[1]
    train_or_test = parts[0]

    return train_or_test, classname, filename_no_ext, filename

def check_already_extracted(video_parts):
    """Check to see if we created the -0001 frame of this file."""
    train_or_test, classname, filename_no_ext, _ = video_parts
    return bool(os.path.exists(os.path.join(train_or_test, classname,
                               filename_no_ext + '-0001.jpg')))

def extract_files():
    """After we have all of our videos split between train and test, and
    all nested within folders representing their classes, we need to
    make a data file that we can reference when training our RNN(s).
    
    This will let us keep track of image sequences and other parts
    of the training process.
    
    We'll first need to extract images from each of the videos. We'll
    need to record the following data in the file:
    [train|test], class, filename, nb frames
    
    Extracting can be done with ffmpeg:
    `ffmpeg -i video.mpg image-%04d.jpg`
    """

    data_file = []
    folders = ['train', 'test']
    for folder in folders:
      class_folders = glob.glob(os.path.join(folder, '*'))
      
      for vid_class in class_folders:
        class_files = glob.glob(os.path.join(vid_class, '*.avi'))
        
        for video_path in class_files:
          # Get the parts of the file path
          video_parts  = get_video_path_parts(video_path)
          train_or_test, classname, filename_no_ext, filename = video_parts          

          # Only extract if we haven't done it yet. Otherwise, just get
          # the info.
          if not check_already_extracted(video_parts):
            # Now extract it.
            src = os.path.join(train_or_test, classname, filename)
            dest = os.path.join(train_or_test, classname,
                                filename_no_ext + '-%04d.jpg')
            call(["ffmpeg", "-i", src, dest])

            # Now get how many frames it is.
            nb_frames = get_nb_frames_for_video(video_parts)

            data_file.append([train_or_test, classname, filename_no_ext, nb_frames])

            print("Generated %d frames for %s" % (nb_frames, filename_no_ext))

    with open('data_file.csv', 'w') as fout:
        writer = csv.writer(fout)
        writer.writerows(data_file)

    print("Extracted and wrote %d video files." % (len(data_file)))

In [0]:
# If you have many more files, recommend run directly from a Terminal
# (outside of Jupyter or Colab - because Colab can hang)
extract_files()

In [0]:
# To see what gets generated (if Files Refresh does not show anything)
!ls test/Archery

In [0]:
# Contents of the data file
# [train|test], class, filename, nb frames
!cat data_file.csv