# Explore and Run Shot Detection

Running facial regonition and embedding on every single frame is expensive. Not to mention, having to query a database or even store information for every frame would get dense.

Can we use shot detection/boundaries to determine when sequential frames show significant variation? If yes, we can run facial detection for key frames from these shots, which will *hopefully* contain all individuals appearing in a given shot.

## Process MP4 File 

In order to perform shot/scene detection we need to attach some information that will help us quantify the content appearing in a frame 

Using these metrics we can calculate when a significant shift occurs between frames and mark it as a shot boundary

In [1]:
import sys 

sys.path.append("/Users/srmarshall/Desktop/code/pbs/pbs-passthrough/")

In [2]:
# set mp4 filepath - in this case a path pointing to a grantchester episode 
mp4_filepath = "/Users/srmarshall/Desktop/data-dump/dynamic_recaps/video_assets/full_length/30d5ccbd-f2ce-4fd7-99d9-ee28236d9af9.mp4"

In [3]:
from utils.helpers import video_procesisng_pipeline

# run video processing pipeline
df = video_procesisng_pipeline(mp4_filepath=mp4_filepath, load_previous=True)

Processing MP4 Frames: 100%|██████████| 95025/95025 [01:42<00:00, 924.36frames/s]
Extracting Features:   0%|          | 16/95025 [00:00<10:19, 153.24 frames/s]

Error extracting features: cannot access local variable 'edge' where it is not associated with a value


Extracting Features: 95026 frames [10:15, 154.34 frames/s]                      


In [4]:
## write to df to avoid reprocessing 
# df.to_csv("../assets/grantchester_sample_features.csv", index=False) ## SM: only needs to be done once

In [5]:
df.head()

Unnamed: 0,frame_number,timestamp,edges,pixel_diffs,bhattacharyya_distance
0,1,0.0,0,0.0,0.0
1,2,33.366667,0,0.0,0.0
2,3,66.733333,0,0.0,0.0
3,4,100.1,0,0.0,0.0
4,5,133.466667,0,0.0,0.0


In [6]:
df.tail()

Unnamed: 0,frame_number,timestamp,edges,pixel_diffs,bhattacharyya_distance
95020,95021,3170501.0,0,0.0,0.0
95021,95022,3170534.0,0,0.0,0.0
95022,95023,3170567.0,0,0.0,0.0
95023,95024,3170601.0,0,0.0,0.0
95024,95025,3170634.0,0,,


## Detect Key Frames

Use the metrics we have available to us to try and determine when a frame marks a shot boundary 

For now, let's use the `bhattacharyya_distance`

In [11]:
# summary stats
df.describe()

Unnamed: 0,frame_number,timestamp,edges,pixel_diffs,bhattacharyya_distance
count,95025.0,95025.0,95025.0,95024.0,93983.0
mean,47513.0,1585317.0,12918.538374,18438.00222,0.021828
std,27431.499002,915297.7,16596.805285,56571.436779,0.040937
min,1.0,0.0,0.0,0.0,0.0
25%,23757.0,792658.5,4045.0,3.0,0.009606
50%,47513.0,1585317.0,7504.0,1522.0,0.013762
75%,71269.0,2377976.0,14369.0,12630.25,0.020771
max,95025.0,3170634.0,143585.0,830705.0,0.994448


In [12]:
# pull mean and std 
mean = df["bhattacharyya_distance"].mean()
std = df["bhattacharyya_distance"].std()

# set threshold to 2 std above the mean
threshold = mean + 2*std

In [13]:
import numpy as np 

# mark shot boundaries using the threshold 
df["is_keyframe"] = np.where(df["bhattacharyya_distance"] > threshold, 1, 0)

In [14]:
# how many shots do we have?
df.value_counts("is_keyframe")

is_keyframe
0    93307
1     1718
Name: count, dtype: int64

In [15]:
# extract keyframes for analysis
keyframe_df = df[df["is_keyframe"] == 1]

## Embed Key Frames