# Shot Boundary Detection (SBD) - Exercises

_Authors: Mikołaj Leszczuk_

[http://qoe.agh.edu.pl](http://qoe.agh.edu.pl)

## Purpose of the Exercise

The purpose of this exercise is acquiring the practice in opportunities for video content analysis. An example of video content analysis is automatic Shot Boundary Detection (SBD). SBD is commonly used in case of creating video summarizations. 

## Needed Knowledge

Before starting exercise, one should possess knowledge in the following topics: 
* SBD basics (why it is used) 
* SBD methods (general information) 
* Applications for SBD used during the exercise 

## Work Environment

We will use the following solution for SBD: [PySceneDetect](http://scenedetect.com/en/latest/) ([the PySceneDetect Python API (the *scenedetect* module)](https://scenedetect.com/projects/Manual/en/latest/)).

In [1]:
pip install pip --upgrade

Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install scenedetect --upgrade

Collecting scenedetect
  Obtaining dependency information for scenedetect from https://files.pythonhosted.org/packages/f1/5f/4e7ed7ca64d9f8f7330c7c1fc107340f2d712e199d71e62ce123f3d8668a/scenedetect-0.6.2-py3-none-any.whl.metadata
  Downloading scenedetect-0.6.2-py3-none-any.whl.metadata (4.1 kB)
Downloading scenedetect-0.6.2-py3-none-any.whl (117 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m117.1/117.1 kB[0m [31m849.2 kB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: scenedetect
  Attempting uninstall: scenedetect
    Found existing installation: scenedetect 0.6.1
    Uninstalling scenedetect-0.6.1:
      Successfully uninstalled scenedetect-0.6.1
Successfully installed scenedetect-0.6.2
Note: you may need to restart the kernel to use updated packages.


## Execution of the Exercise

The first task is the preparation of a video, which will be used for testing SBD systems. You may use files with videos (like: [UGS05.mpg](UGS05.mpg)); alternatively, it is possible to find video files (having numerous, easy to distinguish, shots) on the Internet. The proposes files are preferred as they are accompanied by manually created reference shot positions (`ref_*.csv` files, column: `manual_past_f_num`). Please be aware that not all video formats and codecs are handled by the programs used for the exercise. The videos should not be too long as the exercise duration is limited. In the case of long videos, it is acceptable to analyze only a part of a video. The audio track is not used; thus, also it is not needed.

If the reference shot positions are not available, you should oversee the video to obtain real shot boundaries. Please note that different programs may use numbering of frame numbers starting from `0` or `1`, in case of incompatibility, you should apply the appropriate translation.

The next steps are a try of automatic SBD in the video, and then, a determination of the accuracy of SBD.

To get started, the **`scenedetect.detect()`** function takes a path to a video and a [scene detector object](https://scenedetect.com/projects/Manual/en/latest/api/detectors.html#scenedetect-detectors), and returns a list of start/end timecodes. For detecting fast cuts (shot changes), we use the **`ContentDetector`**:

In [3]:
from scenedetect import detect, ContentDetector
shot_list = detect(
    video_path='UGS05.mpg',
    detector=ContentDetector(),
    stats_file_path='stats_file.txt',
    show_progress=True,
)

Detected: 24 | Progress: 100%|█████████████████████████████████████| 40098/40098 [00:56<00:00, 712.33frames/s]


Note that when calling **`detect`** we set `stats_file_path='stats_file.txt'` save per-frame metrics to [stats_file.txt](stats_file.txt) and we set `show_progress=True` to display a progress bar with estimated time remaining.

`shot_list` is now a list of **`FrameTimecode`** pairs representing the start/end of each shot (try printing `shot_list`). 

In [4]:
shot_list

[(00:00:00.000 [frame=0, fps=29.970], 00:00:02.002 [frame=60, fps=29.970]),
 (00:00:02.002 [frame=60, fps=29.970], 00:00:26.026 [frame=780, fps=29.970]),
 (00:00:26.026 [frame=780, fps=29.970], 00:00:46.246 [frame=1386, fps=29.970]),
 (00:00:46.246 [frame=1386, fps=29.970],
  00:00:55.455 [frame=1662, fps=29.970]),
 (00:00:55.455 [frame=1662, fps=29.970],
  00:01:04.698 [frame=1939, fps=29.970]),
 (00:01:04.698 [frame=1939, fps=29.970],
  00:01:12.906 [frame=2185, fps=29.970]),
 (00:01:12.906 [frame=2185, fps=29.970],
  00:01:22.115 [frame=2461, fps=29.970]),
 (00:01:22.115 [frame=2461, fps=29.970],
  00:02:17.471 [frame=4120, fps=29.970]),
 (00:02:17.471 [frame=4120, fps=29.970],
  00:07:13.733 [frame=12999, fps=29.970]),
 (00:07:13.733 [frame=12999, fps=29.970],
  00:07:31.918 [frame=13544, fps=29.970]),
 (00:07:31.918 [frame=13544, fps=29.970],
  00:07:52.439 [frame=14159, fps=29.970]),
 (00:07:52.439 [frame=14159, fps=29.970],
  00:09:12.252 [frame=16551, fps=29.970]),
 (00:09:12.2

Next, let’s print the shot list in a more readable format by iterating over it:

In [5]:
for i, shot in enumerate(shot_list):
    print('Shot %2d: Start %s / Frame %d, End %s / Frame %d' % (
        i+1,
        shot[0].get_timecode(), shot[0].get_frames(),
        shot[1].get_timecode(), shot[1].get_frames(),))

Shot  1: Start 00:00:00.000 / Frame 0, End 00:00:02.002 / Frame 60
Shot  2: Start 00:00:02.002 / Frame 60, End 00:00:26.026 / Frame 780
Shot  3: Start 00:00:26.026 / Frame 780, End 00:00:46.246 / Frame 1386
Shot  4: Start 00:00:46.246 / Frame 1386, End 00:00:55.455 / Frame 1662
Shot  5: Start 00:00:55.455 / Frame 1662, End 00:01:04.698 / Frame 1939
Shot  6: Start 00:01:04.698 / Frame 1939, End 00:01:12.906 / Frame 2185
Shot  7: Start 00:01:12.906 / Frame 2185, End 00:01:22.115 / Frame 2461
Shot  8: Start 00:01:22.115 / Frame 2461, End 00:02:17.471 / Frame 4120
Shot  9: Start 00:02:17.471 / Frame 4120, End 00:07:13.733 / Frame 12999
Shot 10: Start 00:07:13.733 / Frame 12999, End 00:07:31.918 / Frame 13544
Shot 11: Start 00:07:31.918 / Frame 13544, End 00:07:52.439 / Frame 14159
Shot 12: Start 00:07:52.439 / Frame 14159, End 00:09:12.252 / Frame 16551
Shot 13: Start 00:09:12.252 / Frame 16551, End 00:11:00.527 / Frame 19796
Shot 14: Start 00:11:00.527 / Frame 19796, End 00:11:40.633 / Fr

## Next Steps

Let us use methods presented in the lecture for determining the accuracy of SBD (in particular, let us try to assess the accuracy of SBD using Precision and Recall metrics). 

In [6]:
positives = [shot[1].get_frames() for shot in shot_list]
print(positives)

[60, 780, 1386, 1662, 1939, 2185, 2461, 4120, 12999, 13544, 14159, 16551, 19796, 20998, 22079, 25046, 25618, 30956, 34542, 34999, 35305, 35611, 35973, 40058, 40098]


In [7]:
with open('ref_UGS05.csv') as file_object:
    ground_truth = [int(line) for line in file_object.readlines()[1:]]
print(ground_truth)

[60, 780, 1386, 1662, 1939, 2185, 2461, 4120, 12999, 13544, 14159, 16551, 19796, 20998, 22079, 23041, 24163, 25046, 27080, 29383, 30956, 31320, 34542, 34999, 35305, 35611, 35973, 36575, 36877, 37959, 38861, 40058]


Obtaining lists of true positives and false positives:

In [8]:
true_positives = []
false_positives = []
for positive in positives:
    if positive in ground_truth:
        true_positives.append(positive)
    else:
        false_positives.append(positive)

In [9]:
print(true_positives)

[60, 780, 1386, 1662, 1939, 2185, 2461, 4120, 12999, 13544, 14159, 16551, 19796, 20998, 22079, 25046, 30956, 34542, 34999, 35305, 35611, 35973, 40058]


In [10]:
print(false_positives)

[25618, 40098]


Obtaining values of $t_p$ and $f_p$:

In [11]:
tp = len(true_positives)
print(tp)

23


In [12]:
fp = len(false_positives)
print(fp)

2


Obtaining list of false negatives:

In [13]:
false_negatives = [
    false_negative for false_negative in ground_truth if false_negative not in positives
]
print(false_negatives)

[23041, 24163, 27080, 29383, 31320, 36575, 36877, 37959, 38861]


Obtaining value of $f_n$:

In [14]:
fn = len(false_negatives)
print(fn)

9


Ontaining value of $t_n$:

In [15]:
with open('stats_file.txt') as file_object:
    tn = len(file_object.readlines()) - tp - fp - fn
print(tn)

40064


Obtaining values of Precision, Recall and Accuracy:

In [16]:
p = tp / (tp + fp)
print(p)

0.92


In [17]:
r = tp / (tp + fn)
print(r)

0.71875


In [18]:
a = (tp + tn) / (tp + tn + fp + fn)
print(a)

0.9997256721033468


If time permits, after testing SBD for video content with easily detectable shot boundaries, please try downloading the video (or videos) where shot boundaries are not so visible. Please do the tests for these videos as well.

## Report

In a report (if required – please check) please consider methods presented in the lecture for determining the accuracy of SBD.