In [None]:
from IPython.core.display import display, HTML

display(HTML("<style>.jp-Cell { max-width: 1200px !important; }</style>"))

# VRD Notebook 

Beta version 0.1

# Introduction

This Jupyter Notebook is the home of the Video Reuse Detector (VRD) – a toolbox for identifying visual similarities in video archives. If you are not familiar with the basic functions of Jupyter Notebook, we suggest you begin by having a look at [this tutorial](https://jupyter-notebook.readthedocs.io/en/stable/notebook.html).

In what follows, you will access an interactive computing environment and programming interface that allows you to run the VRD. You will also be guided through the VRD’s computational workflow and receive information about the basic logics and ideas behind the software. 

The VRD is intended to help archivists and humanistic scholars study video reuse and find patterns of similarity across audiovisual databases. You can use it to explore how one or several selected video clips are reused within a larger video database, or match and compare all videos within a given database against each other. 

The VRD has been developed at [Humlab](https://www.umu.se/en/humlab/), [Umeå University](https://www.umu.se) within the research project [European History Reloaded](https://www.cadeah.eu/) and you can find an overarching description of how the toolkit works [here](https://videoreusedetector.github.io/). We strongly recommend you read this documentation before you proceed.

The open source code for all elements of the VRD can be found on [Github](https://github.com/humlab/vrd). Note that the current beta version of the VRD only processes images and not sound. 

## Video requirements

 
The VRD can process video encodings supported by the [FFmpeg library](https://ffmpeg.org/). If the video files you want to work with are saved in another format, you need to convert them. Note that file extensions are not guaranteed to match the underlying video file, especially if the file was not created by you. We therefore suggest you double-check that video encodings and file formats match.

Our tests have shown that the VRD can detect video reuse even if the original content has been visually modified and distorted (such as when the color or composition of the image has been changed) as long as sufficient contrast in the footage is maintained. 

However, a few video types are less well-suited for being used in the current beta version of the VRD. This primarily includes videos with a lot of textual or symbolic overlays, such as subtitles, news show banners, and/or tv-channel symbols, since these can distort the toolkit’s analysis of visual similarities and generate false positives. It should not hinder the overall comparison, however.

For further discussions about the strengths/weaknesses of the VRDs capacities to analyze different types of content, click [here](https://videoreusedetector.github.io/).

## Using subfolders 

To narrow down and customize your matching results it is good to prepare by sorting and categorizing your dataset. One preparation involves sorting videos into subfolders that help filter the search results and enable you to compare different categories of videos (such as producer, year of publication etc.).

If video files 1 and 2 are placed in subfolder A, while video file 3 is placed in subfolder B, for example, you will later be able to filter your search results to only obtain matches between subfolder A and B and remove comparisons between video 1 and 2 stored in subfolder A. This allows for targeted similarity searches. Note, however, that only one level is considered when subfolders are matched, and sub-subdirectories will not be accounted for.

For instance, you might want to place one or several videos – say, footage of the moon landing – in subfolder A and all BBC documentary films produced between 1969 and 2000 in subfolder B. Using this technique, you will later be able to find instances where footage of the moon landing appears in late 20th century BBC documentaries.  

Alternatively, you may choose to place footage of the moon landing in subfolder A, and BBC documentaries produced in the 1970’s in subfolder B, 1980’s in subfolder C, and 1990’s in subfolder D. If so, you can later narrow down the matching results even more and re-trace how footage of the moon landing was depicted and contextualized during three decades. 

Note that regardless of subfolder structure, the VRD will always - by design - first match all-against-all, which means that using subfolders will not result in an advantage in processing speed. It will, however, allow for more flexible ways of presenting and filtering your search results.
If you want to work with subfolders, all you have to do at this stage is to create a suitable folder system in your local file directory. You will be guided through the process of instructing the VRD to locate and match these files in section [Filter matches from the same subfolder](#Filter-matches-from-the-same-subfolder).

## Brief notebook user guide

Once again, we strongly recommend you study a Jupyter Notebook tutorial closely before you start using the VRD, unless you are already familiar with how notebooks work.

In short, a notebook is split into three types of cells; markdown, code, and output cells. Markdown cells are used for informative text that has no direct impact on the code being written, while code cells specify what is to be calculated/computed. In output cells, the results of the calculated code is displayed - for example in the form of graphs and tables.

In general, the values in code cells will only change when the cell has been run, and not when the text changes. The run feature can be found in the upper menu of the notebook and looks like a play button (see example below).

![run_example](img/run_notebook_example.png)

Aside from clicking run, you can for example stop an ongoing session of running code by clicking on the stop symbol, or run all cells in an entire notebook at the same time by clicking on the fast-forward symbol.   As an alternative, you may want to consider using [shortcuts](https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330).

If cells are run out-of-order (such as when you jump back in the notebook, make a change in a code cell, and then click run), it is important to remember to also always also run all subsequent code cells. Otherwise, your adjustments will not be fully applied. 

If you want to restart a project after a major change, it is important to keep in mind that the VRD does not include a feature for ensuring that previous settings are kept. We therefore recommend users to delete the relevant project folder and recreate all content by running the script again, whenever major settings in the VRD have been changed. By major settings, we for example refer to changing a neural network, adding or removing video files, or changing your folder structure.

Remember that once you start working on a project with the VRD notebook, you will always be able to add (and remove) features, widgets, filtering options, visualization modes etc. from the interface and underlying code. We warmly welcome users to tweak and extend the VRDs features and hope that you will share your improvements with us on [Github](https://github.com/humlab/vrd).

## Transparency and the VRD

 
As previously mentioned, the VRD contains a mix of explanatory markdown, code, and output cells (including tables, graphs, and visual examples), yet it is important to point out that much of the VRD’s code remains hidden from the immediate notebook interface. This is primarily to shorten the length of the notebook and facilitate a cleaner and more straightforward workflow.
 
When deciding what code and settings to address in the notebook, we have focused on features that are either necessary for the VRD to run, or that have a great impact on the final matching results and performance of the tool. We have also focused on discussing features that we have found to be practical and useful when using the toolkit. 
 
We would like to acknowledge, however, that there are many more tweaks and adjustments that can be made to the overall function of the VRD. Furthermore, it is not possible to grasp the full workings and operations of the VRD by only reading this notebook. If you are interested in studying the VRD’s source code in full detail, you need to visit the project’s [Github page](https://github.com/humlab/vrd). 

# Setup notebook

## Initialize notebook environment

To begin using the VRD, you first need to import libraries required for the VRD to function. There is normally no need for a user to modify these values - simply run the code cell below.


In [None]:
%load_ext autoreload
# Necessary imports
import ipywidgets as widgets
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os

from tqdm.notebook import tqdm

from vrd import neural_networks
from vrd.neural_networks import NeuralNetworks
from vrd.faiss_helper import calculate_distance_list
from vrd.notebook_helper import get_grid, execute_filter


video_source_base_folder = "/home/jovyan/videos/"

# Adjust configurations

Next, you need to adjust some initial configurations before the VRD can run. After this markdown cell, you can find the code cell where the configurations described in upcoming sections can be adjusted according to your preferences.

We suggest you 1) start by having a quick look at this code cell to get an initial overview of which settings can be changed, 2) carefully read through all sections in the chapter Adjust configurations to understand the implication of each setting, and 3) come back to this code cell to make your adjustments.

Don’t forget to run the code cell when you are done modifying the code - otherwise your changes will not be applied (place your marker in the code cell and click on the run symbol in the top menu, see section [Brief notebook user guide](#Brief-notebook-user-guide).).

In [None]:
# User adjustable configurations
project_name = 'add_title_here'
video_source_folder = 'main_folder_name'

additional_video_extensions = ".mov"
number_of_neighbours_considered = 250
dl_network = neural_networks.get_network(NeuralNetworks.resnet50)

# dl_network.default_layer = 100

# No need to change these configurations
project_root_folder = f"/home/jovyan/notebooks/projects/{project_name}/"
database_file = f"{project_root_folder}/database_file"
frame_location = f"{project_root_folder}/frames/"
full_video_source_directory = os.path.join(
    video_source_base_folder, video_source_folder
)

### Select project title

 
Start by selecting a title for your project by adjusting the following line in the code cell in [Adjust configurations](#Adjust-configurations): 

<code>project_name= '<span style="color:red">add_title_here</span>'</code>

If you want your project to be called “Test”, for example, then change the line to say:

<code>project_name = '<span style="color:red">Test</span>'</code>

This will mainly impact the output directory of the current project (that is, where newly produced files will be stored). 

### Select video folder

Next, you need to instruct the VRD where to find the videos you want to work with. 

To begin with, you need to place your video files in the correct VRD folder. When you downloaded and installed the VRD docker container, a zip-file named VRD was installed on your local computer. During the installation process, you should have opened and un-zipped this file, which contains two folders entitled “mounts” and “vrd”. 

Place the video files you want to work with in mounts > videos. Your files should be saved and collected in one main folder, which you can name according to your preferences. If you want to work with subfolders, these should be stored inside the main folder (again, note that only one level of subfolders will be considered). 

When your video files have been placed in the right spot, you can instruct the VRD to find them by adjusting the following prompt in the code cell above (section [Adjust configurations](#Adjust-configurations)):

<code>video_source_folder = '<span style="color:red">main_folder_name</span>'</code>

If your files are stored in a folder called “XYZ” in the mounted videos folder on your local computer, for example, then input:

<code>video_source_folder = '<span style="color:red">XYZ</span>'</code>

### Add non-standard file extensions

The default video extensions considered by the VRD include <code>.mp4</code>, <code>.flv</code> and <code>.avi</code>. By default, you are also asked to add the .mov video extension to the VRD in the following prompt:

<code>additional_video_extensions = '<span style="color:red">.mov</span>'</code>

If the video files you want to work with are saved using other file extensions, you need to add these manually. If you are working with .ogg, .mov and .wmv videos, for example, then change the following line to say:

<code>additional_video_extensions = ['<span style="color:red">.mov', '.ogg', '.wmv</span>']</code>

If you are already working with standard file extensions, leave this line as it is.

### Adjust number of similarity neighbours considered  

In the upcoming section [Match frames](#Match-frames), you will be guided through the process of performing the actual similarity comparison between videos. An important step in this procedure involves deciding how many similarity neighbors the VRD should consider for every analyzed video frame.

The VRDs default setting for neighboring matches is set to a maximum of 250 matches per analyzed video frame. This can be seen on the following line in the code cell above (section [Adjust configurations](#Adjust-configurations)):

<code>number_of_neighbours_considered = <span style="color:red">250</span></code>

For now, we suggest you leave this default setting as it is. You will be informed about how to adjust this number at a later stage in the notebook if/when necessary.

### Select neural network 

The current version of VRD is based on [Tensorflow](https://www.tensorflow.org/), which includes the Python API Keras. Keras makes four pre-trained convolutional neural networks available for use: [ResNet50](https://keras.io/api/applications/resnet/), [Inception_3](https://keras.io/api/applications/inceptionv3/), [VGG16](https://keras.io/api/applications/vgg/), and [MobileNet](https://keras.io/api/applications/mobilenet/). The VRD uses these neural nets to process and identify visual similarities in videos (read more about how this works [here](https://videoreusedetector.github.io/)).
 
We recommend you start using the neural net called Resnet50, which is installed as the default option in the VRD. This is illustrated by the row in the code cell in section [Adjust configurations](#Adjust-configurations):

<code>dl_network = neural_networks.get_network(NeuralNetworks.<span style="color:red">resnet50)</span></code>

If you want to try one of the other networks, delete the red text in the code cell and change it to one of the following: <code>mobilenet</code>, <code>inception_v3</code>, or <code>vgg16</code>. 

### Override default neural network layer

Convolutional neural networks operate according to multiple layers of analysis and abstraction where each layer processes an input and produces an output, which is passed on to the next layer. Somewhere before the final analytical layer is reached, a convolutional neural net will produce a compressed interpretation of the key visual characteristics of images. 

We call these interpretations “frame features” and the VRD is designed to use them to find patterns of similarity across videos, while disregarding the analysis done in the remaining layers of the neural net (read more about how this works [here](https://videoreusedetector.github.io/)). 

For the neural net Resnet50, for example, we have set the default layer to conv5_block3_1_conv. This default layer was selected to offer a reasonable abstraction of the image while keeping the layer size small to minimize computation time. 

Advanced users who are familiar with the layered design and workings of ResNet50 may find it desirable to override this default layer, however, since this will impact the level of abstraction of the matches and can lead to improved performance depending on the dataset being analyzed. To change this setting, remove the # in front of the following line in the code cell above (section [Adjust configurations](#Adjust-configurations)) and change to the desired value:

<code># dl_network.default_layer = <span style="color:red">100</span></code>

If you are using another CNN than ResNet50, the default layer will also have to be adjusted.

# Extract frames and generate frame features

## Introduction
In this section, you will be guided through the process of:

- Extracting frames from the selected videos you want to work with
- Removing frames with monochrome color
- Generating and exporting frame features  with the help of your selected neural net
- Importing the frame features  to the Faiss index for later comparison

Instead of being presented with one code cell in the beginning of the section, each subsection will be immediately followed by the relevant code cell. 

## Extract video frames
 
The VRD performs its analysis by breaking videos down into still frames and instructing a selected neural net to interpret the key features of each frame. Frame features are then exported, matched, and compared when the VRD searches for examples of video reuse (see also section [Override default neural network layer](#Override-default-neural-network-layer)).
 
To initiate the process of generating and comparing frame features, it is necessary to first extract still frames from the videos you want to work with. By default, the VRD is instructed to extract one frame per second of video. 

Your extracted frames will be stored in the folder <code>/notebooks/projects/project_name/frames</code>, where <i>project_name</i> is the selected name of the project as described in the section [Select project title](#Select-project-title).

To initiate the extraction of frames, run the code cell below. You do not have to make any changes in the code. After the code cell, you can see the results of the extraction. You can for example use this information to double-check that all your desired video files have been processed, and to get an initial overview of how many frames were extracted from your content in total.


In [None]:
%%time
from vrd import frame_extractor as fe

frames = fe.FrameExtractor(
    full_video_source_directory,
    frame_location,
    additional_video_extensions=additional_video_extensions,
)
print(f"Total number of extracted frames found: {len(frames.all_images)}")

## Generate and save frame features
In this step, we instruct the previously selected neural net to process the extracted frames and save its interpretations of frame features in a separate database.
 
The VRD will only analyze this database and its content onwards.

In more detail, the code will do the following: 

- Remove black borders around frames since these may distort the analysis
- Resize frames according to the requirements of the respective neural net
- Use the frames as input to the neural network and calculate all layers in the CNN
- Extract and save the previously specified layer (see section [Override default neural network layer](#Override-default-neural-network-layer)) as a frame feature to the database

This will result in a database containing the features for each of the frames in the dataset. All required settings have already been set in the code cell below, and there is no reason to change anything in this cell.

Note that this step may take some time. You can follow the process in the progress bar below. 

In [None]:
%%time
from vrd import keras_layer_helper as klh

klh.add_layer_activations_to_database(dl_network, database_file, frames)

## Import frame features to Faiss index
 
To speed up the comparison of extracted frames, the VRD uses the [Faiss index](https://github.com/facebookresearch/faiss) – a software that specializes in large-scale similarity searches – to calculate video matches. If you want to learn more about how Faiss works, you can find an overview of its workings [here](https://towardsdatascience.com/facebook-ai-similarity-search-7c564daee9eb), visit the Faiss Github page via [this link](https://github.com/facebookresearch/faiss), or have a look at the [article](https://arxiv.org/abs/1702.08734) where the Faiss index was first introduced. 

To initiate the process of indexing the fingerprints using Faiss, you first need to import the previously generated frame features to the Faiss index.
 
Run the code cell below to initiate this process. You do not have to make any changes in the code.

In [None]:
%%time
from vrd import faiss_helper

faiss_index = faiss_helper.get_faiss_index(database_file, dl_network, frames)

# Match frames

## Introduction
In this step, we perform the frame matching that is central to the VRDs ability to identify video reuse. This involves using the Faiss library to:
 
- Adjust the number of neighboring matches 
- Index the pre-processed frames
- Compute distance metrics 
- Output a list of similarity neighbors

## Adjust neighbouring matches  

In the section [Adjust number of similarity neighbours considered](#Adjust-number-of-similarity-neighbours-considered) you were instructed to adjust the number of similarity neighbors considered by the VRD according to the default setting 250 nearest neighbors per analyzed frame.

This setting will greatly impact the performance and processing speed of the VRD and in this section, we explain what it implies in more detail.

The VRD will always compare all extracted frames against each other and ascribe a so-called distance metric to each matched pair of frames. Distance metrics constitute a core element of the VRDs evaluation of visual similarity. A low distance metric value (or short distance) will indicate high visual similarity and a high distance metric value (or long distance) will indicate low visual similarity.

The distance metric 0.0 represents the absolute shortest and closest similarity the VRD can ascribe to two compared frames, and essentially corresponds to the distance that a frame would have to itself (i.e. an absolute match). The upper limit for what would represent a “correct” distance metric is impossible to determine in a general sense, however, as it is dynamic and for example changes with the neural network used, the quality of the source material, and the number of images/frames in the index.
 
Importantly, if the user requests it, the VRD will assign a distance metric to any matched pairs of frames, even if the frames are identified as having a very low (and even close to non-existent) visual similarity. Doing so will often take excessive amounts of time result in irrelevant matching results, however.

To avoid being shown an extensive amount of poor matches, we suggest that a) a limit is set for how many neighboring matches the VRD should output for each frame, and b) that the VRD is instructed to reveal the nearest neighbors first.

In general, it is desirable to keep the number of neighboring matches as low as possible to improve processing speed. However, extracting too few neighbors can result in similar images being filtered out. 

As stated in [Adjust number of similarity neighbours considered](#Adjust-number-of-similarity-neighbours-considered), we recommend you leave the setting at the default of 250 nearest neighbors per frame. If obtaining 250 neighbors per frame is too few (e.g. if all neighbors provided by the VRD later turn out to be correct matches), we suggest go back and raise the limit by adjusting the relevant prompt in the code cell.

If it turns out that too few of the returned similarity neighbors can be considered “correct” matches, we suggest you first adjust the distance metric threshold (see section [Introduce distance metric threshold](#Introduce-distance-metric-threshold)) before experimenting with lowering the limit of nearest neighboring matches. Remember that if you go back and change/rerun the prompt in the [code cell](#Adjust-configurations), you also have to run all subsequent cells to keep them updated.

## Compute neighbour distances

In this step, we use the Faiss index created in section [Import frame features to Faiss index](#Import-frame-features-to-Faiss-index) to find the closest similarity neighbours for each frame. These identified similarity neighbours will constitute the source for the final matching results. 

Note that the computation of distance metrics may take some time. You can monitor the developments in the progress bar.
To initiate the process, run the code cell below. You do not have to make any adjustments in the code.

In [None]:
%%time
neighbours = calculate_distance_list(
    frames, database_file, faiss_index, neighbour_num=number_of_neighbours_considered
)

# Filter matching results

## Introduction
In this section, you are given the option to filter the matching results, depending on your aims and goals. The filtering options that are embedded in the VRD include:

- Removing identified nearest neighbors from the same video
- Introducing a distance metric threshold
- Filtering matching results based on subfolders
- Removing frames with monochrome color

After each filtering step, control checks will help you monitor what effect the filtering had on the matching output as a whole.

To begin with, you will be shown a table that illustrates how the filtering operation affected your dataset in real numbers. As an example, such a table may look like this:

![filter_table_example](img/filter_before_after_table.jpg)

The row “Average no. of neighbours” refers to the average number of remaining neighbours per frame. Likewise, the row “Average distance metrics” refers to the remaining average distance metric per frame. Note that after the first filtering operation, the cells Lost frames (number) and Lost frames (percentage) will always be blank.

In addition, you will be shown a series of histograms that illustrate the effects of the filtering. As an example, the histograms may look like this:

![filter_table_example](img/filter_histogram_example.jpg)

The top two histograms will show what the total number of remaining neighbours and distance metric distribution in your dataset looked like before your latest filtering operation, while the two bottom histograms will show the same figures after your latest filtering operation. In all of these cases, the figures refer to the results for individual frames.

The histograms are meant to help you get a (visual) sense of the effects of your most recent filtering operation and can for example guide you in the process of adjusting the filtering settings. 

## Filter neighbors from the same video

In the previous sections [Adjust neighboring matches](#Adjust-neighbouring-matches) and [Compute neighbour distances](#Compute-neighbour-distances), the VRD was instructed to index all pre-processed frames and compute a distance metric for a maximum of 250 nearest neighbors for each analyzed frame. In this setup, however, the identified similarity neighbors for analyzed frames can come from any original video – including one and the same video clip. 

If the scenery in a video is similar or identical several seconds in a row, for example, the VRD will likely identify a continuous series of frames from the same video as highly alike. While this is _technically correct_ (since the frames might indeed be very similar),  it runs the risk of distorting the calculation of distance metrics and can result in other more interesting or relevant video matches being pushed to the side – sometimes resulting in zero remaining neighbors. 

To deal with this problem, it is possible to remove and filter out the display of any identified similarity neighbors that come from the same video. If you happen to be interested in studying video reuse within one and the same video clip, we instead suggest you initiate a new project where only one video is added to the main video folder to begin with.
 
To filter out the display of neighbors from the same video, run the code cell below. You do not have to make any changes in the code. Below the code cell, you can see the result of the filtering operation. You may want to use this information to get a sense of how many identified frame matches originated from the same video clips, and how many matches were removed as a result of the filtering.

In [None]:
%%time
execute_filter(neighbours, neighbours.filter_same_video);

## Introduce distance metric threshold
 
As previously mentioned, the VRD will compare extracted frames against each other and ascribe a so-called distance metric to each pair of frames. In this step, you have the option to select a threshold for which distance metrics should be shown in the section [Output final matching results](#Output-final-matching-results).
 
For instance, you may instruct the VRD to only show frame pairs with an assigned distance metric of less than 30 000. If this threshold is accurate, it should greatly reduce the number of shown non-matching frames. If the distance metric threshold is set too low, however, it may result in correct similarity matches being lost. If you wish to apply a distance metric threshold, we therefore suggest you start high, study the results, and possibly lower or heighten the threshold later. 
 
Our experiments have shown that a decent starting point for a distance metric threshold lies somewhere around 30 000-40 000 when using the neural network Resnet50. In the settings below, the default threshold therefore is set to 40 000. You can change this value by editing the following corresponding value in the code cell below:
 
<code>max_distance=<span style="color:red">40000</span></code>

Note, however, that a different neural net will likely require a different threshold value. It may also be the case that even if you are using Resnet50, your video material and the size of your dataset makes a distance metric threshold of 40 000 inappropriate. In other words, use the distance metric threshold with caution and always double-check the result of the filtering operation. You can always go back and adjust the distance metric threshold later.

As a tip, we suggest you study the previous histograms to get a sense of how much data is lost when you introduce a specific distance metric threshold.  

Below the code cell, you can see the result of the filtering operation. You may want to use this information to get a sense of how many frames were lost as a result of the filtering, or to get an idea of how the average number of nearest neighbours changed.

If you do not wish to add a distance metric threshold, simply skip this step and move on to the next section.

In [None]:
execute_filter(neighbours, neighbours.filter_maximum_distance, max_distance=40000);

## Filter neighbours from the same subfolder

In this step, you have the option to filter matches from the same subfolder. This is useful if you only want to see matches between specific sources – be it authors, production companies, or similar (see also [Using subfolders](#Using-subfolders)).

If you have sorted your dataset into three subfolders containing video content from three different decades – say, the 1960’s, 1970’s, and 1980’s, for example – you can use this filtering option to make sure that you are not shown any matches between videos within the same decade/subfolder.

To initiate this filter – which conceptually works similarly to the [Filter neighbors from the same video](#Filter-neighbours-from-the-same-video) filter – simply arrange your files into subfolders in the correct video directory (see section Select video folder) and run the code cell below. You do not have to make any changes in the code. If you do not want to filter neighbours from the same subfolder, simply skip this step and proceed to the next section


In [None]:
execute_filter(neighbours, neighbours.filter_neighbours_in_same_subfolder)
neighbours.filter_few_neighbours(min_neighbours=1);

## Filter all neighbours not related to a specific subfolder

This filter allows you to filter any matches that are not related to one or several subfolders. This is for example useful if you are only interested in exploring where/how a specific “reference” film - or set of films - reappears within an archive.

To return to a previously discussed example: if you are interested in studying how reference footage from the moon landing has been reused in BBC documentary films from three decades (the 1970’s, 1980’s, and 1990’s) and have placed the BBC documentaries in three separate subfolders, you may want to exclude matches between the BBC subfolders from your search results, and only ask the VRD to present results that concern the moon landing subfolder. This will significantly narrow down and sharpen your search results (unless you are also interested in studying how video footage has been reused in BBC documentaries in general).

The filter allows both two-way or one-way filtering. This boils down to whether or not only matches FROM the subfolder will be allowed (one-way filtering), or if both matches TO and FROM will be allowed (two-way filtering).

To apply the filter, enter the name of the reference subfolder you wish to study in the following code in the cell below:

<code>execute_filter(neighbours, neighbours.filter_neighbours_not_in_subfolder, neighbours, \['<span style="color:red">moon_landing</span>'\], keep_reverse_matches=True)</code>
 
If you wish to add several reference folders, enter the name of all folders in a row, like this:

<code>execute_filter(neighbours, neighbours.filter_neighbours_not_in_subfolder, neighbours, \['<span style="color:red">Subfolder_1</span>', '<span style="color:red">Subfolder_2</span>', '<span style="color:red">Subfolder_3</span>'\], keep_reverse_matches=True)</code>
 
If you do not wish to allow reverse or two-way filtering, change the following value to “False”:

<code>keep_reverse_matches=<span style="color:red">True</span></code>
 
When you are done adjusting the code, remove the # in the prompt below, run the code cell, and proceed to the next section. After the code cell, you can see the results of the filtering operation.

In [None]:
%%time

# execute_filter(neighbours, neighbours.filter_neighbours_not_in_subfolder, neighbours, ['moon_landing'], keep_reverse_matches=True)

## Filter monochrome frames

To improve frame matching performance, we suggest you remove any frames that are monochrome, i.e. of a single color.

Monochrome frames will match other monochrome images with a very high accuracy (i.e. low distance metric), but we assume that these matches will mostly be irrelevant for users looking to study video reuse – unless they are interested in seeing lots of matched, pitch-black frames, for example.

To determine whether or not a frame is monochrome, the VRD finds extremes in the red-green-blue (RGB) values in each frame and checks if the difference between the smallest and largest RGB value is below a designated threshold. By default, this threshold value is set to 40 and specified in the following row in the code cell below: 

<code>allowed_distance = <span style="color:red">40</span></code>

If all colors are below the designated threshold, the frame is considered monochrome and is filtered/removed from future matching results.  

Note that if  the source material is of low image resolution, there is a risk that this approach will fail as compression artifacts and/or other noise of a very different color may appear in otherwise monochrome images. This can  cause the VRD to incorrectly assume that the frame is not monochrome.

Also not that this filtering option may take some time. 


After the code cell, you can see the result of the operation. If you are not interested in removing monochrome frames from the search results, simply skip this step and move on to the next section.

In [None]:
%%time

is_same_color = neighbours.get_monochrome_frames(allowed_difference=40)
execute_filter(neighbours, neighbours.remove_indexes_from_distance_list, is_same_color)

# Output final matching results 

## Introduction
In this section, you will be shown the final matching results in two different sections: 1) based on each individual analyzed frame, and 2) based on so-called sequential frame matches – that is, instances where two or more sequential frames (i.e., frames that were extracted one after the other from the original files) from two videos have been given a distance metric below a specified value. 

In each section, you will first be shown a statistical overview, where the matching results are presented as text and tables. Next, you have the option of viewing the final matching results as image previews, i.e. miniature images of the analyzed frames that remain after your previously applied filtering settings. 

We suggest you use the statistical overview to get an initial sense of what videos might be interesting to study in more detail, and then use the visual frame comparisons to drill deeper into the matching results. We also suggest you use the visual frame comparisons to double-check if the distance metric thresholds and filtering options applied in the previous chapters seem adequate. 

If you find that your matching results contain a lot of inadequate matches (i.e. if the identified neighbours for your analyzed frames do not live up to your idea of what constitutes a “correct match”), or if you find that almost every matching result is 100% accurate, you might want to go back and adjust your filtering settings. If all matching results appear to be accurate, for example, you might have been too harsh in your previous filtering. This can result in removing correct matches from the final matching results. Likewise, a large amount of inadequate matches may be a sign that you could apply harsher filtering, to avoid dealing with excessive data.

Again, there is no definitive threshold that specifies what constitutes a correct/incorrect match when using the VRD, since the toolkit’s matching results represent an estimation of patterns of similarity across video files. This means that its final matching outputs should not be interpreted as definitive measurements of similarity. Rather, the statistics and frame comparisons are meant to function as a guide that can point users towards videos that might be interesting to study in more detail – primarily by actually opening the original video files and viewing the moving images. 

In other words, we strongly advise against exporting and using these statistics as an absolute proof of video reuse, and instead encourage users to approach them as an assistance tool in navigating large video datasets.

## Matching results for individual frames
This section displays matching results for individual frames. In the first three sections you can find the textual matching results. In the final two sections you can find visual previews.


### Videos with the most remaining similarity neighbours

To see what videos have the most remaining similarity neighbours (given previous filtering settings and distance metric thresholds), run the code cell below.

The matching results will show both the name of the original video (displayed in the column “Video name”) and how many similarity neighbors remain after the previously applied filters (displayed in the column “Remaining no. of neighbours”). The average number of neighbours is also displayed.

We suggest you change the sorting of the table by clicking on the column "Remaining no. of neighbours".

Note that each frame can be counted more than once, with matches against different other frames. This makes it relevant to also study the average number of neighbours.

To look at the videos with the most frequently reused content in greater detail, jump to [Image preview (manual)](#Image-preview-(manual)).


In [None]:
get_grid(neighbours.get_video_match_statistics(distance_threshold=-1).round(2))

### Frames with the most remaining neighbours

To see which individual frames have the most remaining neighbours (given previous filtering settings and distance metric thresholds), run the code cell below. 

By default, you will be shown the top 200 frames with the highest number of remaining neighbours. If you wish to decrease or increase this number, adjust the following value in the code cell below:

<code>number_to_return = <span style="color:red">200</span></code>

The column headings in the matching results will show the following:

- Video name: The name of the original video
- Frame (S): In which second into the original video the analyzed frame can be located
- Frame (HH:MM:SS): In which hour, minute, and second into the video the analyzed frame can be located
- Remaining neighbours: How many similarity neighbours remain for the analyzed frame 

As a tip, you might want to consider using the values shown in column “Frame (S)” to customize your match results in the section [Image preview (manual)](#Image-preview-(manual)).  In addition, the values shown in column “Frame (HH:MM:SS)” can be used to quickly navigate to the designated spot in the original video clip, using an external video player.



In [None]:
get_grid(neighbours.get_top_frames(number_to_return=200))

### Frames with the closest distance metric
To see what frame pairs (i.e. comparisons between two different frames stemming from two different videos) received the lowest distance metrics, run the code cell below. As always, the results are based on your previous filtering settings and distance metric thresholds. 

By default, you will be shown the top 200 matching results. If you wish to decrease or increase this number, adjust the following value in the code cell below:

<code>number_to_return = <span style="color:red">200</span></code>

The column headings in the matching results show the following:

- V.1 Name: The name of the first analyzed video
- V.1 Frame (S): In which second into the first video the analyzed 
frame can be located
- V.1 Frame (HH:MM:SS)	In which hour, minute, and second into the first video the 
analyzed frame can be located
- V.2 Name: The name of the second analyzed video
- V.2 Frame (S): In which second into the second video the analyzed frame can be located
- V.2 Frame (HH:MM:SS): In which hour, minute, and second into the second video the analyzed frame can be located
- Distance: The distance metric that the comparison between Video 1 and 2 generated

The frame pair with the lowest distance metric will be shown on top. 

Again, you may want to use this information to further study the visual content itself - either with the help of image previews in the upcoming sections, or by opening and looking at the original video files. 


In [None]:
df = neighbours.get_frames_with_closest_distance(number_to_return=200)
get_grid(df)

### Image preview (manual)

To view the final matching results for individual frames as visual frame comparisons (i.e. miniature images of the analyzed frames) according to a manual selection, select a video in the drop down menu at the bottom of this section. Also select what frame number you want to see. The frame number should be equivalent to the value displayed in previous “Frame (S)”-columns. 

By default, you will be shown up to five neighbours for each selected frame, if available. If you wish, you can adjust this default setting in the following row in the code cell below:

<code>number_of_neighbours_shown=<span style="color:red">12</span></code>

Note that being shown fewer than five neighbours is a sign that frame matches have been removed as a result of your previous filtering settings. If all neighbours for the selected frame have been removed, you will instead be shown the text message 'Frame has been filtered'.

As an example, your search results may look like this:

![manual_image_preview](img/image_preview_by_manual_selection.png)

In the table, you are first shown a list of the requested matching results in textual form. The top row will display the reference frame for this particular search session – that is, the frame whose matching results you chose to study in the drop down menu. In the following rows, the frames that received the lowest distance metric in relation to the reference frame will be shown.

Below the table, visual previews of the same frames are displayed. The reference video is placed furthest to the left. In the top left corner of each image, the distance metric for the relevant frame vis-a-vis the reference frame is presented.

You may want to study these matching results to explore if your distance metric threshold has been set too high – or too low. You may also want to use the matching results to get a sense of the success with which monochrome frames were removed.

In general, it is quite common that nearly monochrome frames appear in the search results. This is generally because the [Filter monochrome frames](#Filter-monochrome-frames) is set to be fairly restrictive in its removal of frames that almost have the same color.

Depending on what your video dataset looks like, you may also discover that many search results consist of frames with a lot of textual or symbolic overlays (such as subtitles, symbols, or news show banners).

While it is possible to go back and tweak your filtering settings, we suggest you have some forbearance with “incorrect” matches and simply skim through the uninteresting frame matches. This is because raising your filtering thresholds may result in interesting matches being lost.
The presence of monochrome frames and/or frames with heavy textual or symbolic overlays in the search results is also a good reminder of why the VRD’s matching results should not be interpreted absolute as measurements of similarity.

If you wish to view the original reference frame, as well as the pre-processed frame that was inserted into the neural network, select any video and frame (there is no requirement that the frame has any remaining neighbours) and click the "Show debug image" button. The two images will be shown below the neighbour images.

In [None]:
import vrd.widgets.manual_image_preview as mip

mip.ManualImagePreview(neighbours, dl_network.target_size, number_of_neighbours_shown=5);

### Image preview (per video)

This section shows the three frames per video that have the lowest distance to its first similarity neighbour. Up to five neighbours per frame are shown, if available. As always, the results are based on your previous filtering settings and distance metric thresholds.

Similar to the previous section, your search results may look something like this:

![per_video_preview](img/per_video_preview.png)

To change how many frames per video is shown, change the following value:

<code>frames_to_show_per_video=<span style="color:red">3</span></code>

To change how many neighbours to show per frame, change the following value:

<code>neighbours_to_show_per_frame=<span style="color:red">5</span></code>

The frame that has the closest distance to its most similar neighbour will be shown on top.

By default, you will only be shown the first 30 matching results. If you wish to decrease or increase this number, adjust the following value in the code cell below:

<code>max_number_to_return=<span style="color:red">30</span></code>

In [None]:
import vrd.notebook_helper as nbh

nbh.show_best_results_per_video(
    neighbours,
    frames_to_show_per_video=3,
    neighbours_to_show_per_frame=5,
    max_number_to_return=30
)

## Sequential matching results

As a final step, you have the option to view and examine sequential matches, based on your previous filtering settings and distance metric thresholds.

We define a sequential match as an instance where two or more sequential frames (i.e. frames that were extracted one after the other from the original files) from two videos have been given a distance metric below a designated threshold.

If frame 1-6 in Video X and frame 11-16 in Video Y are each given a distance metric below the threshold 20 000, for example, this may be defined as a sequential match.

We can use sequential filtering to identify instances when longer chunks of moving images have been reused. This is for example desirable if we a) think that single frame matches are less interesting than the discovery of longer examples of video reuse, and b) want to increase the accuracy of the VRDs matching results. In short, we can assume that if not just one but several frames in a row have been assigned a low distance metric, this is often an indication that a correct video match has, indeed, been found.

By default, the VRD is set to identify sequences that are 3 seconds long – or more. To adjust the minimum number of sequential frames identified (from 3 sequential frames to 5 sequential frames, for example), change the following value in the code cell below:

<code>minimum_sequence_length=<span style="color:red">3</span></code>

By default, the VRD will apply a distance metric threshold of 30 000 when searching for sequential matches. To change this, adjust the following value: 

<code>max_sequence_distance=<span style="color:red">30000</span></code>

Note that previously filtered matches will not reappear.

Finally, it is possible to allow for gaps in a sequence, for instance to join two sequences that have frames with a distance metric that is higher than the designated threshold between them. To do so, announce how many gaps/non-matching frames you wish to allow by adjusting the following value:

<code>allowed_sequence_gap=<span style="color:red">0</span></code>

In [None]:
%%time
from vrd.overlap_calculator import OverlapCalculator

max_sequence_distance = 30000
oc = OverlapCalculator(
    neighbours, max_sequence_distance, minimum_sequence_length=3, allowed_sequence_gap=0
)

### Overview of sequential matches

This section will display the found sequential matches in the form of a histogram. This histogram can function as a guide when you later explore the sequential matches in your dataset, and can help you get a rough sense of how the length of sequential matches is distributed across the material.

To see the histogram, simply run the code cell below. You do not have to make any changes in the code.

In [None]:
import pandas as pd
import seaborn as sns
from matplotlib.ticker import MaxNLocator, MultipleLocator
from IPython.display import Image, DisplayHandle, HTML, clear_output


def show_discrete_histplot(df, column, title=None):
    oc_hist = sns.histplot(data=df, x=column, discrete=True)
    plt.gca().set_ylabel("No. of sequential matches")
    plt.gca().set_xlabel("Duration in seconds")
    if title is not None:
        oc_hist.set_title(title)
    oc_hist.get_figure().set_size_inches((10, 5))
    oc_hist.xaxis.set_major_locator(MaxNLocator(integer=True))
    oc_hist.xaxis.set_minor_locator(MultipleLocator(1))
    a = plt.xticks(rotation=90)
    return oc_hist


all_sequences = oc.get_all_sequences()
display(get_grid(all_sequences.sort_values(by="Duration", ascending=False)))
discrete_histplot = show_discrete_histplot(
    all_sequences, "Duration", "Histogram of found sequences"
)

### Videos with the most sequence matches
In this section, you can output a list of the videos that contained the most identified sequential matches within your dataset. 

By default, the VRD will show a list of the top ten videos with the most identified sequential matches. To adjust this number, change the following text in the code cell below:

<code>results_to_show=<span style="color:red">10</span></code>


In the table, each sequential match will be counted as 1, regardless of its length (as long as the sequence is equal to or longer than the minimum sequence length chosen in section [Sequential matching results](#Sequential-matching-results)).

To see the list/table, simply run the code cell below. 

In [None]:
results_to_show = 10

columns = ["Video name", "No. of sequential matches"]
result_list = list()
for vid in oc.found_two_way_sequences.keys():
    #     print(oc.two_way_overlap[vid])
    count = np.sum(
        [len(v) for k, v in oc.found_two_way_sequences[vid].items() if v is not None]
    )
    result_list.append({k: v for k, v in zip(columns, [vid, count])})

sequence_count_df = pd.DataFrame(result_list)

sequence_count_handle = DisplayHandle()
sequence_count_handle.display(
    HTML(
        sequence_count_df.sort_values(by="No. of sequential matches", ascending=False)
        .head(results_to_show)
        .to_html(index=False)
    )
)

### Sequence matches with the lowest distance metrics (from a specific video)

In this section, you can output a visual preview + statistics of sequential matches with the lowest distance metrics for specifically selected videos. 

For instance, you may want to return to the section [Videos with the most sequence matches](#Videos-with-the-most-sequence-matches) to see which video received the most identified sequential matches, and start by scrutinizing these search results more closely.

It is possible to sort the statistical search results (which are shown in a table) according to duration (i.e., the length of sequences in seconds), or by something called “score.” The score corresponds to the median value for identified distance metrics in the relevant sequence.

If you want to display the shown sequences, simply choose what video to display in the drop down menu below and press the "Show" button.

In [None]:
import vrd.widgets.sequences_w_lowest_distances as seq_w_lowest

seq_w_lowest.SequencesWithLowestDistances(oc);

### Sequence matches with the lowest distance metrics (by duration)

This section provides a visual overview of sequential matches, based on a manual selection of sequence durations. For instance, you can choose to view all identified sequences that are 4 seconds long, 5 seconds long, and so on. 

We suggest you return to the histogram displayed in the section [Overview of sequential matches](#Overview-of-sequential-matches) to get a sense of how the identified sequential matches in your dataset are distributed with  regards to duration. 

Choose which duration of sequential matches to display in the drop down menu below, and click the "Show" button to display them..

The sequential match with the lowest score (i.e. lowest identified median distance metric within the identified sequential range) will be shown on top.


In [None]:
import vrd.widgets.sequences_w_lowest_distance_by_duration as dur_sequence

dur_sequence.SequencesWithLowestDistancesByDuration(oc);

# End of notebook

Thank you for using the Video Reuse Detector. 

Follow our project on [GitHub](https://github.com/humlab/vrd) and the [European History Reloaded website](https://www.cadeah.eu/).

Don’t hesitate to reach out to tomas.skotare@umu.se or maria.c.eriksson@umu.se if you have any questions or comments.
