# Notebook 01: Introduction to Distant Viewing Toolkit (DVT)

In this notebook we illustrate the usage of the core DVT functionality. We will use
a short video file taken from the film *All the President's Men*. You can generate 
the same output using your own video files with the template at the end of the file.

## Prebuilt Pipelines

The Distant Viewing Toolkit includes several methods for extracting and
visualizing metadata from moving images that require minimal setup. For
example, we can provide a filename and output directory to the
`VideoCsvPipeline` class to run a pre-defined sequence of algorithms over
our dataset. Here is the code to run the pipeline over our sample video.

In [None]:
from dvt.pipeline.csv import VideoCsvPipeline
from os.path import join

In [None]:
VideoCsvPipeline(finput=join("videos", "all-presidents-men-sample.mp4"), dirout="dvt-output-csv").run()

You may see some warnings from the underlying Deep Learning algorithms, but these
can usually be ignored. After finishing, you should be able to see that the
pipeline has produced a number of CSV files within the "dvt-output-csv" directory.
The general idea is that a first pass was made over the video file to detect shot
breaks. Then, a second pass applied a number of computer algorithms to the middle 
frame in each shot. For exaple, it reports the detected objects, faces, and colors.
This pipeline is great for users who want to read the dataset into another software
program, such as R, Excel, or SPSS.

Another pipeline produces a similar set of data, but produces output that can be
visualized in a web browser. The same syntax as above can be used by replacing the
CSV pipline object with `VideoVizPipeline`.

In [None]:
from dvt.pipeline.viz import VideoVizPipeline

In [None]:
VideoVizPipeline(finput=join("videos", "all-presidents-men-sample.mp4")).run()

When this script finishes running, you should see a new directory called "dvt-output-data".
Inside are several JavaScript, CSS, and HTML files. Under the subdirectories, there are
also still images extracted from the input as well as JSON files capturing similar 
data to the CSV files in the first pipeline.

In order to see the visualization in your own browser, you will need to start a
local server in the "dvt-output-data" directory. If you have access to a terminal,
you can run `python3 -m http.server` within the directory. Otherwise, you may also
start a Jupyter notebook server from within the directory "dvt-output-data" directory
and open the file "index.html". (For longer workshops, we push these to GitHub pages,
which handles the server part for us, but required more time and setup).

## Applying Annotators to a Video File

The pre-built pipelines are a convenient starting place. It is also possible
to make use of the internal mechanisms provided by DVT to create a custom 
set of annotations on your data. We will show an example of this here.

To start, we need to create a `FrameInput` object that points to our video
file. This object handles grabbing frames from the input dataset. 

In [None]:
from dvt.core import FrameInput
fi = FrameInput(input_path=join("videos", "all-presidents-men-sample.mp4"))

Next, we construct a data extraction object by providing out input object as an input.
There are various options that are possible here, such as providing the location of
an audio file, but for now we will just use our input video.

In [None]:
from dvt.core import DataExtraction
dextraction = DataExtraction(fi)

With the input and extraction objects created, we can now pass a list of 
annotator objects and run them over the input. Here, we will use an 
annotator called `DiffAnnotator`. The algorithm will compute
differences between succesive frames; by setting the quantiles input to 50,
we indicate that we want to output the median differences between these
frames.

In [None]:
from dvt.annotate.diff import DiffAnnotator

dextraction.run_annotators([
    DiffAnnotator(quantiles=[50])
])

Once the annotator has been run, the results are stored inside of the data
extraction object. To retrieve them, we use the `get_data` method. Note that
the algorithm has produced our desired annotations (called "diff") as well
as a special set of annotations called "meta". 

In [None]:
dt = dextraction.get_data()
dt.keys()

As we input only a single video file, the metadata record contains only
a single line. Printing it out, we see that it gives some basic information
about the video file:

In [None]:
dt['meta']

The difference annotator gives information about each frame in the
input, such as the average value (how bright the image is) as well
as the median difference between each frame and the next.

In [None]:
dt['diff']

The column "q50" gives the pixel difference between each frame and
"h50" gives the histogram difference between colors in each image.
Used together, we can try to predict when there is a shot break in
the source video.

## Applying Aggregator for Cut Detection

As mentioned, one usage of the difference annotations is to detect the boundary
between shots. The distance viewing toolkit includes an aggregator (and algorithm
which processes other annotations, rather than working from the original image 
data) that performs this calculation called `CutAggregator`. To use it, we 
specify the cutoff score used for detected differences (here, a "q50" score of
3 or greater) and the minimum length of a new shot. Then, the aggregator is
passed to the data extraction object's `run_aggregator` method.

In [None]:
from dvt.aggregate.cut import CutAggregator
dextraction.run_aggregator(CutAggregator(cut_vals={'q50': 3}, min_len=10))

As with the annotations, the aggregated data is stored inside of the `dextraction`
and we can retrieve it with the `get_data` method. Here, notice that we now have
a new data type called "cut":

In [None]:
dt = dextraction.get_data()
dt.keys()

Printing out this data shows the estimated shots, of which there are estimated
to be 12:

In [None]:
dt['cut']

As an example of how to put this all together for an analysis, here we
will use the  framerate from the metadata to compute the length of each
shot in seconds:

In [None]:
dt['cut']['length_sec'] = ((dt['cut']['frame_end'] - dt['cut']['frame_start']) / dt['meta']['fps'].values)
dt['cut']

In the following notebooks we will apply the full Distant Viewing approach to address
research questions with these kinds of extracted values.

## Your own data

If you would like to run the DVT toolkit in your data, one good way to start
is to apply the `VideoVizPipeline`. To do this, it is easiest to copy your 
data into the video folder in the workshop folder. Then, input the file name
below and run the code. 

In [None]:
from dvt.pipeline.viz import VideoVizPipeline
from os.path import join

input_path = ""
VideoVizPipeline(finput=join("videos", input_path)).run()

If your input file is large, the code above could take a while to finish running.
In the case of a file longer than a couple of minutes, consider starting with the
following, where you can manually set a fixed frequency for how frequently (in frames)
the DVT annotators are applied.

In [None]:
from dvt.pipeline.viz import VideoVizPipeline
from os.path import join

input_path = ""
VideoVizPipeline(finput=join("videos", input_path), frequency=1000).run()