# Distant Viewing Toolkit: Example Usage

This notebook introduces the usage of the Distant Viewing Toolkit to generate structured metadata from digitized moving images. The notebook is running in Google's Colab environment, which simplifies the installation process.  Instructions setting up the software on your own machine can be found on the project's [GitHub page](https://github.com/distant-viewing/dvt). 

To run the code on this page, just click on a block of code and either hit run button on the left of the code or type ⌘/Ctrl+Enter. You may be prompted to log into a Google account before executing the code.

**Note**: This notebook can be run using a free GPU instance, which significantly speeds up the processing of the example data. To do this, select *Runtime > Change runtime type* using the menu above and select GPU as the Hardware accelerator. This must be done before running any of the code below.

### Setup

This notebook is running on a default installation of Python 3.7, but does not yet have the Distant Viewing Toolkit or all of its dependencies installed. To  do this, run the following block of code (this may take a minute or two).

In [None]:
%tensorflow_version 2.x
%pip install --upgrade -q git+https://github.com/distant-viewing/dvt.git@14715eb228aabf6561ddb876bbd1b339c29d71bc

Next, we need to grab a video file that we will apply the toolkit to. Run the following code to upload a short clip from the film *All the President's Men*:

In [2]:
!wget -q https://github.com/distant-viewing/dvt-tutorial/raw/master/videos/all-presidents-men-sample.mp4

Instructions for uploading your own files are included below.


### Running the Toolkit

Now, we are ready to load and run the Distant Viewing toolkit over the video file. To start, we will load functions and modules by running the following block of code:

In [14]:
from dvt.pipeline.csv import VideoCsvPipeline
from os.path import join
import pandas as pd
from IPython.display import Image

Next, we will run a default sequence of computer vision algorithms over the video file by running the following line of code. Note that this could take several minutes to complete; you may also see one or two warnings, though these can be safely ignored.

In [None]:
VideoCsvPipeline(finput="all-presidents-men-sample.mp4", dirout="dvt-output-csv", include_images=True).run()

You will know that the code is finished processing when the spinning icon on the left of the code stop moving and turns into a number in square brackets.

### Viewing the Output

We can now take a look at the output of the pipeline, which consists of several CSV files and extracted frames. Here is a list of the available files: 

In [None]:
!ls -l dvt-output-csv/all-presidents-men-sample/data

To illustrate, we will read a few of these into Python using the **pandas** module. Here, for example, are all of the faces in the detected cuts:

In [None]:
pd.read_csv("dvt-output-csv/all-presidents-men-sample/data/face.csv")

And here are the detected objects:

In [None]:
pd.read_csv("dvt-output-csv/all-presidents-men-sample/data/obj.csv")

The pipeline has also extract the median frame from each detect shot, which we can use to confirm the extracted CSV data:

In [None]:
!ls -l dvt-output-csv/all-presidents-men-sample/img

For example, the first cut does show one person along with a wine glass (or at least something very similar to a wine glass filled with water):

In [None]:
Image("dvt-output-csv/all-presidents-men-sample/img/frame-000074.png")

And the second cut shoes another person with a similar wine glass as well as a tie:

In [None]:
Image("dvt-output-csv/all-presidents-men-sample/img/frame-000197.png")

You can change the code above to load ther other cuts and to view the other CSV files.

### Downloading the data

You may eventually want to download the extracted data to your local machine. This can be done by click *View > Table of contents* in the menu above, selecting the folder icon (third one down on the left-hand side), clicking on the three vertical dots icon to the right of the "dvt-output-csv" folder, and then selecting "Download".

In theory this can also be done with a short line of code, but we have experienced issues when using different browsers and find the manual method for downloading the file more reliable.


## Using your own data

Hopefully the above demo has encouraged you to explore using the Distant Viewing Toolkit with your own data. You can try this out within the Google Colab Notebook by first uploading your file to Google Drive. Then, run the following block of code to attach your Google Drive account to this notebook (it may prompt you to open another window and paste a password over):

In [22]:
from google.colab import drive
drive.mount('drive')

Mounted at drive


Once that is complete, you should have access to your Google drive in the following location:

In [None]:
%ls drive/'My Drive'

You should then be able to run the toolkit over your video file just as before by selecting the path to your video file of interest. If your video is called "dvt-demo.mp4", for example, you would run the following:

In [None]:
VideoCsvPipeline(finput="drive/My Drive/dvt-demo.mp4", dirout="dvt-output-csv", include_images=True).run()

The GPUs provided by Google Colab are not particularly fast; it may take a while to process long videos, particularly if they were recorded in high-definition.

As before, the download files will be in the "dvt-output-csv" directory, and can be downloaded with the same instructions as above:


In [None]:
!ls -l dvt-output-csv/

There are any array of different annotators and aggregation algorithms included in the toolkit, and many tuning parameters than can be adjusted. For more information, see the documentation on the project's GitHub page.