# 2. Download and preprocess the video
In this notebook, we'll download the preprocess the video that we will be applying style transfer to. The output of the tutorial will be the extracted audio file of the video, which will be reused when stitching the video back together, as well as the video separated into individual frames.

The video that will be used in this tutorial is also of orangutans (just like the provided sample content images). The video is stored in a publib blob that we will download. However, for this section of the tutorial, you can choose to switch out the video with something of your own choice. Likewise, feel free to switch out the style image instead of using the provided image of a Renior painting.

```md
├── images/
│   ├── orangutan/ [<-- this folder will contain all individual frames from the video]
│   ├── sample_content_images/
│   ├── sample_output_images/
│   └── style_images/
├── video/ [<-- create this new folder to put video content in]
│   ├── orangutan.mp4 [<-- this is the downloaded video]
│   └── orangutan.mp3 [<-- this is the extracted audio file from the video]
└── style_transfer_script.py
```

---

Import utilities to help us display images and html embeddings:

In [None]:
from IPython.display import HTML
import os
%load_ext dotenv
%dotenv

First, create the video folder store your video contents in.

In [None]:
%%bash
mkdir pytorch/video

Download the video that is stored in a public blob storage, located at https://happypathspublic.blob.core.windows.net/videos/orangutan.mp4

In [None]:
%%bash 
cd pytorch/video && 
    wget https://happypathspublic.blob.core.windows.net/videos/orangutan.mp4

Set the environment variable __VIDEO_NAME__ to the name of the video as this will be used throughout the tutorial for convinience.

In [None]:
%%bash
dotenv set VIDEO_NAME orangutan

Lets check out the video so we know what it looks like before hand:

In [None]:
%dotenv
HTML('\
    <video width="360" height="360" controls> \
         <source src="pytorch/video/{0}.mp4" type="video/mp4"> \
    </video>'\
    .format(os.getenv('VIDEO_NAME'))
)

Next, use __ffmpeg__ to extract the audio file and save it as orangutan.mp3 under the video directory.

In [None]:
%%bash 
cd pytorch/video &&
    ffmpeg -i ${VIDEO_NAME}.mp4 ${VIDEO_NAME}.mp3

Finally, break up the frames of the video into separate individual images. The images will be saved inside a new folder under the __/images__ directory, called __/orangutan__.

In [None]:
%%bash
cd pytorch/images/ &&
    mkdir ${VIDEO_NAME} && cd ${VIDEO_NAME} &&
    ffmpeg -i ../../video/${VIDEO_NAME}.mp4 %05d_${VIDEO_NAME}.jpg -hide_banner

To make sure that the frames were successfully extracted, print out the number of images under __pytorch/images/orangutan__. For the orangutan video, that number should be 823 individual images:

In [None]:
!cd pytorch/images/${VIDEO_NAME} && ls -1 | wc -l

---

## Conclusion
In this notebook, we downloaded the video that we will be applying neural style transfer to, and processed it so that we have the individual frames and audio track as seperate entities. In other scenarios, this can be thought of as preprocessing the data so that it is ready to be scored. 

Next, we will use the style transfer script from the previous notebook to batch apply style transfer to all extracted frames using Batch AI in Azure.