# Pix2PixHD

This notebook was created by Doug Rosman, and uses code from [Doug's forked pix2pixHD repository](https://github.com/dougrosman/pix2pixHD). For a video tutorial showing how to use this notebook, visit this link here: [https://dougrosman.github.io/cvml-sp21/resources/pix2pixHD/](https://dougrosman.github.io/cvml-sp21/resources/pix2pixHD/)

The notes in this notebook try to be as comprehensive as possible.



## 1. Connect to a GPU Instance (required)

**Executing this cell will connect you to a GPU, and your 8-10 hours of free GPU time will begin.**

This will show you what GPU you've been randomly given for this instance. With a Google Colab Pro account ($9.99/mo, really worth it!), you're almost always guaranteed a **P100**, with a chance at getting a **V100**.
* **V100:** Best, (not available for free accounts)
* **P100:** Great
* **T4:** (untested) this might *not* work for training.
* **K80:** (untested) this might work, but it will likely be very slow.

If you get a T4 or K80, I encourage you to terminate your session, wait 5-10 minutes, then try connecting to a GPU again. To terminate your session, at the top of your screen go to **Runtime** --> **Manage Sessions** --> **Terminate**


In [None]:
!nvidia-smi -L

## 2. Mount your Google Drive (required)

**Executing this cell will prompt you to mount your Google Drive.**

After executing, a link will show up. Click the link and follow the directions. Select the Google Drive you wish to use (use your SAIC account since it has unlimited storage). Copy and paste the authorization key into the box below, and press 'Enter' or 'Return' on your keyboard.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## 3. Install pix2pixHD repository OR change directory into repo (required)

**Executing this cell will either install the pix2pixHD repo into your Google Drive, or move you into the pix2pixHD repo if it already exists. This cell also installs a python dependency called _dominate_.**

**Note: _If you already have a colab-pix2pixHD folder in your Google Drive, make sure to either rename or remove the folder, or else you won't clone the correct repo in this step._**

**Case 1: Installing the repo**
If this is your first time using this notebook (or if you deleted a previously installed version of the pix2pixHD repo in you Google Drive), this cell will clone Doug Rosman's forked pix2pixHD repo into your Google Drive into a folder called 'colab-pix2pixHD'. After cloning, it will move you into the pix2pixHD folder.

**Case 2: Moving into the repo**
If this repo already exists in your Google drive, this cell will move you into the pix2pixHD folder so that you can execute the other cells in this notebook.


In [None]:
import os
if os.path.isdir("/content/drive/MyDrive/colab-pix2pixHD"):
    %cd "/content/drive/MyDrive/colab-pix2pixHD/pix2pixHD"
    !pip install dominate
    !pip install -r util/requirements.txt
elif os.path.isdir("/content/drive/"):
    #install script
    %cd "/content/drive/MyDrive/"
    !mkdir colab-pix2pixHD
    %cd colab-pix2pixHD
    !git clone https://github.com/dougrosman/pix2pixHD
    %cd pix2pixHD
    !mkdir generated_videos
    !pip install dominate
    #install python requirements for Derrick Schultz' dataset-tools.py
    #more info: https://github.com/dvschultz/dataset-tools
    !pip install -r util/requirements.txt
else:
    !git clone https://github.com/dougrosman/pix2pixHD
    %cd pix2pixHD
    !mkdir generated_videos
    !pip install dominate
    !pip install -r util/requirements.txt


## 4. Data Processing

This notebook includes the following commands to help you create your data set:

1. Create the required folders for organizing your training data
1. Extract frames from a video file using FFMPEG
1. Create Canny edge versions of your images for your input (train_A) images


### 4a. Create necessary folders and upload dataset files

1. Inside your datasets folder, create a folder and name it based on the content of your dataset. Create the following two folders inside that folder. 
(**note:** your folder names must match these exactly):
(**note:** the images in train_A MUST BE IN THE EXACT SAME ORDER as the images in train_B, and the number of images in each should be EXACTLY THIS SAME. 
This will likely happen automatically, but in case your training outputs appear mis-matched, it's likely because your train_A and train_B folders are mismatched.)
  1. **_train_A_**, for your input images. Place your input images inside that folder.
  2. **_train_B_**, for your output images. Place your output images inside that folder.

The following command creates the folders for you. Just change 'your_dataset_name' to something indicative of your dataset.

In [None]:
# create a folder for your dataset (change your_dataset_name to whatever your dataset is)
!mkdir ./datasets/dataset_name

# creates three folders inside your datasets folder, one for your input (train_A),
# one for your output (train_B), and one for your test images (test_A)
!mkdir ./datasets/dataset_name/train_A
!mkdir ./datasets/dataset_name/train_B
!mkdir ./datasets/dataset_name/test_A

Before proceeding to the next step, upload the video that you will be using to extract images to your dataset folder.

If your images are already prepared, upload your input images to the train_A folder, and your output images to the train_B folder.

### 4b. Extract frames from video – create train_B (output) images
If your images are already prepared and don't need to be extracted from a video file, you don't need to do this step. Your extracted frames will end up in the train_B folder.

In [None]:
# change dataset_name to the dataset_name you created above
# change input.mp4 file to the name of your video file
# change scale to your desired resolution to resize your images. (1280:-1 scales
  ## images to 1280 for the width; the height is scaled to maintain the aspect ratio)
  ## change the width to whatever makes sense for your images (I haven't tested
  ## anything larger than 1280x720, but you could try higher resolutions)
# change the fps (number of frames per second to extract);
  ## higher fps = more images to extract, 4-12 is a good range

!ffmpeg \
 -i ./datasets/dataset_name/input.mp4 \
 -vf scale=1280:-1,fps=8 \
 ./datasets/dataset_name/train_B/output%5d.png

### 4c. Apply Canny Edge Detection - create train_A (input) images
If your images are already prepared and don't need to be converted to Canny edges, you don't need to do this step. This step uses [Canny Edge Detection](https://en.wikipedia.org/wiki/Canny_edge_detector) to find edges in your input images. The outlined images are rendered and stored in the train_A folder.


In [None]:
# change --input_folder to the path to your train_B folder
# change --output_folder to the path to your train_A folder

# change blur amount if there are too many lines in your resulting Canny Edge
# images (odd numbers only, 3 or 5 are good values, but try 1 if you're hardly getting any lines)

# change max_size to be the max dimension of your input images (e.g., if your
# input images are 1280x720, set max_size to '1280')

!python util/dataset-tools.py \
--input_folder ./datasets/dataset_name/train_B/ \
--output_folder ./datasets/dataset_name/train_A/ \
--process_type canny \
--blur_type gaussian \
--blur_amount 3 \
--max_size 1280 \
--verbose

## 5. Training

### **Some notes on training:** 

* **To stop your training manually**, click the stop button on the cell that's running your training.
* **You should only train using a P100 or a V100** (step 1 in the notebook tells you which care you have).
* **There's no set time for how much training your model needs to get the results you want**, but at least 60 epochs is ideal (more is likely better)
* **Watch your results folder as you train.** If it looks like your training is getting __*worse*__, then stop your training.
* **When you start your training, stick around for the first 10-15 minutes,** Google Colab checks to see if you're a robot around that time, so make sure you're there to confirm your humanity.
* **Don't close this tab!** You can do other things on your computer, and browse other tabs, but just don't close the tab!
* **Don't close your laptop!**
* **Don't let your computer fall asleep.** Go into your system settings to make sure your computer won't fall asleep.
* **On a free account, you'll get around ~7-10 hours of continuous training.**
* **On a pro account, you'll get around ~18-24 hours of continuous training.**
* **For free accounts, if you train for around ~40 hours or so in a single week, Google may "shadowban" you for a bit**, meaning you might not be able to connect to a GPU until after waiting a few hours (or sometimes an entire day). If you're running into these issues, I recommend Google Colab Pro (it's only $9.99 for the month, and totally worth it).
* **You can't do anything else in this notebook while training.** If you want to generate images while training, I recommend opening up a second Colab notebook in another Google account. Note, Google might be on to you if it finds you're using like, 5 Colab notebooks simultaneously. Proceed with this at your own risk. Just make sure you don't mount the same Drive folder in step 2 from multiple Colab notebooks.

### 5a. Training a new model from scratch (required for both training from scratch AND resuming training)

Set the following variables, whether you're training from scratch or resuming training. After change the variables, click the play button in this cell to save your values. If you're resuming a training, set your variables, then skip to step 5b.

In [None]:
##### REQUIRED: edit these each time you train a new model from scratch!
name = 'training_name'    # can be whatever you want; name this based on your dataset (e.g. wave_pool)
dataroot = 'datasets/dataset_name'  # must match the name of your dataset folder
loadSize = 1280     # The desired width of your outputs (note: images will be cropped to this) Default=1024 
fineSize = 720      # The desired height of your outputs
which_epoch = 'latest' # The epoch you wish to resume training from. Keep this set to 'latest' if you want to 
                       ## pick up from where you left off. Otherwise, put the number of the .pth file you want 
                       ## to resume from (e.g. 10, 20, 30, etc.) 

##### OPTIONAL: change these if needed
resize_or_crop = 'scale_width'  # keeping this unchanged will automatically resize
                                  ## your images to the loadSize and fineSize, then crop to
                                  ## those dimensions. Set to 'none' if your images are
                                  ## already the correct dimensions

display_freq = 200    # frequency of showing training results on screen
print_freq = 100      # frequency of showing training results on console
save_latest_freq = 1000     # frequency of saving the latest results
                              ##(lower = more frequent saving, 1000 ~ saves every 10 minutes)
save_epoch_freq = 10    # frequency of saving checkpoints at the end of epochs
                        # (1 epoch is completed after going through every image in your data set 1 time)

In [None]:
# ONLY RUN THIS IF YOU ARE STARTING A NEW TRAINING
!python train.py --name=$name --dataroot=$dataroot --checkpoints_dir checkpoints --no_instance --label_nc 0 --loadSize=$loadSize --fineSize=$fineSize

### 5b. Resuming your training (required if resuming training from a partially-trained model)

Run this command if you are resuming training. You should still set your variables above before running this command.

In [None]:
# ONLY RUN THIS IF YOU ARE RESUMING A TRAINING
!python train.py --name=$name --dataroot=$dataroot --checkpoints_dir checkpoints --no_instance --label_nc 0 --continue_train --which_epoch=$which_epoch --loadSize=$loadSize --fineSize=$fineSize

## 6. Generating Images
In order to generate images with pix2pixHD, you need to feed the model with "test" images. Your test images should look like your input (train_A) images. For example, if your train_A images used Canny Edge images, then your test images should also be Canny Edge images.

**pix2pixHD takes the images from your test_A folder and feeds them into your trained model. Any time you want to test new input images with your model, you'll need to replace the images in the test_A folder with your new images.** 

Inevitably, pix2pixHD will have you working with a large amount of images spread across a number of different folders, and your file organization can get out of hand if you don't plan ahead a bit.

The following code cells provide tools to prepare your test images. None are **required**, so read the description before each cell to see if that's something you want to do.

### 6a. Preparing your test images

#### **Option 1:** Generating images from your original training data (the images from your train_A folder)
1. This cell removes any images from your test_A folder (if there are any), and 
2. Copies all the images from your train_A folder to your test_A folder.
Testing your trained model with the original data set can be useful to see how accurately the model can recreate the training data.

In [None]:
# remove any images currently in test_A
!rm -v ./datasets/dataset_name/test_A/*.png

# copy images from train_A to test_A
!cp -v ./datasets/dataset_name/train_A/*.png ./datasets/dataset_name/test_A

#### **Option 2: Testing with Canny Edge images from a new input source.**
This is where it starts to get important to stay organized. This command creates a folder inside of "input_test_images" (which lives in your datasets folder). **Do this for each new set of test images you create.**

This notebook refers to new sets of test images as "experiments".

In [None]:
# create a folder + relevant subfolders for your new experiment
!mkdir ./datasets/input_test_images/experiment_name
!mkdir ./datasets/input_test_images/experiment_name/extracted_frames
!mkdir ./datasets/input_test_images/experiment_name/canny_edges

#### **Upload your source video**
After running the above cell, upload the video to the [experiment_name] folder you created above.

#### **Extract frames from source video**

In [None]:
# change experiment_name to the experiment_name you created above
# change input.mp4 file to the name of your video file
# change scale to your desired resolution to resize your images. (1280:-1 scales
  ## images to 1280 for the width; the height is scaled to maintain the aspect ratio)
  ## change the width to whatever makes sense for your images (I haven't tested
  ## anything larger than 1280, but you might be able to go up to 1440 or 1600)
# change the fps (number of frames per second to extract);
  ## this time, you probably want all the frames from the video, so set
  ## the fps to the fps of your input video.
# change output%5d.png to include a reference to your experiment_name

!ffmpeg \
 -i ./datasets/input_test_images/experiment_name/input.mp4 \
 -vf scale=1280:-1,fps=30 \
 ./datasets/input_test_images/experiment_name/extracted_frames/output%5d.png

#### **Apply Canny Edge Detection - create test_A images**
If your images are already prepared and don't need to be converted to Canny edges, you don't need to do this step. This step uses [Canny Edge Detection](https://en.wikipedia.org/wiki/Canny_edge_detector) to find edges in your input images. The outlined images are rendered and stored in the canny_edges folder.


In [None]:
# change --input_folder (change experiment_name)
# change --output_folder (change experiment_name)

# change blur amount if there are too many lines in your resulting Canny Edge
# images (odd numbers only, 3 or 5 are good values, but try 1 if you're hardly getting any lines)

# change max_size to be the max dimension of your input images (e.g., if your
# input images are 1280, set max_size to '1280')

!python util/dataset-tools.py \
--input_folder ./datasets/input_test_images/experiment_name/extracted_frames/ \
--output_folder ./datasets/input_test_images/experiment_name/canny_edges \
--process_type canny \
--blur_type gaussian \
--blur_amount 3 \
--max_size 1280 \
--verbose

#### **Put your test images in the correct folders**
Run this cell to remove any images currently in test_A, then copy your new Canny Edge Images to the test_A folder.

In [None]:
# remove any images currently in test_A (change dataset_name to your dataset_name)
!rm -v ./datasets/dataset_name/test_A/*.png

# copy your canny edge images into test_A (change experiment_name and dataset_name)
!cp -v ./datasets/input_test_images/experiment_name/canny_edges/*.png ./datasets/dataset_name/test_A

### 6b. Running the generate command

Change your variables as need below, then run the cell to save the changes.

In [None]:
##### REQUIRED: edit these each time each you generate images

name = 'dataset_name'        # change dataset_name; must match the name you set when you trained your model
dataroot = 'datasets/dataset_name'  # change dataset_name; must match the name of your dataset folder
results_dir = './results/experiment_name' # replace experiment_name with something descriptive of the image sequence you're generating

##### OPTIONAL: change these if needed
how_many = 1000    # The number of images you wish to generate. Make sure this is greater than or equal to the number of images you're trying to generate
which_epoch = 'latest' # The epoch you wish to generate images from (e.g. '20_net_G.pth') (Defult: latest)


In [None]:
# Run this cell to generate your images. This may take a few minutes.
!python test.py --name=$name --dataroot=$dataroot --checkpoints_dir checkpoints --results_dir=$results_dir --which_epoch=$which_epoch --how_many=$how_many --no_instance --label_nc 0 --loadSize=$loadSize --fineSize=$fineSize

## 7. Creating videos from your generated images

### 7a. Create video of your test input images
_Thread your test_A images together into a video. If your input images used Canny Edge, then this will be a video of your Canny Edge inputs._

1. **-i**: the input images _(the filepath to your synthesized images)_
1. **-r**: the framerate _(any value between 1-60)_
1. **-crf**: the compression quality of the output video _(lower is better, 17-25 is a good range)_
1. **_the output filename_**: the last argument; make sure to set this each time you create a new video

In [None]:
# change experiment_name, dataset_name, and 'input_images_sequence.mp4'
!ffmpeg \
   -i "./results/experiment_name/dataset_name/test_latest/images/*_input_label.png" \
   -r 30 \
   -pattern_type glob \
   -vcodec libx264 \
   -crf 23 \
   -pix_fmt yuv420p \
   ./generated_videos/input_images_sequence.mp4

### 7b. Create video of your synthesized images
_Thread your synthesized images together into a video._

1. **-i**: the input images _(the filepath to your synthesized images)_
1. **-r**: the framerate _(any value between 1-60)_
1. **-crf**: the compression quality of the output video _(lower is better, 17-25 is a good range)_
1. **_the output filename_**: the last argument; make sure to set this each time you create a new video

In [None]:
# change experiment_name, dataset_name, and 'synthesized_output.mp4'
!ffmpeg \
   -pattern_type glob \
   -i "./results/experiment_name/dataset_name/test_latest/images/*synthesized_image.png" \
   -r 30 \
   -vcodec libx264 \
   -crf 23 \
   -pix_fmt yuv420p \
   ./generated_videos/synthesized_output.mp4