# Training Remotely on Google Colab

Make sure you are connected to an environment with an Nvidia GPU. First, clone the project repository and navigate to the root directory:

In [None]:
!git clone https://github.com/starboi-63/ArTEMIS.git

%cd ArTEMIS

Install Python dependencies from `requirements.txt`:

In [None]:
!pip install -r requirements.txt

Check the CUDA driver installed on the current compute instance:

In [None]:
!nvcc --version

Verify that CUDA is working as expected:

In [None]:
import torch

print("PyTorch version: ", torch.__version__)
print("PyTorch CUDA version: ", torch.version.cuda)
print("CUDA available: ", torch.cuda.is_available())

## Downloading the Dataset
Fetch the correct file from drive and unzip it:

In [None]:
!pip install gdown

!gdown --id "1-92uD26anIHLDIGVFuxmT7AlN7idVMkE" # Vimeo-90k dataset full size

Unzip the dataset:

In [None]:
!unzip vimeo_septuplet.zip

## Mounting to Drive

Mount to Google Drive to save logs and checkpoints in real-time:

In [None]:
from google.colab import drive

drive.mount('/content/drive')

Add `/content` to the Python path to import local modules:

In [None]:
import sys
sys.path.append('/content/ArTEMIS')

Create directories to save logs and model training checkpoints:

In [None]:
import os

logging_dir = '/content/drive/My Drive/Deep Learning/ArTEMIS/training/tensorboard_logs'
checkpoint_dir = '/content/drive/My Drive/Deep Learning/ArTEMIS/training/checkpoints'
data_dir = '/content/vimeo_septuplet'

if not os.path.exists(logging_dir):
    os.makedirs(logging_dir)
    print(f"Created directory at {logging_dir}.")
else:
    print(f"Directory {logging_dir} already exists.\n")

if not os.path.exists(checkpoint_dir):
    os.makedirs(checkpoint_dir)
    print(f"Created directory at {checkpoint_dir}.")
else:
    print(f"Directory {checkpoint_dir} already exists.")

## Tensor Board:

Run the following cells to launch tensorboard, which helps visualize training progress.

In [None]:
%load_ext tensorboard

In [None]:
%tensorboard --logdir "{logging_dir}"

## Train!

### Command-Line Arguments

#### Key Arguments
- `--model`: Model to train. Default: `ArTEMIS`.
- `--dataset`: Dataset to train on. Default: `vimeo90k_septuplet`.
- `--data_root`: Path to the dataset. Default: `/content/drive/My Drive/Deep Learning/ArTEMIS/vimeo_septuplet`.
- `--checkpoint_dir`: Directory saving intermediate model states. Default: `/content/drive/My Drive/Deep Learning/ArTEMIS/training/`.
- `--log_dir`: Directory saving TensorBoard logs. Default: `/content/drive/My Drive/Deep Learning/ArTEMIS/training/logs`.

#### Model Parameters
- `--nbr_frame`: Number of input frames to consider. Default: `4`.
- `--joinType`: Type of join operation. Default: `concat`. Choices: `concat`, `add`, `none`.
- `--kernel_size`: Kernel size for the convolutional layers. Default: `5`.
- `--dilation`: Dilation factor for the convolutional layers. Default: `1`.
- `--num_outputs`: Number of interpolated output frames. Default: `3`.

#### Learning Parameters
- `--loss`: Loss function to use. Default: `1*L1`.
- `--lr`: Learning rate. Default: `2e-4`.
- `--beta1`: Beta1 for Adam optimizer. Default: `0.9`.
- `--beta2`: Beta2 for Adam optimizer. Default: `0.999`.
- `--batch_size`: Batch size. Default: `4`.
- `--test_batch_size`: Test batch size. Default: `12`.
- `--start_epoch`: Start epoch. Default: `0`.
- `--max_epoch`: Maximum number of epochs. Default: `100`.
- `--resume`: Resume training. Default: `False`.
- `--resume_exp`: Resume experiment. Default: `None`.
- `--load_from`: Load from a checkpoint. Default: `checkpoints/ArTEMIS/model_best.pth`.
- `--pretrained`: Load from a pretrained model. Default: `None`.

#### Miscellaneous
- `--exp_name`: Experiment name. Default: `exp`.
- `--log_iter`: Log iteration. Default: `100`.
- `--num_gpu`: Number of GPUs. Default: `1`.
- `--random_seed`: Random seed. Default: `103`.
- `--num_workers`: Number of workers. Default: `1`.
- `--val_freq`: Validation frequency. Default: `1`.

In [None]:
!python main.py --model ArTEMIS --dataset vimeo90K_septuplet --data_root "{data_dir}" --log_dir "{logging_dir}" --checkpoint_dir "{checkpoint_dir}" --bath_size 4