# 1) Mount drive, unzip data, clone repo, install packages

## 1.1) Mount Drive and define paths
Run provided colab code to mount Google Drive. Then define dataset paths relative to mount point.

In [None]:
!rm -rf /content/sample_data
!rm -rf /content/*.jpg
!rm -rf /content/*.png
!rm -rf /content/*.json

In [None]:
# noinspection PyUnresolvedReferences,PyPackageRequirements
from google.colab import drive
mount_root_abs = '/content/drive'
drive.mount(mount_root_abs)
drive_root = f'{mount_root_abs}/MyDrive'

In [None]:
import os
# DeepFashion In-shop Clothes Retrieval Benchmark (ICRB)
df_root_drive = f'{drive_root}/Datasets/DeepFashion'
assert os.path.exists(df_root_drive)
df_icrb_root_drive = f'{df_root_drive}/In-shop Clothes Retrieval Benchmark'
assert os.path.exists(df_icrb_root_drive)
df_icrb_img_zip_abs_drive = f'{df_icrb_root_drive}/Img.zip'

# If Img.zip is not present, we need to unzip .../Img/img_iuv.zip directory
# from drive root and then run ICRBScraper.run() from /src/dataset/deep_fashion.
# For this nb, we skip this since it'll take an eternity to complete with
# mounted Google Drive.
assert os.path.exists(df_icrb_img_zip_abs_drive), \
  'Please upload a processed zip (processing img.zip in colab will take' + \
  f' for AGES). \nTried: {df_icrb_img_zip_abs_drive}'

## 1.2) Unzip Img directory in Colab
By unzipping the `Img.zip` in Colab before running our model we gain significant disk reading speedups.
So, the first step is to unzip images directory, and then save the image directory before proceeding.

In [None]:
df_icrb_root = df_icrb_root_drive.replace(drive_root, '/content/data')
df_icrb_img_root = f'{df_icrb_root}/Img'
if not os.path.exists(df_icrb_img_root):
    # Clear any previous attempts
    # ATTENTION: This will remove /contents/data/*. So, before running, please make
    # sure no usable files will be deleted.
    !mkdir -p /content/data
    !rm -rf /content/data

    # Create output directory
    !mkdir -p "$df_icrb_root"

    # Transfer Img.zip from Google Drive to Colab
    df_icrb_img_zip_abs = f'{df_icrb_root}/{os.path.basename(df_icrb_img_zip_abs_drive)}'
    if not os.path.exists(df_icrb_img_zip_abs):
        !cp "$df_icrb_img_zip_abs_drive" "$df_icrb_root"
    # Unzip it in Colab
    !unzip -q "$df_icrb_img_zip_abs" -d "$df_icrb_root"
    # Handle newly-created image directory
    assert os.path.exists(df_icrb_img_root), f'df_icrb_img_root: {df_icrb_img_root}'
    assert not os.path.exists(f'{df_icrb_img_root}/Img')
    assert not os.path.exists(f'{df_icrb_img_root}/img')
    !rm -f "$df_icrb_img_zip_abs"
    assert not os.path.exists(df_icrb_img_zip_abs)

    # Create a symbolic link back to drive (we need this to fool GDriveDataset
    # into thinking that it done the unzipping)
    if os.path.exists(f'{df_icrb_root_drive}/Img'):
        !rm "$df_icrb_root_drive"/Img
    !ln -s "$df_icrb_img_root" "$df_icrb_root_drive"

## 1.3) Clone github repo
Clone achariso/gans-thesis repo into /content/code
 using git clone.
 For more info see: https://medium.com/@purba0101/how-to-clone-private-github-repo-in-google-colab-using-ssh-77384cfef18f

In [None]:
repo_root = '/content/code/gans-thesis'
if not os.path.exists(repo_root) and not os.path.exists(f'{repo_root}/requirements.txt'):
    # Check that ssh keys exist
    assert os.path.exists(f'{drive_root}/GitHub Keys')
    id_rsa_abs_drive = f'{drive_root}/GitHub Keys/id_rsa'
    id_rsa_pub_abs_drive = f'{id_rsa_abs_drive}.pub'
    assert os.path.exists(id_rsa_abs_drive)
    assert os.path.exists(id_rsa_pub_abs_drive)
    # On first run: Add ssh key in repo
    if not os.path.exists('/root/.ssh'):
        # Transfer config file
        ssh_config_abs_drive = f'{drive_root}/GitHub Keys/config'
        assert os.path.exists(ssh_config_abs_drive)
        !mkdir -p ~/.ssh
        !cp -f "$ssh_config_abs_drive" ~/.ssh/
        # # Add github.com to known hosts
        !ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
        # Test: !ssh -T git@github.com

    # Remove any previous attempts
    !rm -rf "$repo_root"
    !mkdir -p "$repo_root"
    # Clone repo
    !git clone git@github.com:achariso/gans-thesis.git "$repo_root"
    src_root = f'{repo_root}/src'
    !rm -rf "$repo_root"/report

## 1.4) Install pip packages
All required files are stored in a requirements.txt files at the repository's root.
Use `pip install -r requirements.txt` from inside the dir to install required packages.

In [None]:
%cd "$repo_root"
!pip install -r requirements.txt

In [None]:
import torch
assert torch.cuda.is_available()

## 1.5) Add code/, */src/ to path
This is necessary in order to be able to run the modules.

In [None]:
content_root_abs = f'{repo_root}'
src_root_abs = f'{repo_root}/src'
%env PYTHONPATH="/env/python:$content_root_abs:$src_root_abs

# 2) Train PGPG model on DeepFashion
In this section we run the actual training loop for PGPG network. PGPG consists of a 2-stage generator, where each stage is a UNET-like model, and, in our version, a PatchGAN discriminator.

### Colab Bug Workaround
Bug: matplotlib cache not rebuilding.
Solution: Run the following code and then restart the kernel.


In [None]:
# now inside train_pgpg.py

### Actual Run
Eventually, run the code!

In [None]:
chkpt_step = 'latest'   # supported: 'latest', <int>, None
log_level = 'debug'     # supported: 'debug', 'info', 'warning', 'error', 'critical', 'fatal'

# From epoch=37, lambda_recon in G2's loss went from 1 --> 5
# From epoch=66, lambda_recon in G2's loss went from 5 --> 10

# Running with -i enables us to get variables defined inside the script (the script runs inline)
%run -i src/train_pgpg.py

# 3) Evaluate PGPG
In this section we evaluate the generation performance of our trained network using the SOTA GAN evaluation metrics.

## 3.1) Get the metrics evolution plots
We plot how the metrics evolved during training. The GAN is **not** trained to minimize those metrics (they are
calculated using `torch.no_grad()`) and thus this evolution merely depends on the network and showcases the correlation
between the GAN evaluation metrics, and the losses (e.g. adversarial & reconstruction) used to optimize the network.

In [None]:
# Since the PGPG implements utils.ifaces.Visualizable, we can
# directly call visualize_metrics() on the model instance.
_ = pgpg.visualize_metrics(upload=True, preview=True)

## 3.2) Evaluate Generated Samples
In order to evaluate generated samples and compare model with other GAN architectures trained on the same dataset. For this purpose we will re-calculate the evaluation metrics as stated above, but with a much bigger number of samples. In this way, the metrics will be more trustworthy and comparable with the corresponding metrics in the original paper.


In [None]:
# Initialize a new evaluator instance
# (used to run GAN evaluation metrics: FID, IS, PRECISION, RECALL, F1 and SSIM)
evaluator = GanEvaluator(model_fs_folder_or_root=models_groot, gen_dataset=dataset, target_index=1, device=exec_device,
                         condition_indices=(0, 2), n_samples=10000, batch_size=metrics_batch_size,
                         f1_k=f1_k)
# Run the evaluator
metrics_dict = evaluator.evaluate(gen=pgpg.gen, metric_name='all', show_progress=True)

# Print results
import json
print(json.dumps(metrics_dict, indent=4))