<a href="https://colab.research.google.com/github/billsioros/thesis/blob/master/Nanorough_surface_Super_resolution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ✔️ Prerequisites

First of all we need to take care of a few **prerequisites**, most notably:

- Install the various pip modules that we will be using.
- Install some linux specific dependencies of our [content loss](#content-loss).
- Initialize the Random Number Generator(s), so that our experiments can be replicated.
- Determine:
  - The current working directory, as it's going to be used to reference various files such as the dataset, our model checkpoints e.t.c
  - The available hardware backend. GPU utilization is preferable, as it results in higher complition time.
- `(Optionally)` Mount Google Drive, where we can load our dataset from.

## Determining the Current Working Directory

In [None]:
from pathlib import Path

BASE_DIR = Path.cwd()

## Mounting Google Drive

In [None]:
GDRIVE_DIR = BASE_DIR / "drive"

try:
    from google.colab import drive

    drive.mount(f"{GDRIVE_DIR}")
except ImportError:
    pass

In [None]:
SECRETS_DIR = GDRIVE_DIR / "MyDrive" / "Secrets"

if GDRIVE_DIR.is_dir():
    THESIS_DIR = GDRIVE_DIR / "MyDrive" / "Thesis"
else:
    THESIS_DIR = BASE_DIR

OUTPUT_DIR = THESIS_DIR / "Output"

if THESIS_DIR.is_dir():
    DATASET_DIR = THESIS_DIR / "Datasets"
else:
    DATASET_DIR = BASE_DIR / "Datasets"

## Configuring our Loggers

In [None]:
import os

LOGGING_LEVEL = os.environ.get("LOGGING_LEVEL", "CRITICAL").upper()

LOGGING_CONFIG = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "standard": {"format": "[%(asctime)s] %(levelname)s:%(name)s: %(message)s"}
    },
    "handlers": {
        "default": {
            "level": LOGGING_LEVEL,
            "formatter": "standard",
            "class": "logging.StreamHandler",
        },
        "file": {
            "level": LOGGING_LEVEL,
            "formatter": "standard",
            "class": "logging.FileHandler",
        },
    },
    "loggers": {"": {"handlers": ["default", "file"], "level": LOGGING_LEVEL}},
}

## Installing [graphviz](https://graphviz.org/) & [libgraphviz-dev](https://packages.debian.org/jessie/libgraphviz-dev)

The aforementioned packages are required by [PyINSECT](https://github.com/billsioros/PyINSECT/tree/implementing-HPGs) and more specifically its graph plotting methods.

In [None]:
!sudo apt-get install graphviz libgraphviz-dev 1> /dev/null

## Installing the required `pip` modules

- [torch](https://pytorch.org/) is our machine learning framework of choice.
- [numpy](https://numpy.org/), [sympy](https://www.sympy.org/en/index.html) and [scipy](https://www.scipy.org/) are used to in the context of nanorough surface generation.
- [plotly](https://plotly.com/) (which requires [pandas](https://pandas.pydata.org/)) as well as [matplotlib](https://matplotlib.org/) are used in order to plot various graphs.

In [None]:
WHEEL_VERSION = "3.0.1"
WHEEL_FILE = "roughml-%s-py3-none-any.whl" % (WHEEL_VERSION,)
WHEEL_PATH = THESIS_DIR / "Binaries" / WHEEL_FILE

In [None]:
import subprocess
import sys

pip_freeze_output = subprocess.check_output(
    [sys.executable, "-m", "pip", "freeze"]
).decode()

if "roughml" not in pip_freeze_output:
    if WHEEL_PATH.is_file():
        subprocess.check_call([sys.executable, "-m", "pip", "install", WHEEL_PATH])
    else:
        raise FileNotFoundError(WHEEL_PATH)

## Initializing (a.k.a `Seeding`) the Random Number Generator(s)

We are required to seed various random number generation engines, so that our experiments can be replicated on a later date.

In [None]:
SEED = 1234

import os
import random

import numpy as np

In [None]:
import torch

if SEED is not None:
    np.random.seed(SEED)
    random.seed(SEED)
    torch.manual_seed(SEED)
    torch.cuda.manual_seed(SEED)
    torch.backends.cudnn.deterministic = True
    os.environ["PYTHONHASHSEED"] = str(SEED)

## Determining available backend

By default, we are going to be utilizing the available CPU backend, if no GPU is available.

In [None]:
device = "cpu"
if torch.cuda.is_available():
    device = "cuda:0"

In [None]:
device = torch.device(device)

## Setting up our callbacks

### End-of-training callback

In [None]:
from datetime import datetime

from roughml.shared.notifiers import EndOfTrainingNotifier

training_callback = None
if SECRETS_DIR.is_dir():
    notifier = EndOfTrainingNotifier.from_json(SECRETS_DIR / "credentials.json")

    timestamp = datetime.now().strftime("%Y/%m/%d %H:%M:%S.%f")

    def training_callback(log_file=None, **context):
        notifier(
            ("Vasilis Sioros", "billsioros97@gmail.com"),
            log_file=log_file,
            dataset=context["dataset"],
            generator=context["generator"],
            discriminator=context["discriminator"],
            elapsed_time=context["elapsed_time"],
            succeeded=context["succeeded"],
            identifier=timestamp,
        )

### Logging initialization callback

In [None]:
def logging_callback(config, logging_dir):
    level = config.handlers.file.level.lower()

    config.handlers.file.filename = logging_dir / f"{level}.log"

    return config

# 🙃 A naive-approach

## Defining the **Generator** and the **Discriminator** instantiation callbacks

In [None]:
from roughml.models import PerceptronGenerator


def get_generator():
    return PerceptronGenerator.from_device(device)

In [None]:
from roughml.models import PerceptronDiscriminator


def get_discriminator(generator):
    return PerceptronDiscriminator.from_generator(generator)

## Training

In [None]:
from torch.nn import BCELoss

criterion = BCELoss().to(device)

In [None]:
import functools

from roughml.content.loss import NGramGraphContentLoss
from roughml.data.loaders import load_multiple_datasets_from_pt
from roughml.data.transforms import To, View
from roughml.training.flow import TrainingFlow
from roughml.training.manager import per_epoch

training_flow = TrainingFlow(
    output_dir=OUTPUT_DIR,
    logging={"config": LOGGING_CONFIG, "callback": logging_callback},
    training={
        "manager": {
            "benchmark": True,
            # Uncomment if you want to enable checkpointing
            # "checkpoint": {"multiple": True},
            "train_epoch": per_epoch,
            "log_every_n": 10,
            "criterion": {"instance": criterion},
            "n_epochs": 10,
            "train_ratio": 0.8,
            "optimizer": {"lr": 0.0005, "weight_decay": 0},
            "dataloader": {
                "batch_size": 256,
                "shuffle": True,
                "num_workers": 0,
            },
        },
        "callbacks": [
            training_callback,
        ],
    },
    content_loss={
        "type": NGramGraphContentLoss,
        # Uncomment if you want to enable checkpointing
        # "cache": "n_gram_graph_content_loss.pkl",
    },
    data={
        "loader": functools.partial(
            load_multiple_datasets_from_pt,
            DATASET_DIR,
            transforms=[To(device), View(1, 128, 128)],
            limit=(4, 10),
        )
    },
    animation={
        "indices": [
            0,
        ],
        "save_path": "cnn_per_epoch_animation.mp4",
    },
    plot={
        "grayscale": {"limit": 10, "save_path_fmt": "grayscale/%s_%02d.png"},
        "surface": {"limit": 10, "save_path_fmt": "surface/%s_%02d.png"},
        "against": {"save_path_fmt": "against_%s.png"},
    },
    suppress_exceptions=False,
)

In [None]:
training_flow(get_generator, get_discriminator)

# 😎 A CNN based approach

## Instantiating the **Generator** and the **Discriminator** Networks

In [None]:
from roughml.models import CNNGenerator


def get_generator():
    return CNNGenerator.from_device(device)

In [None]:
from roughml.models import CNNDiscriminator


def get_discriminator(_):
    return CNNDiscriminator.from_device(device)

## Training

In [None]:
from torch.nn import BCELoss

criterion = BCELoss().to(device)

In [None]:
import functools

from roughml.content.loss import ArrayGraph2DContentLoss
from roughml.data.transforms import To, View
from roughml.training.flow import TrainingFlow
from roughml.training.manager import per_epoch

training_flow = TrainingFlow(
    output_dir=OUTPUT_DIR,
    logging={"config": LOGGING_CONFIG, "callback": logging_callback},
    training={
        "manager": {
            "benchmark": True,
            # Uncomment if you want to enable checkpointing
            # "checkpoint": {"multiple": True},
            "train_epoch": per_epoch,
            "log_every_n": 10,
            "criterion": {"instance": criterion},
            "n_epochs": 10,
            "train_ratio": 0.8,
            "optimizer": {"lr": 0.0002, "betas": (0.5, 0.999)},
            "dataloader": {
                "batch_size": 256,
                "shuffle": True,
                "num_workers": 0,
            },
        },
        "callbacks": [
            training_callback,
        ],
    },
    content_loss={
        "type": ArrayGraph2DContentLoss,
        # Uncomment if you want to enable checkpointing
        # "cache": "array_graph2d_content_loss.pkl",
    },
    data={
        "loader": functools.partial(
            load_multiple_datasets_from_pt,
            DATASET_DIR,
            transforms=[To(device), View(1, 128, 128)],
            limit=(4, 10),
        )
    },
    animation={
        "indices": [
            0,
        ],
        "save_path": "cnn_per_epoch_animation.mp4",
    },
    plot={
        "grayscale": {"limit": 10, "save_path_fmt": "grayscale/%s_%02d.png"},
        "surface": {"limit": 10, "save_path_fmt": "surface/%s_%02d.png"},
        "against": {"save_path_fmt": "against_%s.png"},
    },
    suppress_exceptions=False,
)

In [None]:
training_flow(get_generator, get_discriminator)

# 👋 Dismounting Google Drive and persisting any changes made

In [None]:
try:
    from google.colab import drive

    drive.flush_and_unmount()
except ImportError:
    pass