<a href="https://colab.research.google.com/github/Cat6498/painterseye/blob/main/NeuralRendererExperiments2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Experimenting with a neural renderer - part 2

After trying out the approach by Nakano and LibreAI as an introduction, it's time to get into it - in this colab notebook I'll try the approach from [Stylized Neural Painter](https://jiupinjia.github.io/neuralpainter/), creating a dual-architecture (rasterization + shading) painter network and training it on brushstrokes generated on the fly by the renderer itself.

## Setup

First of all, clone the git repository into the content folder and cd into it

In [None]:
!git clone https://github.com/jiupinjia/stylized-neural-painting.git 

Cloning into 'stylized-neural-painting'...
remote: Enumerating objects: 198, done.[K
remote: Counting objects: 100% (55/55), done.[K
remote: Compressing objects: 100% (45/45), done.[K
remote: Total 198 (delta 29), reused 26 (delta 10), pack-reused 143[K
Receiving objects: 100% (198/198), 3.63 MiB | 6.43 MiB/s, done.
Resolving deltas: 100% (100/100), done.


In [None]:
cd stylized-neural-painting

/content/stylized-neural-painting


Import the packages and files we'll need

In [None]:
%pip install -U git+https://github.com/szagoruyko/pytorchviz.git@master

Collecting git+https://github.com/szagoruyko/pytorchviz.git@master
  Cloning https://github.com/szagoruyko/pytorchviz.git (to revision master) to /tmp/pip-req-build-urzhx7o7
  Running command git clone -q https://github.com/szagoruyko/pytorchviz.git /tmp/pip-req-build-urzhx7o7
Building wheels for collected packages: torchviz
  Building wheel for torchviz (setup.py) ... [?25l[?25hdone
  Created wheel for torchviz: filename=torchviz-0.0.2-py3-none-any.whl size=4991 sha256=370875f341f08226f8930d7a298336c8fd78539520c2cfa487416d7f2a5b0081
  Stored in directory: /tmp/pip-ephem-wheel-cache-oygz0c98/wheels/69/06/fd/652908d49c931cdcca96be3c727fb11ed777a3a62402210396
Successfully built torchviz
Installing collected packages: torchviz
Successfully installed torchviz-0.0.2


In [None]:
# General imports
import numpy as np
import matplotlib.pyplot as plt
import cv2
import os
import argparse

# Torch imports
import torch
import torch.optim as optim
import torch.nn as nn
# For visualisation purposes
from torchviz import make_dot

# Imports from Stylized Neural Renderer
import utils
import loss
from networks import *
import renderer

Run on GPU if available, else on CPU

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0


## The PainterTrainer

The PainterTrainer class will be our neural renderer.

It is composed of several parts.

<br />

#### Dataloaders
Passed as argument when initialising the PainterTrainer, they're defined with
    
    > utils.get_renderer_loaders(args)
    
where the get_renderer_loaders() fn generates two datasets, one for training and one for validation.

The datasets are two objects of the StrokeDataset class, that generates strokes through the Renderer's random_stroke_params() and draw_stroke() methods. 

get_renderer_loaders then creates a dictionary dataloaders = {'train' : Dataloader(train), 'val' : Dataloader(val)}.  

<br />

---
#### Renderer

The renderer param indicates which type of brush gets created (available are oilpaintbrush (default), markerpen, watercolor and rectangle) and influences the shape of the action vectors. For oilpaintbrush, it also uses some png images in greyscale to give brushstrokes a "rougher" texture.

It has methods to create a canvas and to update it with the latest brushstroke, to create random strokes parameters (ground truth) and to generate stroke parameters according to the renderer (generated brushstrokes).

It also has a method draw_stroke that calls private methods according to renderer type. These actually create the brushstroke with a serie of transformations on the vectors, and they all end with a normalised foreground and stroke_alpha_mat that are used to update the canvas.

<br />

---
#### Network

Decides which class of generator to instantiate based on net_G parameter.

 The suggested one is a zou-fusion-net, that is made of two parts:
    
* a DCGAN made of convolutional-transpose layers, batch norm layers, and ReLU activations - from Radford et. al., used to get the color of the brushstroke
    
* a PixelShuffleNet made of FC linear layers and convolutional layers, used to get the mask of the brushstroke (so this is the rasterisation network?)

PixelShuffleNet from PyTorch: rearranges elements in a tensor of shape $(∗,C×r^2,H,W)$ to a tensor of shape $(∗,C,H×r,W×r)$ where $r$ is an upscale factor. This is useful for implementing efficient sub-pixel convolution with a stride of $1/r$.

<br />

---
> Why do we need to set the gradient to 0?

[From StackOverflow](https://stackoverflow.com/questions/48001598/why-do-we-need-to-call-zero-grad-in-pytorch)

In PyTorch, for every mini-batch during the training phase, we need to explicitly set the gradients to zero before starting to do backpropragation (i.e., updation of Weights and biases) because PyTorch accumulates the gradients on subsequent backward passes. This is convenient while training RNNs. So, the default action has been set to accumulate (i.e. sum) the gradients on every loss.backward() call.

Because of this, when you start your training loop, ideally you should zero out the gradients so that you do the parameter update correctly. Else the gradient would point in some other direction than the intended direction towards the minimum (or maximum, in case of maximization objectives).

In [None]:
class PainterTrainer():

  def __init__(self, args, dataloaders):
    
    # Define the dataloaders
    self.dataloaders = dataloaders

    # Create the Renderer
    self.renderer = renderer.Renderer(renderer=args.renderer)

    # Define the network structure and load it to the device
    self.net_G = define_G(rdrr=self.renderer, netG=args.net_G).to(device)

    # define learning rate
    self.lr = args.lr

    # define the Adam optimizer - extension of SGD, efficient also with noisy or sparse gradients 
    self.optimizer_G = optim.Adam(self.net_G.parameters(), lr=self.lr, betas=(0.9, 0.999))

    # define the learning rate scheduler - from pytorch docs: decays the learning rate of each parameter group by gamma every step_size epochs
    self.lr_scheduler_G = optim.lr_scheduler.StepLR(self.optimizer_G, step_size=100, gamma=0.1)

    # define the loss functions - just a torch.mean operation
    self._pxl_loss = loss.PixelLoss(p=2)

    # define some other vars to record the training states
    self.running_acc = []
    self.epoch_acc = 0
    self.best_val_acc = 0.0
    self.best_epoch_id = 0
    self.epoch_to_start = 0
    self.max_num_epochs = args.max_num_epochs
    self.G_pred_foreground = None
    self.G_pred_alpha = None
    self.batch = None
    self.G_loss = None
    self.is_training = False
    self.batch_id = 0
    self.epoch_id = 0
    self.checkpoint_dir = args.checkpoint_dir
    self.vis_dir = args.vis_dir

    # still not clear what this is for
    self.VAL_ACC = np.array([], np.float32)
    if os.path.exists(os.path.join(self.checkpoint_dir, 'val_acc.npy')):
        self.VAL_ACC = np.load(os.path.join(self.checkpoint_dir, 'val_acc.npy'))

    # check (and create if it does not exist) model directory
    if os.path.exists(self.checkpoint_dir) is False:
        os.mkdir(self.checkpoint_dir)
    if os.path.exists(self.vis_dir) is False:
        os.mkdir(self.vis_dir)

     # visualize model
    if args.print_models:
        self._visualize_models()


  # visualize the model in graph form 
  def _visualize_models(self):
      
    data = next(iter(self.dataloaders['train']))
    y = self.net_G(data['A'].to(device))
    dot = make_dot(y[0].mean(), params=dict(self.net_G.named_parameters()), show_attrs=True, show_saved=True)
    dot.render('G')


  # This decreases learning rate
  def _update_lr_schedulers(self):
    self.lr_scheduler_G.step()


  # Computes accuracy of predictions 
  def _compute_acc(self):
    
    # ground truths
    target_foreground = self.gt_foreground.to(device).detach()
    target_alpha_map = self.gt_alpha.to(device).detach()

    # predictions
    foreground = self.G_pred_foreground.detach()
    alpha_map = self.G_pred_alpha.detach()

    # Average peak signal-to-noise ratio 
    psnr1 = utils.cpt_batch_psnr(foreground, target_foreground, PIXEL_MAX=1.0)
    psnr2 = utils.cpt_batch_psnr(alpha_map, target_alpha_map, PIXEL_MAX=1.0)
    return (psnr1 + psnr2)/2.0


  # Called a the start of training - running_acc stores the evolution of accuracy through the painting process
  def _clear_cache(self):
    self.running_acc = []


  # From the batch, get the action vector, and feed it to the network to get prediction of foreground and alpha
  def _forward_pass(self, batch):
    self.batch = batch
    z_in = batch['A'].to(device)
    self.G_pred_foreground, self.G_pred_alpha = self.net_G(z_in)


  # Get the ground truth for alpha and foreground and compute pixel loss of both of them 
  def _backward_G(self):

    self.gt_foreground = self.batch['B'].to(device)
    self.gt_alpha = self.batch['ALPHA'].to(device)

    _, _, h, w = self.G_pred_alpha.shape
    self.gt_foreground = torch.nn.functional.interpolate(self.gt_foreground, (h, w), mode='area')
    self.gt_alpha = torch.nn.functional.interpolate(self.gt_alpha, (h, w), mode='area')

    pixel_loss1 = self._pxl_loss(self.G_pred_foreground, self.gt_foreground)
    pixel_loss2 = self._pxl_loss(self.G_pred_alpha, self.gt_alpha)
    self.G_loss = 100 * (pixel_loss1 + pixel_loss2) / 2.0
    self.G_loss.backward()



  """ Checkpoint methods """

  def _load_checkpoint(self):

    if os.path.exists(os.path.join(self.checkpoint_dir, 'last_ckpt.pt')):
        print('loading last checkpoint...')
        # load the entire checkpoint
        checkpoint = torch.load(os.path.join(self.checkpoint_dir, 'last_ckpt.pt'))

        # update net_G states
        self.net_G.load_state_dict(checkpoint['model_G_state_dict'])
        self.optimizer_G.load_state_dict(checkpoint['optimizer_G_state_dict'])
        self.lr_scheduler_G.load_state_dict(
            checkpoint['exp_lr_scheduler_G_state_dict'])
        self.net_G.to(device)

        # update some other states
        self.epoch_to_start = checkpoint['epoch_id'] + 1
        self.best_val_acc = checkpoint['best_val_acc']
        self.best_epoch_id = checkpoint['best_epoch_id']

        print('Epoch_to_start = %d, Historical_best_acc = %.4f (at epoch %d)' %
              (self.epoch_to_start, self.best_val_acc, self.best_epoch_id))
        print()

    else:
        print('training from scratch...')


  def _save_checkpoint(self, ckpt_name):
    torch.save({
      'epoch_id': self.epoch_id,
      'best_val_acc': self.best_val_acc,
      'best_epoch_id': self.best_epoch_id,
      'model_G_state_dict': self.net_G.state_dict(),
      'optimizer_G_state_dict': self.optimizer_G.state_dict(),
      'exp_lr_scheduler_G_state_dict': self.lr_scheduler_G.state_dict()
    }, os.path.join(self.checkpoint_dir, ckpt_name))

  
  def _update_checkpoints(self):

    # save current model
    self._save_checkpoint(ckpt_name='last_ckpt.pt')
    print('Lastest model updated. Epoch_acc=%.4f, Historical_best_acc=%.4f (at epoch %d)'
          % (self.epoch_acc, self.best_val_acc, self.best_epoch_id))
    print()

    self.VAL_ACC = np.append(self.VAL_ACC, [self.epoch_acc])
    np.save(os.path.join(self.checkpoint_dir, 'val_acc.npy'), self.VAL_ACC)

    # update the best model (based on eval acc)
    if self.epoch_acc > self.best_val_acc:
        self.best_val_acc = self.epoch_acc
        self.best_epoch_id = self.epoch_id
        self._save_checkpoint(ckpt_name='best_ckpt.pt')
        print('*' * 10 + 'Best model updated!')
        print()



  """ Batch analysis section """

  def _collect_running_batch_states(self):
    self.running_acc.append(self._compute_acc().item())

    m = len(self.dataloaders['train'])
    if self.is_training is False:
        m = len(self.dataloaders['val'])

    # Every 100 batches print the state of training
    if np.mod(self.batch_id, 100) == 1:
        print('Is_training: %s. [%d,%d][%d,%d], G_loss: %.5f, running_acc: %.5f'
              % (self.is_training, self.epoch_id, self.max_num_epochs-1, self.batch_id, m,
                 self.G_loss.item(), np.mean(self.running_acc)))

    # Every 1000 batches save a picture of the brushstroke in the visualisation directory
    if np.mod(self.batch_id, 1000) == 1:
        vis_pred_foreground = utils.make_numpy_grid(self.G_pred_foreground)
        vis_gt_foreground = utils.make_numpy_grid(self.gt_foreground)
        vis_pred_alpha = utils.make_numpy_grid(self.G_pred_alpha)
        vis_gt_alpha = utils.make_numpy_grid(self.gt_alpha)

        vis = np.concatenate([vis_pred_foreground, vis_gt_foreground,
                              vis_pred_alpha, vis_gt_alpha], axis=0)
        vis = np.clip(vis, a_min=0.0, a_max=1.0)
        file_name = os.path.join(
            self.vis_dir, 'istrain_'+str(self.is_training)+'_'+
                          str(self.epoch_id)+'_'+str(self.batch_id)+'.jpg')
        plt.imsave(file_name, vis)


  def _collect_epoch_states(self):
    # Get accuracy of epoch as mean of the various batches accuracies
    self.epoch_acc = np.mean(self.running_acc)
    print('Is_training: %s. Epoch %d / %d, epoch_acc= %.5f' %
          (self.is_training, self.epoch_id, self.max_num_epochs-1, self.epoch_acc))
    print()

  """ Training section """
  def train_models(self):

    self._load_checkpoint()

    # loop over the epochs
    for self.epoch_id in range(self.epoch_to_start, self.max_num_epochs):

        self._clear_cache() # Reset the accuracy for the epoch
        self.is_training = True
        self.net_G.train()  # Set model to training mode
        # Iterate over data
        for self.batch_id, batch in enumerate(self.dataloaders['train'], 0):
            # Take a step forward (sample ground truth), set the gradient to 0, take a step backwards (predict generated brushstrokes), and have the optimizer step
            self._forward_pass(batch)
            self.optimizer_G.zero_grad()
            self._backward_G()
            self.optimizer_G.step()
        
            self._collect_running_batch_states() # Get batch accuracy and print state
        self._collect_epoch_states() # Get epoch accuracy and print the state
        self._update_lr_schedulers() 
        self._update_checkpoints()

    

In [None]:
parser = argparse.ArgumentParser(description='STYLIZED NEURAL PAINTING EXPERIMENT')
parser.add_argument('-f')
parser.add_argument('--renderer', type=str, default='oilpaintbrush', metavar='str',
                    help='renderer: [watercolor, markerpen, oilpaintbrush, rectangle'
                         'bezier, circle, square, rectangle] (default ...)')
parser.add_argument('--batch_size', type=int, default=64, metavar='N',
                    help='input batch size for training (default: 4)')
parser.add_argument('--print_models', action='store_true', default=True,
                    help='visualize and print networks')
parser.add_argument('--net_G', type=str, default='zou-fusion-net', metavar='str',
                    help='net_G: plain-dcgan or plain-unet or huang-net,'
                         'zou-fusion-net, or zou-fusion-net-light')
parser.add_argument('--checkpoint_dir', type=str, default=r'./checkpoints_G', metavar='str',
                    help='dir to save checkpoints (default: ...)')
parser.add_argument('--vis_dir', type=str, default=r'./val_out_G', metavar='str',
                    help='dir to save results during training (default: ./val_out_G)')
parser.add_argument('--lr', type=float, default=2e-4,
                    help='learning rate (default: 0.0002)')
parser.add_argument('--max_num_epochs', type=int, default=400, metavar='N',
                    help='max number of training epochs (default 400)')
args = parser.parse_args()

In [None]:
# set parameters here
args.renderer = 'oilpaintbrush'
args.batch_size = 64
args.print_models = True
args.net_G = 'zou-fusion-net'
args.checkpoint_dir = './checkpoints_G' 
args.vis_dir = './val_out_G'
args.max_num_epochs = 100 

In [None]:
dataloaders = utils.get_renderer_loaders(args)
neurend = PainterTrainer(args=args, dataloaders=dataloaders)

""" uncomment to get a visualisation of the generated ground truth
# check if the data is loading correctly
for i in range(10):
    data = next(iter(dataloaders['train']))
    vis_A = data['A']
    vis_B = utils.make_numpy_grid(data['B'])
    print(data['A'].cpu().numpy().shape[1])
    print(data['B'].shape)
    plt.imshow(vis_B)
    plt.show()
"""

neurend.train_models()

  cpuset_checked))


initialize network with normal
training from scratch...


  cpuset_checked))


Is_training: True. [0,99][1,782], G_loss: 7.07903, running_acc: 12.54427
Is_training: True. [0,99][101,782], G_loss: 5.11373, running_acc: 13.42278
Is_training: True. [0,99][201,782], G_loss: 3.36970, running_acc: 14.10678
Is_training: True. [0,99][301,782], G_loss: 2.35786, running_acc: 14.88400
Is_training: True. [0,99][401,782], G_loss: 2.22355, running_acc: 15.44160
Is_training: True. [0,99][501,782], G_loss: 1.87427, running_acc: 15.88997
Is_training: True. [0,99][601,782], G_loss: 1.98127, running_acc: 16.19803
Is_training: True. [0,99][701,782], G_loss: 1.90482, running_acc: 16.45129
Is_training: True. Epoch 0 / 99, epoch_acc= 16.60447

Lastest model updated. Epoch_acc=16.6045, Historical_best_acc=0.0000 (at epoch 0)

**********Best model updated!

Is_training: True. [1,99][1,782], G_loss: 1.89106, running_acc: 18.02141
Is_training: True. [1,99][101,782], G_loss: 1.91276, running_acc: 18.17191
Is_training: True. [1,99][201,782], G_loss: 1.84537, running_acc: 18.21649
Is_training

KeyboardInterrupt: ignored