

---


Experiment with different variants of GANs with residual blocks. Includes helper methods that allow building discriminators and generators composed almost entirely by residual blocks, although they are also tested in combination with hybrid architectures in that regard.


---



In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

import os
import sys
from fastai.vision import *
from fastai.vision.gan import *
import torch.nn as nn
import torch.nn.functional as F
from PIL import Image

You should set the following option to True if the notebook isn't located in the file system inside a clone of the git repo (with the needed Python modules available) it belongs to; i.e., it's running independently.

In [None]:
run_as_standalone_nb = False

In [None]:
# This cell needs to be executed before importing local project modules, like import core.gan
if run_as_standalone_nb:
    root_lib_path = os.path.abspath('generative-lab')
    if not os.path.exists(root_lib_path):
        !git clone https://github.com/davidleonfdez/generative-lab.git
    if root_lib_path not in sys.path:
        sys.path.insert(0, root_lib_path)
else:
    import local_lib_import

In [None]:
# Local project modules. Must be imported after local_lib_import or cloning git repo.
from core.gan import CustomGANLearner, GANLossArgs, gan_loss_from_func_std, load_gan_learner, save_gan_learner
from core.layers import AvgFlatten, MergeResampleLayer, res_block_std, res_downsample_block, res_upsample_block, res_resample_block
from core.nb_utils import mount_gdrive
from core.net_builders import (deep_res_critic, deep_res_generator, pseudo_res_critic, 
                               pseudo_res_generator, simple_res_critic, simple_res_generator)

`models_root` is used as the base path to save models. Next cell sets assumes the nb is being executed from Google Colab and you have a "ML" dir in Google Drive. Alternatively, you could set it manually to something like './' to save and load models to/from the current directory.

In [None]:
# Optional, allows saving parameters in gdrive
root_gdrive = mount_gdrive()
models_root = root_gdrive + 'ML/'

In [None]:
img_size = 64
img_n_channels = 3
batch_size = 128
use_cuda = torch.cuda.is_available()

# DATA

In [None]:
ds_url = "http://vis-www.cs.umass.edu/lfw/lfw"

In [None]:
realImagesPath = untar_data(ds_url)
realImagesPath

In [None]:
sampleImg1Path = realImagesPath/'Aaron_Eckhart/Aaron_Eckhart_0001.jpg'

In [None]:
im = Image.open(sampleImg1Path)
im.size

In [None]:
from IPython.display import Image
Image(filename=str(sampleImg1Path))

In [None]:
def get_data(path, bs, size):
    return (GANItemList.from_folder(path, noise_sz=100)
               .split_none()
               .label_from_func(noop)
               .transform(tfms=[[crop_pad(size=size, row_pct=(0,1), col_pct=(0,1))], []], size=size, tfm_y=True)
               .databunch(bs=bs)
               .normalize(stats = [torch.tensor([0.5,0.5,0.5]), torch.tensor([0.5,0.5,0.5])], do_x=False, do_y=True))

In [None]:
data = get_data(realImagesPath, batch_size, img_size)
data.show_batch()

# GENERATORS

Generator with residual blocks after each upsampling block. It's similar to a DCGAN generator with residual blocks in between.

Input is bs x noise_sz * 1 * 1<br>
**n_features** is the number of feature maps (so kernels) generated after penultimate layer (the last layer of course outputs n_channels) if n_extra_layers = 0 . At the beginning there will be n_features * 2^(n_intermediate_convtrans_blocks), and this number will be reduced by half in any subsequent layer.

```
pseudo_res_generator(in_size:int, n_channels:int, noise_sz:int=100,
                     n_features:int=64, n_extra_layers:int=0, dense:bool=False, 
                     **conv_kwargs) -> nn.Module
```

Test pseudo_res_generator():

In [None]:
generator = pseudo_res_generator(img_size, img_n_channels)
generator(torch.rand(2, 100, 1, 1)).size() == torch.empty(2, img_n_channels, img_size, img_size).size()

Generator with only residual blocks, acting as upsampling blocks, in the intermediate layers of the network. Of course, it means we are not dealing with purely residual blocks, as the input needs to be upsampled too in the shortcut path to allow the addition to the output of the convolutions; that's done with a stride 2 1x1 transpose convolution.

With the default parameters, any block has two convolutions; the first one performs the downsampling. This order can be reversed passing **downsample_first**=False.

**n_features** is the number of feature maps (so kernels) generated after first layer (from the n_channels of the input). This number will be doubled in any subsequent layer.

```
simple_res_generator(in_size:int, n_channels:int, noise_sz:int=100,
                     n_features:int=64, n_extra_layers:int=0,
                    n_extra_convs_by_block:int=1, upsample_first:bool=True, 
                    **conv_kwargs) -> nn.Module
```

In [None]:
# Test simple_res_generator
simple_generator = simple_res_generator(img_size, img_n_channels)
simple_generator(torch.rand(2, 100, 1, 1)).size() == torch.empty(2, img_n_channels, img_size, img_size).size()

Generator with only residual blocks:

* Standard residual blocks composed by two 3x3 convolutions.
* Upsampling residual blocks, parametrized by:
  * **`n_extra_convs_by_upblock`**: number of 3x3 convolutions (padding 1, don't alter size) performed inside any upsampling block.
  * **`upsample_first_in_block`**: indicates if the transpose convolution must come first (True) or last (False) inside any upsampling block.

The number of standard residual blocks between a pair of upsampling residual blocks is defined by **`n_blocks_between_upblocks`**.

```
deep_res_generator(in_size:int, n_channels:int, noise_sz:int=100,
                   n_features:int=64, n_extra_blocks_begin:int=0, 
                   n_extra_blocks_end:int=0, n_blocks_between_upblocks:int=0, 
                   n_extra_convs_by_upblock:int=1,
                   upsample_first_in_block:bool=True, dense:bool=False,
                   use_final_activ_res_blocks:bool=False,
                   use_final_bn:bool=False, use_shortcut_activ:bool=False, 
                   use_shortcut_bn:bool=True,
                   norm_type_inner:Optional[NormType]=NormType.Batch, 
                   **conv_kwargs) -> nn.Module
```

# CRITIC

Critic with residual blocks after each downsampling block. It's similar to a DCGAN discriminator with residual blocks in between.

**n_features** is the number of feature maps (so kernels) generated after first layer (from the n_channels of the input). This number will be doubled in any subsequent layer.

```
pseudo_res_critic(in_size:int, n_channels:int, n_features:int=64,
                  n_extra_layers:int=0, dense:bool=False,
                  conv_before_res:bool=True, **conv_kwargs) -> nn.Module
```

---
Critic with only residual blocks, acting as downsampling blocks, in the intermediate layers of the network. Of course, it means we are not dealing with purely residual blocks, as the input needs to be downsampled too in the shortcut path to allow the addition to the output of the convolutions; that's done with a stride 2 1x1 conv (unfortunately, according to some sources like https://papers.nips.cc/paper/7356-fishnet-a-versatile-backbone-for-image-region-and-pixel-level-prediction.pdf, this technique makes gradient propagation harder).

With the default parameters, any block has two convolutions; the first one performs the downsampling. This order can be reversed passing **downsample_first**=False.

**n_features** is the number of feature maps (so kernels) generated after first layer (from the n_channels of the input). This number will be doubled in any subsequent layer.

```
simple_res_critic(in_size:int, n_channels:int, n_features:int=64, 
                  n_extra_layers:int=0, n_extra_convs_by_block:int=1, 
                  downsample_first:bool=True, **conv_kwargs) -> nn.Module
```

In [None]:
critic = simple_res_critic(img_size, img_n_channels)

Test the critic:

In [None]:
critic(torch.rand(2, 3, 64, 64)).size()

---
Critic with only residual blocks:

* Standard residual blocks composed by two 3x3 convolutions.
* Downsampling residual blocks, parametrized by:
  * **`n_extra_convs_by_downblock`**: number of 3x3 convolutions (padding 1, don't alter size) performed inside any downsampling block.
  * **`downsample_first_in_block`**: indicates if the transpose convolution must come first (True) or last (False) inside any downsampling block.

The number of standard residual blocks between a pair of downsampling residual blocks is defined by **`n_blocks_between_downblocks`**.

```
deep_res_critic(in_size:int, n_channels:int, n_features:int=64, 
                n_extra_blocks_begin:int=0, n_extra_blocks_end:int=0, 
                n_blocks_between_downblocks:int=0, 
                n_extra_convs_by_downblock:int=1, 
                downsample_first_in_block:bool=True, dense:bool=False, 
                use_final_activ_res_blocks:bool=False, use_final_bn:bool=False, 
                use_shortcut_activ:bool=False, use_shortcut_bn:bool=True, 
                norm_type_inner:Optional[NormType]=NormType.Batch, 
                **conv_kwargs) -> nn.Module
```

# GAN LEARNER

In [None]:
def gen_loss_func(*args): return 0
crit_loss_func = nn.BCEWithLogitsLoss()

losses = gan_loss_from_func_std(gen_loss_func, crit_loss_func)

learner = CustomGANLearner(data, generator, critic, GANLossArgs(*losses))

# TRAINING

* The parameters of a trained model can be saved with `save_gan_learner`.
* A training run can resumed (using weights saved during a previous session) with `load_gan_learner`. For example:
        load_gan_learner(learner, models_root + 'resGANStrictTr1_60it.pth')
    This must be executed after instantiating the learner and BEFORE running `learner.fit()`.

* Another alternative to launch a long training run is the method `save_checkpoint_gan`. It will automatically save the weights every `n_epochs_save_split` epochs.

## TR 1: Simple_res_critic, pseudo_res_generator

### TR 1A: *Simple_res_critic, pseudo_res_generator*, Downsample_first=True, n_extra_layers=(critic 1, generator 0), n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=2e-4

In [None]:
lr = 2e-4
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)

In [None]:
learner.fit(30, lr)

In [None]:
save_gan_learner(learner, models_root + 'resGANStrictTr1_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1_60ep.pth')

### TR 1B (1 extra layer in generator too): *Simple_res_critic, pseudo_res_generator,* Downsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=2e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1b_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1b_60ep.pth')

### TR 1C (bigger lr): Simple_res_critic, pseudo_res_generator, Downsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr=5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1c_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1c_60ep.pth')

In [None]:
learner.show_results(ds_type=DatasetType.Train)

### TR 1D (lr=1e-4): *Simple_res_critic, pseudo_res_generator*, Downsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=1e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr=1e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1d_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1d_60ep.pth')

### TR 1E (not downsample first): *Simple_res_critic, pseudo_res_generator*, Downsample_first=False, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=2e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr=2e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1e_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1e_60ep.pth')

### TR 1F (lr=5e-4): *Simple_res_critic, pseudo_res_generator*, Downsample_first=False, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr=5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1f_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1f_60ep.pth')

### TR 1G (0 extra convs by block): *Simple_res_critic, pseudo_res_generator*, Downsample_first=doesn't mind, n_extra_layers=1, n_extra_convs_by_block=0, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=0)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr=5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1g_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1g_60ep.pth')

It should converge faster with less layers, but it doesn't mean the final image quality will be better. We could guess the best possible results have to be poorer than the ones of other models.

### TR 1H (2 extra convs by block): *Simple_res_critic, pseudo_res_generator*, Downsample_first=True, n_extra_layers=1, n_extra_convs_by_block=2, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=2)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr=5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1h_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1h_60ep.pth')

### TR 1I (2 extra convs by block, not downsample first): *Simple_res_critic, pseudo_res_generator*, Downsample_first=False, n_extra_layers=1, n_extra_convs_by_block=2, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = pseudo_res_generator(img_size, img_n_channels, n_extra_layers=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=2, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr=5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1i_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr1i_60ep.pth')

## TR 2: Simple_res_critic, simple_res_generator

### TR 2A: *Simple_res_critic, simple_res_generator,* downsample_first=True, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=2e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                          opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 2e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2a_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2a_60ep.pth')

### TR 2B (bigger lr): *Simple_res_critic, simple_res_generator,* downsample_first=True, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                          opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2b_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2b_60ep.pth')

### TR 2C (not upsample first): *Simple_res_critic, simple_res_generator*, downsample_first=True, upsample_first=False, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=2e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, upsample_first=False)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 2e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2c_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2c_60ep.pth')

### TR 2D (not upsample first, bigger lr): *Simple_res_critic, simple_res_generator*, upsample_first=False, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, upsample_first=False)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2d_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2d_60ep.pth')

### TR 2E (not downsample first, not upsample first): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=False, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, 
                                 n_extra_convs_by_block=1, upsample_first=False)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, 
                           n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2e_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2e_60ep.pth')

### TR 2F (not downsample first): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2f_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2f_60ep.pth')

### TR 2G (0 extra convs by block): *Simple_res_critic, simple_res_generator*, downsample_first=True, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=0, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=0)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=0)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2g_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2g_60ep.pth')

### TR 2H (0 extra convs by block in critic): *Simple_res_critic, simple_res_generator*, downsample_first=True, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=(critic 0, generator 1), Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=0)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2h_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2h_60ep.pth')

### TR 2I (0 extra convs by block in generator): *Simple_res_critic, simple_res_generator*, downsample_first=True, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=(critic 1, generator 0), Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=0)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2i_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2i_60ep.pth')

### TR 2J (repeat 2B after fixing absence of leakyReLU in residual blocks of critic): *Simple_res_critic, simple_res_generator,* downsample_first=True, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2j_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2j_60ep.pth')

### TR 2K (repeat 2F after fixing absence of leakyReLU in residual blocks of critic): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2k_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2k_60ep.pth')

### TR 2L (repeat 2B after removing ReLU after last conv inside residual block): *Simple_res_critic, simple_res_generator,* downsample_first=True, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2l_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2l_60ep.pth')

### TR 2M (repeat 2F after removing ReLU after last conv inside residual block): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2m_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2m_60ep.pth')

### TR 2N (repeat 2F after swapping ReLU for LeakyReLU as activation of downsampling residual block): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2n_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2n_60ep.pth')

### Preserve methods to be modified dirtily before next training runs. This can be avoided using the more flexible builders `deep_res_critic` and `deep_res_generator`

In [None]:
old_res_resample_block = res_resample_block
old_res_downsample_block = res_downsample_block
old_simple_res_critic = simple_res_critic
OldMergeResampleLayer = MergeResampleLayer

### TR 2O (repeat 2F after removing activation from downsampling residual block): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
def res_resample_block(in_ftrs, out_ftrs, n_extra_convs=1, resample_first:bool=True, 
                                upsample:bool=False, leaky:float=None, **conv_kwargs):
    resample_conv = conv_layer(in_ftrs, out_ftrs, ks=4, stride=2, padding=1, leaky=leaky, 
                               use_activ=resample_first and n_extra_convs > 0, transpose=upsample, **conv_kwargs)
    nf_extra_convs = out_ftrs if resample_first else in_ftrs
    regular_convs = [conv_layer(nf_extra_convs, nf_extra_convs, leaky=leaky, 
                                use_activ=not resample_first or (i < n_extra_convs-1), **conv_kwargs) 
                     for i in range(n_extra_convs)]
    convs = [resample_conv, *regular_convs] if resample_first else [*regular_convs, resample_conv]

    return SequentialEx(*convs, MergeResampleLayer(in_ftrs, out_ftrs, 2, upsample=upsample))

def res_downsample_block(in_ftrs, out_ftrs, n_extra_convs=1, downsample_first:bool=True, **conv_kwargs):
    return res_resample_block(in_ftrs, out_ftrs, n_extra_convs, downsample_first, False, leaky=0.2, **conv_kwargs)

def simple_res_critic(in_size:int, n_channels:int, n_features:int=64, n_extra_layers:int=0, 
                      n_extra_convs_by_block=1, downsample_first:bool=True, **conv_kwargs):
    "A resnet-ish critic for images `n_channels` x `in_size` x `in_size`."
    layers = [conv_layer(n_channels, n_features, 4, 2, 1, leaky=0.2, norm_type=None, **conv_kwargs)]
    cur_size, cur_ftrs = in_size//2, n_features
    layers.append(nn.Sequential(*[conv_layer(cur_ftrs, cur_ftrs, 3, 1, leaky=0.2, **conv_kwargs) 
                                  for _ in range(n_extra_layers)]))
    while cur_size > 4:
        layers.append(res_downsample_block(cur_ftrs, cur_ftrs*2, n_extra_convs=n_extra_convs_by_block, 
                                           downsample_first=downsample_first, **conv_kwargs))
        cur_ftrs *= 2; cur_size //= 2

    layers += [conv2d(cur_ftrs, 1, 4, padding=0), AvgFlatten()]
    return nn.Sequential(*layers)

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, 
                                         n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2o_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2o_60ep.pth')

### TR 2P (repeat 2F after removing activation from downsampling residual block but preserving RELU after last conv): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
def res_resample_block(in_ftrs, out_ftrs, n_extra_convs=1, resample_first:bool=True, upsample:bool=False, leaky:float=None, **conv_kwargs):
    resample_conv = conv_layer(in_ftrs, out_ftrs, ks=4, stride=2, padding=1, leaky=leaky, transpose=upsample, **conv_kwargs)
    nf_extra_convs = out_ftrs if resample_first else in_ftrs
    regular_convs = [conv_layer(nf_extra_convs, nf_extra_convs, leaky=leaky, **conv_kwargs) 
                     for i in range(n_extra_convs)]
    convs = [resample_conv, *regular_convs] if resample_first else [*regular_convs, resample_conv]

    return SequentialEx(*convs, MergeResampleLayer(in_ftrs, out_ftrs, 2, upsample=upsample))

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2p_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2p_60ep.pth')

### TR 2Q (repeat 2F after adding ReLU to shortcut path of residual block): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
class MergeResampleLayer(Module):
    """Merge a shortcut with the result of the module, which is assumed to be resampled with the same stride. 
    
    Uses a 1x1 conv + BatchNorm to perform a simple up/downsample of the input before the addition.
    upsample = True => performs upsample; upsample = False => performs downsample."""
    def __init__(self, in_ftrs, out_ftrs, stride=2, upsample:bool=False):
        # We can't use fastai's conv_layer() here because we need to pass output_padding to ConvTranspose2d
        conv_func = nn.ConvTranspose2d if upsample else nn.Conv2d
        conv_kwargs = {"output_padding": 1} if upsample else {}
        init = nn.init.kaiming_normal_
        conv = init_default(conv_func(in_ftrs, out_ftrs, kernel_size=1, stride=stride, bias=False, padding=0, **conv_kwargs), init)
        self.conv1 = nn.Sequential(conv, nn.ReLU(), nn.BatchNorm2d(out_ftrs))

    def forward(self, x):
        identity = self.conv1(x.orig)
        return x + identity

def res_resample_block(in_ftrs, out_ftrs, n_extra_convs=1, resample_first:bool=True, upsample:bool=False, 
                       leaky:float=None, **conv_kwargs):
    resample_conv = conv_layer(in_ftrs, out_ftrs, ks=4, stride=2, padding=1, leaky=leaky, transpose=upsample, **conv_kwargs)
    nf_extra_convs = out_ftrs if resample_first else in_ftrs
    regular_convs = [conv_layer(nf_extra_convs, nf_extra_convs, leaky=leaky, **conv_kwargs) 
                     for i in range(n_extra_convs)]
    convs = [resample_conv, *regular_convs] if resample_first else [*regular_convs, resample_conv]

    return nn.Sequential(SequentialEx(*convs, MergeResampleLayer(in_ftrs, out_ftrs, 2, upsample=upsample)), nn.ReLU())

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2q_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2q_60ep.pth')

### TR 2R (repeat 2F after adding ReLU/LeakyReLU to shortcut path of residual block): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
class MergeResampleLayer(Module):
    """Merge a shortcut with the result of the module, which is assumed to be resampled with the same stride. 
    
    Uses a 1x1 conv + BatchNorm to perform a simple up/downsample of the input before the addition.
    upsample = True => performs upsample; upsample = False => performs downsample."""
    def __init__(self, in_ftrs, out_ftrs, stride=2, upsample:bool=False, leaky:float=None):
        # We can't use fastai's conv_layer() here because we need to pass output_padding to ConvTranspose2d
        conv_func = nn.ConvTranspose2d if upsample else nn.Conv2d
        conv_kwargs = {"output_padding": 1} if upsample else {}
        init = nn.init.kaiming_normal_
        conv = init_default(conv_func(in_ftrs, out_ftrs, kernel_size=1, stride=stride, bias=False, padding=0, **conv_kwargs), init)
        activ = nn.ReLU() if leaky is None else nn.LeakyReLU(leaky)
        self.conv1 = nn.Sequential(conv, activ, nn.BatchNorm2d(out_ftrs))

    def forward(self, x):
        identity = self.conv1(x.orig)
        return x + identity

def res_resample_block(in_ftrs, out_ftrs, n_extra_convs=1, resample_first:bool=True, upsample:bool=False, leaky:float=None, **conv_kwargs):
    resample_conv = conv_layer(in_ftrs, out_ftrs, ks=4, stride=2, padding=1, leaky=leaky, transpose=upsample, **conv_kwargs)
    nf_extra_convs = out_ftrs if resample_first else in_ftrs
    regular_convs = [conv_layer(nf_extra_convs, nf_extra_convs, leaky=leaky, **conv_kwargs) 
                     for i in range(n_extra_convs)]
    convs = [resample_conv, *regular_convs] if resample_first else [*regular_convs, resample_conv]
    activ = nn.ReLU() if leaky is None else nn.LeakyReLU(leaky)

    return nn.Sequential(SequentialEx(*convs, 
                                      MergeResampleLayer(in_ftrs, out_ftrs, 2, upsample=upsample, leaky=leaky)), 
                         activ)

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2r_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2r_60ep.pth')

### TR 2S (repeat 2F after adding ReLU/LeakyReLU to shortcut path of residual block and BN to output): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
class MergeResampleLayer(Module):
    """Merge a shortcut with the result of the module, which is assumed to be resampled with the same stride. 
    
    Uses a 1x1 conv + BatchNorm to perform a simple up/downsample of the input before the addition.
    upsample = True => performs upsample; upsample = False => performs downsample."""
    def __init__(self, in_ftrs, out_ftrs, stride=2, upsample:bool=False, leaky:float=None):
        # We can't use fastai's conv_layer() here because we need to pass output_padding to ConvTranspose2d
        conv_func = nn.ConvTranspose2d if upsample else nn.Conv2d
        conv_kwargs = {"output_padding": 1} if upsample else {}
        init = nn.init.kaiming_normal_
        conv = init_default(conv_func(in_ftrs, out_ftrs, kernel_size=1, stride=stride, bias=False, padding=0, **conv_kwargs), init)
        activ = nn.ReLU() if leaky is None else nn.LeakyReLU(leaky)
        self.conv1 = nn.Sequential(conv, activ, nn.BatchNorm2d(out_ftrs))

    def forward(self, x):
        identity = self.conv1(x.orig)
        return x + identity

def res_resample_block(in_ftrs, out_ftrs, n_extra_convs=1, resample_first:bool=True, upsample:bool=False, leaky:float=None, **conv_kwargs):
    resample_conv = conv_layer(in_ftrs, out_ftrs, ks=4, stride=2, padding=1, leaky=leaky, transpose=upsample, **conv_kwargs)
    nf_extra_convs = out_ftrs if resample_first else in_ftrs
    regular_convs = [conv_layer(nf_extra_convs, nf_extra_convs, leaky=leaky, **conv_kwargs) 
                     for i in range(n_extra_convs)]
    convs = [resample_conv, *regular_convs] if resample_first else [*regular_convs, resample_conv]
    activ = nn.ReLU() if leaky is None else nn.LeakyReLU(leaky)

    return nn.Sequential(SequentialEx(*convs, MergeResampleLayer(in_ftrs, out_ftrs, 2, upsample=upsample, leaky=leaky)), 
                         activ, nn.BatchNorm2d(out_ftrs))

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2s_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2s_60ep.pth')

### TR 2T (repeat 2F after swapping the order of BN and ReLU): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

Now it is: CONV + BN + RELU.<br>
Inside residual blocks, the last conv before the addition omits ReLU. It comes after the addition.

In [None]:
def res_resample_block(in_ftrs, out_ftrs, n_extra_convs=1, resample_first:bool=True, upsample:bool=False, leaky:float=None, **conv_kwargs):
    resample_conv = [conv_layer(in_ftrs, out_ftrs, ks=4, stride=2, padding=1, leaky=leaky, 
                               transpose=upsample, use_activ=False, **conv_kwargs)]
    if resample_first and n_extra_convs > 0:
        resample_conv.append(nn.ReLU() if leaky is None else nn.LeakyReLU(leaky))
    nf_extra_convs = out_ftrs if resample_first else in_ftrs
    regular_convs = []
    for i in range(n_extra_convs):
        regular_convs.append(conv_layer(nf_extra_convs, nf_extra_convs, leaky=leaky, use_activ=False, **conv_kwargs))
        if not resample_first or (i < n_extra_convs - 1):
            regular_convs.append(nn.ReLU() if leaky is None else nn.LeakyReLU(leaky))
    convs = [*resample_conv, *regular_convs] if resample_first else [*regular_convs, *resample_conv]

    return nn.Sequential(SequentialEx(*convs, 
                                      MergeResampleLayer(in_ftrs, out_ftrs, 2, upsample=upsample)), 
                         nn.ReLU() if leaky is None else nn.LeakyReLU(leaky))

class MergeResampleLayer(Module):
    """Merge a shortcut with the result of the module, which is assumed to be resampled with the same stride. 
    
    Uses a 1x1 conv + BatchNorm to perform a simple up/downsample of the input before the addition.
    upsample = True => performs upsample; upsample = False => performs downsample."""
    def __init__(self, in_ftrs, out_ftrs, stride=2, upsample:bool=False):
        # We can't use fastai's conv_layer() here because we need to pass output_padding to ConvTranspose2d
        conv_func = nn.ConvTranspose2d if upsample else nn.Conv2d
        conv_kwargs = {"output_padding": 1} if upsample else {}
        init = nn.init.kaiming_normal_
        conv = init_default(conv_func(in_ftrs, out_ftrs, kernel_size=1, stride=stride, bias=False, padding=0, **conv_kwargs), init)
        self.conv1 = nn.Sequential(conv, nn.BatchNorm2d(out_ftrs))

    def forward(self, x):
        identity = self.conv1(x.orig)
        return x + identity

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2t_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2t_60ep.pth')

### TR 2U (repeat 2F after removing BN from shortcut): *Simple_res_critic, simple_res_generator*, downsample_first=False, upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
def res_resample_block(in_ftrs, out_ftrs, n_extra_convs=1, resample_first:bool=True, upsample:bool=False, leaky:float=None, **conv_kwargs):
    resample_conv = conv_layer(in_ftrs, out_ftrs, ks=4, stride=2, padding=1, leaky=leaky, transpose=upsample, **conv_kwargs)
    nf_extra_convs = out_ftrs if resample_first else in_ftrs
    regular_convs = [conv_layer(nf_extra_convs, nf_extra_convs, leaky=leaky, **conv_kwargs) 
                     for i in range(n_extra_convs)]
    convs = [resample_conv, *regular_convs] if resample_first else [*regular_convs, resample_conv]

    return nn.Sequential(SequentialEx(*convs, 
                                      MergeResampleLayer(in_ftrs, out_ftrs, 2, upsample=upsample)), 
                         nn.ReLU() if leaky is None else nn.LeakyReLU(leaky))

class MergeResampleLayer(Module):
    """Merge a shortcut with the result of the module, which is assumed to be resampled with the same stride. 
    
    Uses a 1x1 conv + BatchNorm to perform a simple up/downsample of the input before the addition.
    upsample = True => performs upsample; upsample = False => performs downsample."""
    def __init__(self, in_ftrs, out_ftrs, stride=2, upsample:bool=False):
        # We can't use fastai's conv_layer() here because we need to pass output_padding to ConvTranspose2d
        conv_func = nn.ConvTranspose2d if upsample else nn.Conv2d
        conv_kwargs = {"output_padding": 1} if upsample else {}
        init = nn.init.kaiming_normal_
        conv = init_default(conv_func(in_ftrs, out_ftrs, kernel_size=1, stride=stride, bias=False, padding=0, **conv_kwargs), init)
        self.conv1 = conv

    def forward(self, x):
        identity = self.conv1(x.orig)
        return x + identity

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = simple_res_critic(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1, downsample_first=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2u_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr2u_60ep.pth')

### Restore methods modified between 2O-2U

In [None]:
res_resample_block = old_res_resample_block
res_downsample_block = old_res_downsample_block
simple_res_critic = old_simple_res_critic
MergeResampleLayer = OldMergeResampleLayer

## TR 3: Pseudo_res_critic, simple_res_generator

### TR 3A: *Pseudo_res_critic, simple_res_generator,* upsample_first=True, n_extra_layers=1, n_extra_convs_by_block=1, Adam(betas=(0,0.99)), wd=0, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = simple_res_generator(img_size, img_n_channels, n_extra_layers=1, n_extra_convs_by_block=1)
critic = pseudo_res_critic(img_size, img_n_channels, n_extra_layers=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr3a_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr3a_60ep.pth')

It'll surely be much worse, no need to go on with this combination of critic and generator.

## TR 4: deep_res_critic, deep_res_generator

### TR 4A: n_extra_blocks = 1, n_blocks_between_up/downblocks = 1,  n_extra_convs_by_up/downblock=1, downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4a_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4a_60ep.pth')

### TR 4B (not downsample first): n_extra_blocks = 1, n_blocks_between_up/downblocks = 1,  n_extra_convs_by_up/downblock=1, downsample_first=False, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b_60ep.pth')

#### TR 4B2: without BN in resampling shortcut

It should't make sense, as BN tends to help deep networks.

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False, use_shortcut_bn=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b2_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b2_60ep.pth')

#### TR 4B3: Add BN after output of up/downsampling residual block

Try again with BN after output of resampling residual block. It could make sense, as BN tends to help deep networks.

In [None]:
def deep_res_critic_downblock_bn(in_size:int, n_channels:int, n_features:int=64, n_extra_blocks_begin:int=0, n_extra_blocks_end=0,
                    n_blocks_between_downblocks:int=0, n_extra_convs_by_downblock:int=1, downsample_first_in_block:bool=True,
                    dense:bool=False, use_final_activ_res_blocks:bool=False, use_shortcut_activ:bool=False, 
                    use_shortcut_bn:bool=True, **conv_kwargs):
    "A resnet-ish critic for images `n_channels` x `in_size` x `in_size`."
    leaky = 0.2
    layers = [conv_layer(n_channels, n_features, 4, 2, 1, leaky=leaky, norm_type=None, **conv_kwargs)]
    cur_size, cur_ftrs = in_size//2, n_features
    layers.append(nn.Sequential(*[res_block_std(cur_ftrs, dense=dense, leaky=leaky, use_final_activ=use_final_activ_res_blocks) 
                                  for _ in range(n_extra_blocks_begin)]))
    
    while cur_size > 4:
        layers.append(res_downsample_block(cur_ftrs, cur_ftrs*2, n_extra_convs=n_extra_convs_by_downblock, 
                                           downsample_first=downsample_first_in_block, use_final_bn=True,
                                           use_shortcut_activ=use_shortcut_activ, use_shortcut_bn=use_shortcut_bn,
                                           **conv_kwargs))
        cur_ftrs *= 2; cur_size //= 2
        layers += [res_block_std(cur_ftrs, dense=dense, leaky=leaky, use_final_activ=use_final_activ_res_blocks) 
                   for _ in range(n_blocks_between_downblocks)]
        if (dense): cur_ftrs *= 2
        
    layers += [res_block_std(cur_ftrs, dense=dense, leaky=leaky, use_final_activ=use_final_activ_res_blocks) 
               for _ in range(n_extra_blocks_end)]
    layers += [conv2d(cur_ftrs, 1, 4, padding=0), AvgFlatten()]
    return nn.Sequential(*layers)


def deep_res_generator_upblock_bn(in_size:int, n_channels:int, noise_sz:int=100, n_features:int=64, n_extra_blocks_begin=0, 
                       n_extra_blocks_end=0, n_blocks_between_upblocks=0, n_extra_convs_by_upblock:int=1, 
                       upsample_first_in_block:bool=True, dense:bool=False, use_final_activ_res_blocks:bool=False,
                       use_shortcut_activ:bool=False, use_shortcut_bn:bool=True, **conv_kwargs):
    "A resnetish generator from `noise_sz` to images `n_channels` x `in_size` x `in_size`."
    cur_size, cur_ftrs = 4, n_features//2
    while cur_size < in_size:  cur_size *= 2; cur_ftrs *= 2
    layers = [conv_layer(noise_sz, cur_ftrs, 4, 1, transpose=True, **conv_kwargs)]
    layers += [res_block_std(cur_ftrs, dense=dense, use_final_activ=use_final_activ_res_blocks) 
               for _ in range(n_extra_blocks_begin)]

    cur_size = 4
    while cur_size < in_size // 2:
        layers.append(res_upsample_block(cur_ftrs, cur_ftrs//2, n_extra_convs=n_extra_convs_by_upblock, 
                                         upsample_first=upsample_first_in_block, use_final_bn=True,
                                         use_shortcut_activ=use_shortcut_activ, use_shortcut_bn=use_shortcut_bn,
                                         **conv_kwargs))
        cur_ftrs //= 2; cur_size *= 2
        layers += [res_block_std(cur_ftrs, dense=dense, use_final_activ=use_final_activ_res_blocks) 
                   for _ in range(n_blocks_between_upblocks)]
        if (dense): cur_ftrs *= 2

    layers += [res_block_std(cur_ftrs, dense=dense, use_final_activ=use_final_activ_res_blocks) 
               for _ in range(n_extra_blocks_end)]
    layers += [conv2d_trans(cur_ftrs, n_channels, 4, 2, 1, bias=False), nn.Tanh()]
    return nn.Sequential(*layers)


In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator_upblock_bn(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic_downblock_bn(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b3_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b3_60ep.pth')

#### TR 4B4: Add RELU before BN in shortcut path of up/downsampling residual block

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator_upblock_bn(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_shortcut_activ=True)
critic = deep_res_critic_downblock_bn(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False, use_shortcut_activ=True)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b4_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b4_60ep.pth')

#### TR 4B5: Add RELU+BN after any residual block

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_final_activ_res_blocks=True, 
                               use_final_bn=True)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False, use_final_activ_res_blocks=True, 
                         use_final_bn=True)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b5_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b5_60ep.pth')

#### TR 4B6: Add ReLU+BN after any residual block, add ReLU before BN in up/downsampling shortcut, so as to have conv+ReLU+BN there too.

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_final_activ_res_blocks=True, 
                               use_final_bn=True, use_shortcut_activ=True)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False, use_final_activ_res_blocks=True, 
                         use_final_bn=True, use_shortcut_activ=True)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b6_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b6_60ep.pth')

#### TR 4B7: Reduce the number of BN layers -> only BN after block output. Use ReLU in resampling shortcut.

Only BN after block output. No BN inside residual blocks.

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_final_activ_res_blocks=True, 
                               use_final_bn=True, use_shortcut_activ=True, use_shortcut_bn=False, norm_type_inner=None)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False, use_final_activ_res_blocks=True, 
                         use_final_bn=True, use_shortcut_activ=True, use_shortcut_bn=False, norm_type_inner=None)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b7_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b7_60ep.pth')

#### TR 4B8: Reduce the number of BN layers -> only BN after block output. No ReLU in resampling shortcut.

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_final_activ_res_blocks=True, 
                               use_final_bn=True, use_shortcut_activ=False, use_shortcut_bn=False, norm_type_inner=None)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False, use_final_activ_res_blocks=True, 
                         use_final_bn=True, use_shortcut_activ=False, use_shortcut_bn=False, norm_type_inner=None)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b8_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b8_60ep.pth')

#### TR 4B9: No BN

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_final_activ_res_blocks=True, 
                               use_final_bn=False, use_shortcut_activ=False, use_shortcut_bn=False, norm_type_inner=None)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1, downsample_first_in_block=False, use_final_activ_res_blocks=True, 
                         use_final_bn=False, use_shortcut_activ=False, use_shortcut_bn=False, norm_type_inner=None)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b9_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4b9_60ep.pth')

### TR 4C (deeper critic and generator): n_extra_blocks = 1, n_blocks_between_up/downblocks = 2,  n_extra_convs_by_up/downblock=1, downsample_first=True, upsample_first=True, lr=5e-4

Up to this point, it is to be expected that deeper models like this won't work at all, but let's try some epochs.

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=2, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c_60ep.pth')

#### TR 4C2: Lower lr

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=2, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 1e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c2_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c2_60ep.pth')

#### TR 4C3: Higher lr

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=2, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 2e-3

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c3_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c3_60ep.pth')

#### TR 4C4: Without BN in resampling shortcut

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=2, 
                         n_extra_convs_by_downblock=1, use_shortcut_bn=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c4_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4c4_60ep.pth')

### TR 4D (deeper generator only): n_extra_blocks = 1, n_blocks_between_upblocks=2, n_blocks_between_downblocks = 1,  n_extra_convs_by_up/downblock=1, downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4d_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4d_60ep.pth')

#### TR 4D2: Without BN in upsampling shortcut

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=1, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4d2_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4d2_60ep.pth')

### TR 4E (deeper critic only): n_extra_blocks = 1, n_blocks_between_upblocks=1, n_blocks_between_downblocks = 2, n_extra_convs_by_up/downblock=1, downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=2, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4e_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4e_60ep.pth')

### TR 4F (deep generator, shallow critic): n_extra_blocks = 1, n_blocks_between_upblocks=1, n_blocks_between_downblocks = 0, n_extra_convs_by_up/downblock=1, downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f_60ep.pth')

#### TR 4F2: without BN in upsampling shortcut

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f2_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f2_60ep.pth')

#### TR 4F3: without BN in downsampling shortcut

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=1, use_shortcut_bn=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f3_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f3_60ep.pth')

#### TR 4F4: without BN in up/downsampling shortcut

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=1, use_shortcut_bn=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f4_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4f4_60ep.pth')

### TR 4G (deep generator, shallow critic, 0 extra_convs_by_downblock): n_extra_blocks = 1, n_blocks_between_upblocks=1, n_blocks_between_downblocks = 0, n_extra_convs_by_up/downblock=(1/0), downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=0)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4g_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4g_60ep.pth')

### TR 4H (deeper generator, shallow critic): n_extra_blocks = 1, n_blocks_between_upblocks=2, n_blocks_between_downblocks = 0, n_extra_convs_by_up/downblock=1, downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1,  n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4h_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4h_60ep.pth')

### TR 4I (deeper generator, shallow critic, 0 extra_convs_by_downblock): n_extra_blocks = 1, n_blocks_between_upblocks=2, n_blocks_between_downblocks = 0, n_extra_convs_by_up/downblock=(1/0), downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=0)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4i_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4i_60ep.pth')

Maybe there's too much asymmetry between generator and critic architectures. Reducing the ratio critic/gen iterations could deliver some benefits, although it tends not to have an impact. Different learning rates for critic and generator might be worth trying too.

#### TR4I2: without BN in shortcut

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=1, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=0, use_shortcut_bn=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4i2_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4i2_60ep.pth')

### TR 4J (deeper generator, shallow critic, 0 extra_convs_by_downblock, 0 extra_convs_by_upblock): n_extra_blocks = 1, n_blocks_between_upblocks=2, n_blocks_between_downblocks = 0, n_extra_convs_by_up/downblock=0, downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, batch_size, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=0)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=0)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4j_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4j_60ep.pth')

#### TR4J2: without BN in shortcut

In [None]:
data = get_data(realImagesPath, batch_size, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=2, 
                               n_extra_convs_by_upblock=0, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=0, use_shortcut_bn=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4j2_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4j2_60ep.pth')

### TR 4K (deep generator, shallow critic, 0 extra_convs_by_downblock, 3 extra_blocks_end(gen)): n_extra_blocks = (g 3 c 1), n_blocks_between_upblocks=1, n_blocks_between_downblocks = 0, n_extra_convs_by_up/downblock=(1 0), downsample_first=True, upsample_first=True, lr=5e-4

In [None]:
data = get_data(realImagesPath, batch_size, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=3, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=0)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4k_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4k_60ep.pth')

#### TR4K2: without BN in shortcut

In [None]:
data = get_data(realImagesPath, batch_size, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=3, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1, use_shortcut_bn=False)
critic = deep_res_critic(img_size, img_n_channels, n_extra_blocks_begin=1, n_blocks_between_downblocks=0, 
                         n_extra_convs_by_downblock=0, use_shortcut_bn=False)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4k2_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr4k2_60ep.pth')

## TR 5: basic_critic, deep_res_generator

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = basic_critic(img_size, img_n_channels, n_extra_layers=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr5a_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr5a_60ep.pth')

## TR 6: pseudo_res_critic, deep_res_generator

### TR 6A: n_blocks_between_upblocks=1, n_extra_convs_by_upblock=1 / (critic): n_extra_layers=1

In [None]:
data = get_data(realImagesPath, batch_size, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = pseudo_res_critic(img_size, img_n_channels, n_extra_layers=1)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr6a_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr6a_60ep.pth')

### TR 6B: n_blocks_between_upblocks=1, n_extra_convs_by_upblock=1 / (critic): n_extra_layers=3

In [None]:
data = get_data(realImagesPath, 128, img_size)
generator = deep_res_generator(img_size, img_n_channels, n_extra_blocks_end=1, n_blocks_between_upblocks=1, 
                               n_extra_convs_by_upblock=1)
critic = pseudo_res_critic(img_size, img_n_channels, n_extra_layers=3)
learner = CustomGANLearner.wgan(data, generator, critic, switch_eval=False, 
                                opt_func = partial(optim.Adam, betas = (0.,0.99)), wd=0.)
lr = 5e-4

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr6b_30ep.pth')

In [None]:
learner.fit(30, lr)
save_gan_learner(learner, models_root + 'resGANStrictTr6b_60ep.pth')

# POSSIBLE MODIFICATIONS/IMPROVEMENTS NOT YET EXPLORED

* Try std loss, not wgan
* Alter inner part of the blocks.
* Try a different up/downsampling block. See https://papers.nips.cc/paper/7356-fishnet-a-versatile-backbone-for-image-region-and-pixel-level-prediction.pdf.
* Refactor TR2 to avoid redefinitions.
