# CycleGAN training Notebook




This notebook includes the preprocessing of the datasets, training loops and finally the training of a re-implementation of the CycleGAN by Zhu et al. 

In [1]:
!pip install -U tensorflow-addons

Collecting tensorflow-addons
[?25l  Downloading https://files.pythonhosted.org/packages/74/e3/56d2fe76f0bb7c88ed9b2a6a557e25e83e252aec08f13de34369cd850a0b/tensorflow_addons-0.12.1-cp37-cp37m-manylinux2010_x86_64.whl (703kB)
[K     |▌                               | 10kB 16.4MB/s eta 0:00:01[K     |█                               | 20kB 20.4MB/s eta 0:00:01[K     |█▍                              | 30kB 10.9MB/s eta 0:00:01[K     |█▉                              | 40kB 9.2MB/s eta 0:00:01[K     |██▎                             | 51kB 6.9MB/s eta 0:00:01[K     |██▉                             | 61kB 7.1MB/s eta 0:00:01[K     |███▎                            | 71kB 7.5MB/s eta 0:00:01[K     |███▊                            | 81kB 8.1MB/s eta 0:00:01[K     |████▏                           | 92kB 7.8MB/s eta 0:00:01[K     |████▋                           | 102kB 8.1MB/s eta 0:00:01[K     |█████▏                          | 112kB 8.1MB/s eta 0:00:01[K     |█████▋      

In [2]:
import tensorflow as tf
import tensorflow_datasets as tfds
import os
import time 
import sys

import matplotlib.pyplot as plt

## Importing modules from git repository

In the following the git repository will be cloned for the current runtime. Alternatively the following lines can be uncommented to mount your google drive and clone the repository permanently.

In [None]:
#from google.colab import drive
#drive.mount("/content/gdrive")


To clone the git repository permanently navigate to the desired destination in google drive in the following line. If you clone the repository to your google drive, this cell needs to be executed only once.

In [3]:
#% cd "insert path here"
!git clone "https://github.com/ktargan/project-cycleGAN.git"

Cloning into 'project-cycleGAN'...
remote: Enumerating objects: 129, done.[K
remote: Counting objects: 100% (129/129), done.[K
remote: Compressing objects: 100% (91/91), done.[K
remote: Total 129 (delta 70), reused 79 (delta 34), pack-reused 0[K
Receiving objects: 100% (129/129), 23.90 KiB | 499.00 KiB/s, done.
Resolving deltas: 100% (70/70), done.


If you cloned the repository to your drive, make sure that you are working on the state of the repository that is most up-to-date by using git pull.

In [None]:
#navigate to git repository in drive
#e.g. % cd /content/drive/MyDrive/project-cycleGAN/
#! git pull

fatal: not a git repository (or any of the parent directories): .git


Now we can import the required files from the repository.

In [4]:
sys.path.insert(0,"/content/project-cycleGAN")

import losses
import generator
import discriminator
from utils import buffer
from utils import img_ops

## Dataset
In the cell below we download the dataset horse2zebra as provided by tensorflow. However, here different datasets can be used.

In [None]:
train_horses, train_zebras, test_horses, test_zebras = tfds.load('cycle_gan/horse2zebra', 
                                                                 split = ['trainA','trainB', 'testA[:30]', 'testB[:30]'], 
                                                                 as_supervised=True)


## Input Pipeline

Preprocess input images:
- resizing images to smaller size
- random changes by random crops and flipping
- normalizing of images
- shuffling
- prefetching

In [None]:
#Input pipeline: preprocess images 

#resize image to smaller size (faster computation and thus more manageable for the scope of the task)
#firstly by simply resizing and secondly randomly cropping the resulting images (introduces variation)
train_horses = train_horses.map(lambda image, label: tf.image.resize(image,[135,135]))
train_zebras = train_zebras.map(lambda image, label: tf.image.resize(image,[135,135]))
train_horses = train_horses.map(lambda image: tf.image.random_crop(image,[128,128,3]))
train_zebras = train_zebras.map(lambda image: tf.image.random_crop(image,[128,128,3]))
#randomly decide to mirror images (make sure that they do not all face the same direction for one class)
train_horses = train_horses.map(lambda image: tf.image.random_flip_left_right(image))
train_zebras = train_zebras.map(lambda image: tf.image.random_flip_left_right(image))
# images are normalizied to [-1, 1]
train_horses = train_horses.map(lambda image: (image/127.5)-1)
train_zebras = train_zebras.map(lambda image: (image/127.5)-1)

#Zhu et al. use a batchsize of 1
train_horses = train_horses.shuffle(buffer_size = 1000)
train_horses = train_horses.batch(1)
landscape_dataset = train_horses.prefetch(8)

train_zebras = train_zebras.shuffle(buffer_size = 1000)
train_zebras = train_zebras.batch(1)
fantasy_dataset = train_zebras.prefetch(8)


#for the test dataset which we use to print images in the end
#resize image to smaller size
test_horses = test_horses.map(lambda image, label: tf.image.resize(image,[128,128]))
test_zebras = test_zebras.map(lambda image, label: tf.image.resize(image,[128,128]))
# iamges are normalizied to [-1, 1]
test_horses = test_horses.map(lambda image: (image/127.5)-1)
test_zebras = test_zebras.map(lambda image: (image/127.5)-1)

test_horses = test_horses.batch(1)
test_horses = test_horses.prefetch(8)

test_zebras = test_zebras.batch(1)
test_zebras = test_zebras.prefetch(8)

## Training Loop

Below we define the custom training steps for both the generators and discriminators. 


In [None]:
def training_step_discrim(discriminator, optimizer, images, generated_images):
  # calculate the discriminator loss and apply gradients
  with tf.GradientTape() as tape:
    # feed real images into discriminator, get the predictions
    real_image_predictions = discriminator(images)
    
    # feed fake images into discriminator, get the predictions
    fake_image_predictions = discriminator(generated_images)
    
    #calculate adversarial loss
    discr_loss = losses.discriminator_loss(fake_image_predictions, real_image_predictions)

    gradients = tape.gradient(discr_loss, discriminator.trainable_variables)
    
  optimizer.apply_gradients(zip(gradients, discriminator.trainable_variables))
  return discr_loss

@tf.function
def training_step_gen(generator_zebras, generator_horses, discriminator_zebras, discriminator_horses, 
                      images_zebras, images_horses, optimizer_zebras, optimizer_horses):
  #clarification: generator_zebras generates zebra images from horses 
  #Calculate the loss for both generators and update the weights
  with tf.GradientTape() as tape_horse, tf.GradientTape() as tape_zebra:
    
    #feed original images to generators
    fake_images_zebras = generator_zebras(images_horses)
    fake_images_horses = generator_horses(images_pattern)

    #get the assigned predicition from the discriminators
    fake_image_predictions_zebras = discriminator_zebras(fake_images_zebras)
    fake_image_predictions_horses = discriminator_horses(fake_images_horses)

    #calculate the adversarial generatorloss: 
    #did the discriminator recognize the images as generated?
    gen_loss_zebras = losses.generator_loss(fake_image_predictions_zebras)
    gen_loss_horses = losses.generator_loss(fake_image_predictions_horses)
    
    #pass the generetaed zebra images of generator_zebras to generator_horses 
    #(to see if it produces horse images close to the original image)
    recreated_images_horses = generator_horses(fake_images_zebras)
    recreated_images_zebras = generator_zebras(fake_images_horses)

    #calculate cycle loss: the weighting factor lambda is set to 10
    #how much does the original image differ from the the cycled image 
    cycle_loss_forward = losses.calc_cycle_loss(images_zebras, recreated_images_zebras, 10)
    cycle_loss_backward = losses.calc_cycle_loss(images_horses, recreated_images_horses, 10)
    total_cycle_loss = cycle_loss_forward + cycle_loss_backward

    #give images from their target domain to the generators
    # e.g. give zebra images to a zebra generator and then see if the output 
    #images are close to original images -> identity loss
    same_images_reconstructed_zebras = generator_zebras(images_zebras)
    same_images_reconstructed_horses = generator_horses(images_horses)

    identity_loss_horses = losses.identity_loss(images_horses, same_images_reconstructed_horses, 10)
    identity_loss_zebras = losses.identity_loss(images_zebras, same_images_reconstructed_zebras, 10)

    # sum up the losses for each generator
    # this means the respective generator and identity loss (for their domain)
    # but also the complete cycle consistency loss!
    total_loss_zebras = gen_loss_zebras + total_cycle_loss + identity_loss_zebras
    total_loss_horses = gen_loss_horses + total_cycle_loss + identity_loss_horses

    #update weights (by calculating gradients) of the currently trained generator
    gradients_zebras = tape_zebra.gradient(total_loss_zebras, generator_zebras.trainable_variables)
    gradients_horses = tape_horse.gradient(total_loss_horses, generator_horses.trainable_variables)

  #update weights
  optimizer_zebras.apply_gradients(zip(gradients_zebras, generator_zebras.trainable_variables))
  optimizer_horses.apply_gradients(zip(gradients_horses, generator_horses.trainable_variables))

  #return loss and generated images for the buffer
  return total_loss_zebras, total_loss_horses, fake_images_zebras, fake_images_horses

In [None]:
# used later on to compute duration of an epoch
def timing(start):
    now = time.time()
    time_per_training_step = now - start
    
    return round(time_per_training_step, 2)

##Start the Training

First generators and discriminators are initialized. For longer training we stored frequent checkpoints to be sure not to loose training progress. The code for this is still included but as comments. 

In [None]:
# We will train 2 generators and 2 discriminators
# generator_horses learns to translate zebra to horse images - i.e. generates horse images
generator_horses = generator.Generator()
# generator_zebras learns to translate horse to zebra images
generator_zebras = generator.Generator()

#discriminator horses learns to distinguish between true horse images and generated ones
# receptive field on the patchGAN is set to 70, to create 70x70 image patches
discrim_horses = discriminator.Discriminator(70)
# the other way round
discrim_zebras = discriminator.Discriminator(70)

#Zhu et al. use a learning rate of 0.0002 for the first 100 epochs and then start decreasing it
#They keep the same learning rate for the first 100 epochs and linearly decay the rate to 
#zero over the next 100 epochs.
#however due to computational reasons we only train for 100 epochs and thus keep 
#the learning rate stable
learning_rate = 0.0002

#optimizers for all models
gen_horse_optimizer = tf.keras.optimizers.Adam(learning_rate)
gen_zebra_optimizer = tf.keras.optimizers.Adam(learning_rate)
disc_horse_optimizer = tf.keras.optimizers.Adam(learning_rate)
disc_zebra_optimizer = tf.keras.optimizers.Adam(learning_rate)

#create a folder to store checkpoints (here we store it in google drive to make sure 
#that all progress is saved permantently)

#checkpoint_path = "/content/gdrive/MyDrive/final_project_ANNwtf/checkpoints/discrim_pretraining"
#if not os.path.exists(checkpoint_path):
#    os.makedirs(checkpoint_path)


#create checkpoint manager and store model and optimizer state in case "save" is called
#ckpt = tf.train.Checkpoint(generator_horses=generator_horses,
#                           generator_zebras =generator_zebras,
#                           discrim_horses=discrim_horses,
#                           discrim_zebras=discrim_zebras,
#                           gen_horse_optimizer=gen_horse_optimizer,
#                           gen_zebra_optimizer=gen_zebra_optimizer,
#                           disc_horse_optimizer=disc_horse_optimizer,
#                           disc_zebra_optimizer=disc_zebra_optimizer)

#ckpt_manager = tf.train.CheckpointManager(ckpt, checkpoint_path, max_to_keep=10)

#if a checkpoint exists, restore the latest checkpoint.
#if ckpt_manager.latest_checkpoint:
#  ckpt.restore(ckpt_manager.latest_checkpoint)
#  print ('Latest checkpoint restored!')

#initialize lists to save model losses in
discrim_horse_losses = []
discrim_zebra_losses = []
gen_horse_losses = []
gen_zebra_losses = []

num_epochs = 100

#start the training
for epoch in range(num_epochs):
  print("epoch: ", epoch+1, " ----------------------------------------------------------")

  #create empty buffers to store generated images in (so that discriminator can use these in the training step)
  #buffer is filled via the image_buffer function in the training steps
  buffer_horse = buffer.Buffer(50)
  buffer_zebra = buffer.Buffer(50)
  
  #fill buffers with random images
  generated_img = tf.random.normal([50,128,128,3])
  buffer_horse.set_image_buffer(generated_img)
  buffer_zebra.set_image_buffer(generated_img)

  start = time.time()

  #create variables to save averaged losses
  running_gen_zebra_loss = 0
  running_gen_horse_loss = 0
  running_disc_zebra_loss = 0
  running_disc_horse_loss = 0
  running_average_factor = 0.95

  #iterate through the datasets and train the models
  for horse_img, zebra_img in tf.data.Dataset.zip((horse_dataset, zebra_dataset)):
    #take generated/random images from buffer
    gen_img_horsebuffer = buffer_horse.get_image_buffer()
    gen_img_zebrabuffer = buffer_zebra.get_image_buffer()  

    #calculate the losses for generators and discriminators:

    #first, training step for the discriminators: check performance on real images and generated ones
    disc_loss_zebra = training_step_discrim(discrim_zebras, disc_zebra_optimizer, zebra_img, gen_img_zebrabuffer)
    disc_loss_horse = training_step_discrim(discrim_horses, disc_horse_optimizer, horse_img, gen_img_horsebuffer)
    
    #train the generators 
    gen_loss_zebra, gen_loss_horse, fake_images_zebra, fake_images_horse = training_step_gen(generator_zebras, generator_horses, 
                                          discrim_zebras, discrim_horses, zebra_img, 
                                          horse_img, gen_zebra_optimizer, gen_horse_optimizer)

    #also save generated images in the respective buffers
    buffer_zebra.set_image_buffer(fake_images_zebra)
    buffer_horse.set_image_buffer(fake_images_horse)

    #loss updates
    running_gen_zebra_loss = running_average_factor* running_gen_zebra_loss + (1- running_average_factor)*gen_loss_zebra
    running_gen_horse_loss = running_average_factor* running_gen_horse_loss + (1- running_average_factor)*gen_loss_horse

    running_disc_zebra_loss = running_average_factor* running_disc_zebra_loss + (1- running_average_factor)*disc_loss_zebra
    running_disc_horse_loss = running_average_factor* running_disc_horse_loss + (1- running_average_factor)*disc_loss_horse

  #save losses in respective list
  discrim_zebra_losses.append(running_disc_zebra_loss)
  discrim_horse_losses.append(running_disc_horse_loss)

  gen_zebra_losses.append(running_gen_zebra_loss)
  gen_horse_losses.append(running_gen_horse_loss)

  #print statements to check on current training state
  print(f"the training step and test evaluation took {timing(start)} seconds")
  print("generator_horse loss", running_gen_horse_loss.numpy())
  print("generator zebra loss", running_gen_zebra_loss.numpy())
  print("discriminator_horse loss", running_disc_horse_loss.numpy())
  print("discriminator_zebra loss", running_disc_zebra_loss.numpy())

  #every 5th epoch save a checkpoint
  if (epoch + 1) % 5 == 0:
    ckpt_save_path = ckpt_manager.save()
    print ('Saving checkpoint for epoch {} at {}'.format(epoch+1,
                                                         ckpt_save_path))
  #plot images after every epoch
  img_ops.plot_image_cycle(generator_zebras,generator_horses, zebra_dataset, horse_dataset)


In [None]:
#print first 30 pictures (to have comparable output for ablations studies)
img_ops.plot_image_cycle(generator_zebras,generator_horses, test_zebras, test_horses, ablation = True)