# Monet CycleGAN
This project explores generating "Monet Style" images from photos implementing Cycle-Consistent Adversarial Network (CycleGAN). At the time of writing, this project is an assignment for "DTSA 5511: Introduction to Deep Learning". This is my first time implementing CycleGAN or working with TensorFlow Record (tfrec) files so I relied upon many sources and tutorials to complete this project. The main difference between this model from source documentation and tutorials is additional jittering steps and the implementation of an efficient neural network (ENet) as opposed to UNet or Resnet to construct the generator. I have included a comprehensive list of sources at the end of the notebook that I found helpful throughout the project.

This notebook can be accessed from my github repository here: 
https://github.com/arwhit/Monet-CycleGAN/tree/main

You can read more about the kaggle competition and access the original data files here:
https://www.kaggle.com/competitions/gan-getting-started/overview


In [None]:
#Import neccesary libraries and packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential
from keras.layers import Input, Add, Rescaling, Concatenate, ZeroPadding2D, Conv2D, MaxPool2D, Conv2DTranspose, UpSampling2D, Dropout, Flatten, GroupNormalization, BatchNormalization, Activation, LeakyReLU
import tensorflow as tf
import os
import time
import PIL
import shutil
#for dirname, _, filenames in os.walk('/kaggle/input'):
    #for filename in filenames:
        #print(os.path.join(dirname, filename))

## EDA and Preprocessing

We first need to access the images from the directory. In total, there are 300 monet images and 7028 photo images.

In [None]:
#Get all the file names
monet_files = tf.io.gfile.glob(str('/kaggle/input/gan-getting-started/monet_tfrec/*.tfrec'))
photo_files = tf.io.gfile.glob(str('/kaggle/input/gan-getting-started/photo_tfrec/*.tfrec'))


Now we need to decode and read in the images. We will also normalize the images so that colors on the higher end of the RGB spectrum do not have a heavier impact on weights than colors on the lower end of the RGB spectrum. Lastly, we will apply some random jittering to introduce some variation in the dataset and prevent overfitting.There are mutliple ways to do this,but we will stick to taking a random crop of the original image and radomly flipping the image vertically or horizontally. In some scenarios it might be useful to also randomly adjust the hue, saturation, or contrast of the image, but that would not be appropriate here since these are features that often help distinguish an artists paintings.

One nice thing about TensorFlow is all of these functions can be wrapped into the data pipeline and applied to each batch with only a few lines of code.

In [None]:
#Implement helper functions to decode and preprocess the images
def preprocess_tfrecord(example,jitter=True):
    tfrecord_format = {
        "image_name": tf.io.FixedLenFeature([], tf.string),
        "image": tf.io.FixedLenFeature([], tf.string),
        "target": tf.io.FixedLenFeature([], tf.string)
    }
    example = tf.io.parse_single_example(example, tfrecord_format)
    #decode
    image = tf.image.decode_jpeg(example['image'], channels=3)
    #normalize so colors on the upper end of the RGB spectrum are not washed out
    image = (tf.cast(image, tf.float32)/ 127.5) - 1
    #jitter
    if jitter:
        image=tf.image.resize(image, size=[286,286])
        image=tf.image.random_crop(image, size=[256,256,3])
        image=tf.image.random_flip_left_right(image)
        image=tf.image.random_flip_up_down(image)
        return image

#implement helper function to load the datasets
def load_dataset(filenames, labeled=True, ordered=False):
    dataset = tf.data.TFRecordDataset(filenames)
    dataset = dataset.map(preprocess_tfrecord, num_parallel_calls=tf.data.AUTOTUNE)
    return dataset

#Create the data pipeline
monets = load_dataset(monet_files, labeled=True).shuffle(300).batch(1)
photos = load_dataset(photo_files, labeled=True).shuffle(300).batch(1)

Now that we have imported the data lets take a look at what a few images from each dataset look like.

In [None]:
#using next(iter()) is a convenient way to iterate through a dataset of TensorFlow object
ex_monet=next(iter(monets))
ex_photo=next(iter(photos))

#Plot some examples
#Create a subplot with 2 examples
plt.subplot(121)
plt.subplots_adjust(wspace=0.25)
plt.title('Real Photo')
#(note: * 0.5 + 0.5 operation is required to normalize the image so imshow can read it properly)
plt.imshow(ex_photo[0] * 0.5 + 0.5)
plt.subplot(122)
plt.title('Monet')
plt.imshow(ex_monet[0] * 0.5 + 0.5)

Random jittering is wrapped into the model, but lets also use our examples to visualize what is happening.

In [None]:
#Example of jitter preprocessing
def jitter(image):
    image=tf.image.resize(image, size=[286,286])
    image=tf.image.random_crop(image, size=[256,256,3])
    image=tf.image.random_flip_left_right(image)
    image=tf.image.random_flip_up_down(image)
    return image

ex_monet_new=jitter(ex_monet[0])
ex_photo_new=jitter(ex_photo[0])

#Create a before and after subplot
plt.subplot(221)
plt.subplots_adjust(wspace=0.25, hspace=0.4)
plt.title('Original')
#(note: * 0.5 + 0.5 operation is required to normalize the image so imshow can read it properly)
plt.imshow(ex_photo[0] * 0.5 + 0.5)
plt.subplot(222)
plt.title('Post Jitter')
#(note: * 0.5 + 0.5 operation is required to normalize the image so imshow can read it properly)
plt.imshow(ex_photo_new * 0.5 + 0.5)
plt.subplot(223)
plt.title('Original')
plt.imshow(ex_monet[0] * 0.5 + 0.5)
plt.subplot(224)
plt.title('Post Jitter')
plt.imshow(ex_monet_new * 0.5 + 0.5)

## Model Building
Now we need to construct the actual model. One interesting feature of CycleGAN is that images from each class are unpaired, meaning we are picking up on general trends/styles and not linking two specific images together. This is not only more flexible than supervised approaches but also less costly as the need to identify and assign image pairs is unnecessary. 

They also differ from vanilla GAN and DCGAN in a few ways, but the primary difference is the cyclical approach to evaluating the model. More specifically, the model does not simply measure the adversarial loss of the generated image, but the loss of the generated image translated back to its original class. This concept has been referred to as "Cycle Consistency Loss". To accomplish this the CycleGAN architecture has 2 discriminators and 2 generators that opperate in a cyclical manner, hence the name.The general cycle goes like this: 

1. an image from class a is passed through a generator to be transformed to an image of class b

2. the class b discriminator makes a prediction if the image is of class b

3. the transformed image is passed into another generator and transformed back to class a

4. the difference between the original image and the regenerated image is measured and weights are updated

Note: This is a gross oversimplification of the process, but should give you enough of an overview to interperet what is going on in the code below. If you need more details the full research paper that is linked in the sources section.



### Construct The Generators
There are a couple popular generator architechtures used for CycleGAN, but most papers/tutorial use some form of unet or resnet. Both of these methods are computationally expensive. For this project we will attempt to implement Enet which has been show to produce similar results in image segmentation tasks with smaller computational and memory requirements.

In [None]:
#Define Enccoder
def bn_encode(filters,  
              ds=False,#case for downsampling
              asym=False,#case for asymmetric convolution
              dial=(1,1),#case for dialated convolution
              drate=0.1,
              dial_rate=1,
              apply_norm=True):
    
    filters=int(filters)
    initializer = tf.random_normal_initializer(0., 0.02)
    gamma_init = keras.initializers.RandomNormal(mean=0.0, stddev=0.02)
    result = Sequential()
    
    #Initial Projection
    if ds:
        result.add(Conv2D(filters, kernel_size=2, strides=2, padding='same',
                          kernel_initializer=initializer, use_bias=False))
    else:
        result.add(Conv2D(filters/2, kernel_size=1, strides=1, padding='same',
                          kernel_initializer=initializer, use_bias=False))
    if apply_norm:
        result.add(BatchNormalization(gamma_initializer=gamma_init))
        #result.add(GroupNormalization(groups=filters/2,gamma_initializer=gamma_init))
        
    result.add(LeakyReLU()) 
    #Main Convolution
    if asym:
        result.add(Conv2D(filters/2, kernel_size=(5,1), strides=1, padding='same',
                          kernel_initializer=initializer, use_bias=False))
        result.add(Conv2D(filters/2, kernel_size=(1,5), strides=1, padding='same',
                          kernel_initializer=initializer, use_bias=False))
    else:
        result.add(Conv2D(filters/2, kernel_size=3, strides=1, padding='same',
                          kernel_initializer=initializer, use_bias=False))
    if apply_norm:
        result.add(BatchNormalization(gamma_initializer=gamma_init))
        #result.add(GroupNormalization(groups=filters/2,gamma_initializer=gamma_init))
        
    result.add(LeakyReLU())
    
    #Final Expansion
    result.add(Conv2D(filters, kernel_size=1, strides=1, padding='same',
                      kernel_initializer=initializer, use_bias=False))
    if apply_norm:
        result.add(BatchNormalization(gamma_initializer=gamma_init))
        #result.add(GroupNormalization(groups=filters, gamma_initializer=gamma_init))
    result.add(LeakyReLU())
    
    #Regulizer
    result.add(Dropout(rate=drate))
    return result

In [None]:
#Define Bottleneck Decoder
def bn_decode(filters,  
              us=False,#case for upsampling
              drate=0.1,
              apply_norm=True):
    
    filters=int(filters)
    initializer = tf.random_normal_initializer(0., 0.02)
    gamma_init = keras.initializers.RandomNormal(mean=0.0, stddev=0.02)
    result = Sequential()
    
    #Initial Projection
    if us:
        result.add(Conv2DTranspose(filters, kernel_size=2, strides=2, padding='same',
                          kernel_initializer=initializer, use_bias=False))
    else:
        result.add(Conv2D(filters/2, kernel_size=1, strides=1, padding='same',
                          kernel_initializer=initializer, use_bias=False))
    if apply_norm:
        result.add(BatchNormalization(gamma_initializer=gamma_init))
        #result.add(GroupNormalization(groups=filters/2,gamma_initializer=gamma_init))
        
    result.add(LeakyReLU()) 
    #Main Convolution
    result.add(Conv2D(filters/2, kernel_size=3, strides=1,padding='same',
                      kernel_initializer=initializer, use_bias=False))
    if apply_norm:
        result.add(BatchNormalization(gamma_initializer=gamma_init))
        #result.add(GroupNormalization(groups=filters/2,gamma_initializer=gamma_init))
        
    result.add(LeakyReLU())
    
    #Final Expansion
    result.add(Conv2D(filters, kernel_size=1, strides=1, padding='same',
                      kernel_initializer=initializer, use_bias=False))
    if apply_norm:
        result.add(BatchNormalization(gamma_initializer=gamma_init))
        #result.add(GroupNormalization(groups=filters, gamma_initializer=gamma_init))
    result.add(LeakyReLU())
    
    #Regulizer
    result.add(Dropout(rate=drate))
    return result

Now it is time to create the generator following the architechture described in the Enet for image segmentation paper

In [None]:
#Define Generator
def Generator():
    initializer = tf.random_normal_initializer(0., 0.02)
    gamma_init = keras.initializers.RandomNormal(mean=0.0, stddev=0.02)
    inputs = Input(shape=[256,256,3], batch_size=1)
    
    #initial Enet block
    #output 16x128x128
    I1=Conv2D(13,kernel_size=3,strides=2, padding='same',  
              kernel_initializer=initializer, use_bias=False)(inputs)
    I2=MaxPool2D(pool_size=(2, 2))(inputs)
    I=Concatenate()([I1, I2])
    #stage 1 (outputs 64x64x64)
    #b1
    b1=bn_encode(filters=64,ds=True,drate=0.01)(I)
    paddings = [(0, 0), (0,0), (0, 0), (48, 0)]
    I_padded = tf.pad(I, paddings, mode="constant")
    b1f=Add()([MaxPool2D(pool_size=(2, 2))(I_padded),b1])
    #b11
    b11=bn_encode(filters=64,drate=0.01)(b1f)
    b11f=Add()([b1f,b11])
    #b12
    b12=bn_encode(filters=64,drate=0.01)(b11f)
    b12f=Add()([b11f,b12])
    #b13
    b13=bn_encode(filters=64,drate=0.01)(b12f)
    b13f=Add()([b12f,b13])
    #b14
    b14=bn_encode(filters=64,drate=0.01)(b13f)
    b14f=Add()([b13f,b14])
    
    #stage 2 (ouputs 128x32x32)
    #b2
    b2=bn_encode(filters=128,ds=True)(b14f)
    paddings = [(0, 0), (0,0), (0, 0), (64, 0)]
    b14f_padded = tf.pad(b14f, paddings, mode="constant")
    b2f=Add()([MaxPool2D(pool_size=(2, 2))(b14f_padded),b2])
    #b21
    b21=bn_encode(filters=128)(b2f)
    b21f=Add()([b2f,b21])    
    #b22
    b22=bn_encode(filters=128, dial_rate=(2,2))(b21f)
    b22f=Add()([b21f,b22])    
    #b23
    b23=bn_encode(filters=128, asym=True)(b22f)
    b23f=Add()([b22f,b23])       
    #b24
    b24=bn_encode(filters=128, dial_rate=(4,4))(b23f)
    b24f=Add()([b23f,b24])    
    #b25
    b25=bn_encode(filters=128)(b24f)
    b25f=Add()([b24f,b25])  
    #b26
    b26=bn_encode(filters=128, dial_rate=(8,8))(b25f)
    b26f=Add()([b25f,b26])  
    #b27
    b27=bn_encode(filters=128, asym=True)(b26f)
    b27f=Add()([b26f,b27]) 
    #b28
    b28=bn_encode(filters=128, dial_rate=(16,16))(b27f)
    b28f=Add()([b27f,b28])    
    
    #stage 3 (ouputs 128x32x32)
    #b31
    b31=bn_encode(filters=128)(b28f)
    b31f=Add()([b28f,b31])    
    #b32
    b32=bn_encode(filters=128, dial_rate=(2,2))(b31f)
    b32f=Add()([b31f,b32])    
    #b33
    b33=bn_encode(filters=128, asym=True)(b32f)
    b33f=Add()([b32f,b33])       
    #b34
    b34=bn_encode(filters=128, dial_rate=(4,4))(b33f)
    b34f=Add()([b33f,b34])    
    #b35
    b35=bn_encode(filters=128)(b34f)
    b35f=Add()([b34f,b35])  
    #b36
    b36=bn_encode(filters=128, dial_rate=(8,8))(b35f)
    b36f=Add()([b35f,b36])  
    #b37
    b37=bn_encode(filters=128, asym=True)(b36f)
    b37f=Add()([b36f,b37]) 
    #b38
    b38=bn_encode(filters=128, dial_rate=(16,16))(b37f)
    b38f=Add()([b37f,b38])
    
    #stage 4 (outpust 64x64x64)
    #b4
    b4=bn_decode(filters=64,us=True)(b38f)
    #b41
    b41=bn_decode(filters=64)(b4)
    b41f=Add()([b4,b41])
    #b42
    b42=bn_decode(filters=64)(b41f)
    b42f=Add()([b41f,b42])
    #stage 5 (outputs16x128x128)
    #b5
    b5=bn_decode(filters=16,us=True)(b42f)
    #b51
    b51=bn_decode(filters=16)(b5)
    b51f=Add()([b5,b51])
    #Upscale to original image size
    last = Conv2DTranspose(3, 4,strides=2, padding='same',
                                  kernel_initializer=initializer,
                                  activation='tanh') # (bs, 256, 256, 3)
    x = last(b51f)

    return keras.Model(inputs=inputs, outputs=x)

sample_generator=Generator()
sample_generator.summary()

In [None]:
#generator to transform images to photo style
generator_p=Generator()
#generator to transfrom images to monet style
generator_m=Generator()

### Construct the Discriminators
The discriminators will have a PatchGAN architechture. Since considerable time was used implementing ENet when building the generator, we will not reinvent the wheel with the discriminator. We will implement the discriminator avaliable in the tensorflow_examples package. Since the package is not downloadable from the kaggle notebook, we will pull the helper functions from the pix2pix tutorial and make minimal adjustments to work with the packages that have already been imported. Documentation for the package can be found in the sources section, but here is a quote from the documentaion on how it works.

* Each block in the discriminator is: Convolution -> Batch normalization -> Leaky ReLU.
* The shape of the output after the last layer is (batch_size, 30, 30, 1).
* Each 30 x 30 image patch of the output classifies a 70 x 70 portion of the input image.
* The discriminator receives 2 inputs:
     * The input image and the target image, which it should classify as real.
     * The input image and the generated image (the output of the generator), which it     should classify as fake.
 * The 2 imputs are concatenated together an evaluated

In [None]:
#Implement downsample and discriminator helper functions from
#https://www.tensorflow.org/tutorials/generative/pix2pix

def downsample(filters, size, apply_batchnorm=True):
    initializer = tf.random_normal_initializer(0., 0.02)
    result = Sequential()
    result.add(Conv2D(filters, size, strides=2, padding='same',
                      kernel_initializer=initializer, use_bias=False))
    if apply_batchnorm:
        result.add(BatchNormalization())
    result.add(LeakyReLU())
    return result

def Discriminator():
    initializer = tf.random_normal_initializer(0., 0.02)
    inp = Input(shape=[256, 256, 3], name='input_image',batch_size=1)
    down1 = downsample(64, 4, False)(inp)  # (batch_size, 128, 128, 64)
    down2 = downsample(128, 4)(down1)  # (batch_size, 64, 64, 128)
    down3 = downsample(256, 4)(down2)  # (batch_size, 32, 32, 256)
    zero_pad1 = ZeroPadding2D()(down3)  # (batch_size, 34, 34, 256)
    conv = Conv2D(512, 4, strides=1,
                  kernel_initializer=initializer,
                  use_bias=False)(zero_pad1)  # (batch_size, 31, 31, 512)
    batchnorm1 = BatchNormalization()(conv)
    leaky_relu = LeakyReLU()(batchnorm1)
    zero_pad2 = ZeroPadding2D()(leaky_relu)  # (batch_size, 33, 33, 512)
    last = Conv2D(1, 4, strides=1,
                  kernel_initializer=initializer)(zero_pad2)  # (batch_size, 30, 30, 1)
    return keras.Model(inputs=inp, outputs=last)

sample_discriminator=Discriminator()
sample_discriminator.summary()

In [None]:
#discriminator for photo style images
discriminator_p=Discriminator()
#discriminator for monet style images
discriminator_m=Discriminator()

### Define the Loss Function
Generator Loss- How good is the generator at generating images that look real to the discriminator?

Discriminator Loss- How good is the discriminator at determining if an image is generated?

Cycle Consistency Loss-How similar is an image that is tranformed to the different style and then transformed back to the original image

Identity Loss-How similar is a generated image of the same style (ex. similarity of a monet photo passed into the monet generator)

In [None]:
LAMBDA = 10
loss_obj = tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)
#Discriminator Loss
def discriminator_loss(real, generated):
    real_loss = loss_obj(tf.ones_like(real), real)
    generated_loss = loss_obj(tf.zeros_like(generated), generated)
    total_disc_loss = real_loss + generated_loss
    return total_disc_loss * 0.5
#Generator Loss
def generator_loss(generated):
    return loss_obj(tf.ones_like(generated), generated)
#Cycle Consistency Loss
def calc_cycle_loss(real_image, cycled_image):
    loss1 = tf.reduce_mean(tf.abs(real_image - cycled_image))
    return LAMBDA * loss1
#Identity Loss
def identity_loss(real_image, same_image):
    loss = tf.reduce_mean(tf.abs(real_image - same_image))
    return LAMBDA * 0.5 * loss

In [None]:
#define optimizers
generator_m_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
generator_p_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)

discriminator_m_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
discriminator_p_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)

### Link everything together into the CycleGAN model


In [None]:
@tf.function
def train_step(real_photo, real_monet):
  # persistent is set to True because the tape is used more than
  # once to calculate the gradients.
    with tf.GradientTape(persistent=True) as tape:
        # Generator m translates to monet style
        # Generator p translates to photo style
        fake_monet = generator_m(real_photo, training=True)
        cycled_photo = generator_p(fake_monet, training=True)

        fake_photo = generator_p(real_monet, training=True)
        cycled_monet = generator_m(fake_photo, training=True)

        # same_photo and same_monet are used for identity loss.
        same_photo = generator_p(real_photo, training=True)
        same_monet = generator_m(real_monet, training=True)

        disc_real_monet = discriminator_m(real_monet, training=True)
        disc_real_photo = discriminator_p(real_photo, training=True)

        disc_fake_monet = discriminator_m(fake_monet, training=True)
        disc_fake_photo = discriminator_p(fake_photo, training=True)

        # calculate the loss
        gen_monet_loss = generator_loss(disc_fake_monet)
        gen_photo_loss = generator_loss(disc_fake_photo)

        total_cycle_loss = calc_cycle_loss(real_monet, cycled_monet) + calc_cycle_loss(real_photo, cycled_photo)

        # Total generator loss = adversarial loss + cycle loss
        total_gen_monet_loss = gen_monet_loss + total_cycle_loss + identity_loss(real_monet, same_monet)
        total_gen_photo_loss = gen_photo_loss + total_cycle_loss + identity_loss(real_photo, same_photo)

        disc_monet_loss = discriminator_loss(disc_real_monet, disc_fake_monet)
        disc_photo_loss = discriminator_loss(disc_real_photo, disc_fake_photo)

        # Calculate the gradients for generator and discriminator
        generator_m_gradients = tape.gradient(total_gen_monet_loss, 
                                          generator_m.trainable_variables)
        generator_p_gradients = tape.gradient(total_gen_photo_loss, 
                                          generator_p.trainable_variables)

        discriminator_m_gradients = tape.gradient(disc_monet_loss, 
                                              discriminator_m.trainable_variables)
        discriminator_p_gradients = tape.gradient(disc_photo_loss, 
                                              discriminator_p.trainable_variables)

        # Apply the gradients to the optimizer
    generator_m_optimizer.apply_gradients(zip(generator_m_gradients, 
                                            generator_m.trainable_variables))

    generator_p_optimizer.apply_gradients(zip(generator_p_gradients, 
                                            generator_p.trainable_variables))

    discriminator_m_optimizer.apply_gradients(zip(discriminator_m_gradients,
                                                discriminator_m.trainable_variables))

    discriminator_p_optimizer.apply_gradients(zip(discriminator_p_gradients,
                                                discriminator_p.trainable_variables))

## Results and Analysis
Now that we have built the model and defined the loss functions, we need to train the model. I am only using free computational resources to train the model, so the number of epochs trained will be somewhat limited.

### Train the Model

In [None]:
for epoch in range(250):
    start = time.time()
    n = 0
    for real_photo, real_monet in tf.data.Dataset.zip((photos, monets)):
        train_step(real_photo, real_monet)
        if n % 30 == 0:
            print ('.', end='')
        n += 1
    print('Time taken for epoch {} is {} sec\n'.format(epoch + 1, time.time()-start))
    if epoch % 10==0:    
        prediction = generator_m(ex_photo_new[None, ...])
        plt.figure(figsize=(6,6))
        display_list = [ex_photo_new, prediction[0]]
        title = ['Input Image', 'Predicted Image']

        for i in range(2):
            plt.subplot(1, 2, i+1)
            plt.title(title[i])
            # getting the pixel values between [0, 1] to plot it.
            plt.imshow(display_list[i] * 0.5 + 0.5)
            plt.axis('off')
        plt.show()

#### Test the model
Lets feed the model some samples to visualize how it performs. Since our original dataset applied random jittering, we will create a different dataset for visualization and submission where no jittering is applied.

In [None]:
#Implement helper functions to minimally process the images
def minprocess_tfrecord(example,jitter=True):
    tfrecord_format = {
        "image_name": tf.io.FixedLenFeature([], tf.string),
        "image": tf.io.FixedLenFeature([], tf.string),
        "target": tf.io.FixedLenFeature([], tf.string)
    }
    example = tf.io.parse_single_example(example, tfrecord_format)
    #decode
    image = tf.image.decode_jpeg(example['image'], channels=3)
    #normalize so colors on the upper end of the RGB spectrum are not washed out
    image = (tf.cast(image, tf.float32)/ 127.5) - 1
    return image

#implement helper function to reload the datasets
def reload_dataset(filenames, labeled=True, ordered=False):
    dataset = tf.data.TFRecordDataset(filenames)
    dataset = dataset.map(minprocess_tfrecord, num_parallel_calls=tf.data.AUTOTUNE)
    return dataset

reloaded_photos=reload_dataset(photo_files)


In [None]:
def generate_images(model, test_input):
    prediction = model(test_input[None, ...])
    plt.figure(figsize=(6, 6))
    display_list = [test_input, prediction[0]]
    title = ['Input Image', 'Predicted Image']

    for i in range(2):
        plt.subplot(1, 2, i+1)
        plt.title(title[i])
        # getting the pixel values between [0, 1] to plot it.
        plt.imshow(display_list[i] * 0.5 + 0.5)
        plt.axis('off')
    plt.show()
    
for inp in reloaded_photos.take(10):
    generate_images(generator_m, inp)

It appears the model is transforming the images, but the results leave a lot to be desired. Photos will be generated and submitted to kaggle for a formal evaluation.

In [None]:
! mkdir ../images
i = 1
for img in reloaded_photos:
    prediction = generator_m(img[None,...])[0].numpy()
    prediction = (prediction * 127.5 + 127.5).astype(np.uint8)
    im = PIL.Image.fromarray(prediction)
    im.save("../images/" + str(i) + ".jpg")
    i += 1
shutil.make_archive("/kaggle/working/images", 'zip', "/kaggle/images")

The final score achieved for the project was

## Conclusion
While the model is opperational, the output is less than desirable. This might suggest that Enet architechture is not as well suited for CycleGAN as Unet or Resnet, but further testing would be required. If I were to develop the project further, I would try implememnting an Enet discriminator to see if using a more similiar discriminator effected the overall quality of the image produced. I would also test a hybrid Enet/Unet approach to see if adding skip connections to encoding and decoding layers of the same size helped produce more visually pleasing results. Additionally, I would also store the loss values at every step to to better diagnose the performance at each epoch. Lastly, I would try to figure out a workaround to achieve more epochs while still using a free cloud environment. This could probably be accomplished by exporting the weights and using them as a starting point to intitialize a new model.

### Sources
https://www.kaggle.com/code/amyjang/monet-cyclegan-tutorial/notebook

https://www.tensorflow.org/tutorials/generative/dcgan

https://www.tensorflow.org/tutorials/generative/pix2pix

https://www.tensorflow.org/tutorials/generative/cyclegan

https://arxiv.org/pdf/1606.02147.pdf

https://arxiv.org/pdf/1703.10593.pdf

https://arxiv.org/pdf/1611.07004.pdf

https://arxiv.org/pdf/1512.03385.pdf

https://aditi-mittal.medium.com/introduction-to-u-net-and-res-net-for-image-segmentation-9afcb432ee2f

https://arxiv.org/pdf/1909.06840.pdf