**BRICKKI BRICKS AND THE FORGERS**

It’s your first day at your new job as head of quality
control for Brickki, a company that specializes in
producing high-quality building blocks of all shapes and
sizes (Figure 4-1).

You are immediately alerted to a problem with some of
the items coming off the production line. A competitor
has started to make counterfeit copies of Brickki bricks
and has found a way to mix them into the bags received
by your customers. You decide to become an expert at
telling the difference between the counterfeit bricks and
the real thing, so that you can intercept the forged
bricks on the production line before they are given to
customers. Over time, by listening to customer
feedback, you gradually become more adept at spotting
the fakes.

The forgers are not happy about this—they react to your
improved detection abilities by making some changes to their forgery process so that now, the difference
between the real bricks and the fakes is even harder for
you to spot.

Not one to give up, you retrain yourself to identify the
more sophisticated fakes and try to keep one step ahead
of the forgers. This process continues, with the forgers
iteratively updating their brick creation technologies
while you try to become increasingly more accomplished
at intercepting their fakes.

With every week that passes, it becomes more and more
difficult to tell the difference between the real Brickki
bricks and those created by the forgers. It seems that
this simple game of cat and mouse is enough to drive
significant improvement in both the quality of the
forgery and the quality of the detection.


A GAN is a battle between two adversaries, the generator
and the discriminator. The generator tries to convert
random noise into observations that look as if they have
been sampled from the original dataset, and the
discriminator tries to predict whether an observation
comes from the original dataset or is one of the generator’s
forgeries.

In [57]:
from tensorflow.keras import utils, layers, losses, metrics, models, optimizers,callbacks
import tensorflow as tf

In [2]:
folder = 'C:\\Users\\Whitebox\\Desktop\\envs_and_git_repos\\generative_models\\data\\lego bricks\\dataset'

In [3]:
train_data = utils.image_dataset_from_directory(
                                        folder
                                        ,labels=None
                                        ,color_mode='grayscale'
                                        ,image_size=(64,64)
                                        ,batch_size=128
                                        ,shuffle=True
                                        ,seed=42
                                        ,interpolation='bilinear'
                                        )

Found 40000 files belonging to 1 classes.


In [6]:
def preprocess(img):
 img = (tf.cast(img, "float32") - 127.5) / 127.5
 return img
train = train_data.map(lambda x: preprocess(x))

#### The Discriminator
The goal of the discriminator is to predict if an image is
real or fake. This is a supervised image classification
problem, so we can use a similar architecture to those we
worked with in Chapter 2: stacked convolutional layers,
with a single output node.

In [34]:
KERNEL_SIZE = 4
FEATURE_SIZE = 64
STRIDES = 2
CHANNELS=1
IMG_SIZE = 64

The Keras model that defines the discriminator—a model
that takes an input image and outputs a single number
between 0 and 1

Flatten the last convolutional layer—by this point, the
shape of the tensor is 1 × 1 × 1, so there is no need for a
final Dense layer

In [59]:
discriminator_input = layers.Input(shape=(IMG_SIZE,IMG_SIZE,CHANNELS), name='discriminator_input')
x = layers.Conv2D(filters=FEATURE_SIZE, kernel_size=KERNEL_SIZE, padding='same',strides=STRIDES,use_bias=False)(discriminator_input)
x = layers.LeakyReLU(.2)(x)
x = layers.Dropout(0.3)(x)

x = layers.Conv2D(filters=FEATURE_SIZE*2, kernel_size=KERNEL_SIZE, padding='same',strides=STRIDES,use_bias=False)(x)
x = layers.BatchNormalization(momentum=0.9)(x)
x = layers.LeakyReLU(.2)(x)
x = layers.Dropout(0.3)(x)

x = layers.Conv2D(filters=FEATURE_SIZE*4, kernel_size=KERNEL_SIZE, padding='same',strides=STRIDES,use_bias=False)(x)
x = layers.BatchNormalization(momentum=0.9)(x)
x = layers.LeakyReLU(.2)(x)
x = layers.Dropout(0.3)(x)

x = layers.Conv2D(filters=FEATURE_SIZE*8, kernel_size=KERNEL_SIZE, padding='same',strides=STRIDES,use_bias=False)(x)
x = layers.BatchNormalization(momentum=0.9)(x)
x = layers.LeakyReLU(.2)(x)
x = layers.Dropout(0.3)(x)

x = layers.Conv2D(filters=1, kernel_size=KERNEL_SIZE, padding='valid',strides=1,activation='sigmoid',use_bias=False)(x)
discriminator_output = layers.Flatten()(x)

discriminator = models.Model(discriminator_input,discriminator_output)
discriminator.summary()



Model: "model_23"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 discriminator_input (InputL  [(None, 64, 64, 1)]      0         
 ayer)                                                           
                                                                 
 conv2d_95 (Conv2D)          (None, 32, 32, 64)        1024      
                                                                 
 leaky_re_lu_76 (LeakyReLU)  (None, 32, 32, 64)        0         
                                                                 
 dropout_76 (Dropout)        (None, 32, 32, 64)        0         
                                                                 
 conv2d_96 (Conv2D)          (None, 16, 16, 128)       131072    
                                                                 
 batch_normalization_78 (Bat  (None, 16, 16, 128)      512       
 chNormalization)                                         

#### Generator
The input to the generator
will be a vector drawn from a multivariate standard normal
distribution. The output is an image of the same size as an
image in the original training data

This description may remind you of the decoder in a
variational autoencoder. In fact, the generator of a GAN
fulfills exactly the same purpose as the decoder of a VAE:
converting a vector in the latent space to an image. The
concept of mapping from a latent space back to the original
domain is very common in generative modeling, as it gives
us the ability to manipulate vectors in the latent space to
change high-level features of images in the original domain

In [64]:
Z_DIM = 100
EPOCHS=300


Notice how we use a stride of 2 in some of the
Conv2DTranspose layers to increase the spatial shape of the
tensor as it passes through the network (1 in the original
vector, then 4, 8, 16, 32, and finally 64), while decreasing
the number of channels (512 then 256, 128, 64, and finally
1 to match the grayscale output).


In [60]:
generator_input = layers.Input(shape=(Z_DIM,), name='generator_input')
x = layers.Reshape(target_shape=(1,1,Z_DIM))(generator_input)
x = layers.Conv2DTranspose(filters=512, kernel_size=KERNEL_SIZE,padding='valid',strides=1,use_bias=False)(x)
x = layers.BatchNormalization(momentum=0.9)(x)
x = layers.ReLU(0.2)(x)

x = layers.Conv2DTranspose(filters=256, kernel_size=KERNEL_SIZE,padding='same',strides=STRIDES,use_bias=False)(x)
x = layers.BatchNormalization(momentum=0.9)(x)
x = layers.ReLU(0.2)(x)

x = layers.Conv2DTranspose(filters=128, kernel_size=KERNEL_SIZE,padding='same',strides=STRIDES,use_bias=False)(x)
x = layers.BatchNormalization(momentum=0.9)(x)
x = layers.ReLU(0.2)(x)

x = layers.Conv2DTranspose(filters=64, kernel_size=KERNEL_SIZE,padding='same',strides=STRIDES,use_bias=False)(x)
x = layers.BatchNormalization(momentum=0.9)(x)
x = layers.ReLU(0.2)(x)

generator_output = layers.Conv2DTranspose(filters=1, kernel_size=4,padding='same',
                    strides=STRIDES,activation='tanh',use_bias=False)(x)
generator = models.Model(generator_input,generator_output)
generator.summary()


Model: "model_24"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 generator_input (InputLayer  [(None, 100)]            0         
 )                                                               
                                                                 
 reshape_9 (Reshape)         (None, 1, 1, 100)         0         
                                                                 
 conv2d_transpose_26 (Conv2D  (None, 4, 4, 512)        819200    
 Transpose)                                                      
                                                                 
 batch_normalization_81 (Bat  (None, 4, 4, 512)        2048      
 chNormalization)                                                
                                                                 
 re_lu_18 (ReLU)             (None, 4, 4, 512)         0         
                                                          

#### Training the DCGAN

The key to understanding GANs lies in
understanding the training process for the generator and
discriminator

- We can train the discriminator by creating a training set
where some of the images are real observations from the
training set and some are fake outputs from the generator.
We then treat this as a supervised learning problem, where
the labels are 1 for the real images and 0 for the fake
images, with binary cross-entropy as the loss function.
- How should we train the generator? We need to find a way
of scoring each generated image so that it can optimize
toward high-scoring images. Luckily, we have a
discriminator that does exactly that! We can generate a
batch of images and pass these through the discriminator
to get a score for each image. The loss function for the
generator is then simply the binary cross-entropy between
these probabilities and a vector of ones, because we want
to train the generator to produce images that the
discriminator thinks are real.
- Crucially, we must alternate the training of these two
networks, making sure that we only update the weights of
one network at a time. For example, during the generator
training process, only the generator’s weights are updated.
If we allowed the discriminator’s weights to change as well,
the discriminator would just adjust so that it is more likely
to predict the generated images to be real, which is not the
desired outcome. We want generated images to be
predicted close to 1 (real) because the generator is strong,
not because the discriminator is weak.


In [61]:
class DCGAN(models.Model):
    def __init__(self, discriminator, generator,latent_dim):
        super(DCGAN,self).__init__()
        self.discriminator = disriminator
        self.generator = generator
        self.latent_dim = latent_dim
    
    def compile(self, d_optimizer, g_optimizer):
        super(DCGAN,self).compile()
        self.loss_fn = losses.BinaryCrossentropy()
        self.d_optimizer = d_optimizer
        self.g_optimizer = g_optimizer
        self.d_loss_metric = metrics.Mean(name='d_loss')
        self.g_loss_metric = metrics.Mean(name='g_loss')


    @property
    def metrics(self):
        return [self.d_loss_metric,self.g_loss_metric]

    def train_step(self,real_images):
        batch_size = tf.shape(real_images)[0]
        random_latent_vectors = tf.random.normal(shape=(batch_size,self.latent_dim))

        with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
            generated_images = self.generator(random_latent_vectors,training=True)
            real_predictions = self.discriminator(real_images,training=True)
            fake_predictions = self.discriminator(generated_images,training=True)
            real_labels = tf.ones_like(real_predictions)
            real_noisy_labels = real_labels + 0.1 * tf.random.uniform(tf.shape(real_predictions))
            fake_labels = tf.zeros_like(fake_predictions)
            fake_noisy_labels = fake_labels + 0.1 * tf.random.uniform(tf.shape(fake_predictions))

            d_real_loss= self.loss_fn(real_noisy_labels,real_predictions)
            d_fake_loss = self.loss_fn(fake_noisy_labels,fake_predictions)
            d_loss = (d_real_loss + d_fake_loss)/2.0

            g_loss = self.loss_fn(real_labels,fake_predictions)

        gradient_of_discriminator = disc_tape.gradient(d_loss,self.discriminator.trainable_variables)
        gradient_of_generator = gen_tape.gradient(g_loss,self.generator.trainable_variables)

        self.d_optimizer.apply_gradients(zip(gradient_of_discriminator,self.discriminator.trainable_variables))
        self.g_optimizer.apply_gradients(zip(gradient_of_generator,self.generator.trainable_variables))

        self.d_loss_metric.update_state(d_loss)
        self.g_loss_metric.update_state(g_loss)

        return  {m.name: m.result() for m in self.metrics}

dcgan = DCGAN(discriminator,generator,latent_dim=Z_DIM)
dcgan.compile(d_optimizer = optimizers.Adam(learning_rate=0.0002,beta_1=0.5,beta_2=0.999),
              g_optimizer = optimizers.Adam(learning_rate=0.0002,beta_1=0.5,beta_2=0.999))    


In [62]:
# Create a model save checkpoint
model_checkpoint_callback = callbacks.ModelCheckpoint(
    filepath="./checkpoint/checkpoint.ckpt",
    save_weights_only=True,
    save_freq="epoch",
    verbose=0,
)

tensorboard_callback = callbacks.TensorBoard(log_dir="./logs")


class ImageGenerator(callbacks.Callback):
    def __init__(self, num_img, latent_dim):
        self.num_img = num_img
        self.latent_dim = latent_dim

    def on_epoch_end(self, epoch, logs=None):
        random_latent_vectors = tf.random.normal(
            shape=(self.num_img, self.latent_dim)
        )
        generated_images = self.model.generator(random_latent_vectors)
        generated_images = generated_images * 127.5 + 127.5
        generated_images = generated_images.numpy()
        display(
            generated_images,
            save_to="./dcgan_output/generated_img_%03d.png" % (epoch),
        )

In [65]:
dcgan.fit(
    train,
    epochs=EPOCHS,
    callbacks=[
        model_checkpoint_callback,
        tensorboard_callback,
        ImageGenerator(num_img=10, latent_dim=Z_DIM),
    ],
)

Epoch 1/300




KeyboardInterrupt: 

In [None]:
# Save the final models
generator.save("./models/generator")
discriminator.save("./models/discriminator")

#### Generate New Images

In [None]:
# Sample some points in the latent space, from the standard normal distribution
grid_width, grid_height = (10, 3)
z_sample = np.random.normal(size=(grid_width * grid_height, Z_DIM))

In [None]:
reconstructions = generator.predict(z_sample)


In [None]:
# Draw a plot of decoded images
fig = plt.figure(figsize=(18, 5))
fig.subplots_adjust(hspace=0.4, wspace=0.4)

# Output the grid of faces
for i in range(grid_width * grid_height):
    ax = fig.add_subplot(grid_height, grid_width, i + 1)
    ax.axis("off")
    ax.imshow(reconstructions[i, :, :], cmap="Greys")

In [None]:
def compare_images(img1, img2):
    return np.mean(np.abs(img1 - img2))

all_data = []
for i in train.as_numpy_iterator():
    all_data.extend(i)
all_data = np.array(all_data)    

In [None]:
r, c = 3, 5
fig, axs = plt.subplots(r, c, figsize=(10, 6))
fig.suptitle("Generated images", fontsize=20)

noise = np.random.normal(size=(r * c, Z_DIM))
gen_imgs = generator.predict(noise)

cnt = 0
for i in range(r):
    for j in range(c):
        axs[i, j].imshow(gen_imgs[cnt], cmap="gray_r")
        axs[i, j].axis("off")
        cnt += 1

plt.show()

In [None]:
fig, axs = plt.subplots(r, c, figsize=(10, 6))
fig.suptitle("Closest images in the training set", fontsize=20)

cnt = 0
for i in range(r):
    for j in range(c):
        c_diff = 99999
        c_img = None
        for k_idx, k in enumerate(all_data):
            diff = compare_images(gen_imgs[cnt], k)
            if diff < c_diff:
                c_img = np.copy(k)
                c_diff = diff
        axs[i, j].imshow(c_img, cmap="gray_r")
        axs[i, j].axis("off")
        cnt += 1

plt.show()