# Chapter 4: Generate Patterns




***
“...a GAN is a pair of adversarial deep learning neural networks. The first network, the forger network, tries to generate something that looks real... The other network, the detective network, ... determines if the forger’s output is real or fake.”


-- Kai-Fu Lee, former Prsident of Google China
***


You learned the basic idea behind Generative Adversarial Networks (GANs) in the previous chapter. You then implemented your first GAN by generating pairs of numbers that form an inverted-U shape. We have used the inverted-U shape as an example, but you can easily modify the code and gererate other shapes such as sine, cosine, U-shape and so on. The GAN framework can be applied to many different settings and researchers have generated images, audio, music, video and more by using the framework. 

To make this point clear, in this chapter, you'll learn how to use GAN to generate a sequence of numbers with certain patterns. We'll try to generate multiples of five. But you can change the pattern to multiples of two, three, seven, or any number really.

To implement the idea, the key is to convert numbers to a format that neural networks understand. Let's say that we want to generate a sequence of ten numbers, and each is a multiple of five in the range 0 to 99. We cannot feed the ten numbers such as [15, 25, 0, 55...] to a neural network. Neural networks adjust values by small amounts in each iteration and we cannot teach the network to change values from one integer to another. Therefore, we will find a way to treat each of the 100 integer numbers, 0, 1, ..., 99, as a separate choice for the neural network. 

The answer lies in one-hot encoders. An integer between 0 and 99 can be represented by $quotient\times5+remainder$, where the value of $quotient$ ranges from 0 to 19 and the value of $remainder$ ranges from 0 to 4. We'll use a 20-value one-hot variable to represent the quotient and a 4-value one-hot variable to represent the remainder. For example, the one-hot variable [1, 0, 0, 0, 0] indicates that the remainder is 0, while [0, 1, 0, 0, 0] indicates the remainder is 1, and so on.

To create an integer between 0 and 99, the neural network need to pick a 20-value one-hot variable and a 4-value one-hot variable to represent the quotient and the remainder, respectively. For example, if the 20-value one-hot variable is
$$[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]$$
and the 4-value one-hot variable is 
$$[0, 1, 0, 0, 0],$$
the neural network has select the integer $7\times5+1=36$.

If a number is a multiple of five, then the associated 4-value one-hot variable is always 
$[1, 0, 0, 0, 0].$ We'll create a GAN for this purpose and train the model with data. After the model is trained, we'll discard the discriminator and use the generator to generate sequence of numbers. The trained generator always generates multiples of five. 

Start a new cell in ch04.ipynb and execute the following lines of code in it:

In [1]:
import os

os.makedirs("files/ch04", exist_ok=True)

# 1. One-Hot Encoders
In this chapter, you'll learn how to convert an integer number between 0 and 99 to two one-hot variables so that you can feed them to the neural networks. Similarly, you'll also convert two one-hot variables back to an integer between 0 and 99 so it's easy for human beings to understand. You basically translate between humans and machines. 

## 1.1. What Are One-Hot Variables?
One-hot variables have value 1 in one place and 0 in all other places. In machine learning, we usually use one-hot encoding to represent *categorical variables*. Examples of categorical variables are the color of a house, which can be red, green, or white. We can use numbers 0, 1, and 2 to represent the three colors in some machine learning models. However, when dealing with neural networks, we need to convert categorical data to one-hot variables, so we use [1, 0, 0] to represent red, [0, 1, 0] to represent green, and  [0, 0, 1] white. 

below, we define a *onehot_encoder()* function to convert an integer to a one-hot variable. 

In [2]:
import torch

def onehot_encoder(position,depth):
    onehot=torch.zeros((depth,))
    onehot[position]=1
    return onehot

The function takes two arguments: the first argumetn *position* is the index at which the value is turned on as 1, and the second argument *depth* is the length of the one-hot variable. 

For example, if we print out the value of *onehot_encoder(1,5)*, like this:

In [3]:
print(onehot_encoder(1,5))

tensor([0., 1., 0., 0., 0.])


The result shows a five-value tensor with the second place as 1 and the rest 0s. 

## 1.2. Convert An Integer to A Pair of One-Hots
Since the pattern we are trying to create are multiples of five, we'll denote an integer as a pair of two numbers: the quotient and the remainder. Any number can be represented as $quotient\times5+remainder$. Since we are dealing with an integer between 0 and 99, the value of $quotient$ ranges from 0 to 19 and the value of $remainder$ ranges from 0 to 4. We'll use a 20-value one-hot variable to represent the quotient and a 4-value one-hot variable to represent the remainder. For simplicity, we'll concetenate the two one-hot variables into a single 25-value tensor, with the 20-value one-hot variable followed by the 5-value one-hot variable. 

The function *int_to_onehots()* is defined as follows:

In [4]:
def int_to_onehots(number):
    # calculate the quotient
    quotient=number//5
    # calculate the remainder
    remainder=number%5
    # convert to onehots
    onehot_quotient=onehot_encoder(quotient,20)
    onehot_remainder=onehot_encoder(remainder,5)
    # concatenate
    combined=torch.cat([onehot_quotient,onehot_remainder])
    return combined

The argument to the function *int_to_onehots()* is an integer between 0 and 99. The function first calculates the quotient and the remainder. It then converts the quotient and remainder into two one-hot variables. Finally, it concatenate the two one-hot variables into a 25-value PyTorch tensor as output. 

Let's use the function to convert 75 to two one-hot variables:

In [5]:
onehots75=int_to_onehots(75)
print(onehots75)

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.,
        0., 0., 1., 0., 0., 0., 0.])


The result is a 25-value tensor. The first 20 values of the tensor is a one-hot variable with the 16th place turned on as 1, and the last 5 values of the tensor is a one-hot variable with the first place turned on as 1. 

## 1.3. Convert A Pair of One-Hots into An Integer
To function *int_to_onehots()* converts an integer into two one-hot variables, concatenated together. In a way, the function is translate human language into machine language. 

Next, we want to translate machine language back to huamn language. Suppose we have two one-hot variables concatenated together, how can we convert them into an integer? The following function *onehots_to_int()* accomplishes that goal:

In [6]:
def onehots_to_int(onehots):
    # extract quotient and remainder
    onehot_quotient=onehots[:20]
    onehot_remainder=onehots[-5:]    
    quotient=torch.argmax(onehot_quotient)
    remainder=torch.argmax(onehot_remainder)
    # concatenate
    number=5*quotient+remainder
    return number.item()

The function *int_to_onehots()* takes an argument *onehots*, which is the concatenated one-hot variables from the quotient and the remainder. The function then converts the one-hot variables into quotient and remainder. From there, the function calculates the value of the number as $quotient\times5+remainder$. 

Let's test the function to see what happens if we use the tensor *onehots75* we just created as the input:

In [7]:
print(onehots_to_int(onehots))

75


The result shows the function converts the one-hot variables to an integer 75, which is the right answer. So the functions are defined properly. 

# 2.  A GAN to Generate Patterns
Our goal is to train a model so that the generator can generate a sequence of 10 integers, all multiples of five. We first generate the training data, and then convert them to machine language in batches. 

## 2.1. Generate A Batch of Training Data
For simplicity, we'll generate a sequence of 10 integers. We'll then convert the sequence into ten machine readable numbers. That is, we'll convert the sequence to ten 25-value tensors. 

The function below generates a sequence of 10 integers, all multiples of five:

In [8]:
import random

def gen_sequence():
    indices = random.sample(range(10), 10)
    values = torch.tensor(indices)*5
    return values    

We first use the *sample()* method in the *random* library to generate ten numbers between 0 ane 9. We then convert the ten numbers into PyTorch tensors and multiply them by five. 

Let's try to generate a sequence of training data:

In [9]:
sequence=gen_sequence()
print(sequence)

tensor([45,  0,  5, 40, 35, 10, 20, 30, 25, 15])


The values are all multiples of five. Next, we convert each number to a pair of one-hot variables so that we can feed them to the neural network later. 

In [10]:
def gen_batch():
    sequence=gen_sequence()
    batch=[int_to_onehots(i).numpy() for i in sequence]
    return torch.tensor(batch)
batch=gen_batch()
print(batch)

tensor([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.,

The function *gen_batch()* creates a batch of ten 25-value tensors to feed to the neural network for training purpose. 

We also define a function *data_to_num()* to convert concatenated one-hot variables to a sequence of integers:

In [11]:
def data_to_num(data):
    multiple=torch.argmax(data[:,:20],dim=-1)
    remainder=torch.argmax(data[:,20:],dim=-1)
    num=multiple*5+remainder
    return num
numbers=data_to_num(batch)
print(numbers)

tensor([ 0,  5, 40, 45, 10, 20, 30, 25, 35, 15])


Next, we'll create two neural networks: one for the discriminator D and one for the generator G. 

## 2.2. The Discriminator D
The discriminator D is a binary classifier, which is very similar to the binary classifier for clothes items we discussed in Chapter 2. Here the discriminator's job is to classify the samples into either real or fake. 

We use PyTorch to create the following discriminator neural network D, like so:

In [12]:
import torch.nn as nn

# determine the device automatically
device="cuda" if torch.cuda.is_available() else "cpu"
# the discriminator D is a binary classifier
D=nn.Sequential(
    nn.Linear(25,1),
    nn.Sigmoid()).to(device)

The input size is 25 so we use 25 as the input size in the first Linear layer in the model. The hidden layer has 64 neurons. The output layer has just one neuron in it: the output of the discriminator D is a single value. We use the Sigmoid activation function to squeeze the output to the range [0, 1] so it can be interpreted as the probability that the sample is real. With complementary probability 1-p, the sample is fake. 

## 2.3. The Generator G
The generator G's job is to create a sequence of numbers so that they can pass as real in front of the discriminator D. That is, G is trying to create a sequence of numbers to maximize the probability that D thinks the numbers are from the training dataset.

We create the following neural network to respresent the generator G:

In [13]:
G=nn.Sequential(
    nn.Linear(100,25),
    nn.ReLU()).to(device)

We'll feed random data from a 100-dimensional latent space, $(z_1, z_2, ..., z_100)$, to the generator. The generator then generates a tensor of 25 values based on the input from the latent space. Note here we use the ReLU activation function at the output layer so the output are all nonnegative. Since we are trying to generate index values, nonnegative values are appropriate here. 

## 2.4. Optimizers and the Loss Function

The loss function is binary cross-entropy loss. The discriminator D is trying to maximize the accuracy of the binary classification: identify a real sample as real and a fake sample as fake. The generator, on the other hand, is trying to minimize the probability that the fake sample is being identified as fake. 

We'll use the Adam optimizer for both the discriminator and the generator:

In [14]:
loss_fn=nn.BCELoss()
lr=0.001
optimD=torch.optim.Adam(D.parameters(),lr=lr)
optimG=torch.optim.Adam(G.parameters(),lr=lr)

# 3. Train and Use the Model
Now that we have the training data and two networks, we'll train the model. After that, we'll discard the discriminator and use the generator to generate a sequence of ten integers. 

## 3.1. Train the Model
We first create labels of zeros and ones as follows:

In [15]:
real_labels=torch.ones((10,1)).to(device)
fake_labels=torch.zeros((10,1)).to(device)

The real labels are all ones, while the fake labels are all zeros. We'll compare these labels with the predictions from the discriminator to train both the discriminator and the generator. 

We train the model for 10000 epochs, as follows:

In [16]:
for i in range(10000):
    # train discriminator on real data
    gloss=dloss=0
    # Generate examples of even real data
    true_data=gen_batch().to(device)
    # use 1 as labels since they are real
    preds=D(true_data)
    loss_D=loss_fn(preds,real_labels)
    optimD.zero_grad()
    loss_D.backward()
    optimD.step()
    dloss+=loss_D    
    # train D on fake data
    noise=torch.randn(10,100).to(device)
    generated_data=G(noise)
    # use 0 as labels since they are fake
    preds=D(generated_data)
    loss_D=loss_fn(preds,fake_labels)
    optimD.zero_grad()
    loss_D.backward()
    optimD.step()
    dloss+=loss_D      
    # train G 
    noise=torch.randn(10,100).to(device)
    generated_data=G(noise)
    # use 1 as labels since G wants to fool D
    preds=D(generated_data)
    loss_G=loss_fn(preds,real_labels)
    optimG.zero_grad()
    loss_G.backward()
    optimG.step()
    gloss+=loss_G      
    if i % 100 == 0:
        print(data_to_num(generated_data))

tensor([35, 58, 15, 65, 50, 18, 53, 37, 19, 16], device='cuda:0')
tensor([11, 17, 86, 46, 54, 65, 46, 45,  9, 65], device='cuda:0')
tensor([17,  5, 15, 67,  6, 15, 17, 85, 85, 10], device='cuda:0')
tensor([ 5, 15, 15, 47, 15,  5, 17, 10, 25, 15], device='cuda:0')
tensor([25, 45, 10,  5, 45,  5, 45, 15, 50, 47], device='cuda:0')
tensor([40,  0, 35, 35, 15, 25,  0, 40,  5,  0], device='cuda:0')
tensor([35, 40, 35, 40,  0, 40, 35, 30, 35, 40], device='cuda:0')
tensor([20,  0,  0,  0, 25, 30,  0,  0, 35, 20], device='cuda:0')
tensor([ 0,  0, 10, 20,  0, 10, 35, 35, 30,  0], device='cuda:0')
tensor([10, 10, 30, 30,  0,  0, 10, 10, 45, 10], device='cuda:0')
tensor([10, 30, 30, 10,  5, 10,  0, 30, 45,  5], device='cuda:0')
tensor([10, 10, 25, 25,  5, 45, 25, 10, 25, 10], device='cuda:0')
tensor([40,  5, 45, 35, 25,  5, 15, 25, 25, 40], device='cuda:0')
tensor([45, 25, 35, 35, 35, 40, 40, 15, 20, 15], device='cuda:0')
tensor([20,  0, 20, 35, 20, 20, 35, 20,  0, 40], device='cuda:0')
tensor([ 0

In each iteration, we generate a batch of ten numbers. We first train the discriminator using the real samples. After that, the generator creates a batch of fake samples and we use them to train the discriminator D again. Finally, we let the generator create a batch of fake samples again, but we use them to train the generator instead.

In the first few hundred epochs, the generator still generates numbers that are not multiples of five. But after a few hundred epochs, all the numbers generated are multiples of five. 

## 3.2. Use the Trained Model
We'll discard the discriminator and save the trained generator on the local folder, as follows:

In [17]:
# Export to TorchScript
scripted = torch.jit.script(G) 
# Save the model
import os
os.makedirs("files/ch04", exist_ok=True)
scripted.save('files/ch04/num_gen.pt') 

We have now saved the generator to the local folder. To use the generator, we don't even need to define the model. We simply load up the model and use it to generate data points as follows: 

In [18]:
new_G=torch.jit.load('files/ch04/num_gen.pt',
                     map_location=device)
new_G.eval()

RecursiveScriptModule(
  original_name=Sequential
  (0): RecursiveScriptModule(original_name=Linear)
  (1): RecursiveScriptModule(original_name=ReLU)
)

The generator is now reloaded. We can use it to generate a sequence of ten integers as follows:

In [19]:
# obtain inputs from the latent space
noise=torch.randn((10,100)).to(device)
# feed the input to the generator 
new_data=new_G(noise) 
print(data_to_num(new_data))

tensor([25,  0, 30, 40, 25, 35, 10, 30, 10,  0], device='cuda:0')


The generated numbers are all multiples of five. So the trained model really works! 

You can easily change the code to generate other patterns such as odd numbers, even numbers, multiples of three, and so on. I'll leave that as an exercise for you.