Ref) https://github.com/adityashukzy/Vanilla-GAN/blob/main/VanillaGAN.ipynb

###Inquiery)
Goodfellow presented GAN in 2014. I want to get a python source code which is most similar to the original implementation of Goodfellow's. When I searched over internet, I could find only the version of Radford's (2016 ICLR), which is built on deep convolution layers. Can you help me find Goodfellow's version. I would be happy if the version can be run on Tensorflow 2.0 or later.

ChatGPT의 말:
###ChatGPT)
Ian Goodfellow's original 2014 Generative Adversarial Network (GAN) utilized fully connected neural networks (multilayer perceptrons) for both the generator and discriminator models. While the original implementation was in Theano, there are Python implementations that closely replicate this architecture using TensorFlow 2.x.

One such implementation is available on GitHub:   
Vanilla-GAN: This repository offers a straightforward PyTorch implementation of Goodfellow's original GAN architecture applied to the MNIST dataset.
###Ref) https://github.com/adityashukzy/Vanilla-GAN/blob/main/VanillaGAN.ipynb
To adapt this PyTorch code to TensorFlow 2.x, you can follow these steps:

Define the Generator and Discriminator Models: Use TensorFlow's tf.keras.Sequential to create the models with fully connected layers, mirroring the original architecture.

Implement the Training Loop: Incorporate the adversarial training process, ensuring the generator and discriminator are updated appropriately.

Utilize TensorFlow's Optimizers and Loss Functions: Leverage TensorFlow's built-in functions to handle the optimization and loss calculations.

By following these steps, you can recreate Goodfellow's original GAN using TensorFlow 2.x, enabling you to experiment with and understand the foundational GAN architecture.

In [1]:
!pip install torchvision tensorboardx matplotlib numpy

Collecting tensorboardx
  Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl.metadata (5.8 kB)
Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl (101 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m101.7/101.7 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tensorboardx
Successfully installed tensorboardx-2.6.2.2


In [2]:
import torch
from torch import nn, optim
from torch import Tensor
from torchvision import transforms, datasets
import matplotlib.pyplot as plt

In [3]:
def preprocessData():
    trans = transforms.Compose([
        transforms.ToTensor(), # converts image of (1, 255) to (0, 1)
        transforms.Normalize((.5,), (.5,)) # normalizes (0, 1) to (-1, 1)
    ])
    dataset_dir = '/content/MNISTdataset'
    return datasets.MNIST(root = dataset_dir, train = True, transform = trans, download = True)

In [4]:
data = preprocessData()

dataLoader = torch.utils.data.DataLoader(data, batch_size = 100, shuffle = True)
noOfBatches = len(dataLoader)
print("Number of batches: {}".format(noOfBatches))

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to /content/MNISTdataset/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9.91M/9.91M [00:00<00:00, 33.1MB/s]


Extracting /content/MNISTdataset/MNIST/raw/train-images-idx3-ubyte.gz to /content/MNISTdataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to /content/MNISTdataset/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28.9k/28.9k [00:00<00:00, 778kB/s]


Extracting /content/MNISTdataset/MNIST/raw/train-labels-idx1-ubyte.gz to /content/MNISTdataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to /content/MNISTdataset/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1.65M/1.65M [00:00<00:00, 10.6MB/s]


Extracting /content/MNISTdataset/MNIST/raw/t10k-images-idx3-ubyte.gz to /content/MNISTdataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to /content/MNISTdataset/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4.54k/4.54k [00:00<00:00, 6.87MB/s]

Extracting /content/MNISTdataset/MNIST/raw/t10k-labels-idx1-ubyte.gz to /content/MNISTdataset/MNIST/raw

Number of batches: 600





In [5]:
class Discriminator(torch.nn.Module):
    '''
    3 hidden-layer Discriminator Neural Network
    '''
    def __init__(self):
        super(Discriminator, self).__init__()
        n_features = 784 # 28 x 28 image flattened to a (784, 1) vec
        n_out = 1

        self.hidden0 = nn.Sequential(
            nn.Linear(n_features, 1024),
            nn.LeakyReLU(),
            nn.Dropout(0.3)
        )

        self.hidden1 = nn.Sequential(
            nn.Linear(1024, 512),
            nn.LeakyReLU(),
            nn.Dropout(0.3)
        )

        self.hidden2 = nn.Sequential(
            nn.Linear(512, 256),
            nn.LeakyReLU(),
            nn.Dropout(0.3)
        )

        self.outLayer = nn.Sequential(
            nn.Linear(256, n_out),
            nn.Sigmoid()
        )

    def forward(self, inp):
        x = self.hidden0(inp)
        x = self.hidden1(x)
        x = self.hidden2(x)
        output = self.outLayer(x)

        return output

In [6]:
class Generator(torch.nn.Module):
    '''
    3-layer Generative Neural Network
    '''
    def __init__(self):
        super(Generator, self).__init__()
        n_features = 100 # random noise of size (100, 1)
        n_out = 784 # output image of (784, 1) which will be resized to (28, 28)

        self.hidden0 = nn.Sequential(
            nn.Linear(n_features, 256),
            nn.LeakyReLU()
        )

        self.hidden1 = nn.Sequential(
            nn.Linear(256, 512),
            nn.LeakyReLU()
        )

        self.hidden2 = nn.Sequential(
            nn.Linear(512, 1024),
            nn.LeakyReLU()
        )

        self.outLayer = nn.Sequential(
            nn.Linear(1024, n_out),
            nn.Tanh()
        )

    def forward(self, inp):
        x = self.hidden0(inp)
        x = self.hidden1(x)
        x = self.hidden2(x)
        output = self.outLayer(x)

        return output

In [7]:
def images_to_vectors(images):
    return images.view(images.size(0), 784)

def vectors_to_images(vectors):
    '''
    Resizes a flattened vector into (1, 28, 28)
    '''
    return vectors.view(vectors.size(0), 28, 28)

def noise_generator(size):
    '''
    torch.randn() generates random Gaussian noise with mean = 0 and variance = 1 i.e. a standard normal distribution

    Implementational Note ->
    the use of Variable here is a bit weird to me. In general, Tensors and Variables in PyTorch are the same thing;
    Variables have the added functionality of constructing a chain (or dynamic graph) of computation so that it can backprop
    through it. We can use this easily if we just do requires_grad = True. Here, obviously we're not doing that so I think I'll
    experiment with just using Tensor later to see if it still works.

    UPDATE: Variable has been deprecated. All its functionality now just works with Tensor instead. Just need to set
    requires_grad to True.
    '''

    noise = Tensor(torch.randn(size, 100))
    return noise

In [22]:
# instantiating Discriminator and Generator models
discriminator = Discriminator()
generator = Generator()

learning_rate = 0.0002

# Adam is reknown as perhaps one of the best optimizers for GANs
discriminatorOptimizer = optim.Adam(discriminator.parameters(), lr=learning_rate)
generatorOptimizer = optim.Adam(generator.parameters(), lr=learning_rate)

# Binary Cross-entropy Loss
# since it closely models the loss function we use here
loss = nn.BCELoss()

In [23]:
def ones_target(size):
    '''
    Returns a tensor of ones of (size, 1) which will act as the label for the real data

    Params:
    size => describes the number of training examples in the minibatch so that the label we return = (m, 1) for all m real examples (of the minibatch)
            in short: [l0, l1, l2, ....., lm] which here => [1, 1, 1, .... , 1]
    '''
    return Tensor(torch.ones(size, 1))

def zeros_target(size):
    '''
    Returns a tensor of zeros of (size, 1) which will act as the label for the fake data

    Params:
    size => describes the number of training examples in the minibatch so that the label we return = (m, 1) for all m fake examples (of the minibatch)
    in short: [l0, l1, l2, ....., lm] which here => [0, 0, 0, .... , 0]
    '''
    return Tensor(torch.zeros(size, 1))

## fakeError = loss(fakePreds, zeros_target(size))

Gemini
Okay, let's break down the line of code loss(fakePreds, zeros_target(size)):

This line calculates the error or loss of the discriminator when it's trying to classify fake images. Here's a step-by-step explanation:

- **loss**: This is a pre-defined variable representing the loss function used to train the discriminator. In this case, it's **nn.BCELoss() which stands for Binary Cross-Entropy Loss**. A loss function, in simple terms, quantifies how well the model's predictions match the actual values. The lower the loss, the better the model's performance.   

#### 즉, DCGAN에서의 loss와 비슷한 방법이다.
- You can imagine the loss function like a penalty system. The more the discriminator's predictions are wrong, the higher the penalty or loss it receives.
fakePreds: This variable stores the discriminator's predictions on a batch of fake images generated by the generator. These predictions are essentially probabilities that the discriminator assigns to each image, indicating how likely it thinks the image is real.

- **zeros_target(size)**: This function call generates a tensor of zeros with a shape determined by size. This tensor represents the target labels for the fake images. We're using zeros as the target because we want the discriminator to **classify fake images as fake (represented by 0)**.

size refers to the number of fake images in the current batch.
Putting it all together:

- **loss(fakePreds, zeros_target(size))** calculates the Binary Cross-Entropy Loss between the discriminator's predictions on fake images (fakePreds) and the desired target labels (zeros_target(size)).

This calculation essentially measures how well the discriminator is doing at correctly identifying fake images. The discriminator is penalized for assigning high probabilities (close to 1) to fake images. The larger the difference between the predictions and the target labels, the higher the loss value will be.

- **In summary**, this line of code is a crucial step in training the discriminator to distinguish between real and fake images. It quantifies the discriminator's error on fake images, providing a signal for the optimization process to adjust the discriminator's parameters and improve its performance.

In [24]:
def trainDiscriminator(optimizer, fakeData, realData):
    size = realData.size(0) # basically getting the number of training examples in the minibatch so it can construct a vector of (m, 1) labels later on
    # in short, it is the minibatch size

    # PyTorch accumulates gradient values on each loss.backward(), so we reset the gradients to zero at each iteration to do parameter update correctly
    optimizer.zero_grad()

    '''
    fakePreds.shape= torch.Size([100, 1])
    zeros_target(size).shape= torch.Size([100, 1])
    realPreds.shape= torch.Size([100, 1])
    '''

    # Training on Fake Data
    fakePreds = discriminator(fakeData)
    #print('fakePreds.shape=', fakePreds.shape)
    #print('zeros_target(size).shape=', zeros_target(size).shape)
    # 위의 cell의 설명을 읽으시오,
    fakeError = loss(fakePreds, zeros_target(size))
    fakeError.backward() # calculating gradients going back

    # Training on Real Data
    realPreds = discriminator(realData)
    #print('realPreds.shape=', realPreds.shape)
    #print(ones_target(size).shape)
    realError = loss(realPreds, ones_target(size))
    realError.backward() # calculating gradients going back

    # Finally updating the parameters once we have made a forward pass and calculated grads
    optimizer.step()

    return realError + fakeError, realPreds, fakePreds

In [25]:
def trainGenerator(optimizer, fakeData):
    size = fakeData.size(0) # again, minibatch size

    optimizer.zero_grad()

    predictions = discriminator(fakeData)
    error = loss(predictions, ones_target(size)) # cos we want our data to be classified as real, ie [1, 1, 1, ...., 1] for each sample in the minibatch
    error.backward()

    optimizer.step()

    return error

In [26]:
testSamples = 16
testNoise = noise_generator(testSamples)

In [27]:
numEpochs = 100
#numEpochs = 3

for epoch in range(numEpochs):
    print("epoch=",epoch)

    for nBatch, (realData, _) in enumerate(dataLoader): # enumerate obv. returns tuple of (batch_number, dataLoader)
                                                        # while dataLoader returns (data, targetLabel)
        size = realData.size(0)

        # Train the discriminator
        realData = images_to_vectors(realData) # prepping real data
        fakeData = generator(noise_generator(size)).detach() # generating fake data
        # .detach() ensures grads are not calculated for generator here since we're not training it as yet

        discError, discRealPreds, discFakePreds = trainDiscriminator(discriminatorOptimizer, fakeData, realData)

        # Train the generator
        moreFakeData = generator(noise_generator(size)) # generating fake data

        genError = trainGenerator(generatorOptimizer, moreFakeData)

        #if nBatch % 100 == 0:
        if nBatch % 1000 == 0:
            testImages = vectors_to_images(generator(testNoise))
            #print("New Batch of Images: => nBatch=",nBatch)

            axes = []
            fig = plt.figure(figsize=(20, 4))
            for i in range(testImages.shape[0]):
                axes.append(fig.add_subplot(2, 8, i + 1))
                plt.imshow(testImages[i].detach().numpy())
                axes[-1].set_title("Image {}".format(i + 1))
                plt.axis('off')
                i += 1
            plt.show()

Output hidden; open in https://colab.research.google.com to view.

In [None]:
print("Voila!")