In [1]:
import numpy as np

In the following code, the generator and discriminator are single-layer perceptrons (which is not practical for real-world data but is used here for simplicity). The generator's output is a deterministic function of its input noise and weights; in a more realistic scenario, you would use a non-linear activation function and multiple layers.

The __train_step__ function at the final cell shows how the generator and discriminator might be updated during training, although the actual weight updates are not implemented here. Normally, you would use an optimization algorithm like gradient descent to adjust the weights based on the gradients of the loss function.

## Initialization

In [2]:
# Hyperparameters
g_input_size = 1     # Size of random noise vector.
g_hidden_size = 5    # Generator complexity.
g_output_size = 1    # Size of generated data.

d_input_size = 1     # Minibatch size - cardinality of distributions.
d_hidden_size = 5    # Discriminator complexity.
d_output_size = 1    # Single dimension for 'real' vs. 'fake' classification.

## Generator

The sigmoid activation function $ \sigma(x) $ is defined mathematically as:

$$
\sigma(x) = \frac{1}{1 + e^{-x}}
$$

Where:
- $ x $ is the input to the function,
- $ \sigma(x) $ is the output between 0 and 1.


In [3]:
# Activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

The generator in a Generative Adversarial Network takes a random noise vector $ \mathbf{z} $ as input and uses it to generate fake data. The process involves two steps: first, transforming the noise vector using a weight matrix $ \mathbf{W}_{g1} $, and second, applying the sigmoid activation function. The output is then transformed again by another weight matrix $ \mathbf{W}_{g2} $ and passed through a sigmoid function to produce the final fake data $ \mathbf{x}_{\text{fake}} $:

$$
\mathbf{h} = \sigma(\mathbf{z} \mathbf{W}_{g1})
$$
$$
\mathbf{x}_{\text{fake}} = \sigma(\mathbf{h} \mathbf{W}_{g2})
$$

Here:
- $ \mathbf{z} $ is the generator input (random noise).
- $ \mathbf{W}_{g1} $ and $ \mathbf{W}_{g2} $ are the weights for the first and second layers of the generator, respectively.
- $ \sigma $ represents the sigmoid activation function.
- $ \mathbf{h} $ is the hidden layer representation.
- $ \mathbf{x}_{\text{fake}} $ is the generated fake data.


In [4]:
# Generator
def generate_fake_data(generator_input, generator_weights):
    """
    The generator takes a random noise and uses it to generate fake data.
    
    :param generator_input: Random noise.
    :param generator_weights: Weights for the generator model.
    :return: Fake data.
    """
    hidden_layer = np.dot(generator_input, generator_weights['g1'])
    hidden_layer = sigmoid(hidden_layer)
    output_layer = np.dot(hidden_layer, generator_weights['g2'])
    fake_data = sigmoid(output_layer)
    return fake_data

## Discriminator

The discriminator takes input data and processes it through layers with weights to classify the data as real or fake.

1. The input data $ \mathbf{x} $ is first transformed by a weight matrix $ \mathbf{W}_{d1} $ and then passed through a sigmoid activation function:

   $$ \mathbf{h} = \sigma(\mathbf{x} \mathbf{W}_{d1}) $$

2. The output of this transformation $ \mathbf{h} $ is then further transformed by another weight matrix $ \mathbf{W}_{d2} $ and again passed through a sigmoid function to produce the final classification:

   $$ \text{classification} = \sigma(\mathbf{h} \mathbf{W}_{d2}) $$

Here:
- $ \mathbf{x} $ represents the input data to be classified.
- $ \mathbf{W}_{d1} $ and $ \mathbf{W}_{d2} $ are the weight matrices of the first and second layers of the discriminator, respectively.
- $ \sigma $ is the sigmoid activation function.
- $ \mathbf{h} $ is the hidden layer representation.
- The final output, `classification`, indicates the probability of the input data being real or fake as determined by the discriminator.


In [5]:
# Discriminator
def discriminate_data(data, discriminator_weights):
    """
    The discriminator takes data and tries to classify it as real or fake.
    
    :param data: Data to be classified.
    :param discriminator_weights: Weights for the discriminator model.
    :return: Classification probabilities.
    """
    hidden_layer = np.dot(data, discriminator_weights['d1'])
    hidden_layer = sigmoid(hidden_layer)
    output_layer = np.dot(hidden_layer, discriminator_weights['d2'])
    classification = sigmoid(output_layer)
    return classification

## Loss function

The loss function for the discriminator in a Generative Adversarial Network is designed to measure its ability to distinguish between real and fake data. It consists of two parts: the loss for real data and the loss for fake data. Mathematically, this can be represented as follows:

1. The loss for real data, where the discriminator output for real data is $ D(\mathbf{x}_{\text{real}}) $:

   $$ \text{loss}_{\text{real}} = -\sum_{i} \log(D(\mathbf{x}_{\text{real}})_i) $$

2. The loss for fake data, where the discriminator output for fake data is $ D(\mathbf{x}_{\text{fake}}) $:

   $$ \text{loss}_{\text{fake}} = -\sum_{i} \log(1 - D(\mathbf{x}_{\text{fake}})_i) $$

The total loss for the discriminator is the average of these two losses:

$$ \text{total loss} = \frac{1}{2} \left( \text{loss}_{\text{real}} + \text{loss}_{\text{fake}} \right) $$

Here, $ D(\mathbf{x}) $ represents the discriminator's output probability that $ \mathbf{x} $ is real, and the summation is over all instances in the batch.


In [6]:
# Loss functions
def loss_function(real_output, fake_output):
    """
    The loss function for the discriminator to measure its performance.
    
    :param real_output: Discriminator output for real data.
    :param fake_output: Discriminator output for fake data.
    :return: Loss value.
    """
    # Calculate the loss from the real data being classified as fake.
    real_loss = -np.log(real_output)
    # Calculate the loss from the fake data being classified as real.
    fake_loss = -np.log(1 - fake_output)
    # Total discriminator loss is the average of these losses.
    total_loss = np.mean(real_loss + fake_loss)
    return total_loss

## Training

In a Generative Adversarial Network, the weights for both the generator and the discriminator are typically initialized randomly. This initialization can be represented mathematically as follows:

For the generator, we initialize two sets of weights, $ \mathbf{W}_{g1} $ and $ \mathbf{W}_{g2} $:

$$
\mathbf{W}_{g1} \sim \mathcal{N}(0, 1), \quad \mathbf{W}_{g1} \in \mathbb{R}^{\text{g input size} \times \text{g hidden size}}
$$
$$
\mathbf{W}_{g2} \sim \mathcal{N}(0, 1), \quad \mathbf{W}_{g2} \in \mathbb{R}^{\text{g hidden size} \times \text{g output size}}
$$

For the discriminator, we initialize two sets of weights, $ \mathbf{W}_{d1} $ and $ \mathbf{W}_{d2} $:

$$
\mathbf{W}_{d1} \sim \mathcal{N}(0, 1), \quad \mathbf{W}_{d1} \in \mathbb{R}^{\text{d input size} \times \text{d hidden size}}
$$
$$
\mathbf{W}_{d2} \sim \mathcal{N}(0, 1), \quad \mathbf{W}_{d2} \in \mathbb{R}^{\text{d hidden size} \times \text{d output size}}
$$

Here, $ \mathcal{N}(0, 1) $ indicates that the weights are drawn from a normal distribution with mean 0 and standard deviation 1. The dimensions of the weight matrices are determined by the size of the input layer, the size of the hidden layer, and the size of the output layer for each model.


In [7]:
# Initialize weights
def initialize_weights():
    """
    Initialize weights randomly for both generator and discriminator.
    
    :return: A tuple of dictionaries containing weights for both models.
    """
    generator_weights = {
        'g1': np.random.randn(g_input_size, g_hidden_size),
        'g2': np.random.randn(g_hidden_size, g_output_size)
    }
    discriminator_weights = {
        'd1': np.random.randn(d_input_size, d_hidden_size),
        'd2': np.random.randn(d_hidden_size, d_output_size)
    }
    return generator_weights, discriminator_weights

In a training step of a Generative Adversarial Network (GAN), the following processes occur:

1. **Generation of Fake Data**: 
   The generator creates fake data $\mathbf{x}_{\text{fake}}$ by transforming a random noise vector $\mathbf{z}$ using its current weights $\mathbf{W}_g$:
   $$ \mathbf{x}_{\text{fake}} = \text{generate fake data}(\mathbf{z}, \mathbf{W}_g) $$



2. **Discrimination of Real and Fake Data**: 
   The discriminator evaluates both real data $\mathbf{x}_{\text{real}}$ and fake data $\mathbf{x}_{\text{fake}}$, producing outputs $D(\mathbf{x}_{\text{real}})$ and $D(\mathbf{x}_{\text{fake}})$ respectively, using its current weights $\mathbf{W}_d$:
   $$ D(\mathbf{x}_{\text{real}}) = \text{discriminate data}(\mathbf{x}_{\text{real}}, \mathbf{W}_d) $$
   $$ D(\mathbf{x}_{\text{fake}}) = \text{discriminate data}(\mathbf{x}_{\text{fake}}, \mathbf{W}_d) $$

3. **Computation of the Discriminator's Loss**: 
   The loss for the discriminator $\mathcal{L}_d$ is computed based on its ability to correctly classify real and fake data:
   $$ \mathcal{L}_d = \text{loss function}(D(\mathbf{x}_{\text{real}}), D(\mathbf{x}_{\text{fake}})) $$

In a full training cycle, the weights of both the generator and the discriminator ($\mathbf{W}_g$ and $\mathbf{W}_d$) would typically be updated based on this loss, usually employing a gradient descent algorithm. However, the weight update mechanism is not included in this simplified example.


In [8]:
# Example training step
def train_step(real_data, generator_weights, discriminator_weights):
    """
    A single training step for the GAN, consisting of:
    
    - Generating fake data.
    - Training the discriminator to distinguish real from fake data.
    - Updating the generator to produce better fake data.
    
    :param real_data: Real data to train the discriminator.
    :param generator_weights: Current weights of the generator.
    :param discriminator_weights: Current weights of the discriminator.
    """
    # Generate fake data
    random_noise = np.random.randn(g_input_size)
    fake_data = generate_fake_data(random_noise, generator_weights)
    
    # Discriminate real and fake data
    real_output = discriminate_data(real_data, discriminator_weights)
    fake_output = discriminate_data(fake_data, discriminator_weights)
    
    # Compute loss for discriminator
    d_loss = loss_function(real_output, fake_output)
    
    # Normally we would update the weights here using the loss.
    # This would require implementing backpropagation and an optimization algorithm,
    # which is beyond the scope of this simplified example.
    
    return d_loss

In [9]:
# Initialize weights
gen_weights, disc_weights = initialize_weights()

# Generate some 'real' data (for demonstration purposes)
real_data = np.random.randn(d_input_size)

# Perform a training step
loss = train_step(real_data, gen_weights, disc_weights)
print(f"Discriminator loss: {loss}")

Discriminator loss: 2.165026057870444
