
# Chapter 1: Foundations of Generative AI

This notebook demonstrates the key foundational concepts in Generative AI through hands-on examples using models such as GPT-2, Variational Autoencoders (VAEs), and Generative Adversarial Networks (GANs).

## Learning Objectives

- Understand the difference between traditional and generative AI
- Explore and implement Variational Autoencoders (VAEs)
- Train a simple Generative Adversarial Network (GAN)
- Use a pretrained Transformer for text generation
- Apply decoding strategies: Greedy, Top-k, Top-p sampling
- Evaluate generative models using metrics like BLEU, ROUGE, FID



## Traditional AI vs Generative AI

| Feature               | Traditional AI           | Generative AI                     |
|-----------------------|---------------------------|----------------------------------|
| Objective             | Classification, Prediction | Generation of new content       |
| Output Type           | Labels, decisions         | Text, images, audio, code        |
| Techniques            | SVM, Decision Trees, CNN  | VAE, GAN, Transformers           |
| Creativity Level      | Deterministic              | Creative, probabilistic          |

### Let's begin with text generation using GPT-2.


In [None]:

from transformers import pipeline

generator = pipeline("text-generation", model="gpt2")
prompt = "The future of artificial intelligence is"
outputs = generator(prompt, max_length=50, temperature=0.7, num_return_sequences=1)

print("Generated Text:")
print(outputs[0]["generated_text"])



## Variational Autoencoders (VAE)

VAEs learn to encode input data into a latent distribution and then reconstruct it. This allows generation of new samples from that distribution.


In [None]:

import torch
from torch import nn

class VAE(nn.Module):
    def __init__(self, input_dim=784, latent_dim=2):
        super(VAE, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 400),
            nn.ReLU()
        )
        self.fc_mu = nn.Linear(400, latent_dim)
        self.fc_logvar = nn.Linear(400, latent_dim)
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 400),
            nn.ReLU(),
            nn.Linear(400, input_dim),
            nn.Sigmoid()
        )

    def reparameterize(self, mu, logvar):
        std = torch.exp(0.5 * logvar)
        eps = torch.randn_like(std)
        return mu + eps * std

    def forward(self, x):
        encoded = self.encoder(x)
        mu, logvar = self.fc_mu(encoded), self.fc_logvar(encoded)
        z = self.reparameterize(mu, logvar)
        return self.decoder(z), mu, logvar



## Generative Adversarial Networks (GAN)

GANs use a generator and discriminator in a minimax game to produce realistic samples.


In [None]:

class Generator(nn.Module):
    def __init__(self, input_size=100, output_size=784):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_size, 256),
            nn.ReLU(),
            nn.Linear(256, output_size),
            nn.Tanh()
        )

    def forward(self, z):
        return self.model(z)

class Discriminator(nn.Module):
    def __init__(self, input_size=784):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_size, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 1),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.model(x)



## Decoding Strategies

We control generative behavior using decoding strategies like:
- Greedy Search
- Top-k Sampling
- Top-p (Nucleus) Sampling


In [None]:

prompt = "The role of artificial intelligence in healthcare is"

print("Greedy decoding:")
print(generator(prompt, do_sample=False, max_length=50)[0]['generated_text'])

print("\nTop-k sampling:")
print(generator(prompt, do_sample=True, top_k=50, max_length=50)[0]['generated_text'])

print("\nTop-p sampling:")
print(generator(prompt, do_sample=True, top_p=0.9, max_length=50)[0]['generated_text'])



## Evaluation of Generated Text

We use metrics like:
- **BLEU**: N-gram overlap
- **ROUGE**: Recall-oriented overlap
- **BERTScore**: Semantic similarity using embeddings

Example with BLEU:


In [None]:

from nltk.translate.bleu_score import sentence_bleu

reference = [["this", "is", "a", "test"]]
candidate = ["this", "is", "a", "trial"]

score = sentence_bleu(reference, candidate)
print(f"BLEU Score: {score:.4f}")



## Exercises

1. Train a VAE on the MNIST dataset and visualize generated digits.
2. Modify the GAN architecture and try generating digits from noise.
3. Compare decoding strategies using different prompts and temperatures.
4. Try calculating ROUGE or BERTScore on generated text.

## References and Links

- Hugging Face Transformers: https://huggingface.co/docs/transformers
- PyTorch VAE Example: https://github.com/pytorch/examples/tree/main/vae
- GAN Tutorial: https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html
- BLEU Score (NLTK): https://www.nltk.org/_modules/nltk/translate/bleu_score.html
