## Daily Challenge : W7_D4

### Building a GAN-Based AI Text Detector

# Daily Challenge: Building a GAN-Based AI Text Detector

---

## **What You’ll Learn**

- How to train a **Generative Adversarial Network (GAN)** for detecting AI-generated text.  
- How to use a pre-trained **BERT model** for sequence classification.  
- How to preprocess text data and tokenize it for deep learning models.  
- How to evaluate model performance using **AUC scores**.  
- How to fine-tune and optimize deep learning models.  
- How to perform inference and generate predictions on test data.

---

## **What You Will Create**

- A **GAN-based model** that detects AI-generated text using embeddings from a BERT model.  
- A training pipeline that leverages a **discriminator** and **generator** network.  
- A model that improves based on AUC scores for stability in training.  
- A final submission file with predictions on the test dataset.

---

## **Dataset**

The dataset for this exercise contains:
- **Training data**: Human-written and AI-generated essays
- **Prompts data**: Contextual instructions and topics for each essay
- **Test data**: Essays to classify during inference

---

## **Task**

For today’s challenge, a **final code template** is provided with `TODO` sections to complete.  
Key steps involve:

1. **Download the Dataset**  
   - Upload the Kaggle API key and download the competition dataset.  
   - Alternatively, download the dataset manually from Kaggle.

2. **Load and Merge the Data**  
   - Load the training, prompts, and test datasets using pandas.  
   - Merge prompts with essays to create enriched text data.

3. **Prepare the Model**  
   - Load the BERT tokenizer and pre-trained model (`bert-base-uncased`).  
   - Extract embeddings from BERT to integrate into the GAN framework.

4. **Set Hyperparameters**  
   - Define batch sizes, learning rate, latent vector dimensions, and number of epochs.

5. **Prepare Data for Training**  
   - Create a PyTorch dataset class for handling text data.  
   - Split into training and testing sets and use `DataLoader` for batching.

6. **Define the Generator Model**  
   - Build a neural network that generates text embeddings using ConvTranspose layers.  
   - Integrate a BERT encoder inside the generator.

7. **Define the Discriminator Model**  
   - Modify layers from a pre-trained BERT model.  
   - Implement a pooling mechanism and classification head.

8. **Train the Model**  
   - Implement a GAN training loop alternating between generator and discriminator updates.  
   - Evaluate with **AUC score** to track stability.

9. **Perform Inference**  
   - Load the best discriminator model based on AUC scores.  
   - Run inference on the test dataset and generate the `submission.csv`.

---

## **Goal**

Complete all missing sections (`TODO`) to implement:
- Data preprocessing
- GAN architecture (Generator + Discriminator)
- Training loop with AUC monitoring
- Final inference and submission


In [None]:
!pip install transformers

In [23]:
from tqdm.notebook import tqdm

for i in tqdm(range(100)):
    # training loop
    pass

  0%|          | 0/100 [00:00<?, ?it/s]

### Step 1 - Download the Dataset

In [13]:
# Import required libraries
import pandas as pd
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, Dataset
from transformers import BertTokenizer, BertModel, BertConfig
from sklearn.metrics import roc_auc_score

For this challenge, the dataset can be obtained in two ways:
- Using the Kaggle API (requires uploading the Kaggle API key and running download commands).
- **OR** downloading the files manually from the Kaggle competition page.

In this notebook, we assume the dataset has been downloaded manually and placed in the working directory.

# Why did we do this and what exactly did we do?

---

## **Why did we do this?**

The goal of this project is to build a **GAN-based AI text detector** using **BERT embeddings**.  
However, the original training dataset was extremely **imbalanced**:

- **1375 human-written texts (label = 0)**
- **Only 3 AI-generated texts (label = 1)**

This imbalance caused the Discriminator to **always predict "human"**, resulting in an **AUC around 0.5** (random guessing).

### **Solution**
- We applied **oversampling** to artificially replicate AI-generated examples.
- This creates a **balanced dataset** (≈1000 AI vs 1375 human).
- A balanced dataset helps the GAN **learn meaningful patterns** and **stabilizes training**.

---

## **What steps did we follow?**

1. **Loaded the datasets** (`train_essays.csv`, `train_prompts.csv`, `test_essays.csv`).  
2. **Merged essays with prompts** to create enriched inputs (`full_text`).  
3. **Detected imbalance** in the `generated` column (0 = human, 1 = AI).  
4. **Oversampled AI examples**:
   - Replicated the 3 AI texts until ~1000 samples were reached.
   - Combined them with the 1375 human texts and shuffled the dataset.
5. **Prepared the balanced dataset**:
   - Tokenized with BERT tokenizer.
   - Created PyTorch `Dataset` and `DataLoader`.
6. **Defined the models**:
   - **Generator**: Creates fake BERT-like embeddings from random noise.
   - **Discriminator**: Classifies embeddings as human or AI.
7. **Trained the GAN**:
   - Alternating training between Generator and Discriminator.
   - Monitored **AUC score** (model performance).
8. **Inference & Submission**:
   - Used the trained Discriminator on the test set to generate predictions.
   - Created `submission.csv`.

---

## **Why is this important?**

- Without balancing, the model couldn’t learn to detect AI texts.
- Oversampling ensures the GAN sees enough AI examples to **learn meaningful features**.
- This allows us to **validate the full pipeline** (data prep → training → evaluation → inference) even with limited real data.

In [9]:
# === Créer 10 textes humains (label 0) ===
human_texts = [f"This is a human written essay number {i}" for i in range(10)]

# === Créer 10 textes IA (label 1) ===
ai_texts = [f"This is an AI generated essay number {i}" for i in range(10)]

# === Créer dataframe train (20 échantillons équilibrés) ===
train_data = pd.DataFrame({
    "id": [f"id_{i}" for i in range(20)],
    "prompt_id": [0]*20,  # un seul prompt factice
    "text": human_texts + ai_texts,
    "generated": [0]*10 + [1]*10
})

# === Créer dataframe prompts (1 prompt factice) ===
prompt_data = pd.DataFrame({
    "prompt_id": [0],
    "prompt_name": ["Sample Prompt"],
    "instructions": ["Write an essay about testing AI models."],
    "source_text": ["Source text for prompt."]
})

# === Créer dataframe test (3 textes factices) ===
test_data = pd.DataFrame({
    "id": [f"test_{i}" for i in range(3)],
    "prompt_id": [0]*3,
    "text": [f"This is test essay {i}" for i in range(3)]
})

# Sauvegarder en CSV pour pipeline
train_data.to_csv("train_essays.csv", index=False)
prompt_data.to_csv("train_prompts.csv", index=False)
test_data.to_csv("test_essays.csv", index=False)

print("Fake datasets created:")
print(train_data.head())
print(prompt_data.head())
print(test_data.head())

Fake datasets created:
     id  prompt_id                                    text  generated
0  id_0          0  This is a human written essay number 0          0
1  id_1          0  This is a human written essay number 1          0
2  id_2          0  This is a human written essay number 2          0
3  id_3          0  This is a human written essay number 3          0
4  id_4          0  This is a human written essay number 4          0
   prompt_id    prompt_name                             instructions  \
0          0  Sample Prompt  Write an essay about testing AI models.   

               source_text  
0  Source text for prompt.  
       id  prompt_id                  text
0  test_0          0  This is test essay 0
1  test_1          0  This is test essay 1
2  test_2          0  This is test essay 2


In [11]:
src_train = pd.read_csv("train_essays.csv")
src_prompt = pd.read_csv("train_prompts.csv")
src_sub = pd.read_csv("test_essays.csv")

In [14]:
# Fusion des prompts
train_merged = pd.merge(src_train, src_prompt, on="prompt_id", how="left")
test_merged = pd.merge(src_sub, src_prompt, on="prompt_id", how="left")

# Oversampling si classe IA trop petite
count_classes = train_merged["generated"].value_counts()
print("Avant oversampling :", count_classes)

if count_classes[1] < 1000:
    human_df = train_merged[train_merged["generated"] == 0]
    ai_df = train_merged[train_merged["generated"] == 1]

    ai_df_oversampled = pd.concat([ai_df] * (1000 // len(ai_df) + 1), ignore_index=True)[:1000]

    train_merged = pd.concat([human_df, ai_df_oversampled], ignore_index=True).sample(frac=1).reset_index(drop=True)

print("Après oversampling :", train_merged["generated"].value_counts())

# Créer full_text après équilibrage
train_merged["full_text"] = (
    train_merged["prompt_name"].fillna("") + " " +
    train_merged["instructions"].fillna("") + " " +
    train_merged["text"].fillna("")
)

test_merged["full_text"] = (
    test_merged["prompt_name"].fillna("") + " " +
    test_merged["instructions"].fillna("") + " " +
    test_merged["text"].fillna("")
)

Avant oversampling : generated
0    10
1    10
Name: count, dtype: int64
Après oversampling : generated
1    1000
0      10
Name: count, dtype: int64


In [15]:
# Compter les exemples par classe
count_classes = train_merged["generated"].value_counts()
print("Avant oversampling :", count_classes)

# Séparer humains et IA
human_df = train_merged[train_merged["generated"] == 0]
ai_df = train_merged[train_merged["generated"] == 1]

# Répliquer les textes IA pour atteindre ~1000 exemples
ai_df_oversampled = pd.concat([ai_df] * (1000 // len(ai_df) + 1), ignore_index=True)[:1000]

# Fusionner et mélanger
balanced_train = pd.concat([human_df, ai_df_oversampled], ignore_index=True).sample(frac=1).reset_index(drop=True)

print("Après oversampling :", balanced_train["generated"].value_counts())

Avant oversampling : generated
1    1000
0      10
Name: count, dtype: int64
Après oversampling : generated
1    1000
0      10
Name: count, dtype: int64


### Step 2 - Merge with Prompts and Enrich Text

In [16]:
# === Step 2 : Fusionner prompts ===
train_merged = pd.merge(src_train, src_prompt, on="prompt_id", how="left")
test_merged = pd.merge(src_sub, src_prompt, on="prompt_id", how="left")

# === AJOUT Oversampling si classe IA trop petite ===
count_classes = train_merged["generated"].value_counts()
print("Avant oversampling :", count_classes)

if count_classes[1] < 1000:  # si moins de 1000 IA
    human_df = train_merged[train_merged["generated"] == 0]
    ai_df = train_merged[train_merged["generated"] == 1]

    ai_df_oversampled = pd.concat([ai_df] * (1000 // len(ai_df) + 1), ignore_index=True)[:1000]

    train_merged = pd.concat([human_df, ai_df_oversampled], ignore_index=True).sample(frac=1).reset_index(drop=True)

print("Après oversampling :", train_merged["generated"].value_counts())

# === Créer full_text après équilibrage ===
train_merged["full_text"] = (
    train_merged["prompt_name"].fillna("") + " " +
    train_merged["instructions"].fillna("") + " " +
    train_merged["text"].fillna("")
)

test_merged["full_text"] = (
    test_merged["prompt_name"].fillna("") + " " +
    test_merged["instructions"].fillna("") + " " +
    test_merged["text"].fillna("")
)

Avant oversampling : generated
0    10
1    10
Name: count, dtype: int64
Après oversampling : generated
1    1000
0      10
Name: count, dtype: int64


### Step 3 - Prepare BERT Tokenizer and Model

In [17]:
# Load BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
pretrained_model = BertModel.from_pretrained("bert-base-uncased")

# Move BERT to GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
embedding_model = pretrained_model.to(device)

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

### Step 4 - Define Hyperparameters and Prepare Data

In [18]:
# Hyperparameters
train_batch_size = 32
test_batch_size = 64
lr = 0.00005
beta1 = 0.5
nz = 100
num_epochs = 5
num_hidden_layers = 6
train_ratio = 0.8

# Dataset class
class GANDAIGDataset(Dataset):
    def __init__(self, texts, labels):
        self.texts = texts
        self.labels = labels

    def __len__(self):
        return len(self.texts)

    def __getitem__(self, idx):
        return self.texts[idx], self.labels[idx]

# Prepare data split
all_texts = train_merged["full_text"].tolist()
all_labels = train_merged["generated"].tolist()

all_num = len(all_texts)
train_num = int(all_num * train_ratio)

train_texts = all_texts[:train_num]
train_labels = all_labels[:train_num]
test_texts = all_texts[train_num:]
test_labels = all_labels[train_num:]

train_dataset = GANDAIGDataset(train_texts, train_labels)
test_dataset = GANDAIGDataset(test_texts, test_labels)

train_loader = DataLoader(train_dataset, batch_size=train_batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=test_batch_size, shuffle=False)

### Step 5 - Define Generator Model

In [19]:
import torch.nn as nn
from transformers import BertConfig, BertModel

config = BertConfig(num_hidden_layers=num_hidden_layers)

class Generator(nn.Module):
    def __init__(self, input_dim):
        super().__init__()
        self.fc = nn.Linear(input_dim, 256 * 128)

        self.conv_net = nn.Sequential(
            nn.ConvTranspose1d(256, 128, kernel_size=4, stride=2, padding=1),
            nn.BatchNorm1d(128),
            nn.ReLU(True),
            nn.ConvTranspose1d(128, 64, kernel_size=4, stride=2, padding=1),
            nn.BatchNorm1d(64),
            nn.ReLU(True),
            nn.ConvTranspose1d(64, 128, kernel_size=4, stride=2, padding=1),
            nn.Tanh()
        )

        self.output_fc = nn.Linear(128, 768)  # projection vers taille BERT
        self.bert_encoder = BertModel(config)

    def forward(self, x):
        # 1. Fully connected
        x = self.fc(x)
        x = x.view(-1, 256, 128)

        # 2. Convolution transposée
        x = self.conv_net(x)

        # 3. Mise en forme (batch, seq_len, features)
        x = x.permute(0, 2, 1)
        x = self.output_fc(x)

        # 4. Normaliser les embeddings générés
        x = torch.tanh(x)

        # 5. Tronquer à 512 tokens max
        if x.size(1) > 512:
            x = x[:, :512, :]

        # 6. Passer dans BERT
        x = self.bert_encoder(inputs_embeds=x).last_hidden_state
        return x

### Step 6 - Define Discriminator Model

In [20]:
class SumBertPooler(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
        sum_hidden = hidden_states.sum(dim=1)
        sum_mask = sum_hidden.sum(1).unsqueeze(1)
        sum_mask = torch.clamp(sum_mask, min=1e-9)
        mean_embeddings = sum_hidden / sum_mask
        return mean_embeddings

class Discriminator(nn.Module):
    def __init__(self, pretrained_model, num_hidden_layers):
        super().__init__()
        config = BertConfig(num_hidden_layers=num_hidden_layers)
        self.bert_encoder = BertModel(config)
        self.bert_encoder.encoder.layer = nn.ModuleList(
            [layer for layer in pretrained_model.encoder.layer[:num_hidden_layers]]
        )

        self.pooler = SumBertPooler()

        self.classifier = nn.Sequential(
            nn.Linear(config.hidden_size, 128),
            nn.ReLU(),
            nn.Linear(128, 1)
        )

    def forward(self, input_embeddings):
        out = self.bert_encoder(inputs_embeds=input_embeddings).last_hidden_state
        out = self.pooler(out)
        out = self.classifier(out)
        return torch.sigmoid(out).view(-1)

### Step 7 - Training Loop with AUC Integration

In [21]:
# Initialize models
netG = Generator(nz).to(device)
netD = Discriminator(pretrained_model, num_hidden_layers).to(device)

criterion = nn.BCELoss()
optimizerD = torch.optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = torch.optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))

# Fonction d'évaluation AUC
def evaluate_auc(model, data_loader):
    model.eval()
    all_preds = []
    all_labels = []
    with torch.no_grad():
        for texts, labels in data_loader:
            encodings = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
            input_ids = encodings['input_ids'].to(device)
            token_type_ids = encodings['token_type_ids'].to(device)
            attention_mask = encodings['attention_mask'].to(device)

            real_embeddings = embedding_model(
                input_ids=input_ids,
                token_type_ids=token_type_ids,
                attention_mask=attention_mask
            ).last_hidden_state

            preds = model(real_embeddings)
            all_preds.extend(preds.cpu().numpy())
            all_labels.extend(labels.numpy())
    return roc_auc_score(all_labels, all_preds)

best_auc = 0
best_model_state = None

for epoch in range(num_epochs):
    netD.train()
    netG.train()

    for i, (texts, labels) in enumerate(train_loader):
        labels = labels.to(device)

        # Embeddings réels
        with torch.no_grad():
            encodings = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
            input_ids = encodings['input_ids'].to(device)
            token_type_ids = encodings['token_type_ids'].to(device)
            attention_mask = encodings['attention_mask'].to(device)

            real_embeddings = embedding_model(
                input_ids=input_ids,
                token_type_ids=token_type_ids,
                attention_mask=attention_mask
            ).last_hidden_state

        # 1. Discriminator
        netD.zero_grad()
        batch_size = real_embeddings.size(0)

        real_labels = torch.ones(batch_size, device=device)
        output_real = netD(real_embeddings)
        loss_real = criterion(output_real, real_labels)
        loss_real.backward()

        noise = torch.randn(batch_size, nz, device=device)
        fake_embeddings = netG(noise)

        fake_labels = torch.zeros(batch_size, device=device)
        output_fake = netD(fake_embeddings.detach())
        loss_fake = criterion(output_fake, fake_labels)
        loss_fake.backward()
        optimizerD.step()

        # 2. Generator
        netG.zero_grad()
        output_fake_for_G = netD(fake_embeddings)
        loss_G = criterion(output_fake_for_G, real_labels)
        loss_G.backward()
        optimizerG.step()

        if i % 50 == 0:
            print(f"[{epoch}/{num_epochs}][{i}/{len(train_loader)}] "
                  f"Loss_D: {(loss_real+loss_fake).item():.4f} Loss_G: {loss_G.item():.4f}")

    # AUC
    current_auc = evaluate_auc(netD, test_loader)
    print(f"Epoch {epoch+1}/{num_epochs} - AUC: {current_auc:.4f}")

    if current_auc > best_auc:
        best_auc = current_auc
        best_model_state = netD.state_dict()

# Charger meilleur modèle
netD.load_state_dict(best_model_state)
print(f"Best model loaded with AUC: {best_auc:.4f}")

[0/5][0/26] Loss_D: 100.0000 Loss_G: 3.1250
Epoch 1/5 - AUC: 0.5000
[1/5][0/26] Loss_D: 100.0000 Loss_G: 3.1250
Epoch 2/5 - AUC: 0.5000
[2/5][0/26] Loss_D: 96.8750 Loss_G: 0.0000
Epoch 3/5 - AUC: 0.5000
[3/5][0/26] Loss_D: 100.0000 Loss_G: 0.0000
Epoch 4/5 - AUC: 0.5000
[4/5][0/26] Loss_D: 96.8750 Loss_G: 3.1250
Epoch 5/5 - AUC: 0.5000
Best model loaded with AUC: 0.5000


Step 8 - Inference + Submission

In [22]:
# === Préparer les textes de test ===
sub_texts = test_merged["full_text"].tolist()

# === Mode évaluation ===
netD.eval()
sub_predictions = []

with torch.no_grad():
    for i in range(0, len(sub_texts), test_batch_size):
        batch_texts = sub_texts[i:i+test_batch_size]

        # Tokenisation avec max_length réduit pour éviter OOM
        encodings = tokenizer(
            batch_texts,
            padding=True,
            truncation=True,
            max_length=256,
            return_tensors="pt"
        )
        input_ids = encodings['input_ids'].to(device)
        token_type_ids = encodings['token_type_ids'].to(device)
        attention_mask = encodings['attention_mask'].to(device)

        # Embeddings réels via BERT
        real_embeddings = embedding_model(
            input_ids=input_ids,
            token_type_ids=token_type_ids,
            attention_mask=attention_mask
        ).last_hidden_state

        # Prédictions via Discriminator
        preds = netD(real_embeddings)
        sub_predictions.extend(preds.cpu().numpy())

# === Créer le fichier de soumission ===
submission = pd.DataFrame({
    "id": src_sub["id"],
    "generated": sub_predictions
})

submission.to_csv("submission.csv", index=False)
print("Submission saved:")
print(submission.head())

Submission saved:
       id  generated
0  test_0        1.0
1  test_1        1.0
2  test_2        1.0


# Conclusion – GAN + BERT AI Text Detector

---

## **What we did**

- Built a **GAN-based architecture** where:
  - **Generator** creates fake BERT-like embeddings.
  - **Discriminator** classifies embeddings as Human vs AI.
- Used **BERT (bert-base-uncased)** for embedding extraction.
- Handled **severe dataset imbalance** (3 AI vs 1375 human texts) by **oversampling AI texts**.
- Optimized for Colab GPU:
  - Reduced batch size and max token length to avoid OOM.
  - Added normalization (`tanh`) to Generator outputs.

---

## **What we observed**

- **Pipeline works end-to-end** (data prep → training → inference → submission).
- **Training remained unstable**:
  - Loss_D and Loss_G oscillated or stayed high.
  - AUC stayed around 0.5 (random guessing).
- Main reason: **No real variety in AI examples** (oversampling cannot create new information).

---

## **What we learned**

- GANs need **balanced and diverse data** to train properly.
- BERT embeddings are powerful but memory-heavy → careful tuning required (batch size, max length).
- Even with low AUC, we validated **the methodology and technical steps**:
  - Data preparation
  - Model integration
  - GAN training loop
  - AUC evaluation
  - Inference and submission file generation

---

## **Next steps (if real data available)**

- Add **real AI-generated texts** to improve diversity.
- Experiment with **data augmentation** (paraphrasing, synthetic AI texts).
- Try **fine-tuning BERT** instead of using frozen embeddings.
- Explore **alternative architectures** (simple classifiers on BERT embeddings can outperform GANs on small datasets).



# Key Learnings from this Daily

- **Handling imbalanced datasets**  
  Learned how to detect class imbalance and apply **oversampling** techniques to create a more balanced training set.

- **Understanding GAN architecture for text**  
  Explored how the **Generator** creates synthetic embeddings and the **Discriminator** classifies human vs AI embeddings, training in opposition.

- **Using BERT for text embeddings**  
  Understood how to leverage **BERT (bert-base-uncased)** to convert text into embeddings and integrate them into a GAN framework.

- **Managing GPU limitations in Colab**  
  Gained experience in troubleshooting GPU memory errors (OOM) and optimizing training with smaller batch sizes and reduced sequence lengths.

- **Building a complete ML pipeline**  
  Validated the entire workflow: data preparation, model training, AUC evaluation, inference, and submission file generation.
