# 🧠 Introduction — AI-Generated Text Detector using a GAN

Generative Adversarial Networks (GANs) are a powerful class of deep learning models introduced by Ian Goodfellow in 2014. They are based on a very simple yet brilliant idea: **two neural networks compete against each other in a zero-sum game**.

A GAN consists of two main components:
- **The Generator**: its role is to produce data that looks as realistic as possible (in our case, text embeddings that resemble those produced by humans or by AI).
- **The Discriminator**: its task is to distinguish between real data (human-written text) and generated data (AI-written text).

Over time, the generator improves in “fooling” the discriminator, while the discriminator becomes more skilled at detecting generated content. This adversarial training loop enables GANs to learn highly expressive data representations.

---

### 🎯 Project Objective

In this challenge, we’ll leverage this architecture to **build a detector for AI-generated text**. Specifically:
- We'll use a pre-trained **BERT** model to encode the input texts.
- We'll train a **generator** to produce embeddings that resemble those from AI-generated text.
- We'll train a **discriminator** to decide whether an embedding comes from a human or an AI-written text.

We’ll evaluate the discriminator’s performance using the **AUC (Area Under the Curve)** score, which is a key metric for binary classification tasks like this one.

This project is a great opportunity to deepen our understanding of GANs while building a practical system for **detecting AI-generated content**—a growing concern in today’s generative AI landscape.

---

📌 *Note: this challenge uses a hybrid approach that combines GANs with Transformer-based architectures to leverage both semantic encoding and adversarial training.*


In [1]:
import numpy as np
import pandas as pd
import random
import string

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import BertConfig
from transformers.models.bert.modeling_bert import BertEncoder
from sklearn.metrics import roc_auc_score

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [2]:
import os
from google.colab import files

# Nettoyer tous les fichiers du dossier /content (sauf les dossiers système)
for f in os.listdir("/content"):
    path = f"/content/{f}"
    if os.path.isfile(path):
        os.remove(path)

# Lancer l’upload interactif
print(" Veuillez téléverser vos fichiers maintenant.")
uploaded = files.upload()

# Afficher uniquement les noms des fichiers
print("\n Fichiers téléversés :")
for fname in uploaded.keys():
    print(fname)


 Veuillez téléverser vos fichiers maintenant.


Saving sample_submission.csv to sample_submission.csv
Saving test_essays.csv to test_essays.csv
Saving train_essays.csv to train_essays.csv
Saving train_prompts.csv to train_prompts.csv

 Fichiers téléversés :
sample_submission.csv
test_essays.csv
train_essays.csv
train_prompts.csv


### 📥 Load Data Files

In this step, we load all the CSV files required for our project:
- `train_essays.csv`: training texts with associated labels (AI or human)
- `test_essays.csv`: unlabeled texts to predict on
- `train_prompts.csv`: corresponding prompts used to generate or inspire the texts
- `sample_submission.csv`: format for submitting predictions




In [4]:
TRAIN_PATH = "/content/train_essays.csv"
TEST_PATH = "/content/test_essays.csv"
PROMPT_PATH = "/content/train_prompts.csv"
SUB_PATH = "/content/sample_submission.csv"


src_train = pd.read_csv(TRAIN_PATH)
src_test = pd.read_csv(TEST_PATH)
src_prompt = pd.read_csv(PROMPT_PATH)
src_sub = pd.read_csv(SUB_PATH)

In [5]:
print(src_train.shape, src_test.shape, src_prompt.shape, src_sub.shape)
print("Extrait du train :")
src_train.sample(5)

(1378, 4) (3, 3) (2, 4) (3, 2)
Extrait du train :


Unnamed: 0,id,prompt_id,text,generated
1101,c8b89dd4,1,"Dear, state senator I think we should change t...",0
88,135b769a,1,"To the State Senate, Addressing my ultimate op...",0
468,5cd6a57e,0,"Cars our main source for travel, what we depen...",0
38,08157ec0,0,"To access what one needs in the world today, m...",0
177,228a014b,1,"Dear my Senator, whats the point in voting if ...",0


In [6]:
src_train["prompt_id"].unique()

array([0, 1])

In [7]:
src_train["generated"].unique()

array([0, 1])

In [8]:
src_train["text"][0][:1000]

'Cars. Cars have been around since they became famous in the 1900s, when Henry Ford created and built the first ModelT. Cars have played a major role in our every day lives since then. But now, people are starting to question if limiting car usage would be a good thing. To me, limiting the use of cars might be a good thing to do.\n\nIn like matter of this, article, "In German Suburb, Life Goes On Without Cars," by Elizabeth Rosenthal states, how automobiles are the linchpin of suburbs, where middle class families from either Shanghai or Chicago tend to make their homes. Experts say how this is a huge impediment to current efforts to reduce greenhouse gas emissions from tailpipe. Passenger cars are responsible for 12 percent of greenhouse gas emissions in Europe...and up to 50 percent in some carintensive areas in the United States. Cars are the main reason for the greenhouse gas emissions because of a lot of people driving them around all the time getting where they need to go. Article

In [9]:
src_test.columns

Index(['id', 'prompt_id', 'text'], dtype='object')

In [10]:
src_prompt.columns

Index(['prompt_id', 'prompt_name', 'instructions', 'source_text'], dtype='object')

In [11]:
src_prompt.head()

Unnamed: 0,prompt_id,prompt_name,instructions,source_text
0,0,Car-free cities,Write an explanatory essay to inform fellow ci...,"# In German Suburb, Life Goes On Without Cars ..."
1,1,Does the electoral college work?,Write a letter to your state senator in which ...,# What Is the Electoral College? by the Office...


### 📊 Quick Statistical Overview

Before building our model, let's explore the training data to understand its structure:

- We check the **class distribution** in the `generated` column to see if the dataset is balanced.
- We compute the **average length** of the texts, which may influence model performance or input preprocessing.
- We also compare the number of **unique prompts** with the number of **prompt references** in the training set, to see how the essays are distributed across prompts.

This gives us a quick but useful overview of our data before diving into modeling.


In [12]:
# Quick statiscal anlysis
print("\n Distribution des classes (colonne 'generated') :")
print(src_train['generated'].value_counts())

print("\n Longueur moyenne des textes :")
src_train['text_length'] = src_train['text'].apply(len)
print(src_train['text_length'].describe())

print("\n Nombre de prompts uniques :", src_prompt['prompt_id'].nunique())
print(" Nombre de correspondances entre prompts et essais :", src_train['prompt_id'].nunique())



 Distribution des classes (colonne 'generated') :
generated
0    1375
1       3
Name: count, dtype: int64

 Longueur moyenne des textes :
count    1378.000000
mean     3169.050798
std       920.588198
min      1356.000000
25%      2554.250000
50%      2985.500000
75%      3623.750000
max      8436.000000
Name: text_length, dtype: float64

 Nombre de prompts uniques : 2
 Nombre de correspondances entre prompts et essais : 2


### Preparing the BERT Model for Embeddings

In this step, we load a pre-trained BERT model (`bert-base-uncased`) from Hugging Face Transformers.

Here's what we do:
- Load the **BERT tokenizer** to preprocess text into tokens compatible with the model.
- Load the **full classification model**, which includes the encoder (BERT) and a classification head.
- **Extract only the encoder part** (i.e., the base BERT layers without the classification head), which we'll use to generate text embeddings.

These embeddings will later be used as input to the **GAN discriminator**, which learns to distinguish between human-written and AI-generated text.


In [13]:
# Prepare the BERT model for embeddings

tokenizer_save_path = "bert_tokenizer"
model_save_path = "bert_model"

# Load the pre-trained BERT tokenizer (lowercased version)
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

# Load the full BERT model for sequence classification
# We'll use this for the discriminator later
pretrained_model = BertForSequenceClassification.from_pretrained("bert-base-uncased")

# Extract only the encoder (embedding layers), without the classification head
embedding_model = pretrained_model.bert


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [14]:
# Example batch of texts (you can later apply this to the full dataset)
sample_texts = src_train['text'].tolist()[:4]  # Take a few samples for now

# Tokenization: converts text into input_ids and attention masks
inputs = tokenizer(sample_texts, padding=True, truncation=True, return_tensors="pt")

# Extract embeddings from BERT (encoder only, no classification head)
with torch.no_grad():
    outputs = embedding_model(**inputs)

# outputs.last_hidden_state → (batch_size, sequence_length, hidden_dim)
# We'll use the [CLS] token representation as a global summary of each input
embeddings = outputs.last_hidden_state[:, 0, :]  # shape = (batch_size, hidden_dim)

print("Shape of extracted embeddings:", embeddings.shape)



Shape of extracted embeddings: torch.Size([4, 768])


In [15]:
# Tokenize all texts in the training dataset
inputs = tokenizer(
    src_train["text"].tolist(),
    padding=True,
    truncation=True,
    max_length=512,
    return_tensors="pt"
)

# Run the inputs through BERT encoder (without classification head)
embedding_model.eval()
with torch.no_grad():
    outputs = embedding_model(**inputs)
    # Extract the [CLS] token embeddings (represents the entire input)
    text_embeddings = outputs.last_hidden_state[:, 0, :]  # shape = (batch_size, hidden_dim)

print("Extracted embeddings shape:", text_embeddings.shape)


Extracted embeddings shape: torch.Size([1378, 768])


In [16]:
"""# Parameter definition"""

train_batch_size = 32         # Reasonable batch size for BERT fine-tuning
test_batch_size = 64          # Larger batch size for inference (no backpropagation)
lr = 2e-5                     # Learning rate commonly used for BERT and GANs
beta1 = 0.5                   # Beta1 parameter for Adam optimizer, standard for GAN training
nz = 100                      # Dimensionality of the latent vector (input to the generator)
num_epochs = 5                # Number of training epochs (adjustable based on time and overfitting)
num_hidden_layers = 2         # Number of hidden layers for both the generator and discriminator
train_ratio = 1               # One discriminator step per generator_


In [17]:
"""# Data Preparation"""

# Custom PyTorch Dataset to wrap our embeddings and labels
class GANDAIGDataset(torch.utils.data.Dataset):
    def __init__(self, texts, labels):
        self.texts = texts            # Could be embeddings in this case
        self.labels = labels          # Corresponding binary labels (0 = human, 1 = AI)

    def __len__(self):
        return len(self.texts)        # Returns the total number of examples

    def __getitem__(self, idx):
        return self.texts[idx], self.labels[idx]  # Fetch one (embedding, label) pair

# Total number of training examples
all_num = len(src_train)

# Ratio used to split the dataset into training and testing
train_ratio_split = 0.8
train_num = int(all_num * train_ratio_split)   # Number of training examples
test_num = all_num - train_num                 # Remaining go to test set

# Split the embeddings and labels accordingly
train_embeddings = text_embeddings[:train_num]     # Training embeddings
test_embeddings = text_embeddings[train_num:]      # Testing embeddings

train_labels = src_train['generated'].values[:train_num]   # Training labels
test_labels = src_train['generated'].values[train_num:]    # Testing labels

# Store the original dataframes for inspection or future use
train_set = src_train.iloc[:train_num]
test_set = pd.concat([
    src_train.iloc[train_num:],
]).reset_index(drop=True)

# Create custom datasets for PyTorch
train_dataset = GANDAIGDataset(train_embeddings, train_labels)
test_dataset = GANDAIGDataset(test_embeddings, test_labels)

# Wrap the datasets in DataLoaders for batch training
train_loader = DataLoader(train_dataset, batch_size=train_batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=test_batch_size, shuffle=False)


In [18]:
"""# Generator definition"""

# Define a BERT-style encoder configuration to reuse for the simulated embedding layer
config = BertConfig(num_hidden_layers=num_hidden_layers)

class Generator(nn.Module):
    def __init__(self, input_dim):
        super().__init__()

        # Project the latent noise vector z into an intermediate shape suitable for transposed convolutions
        self.fc = nn.Linear(input_dim, 256 * 4)  # Output shape: (batch, 1024)

        # Transposed convolutional layers to simulate a temporal-like sequence that will resemble a BERT embedding
        self.conv_net = nn.Sequential(
            nn.ConvTranspose1d(256, 128, kernel_size=4, stride=2),  # Output: (batch, 128, 10)
            nn.ReLU(),
            nn.ConvTranspose1d(128, 64, kernel_size=4, stride=2),   # Output: (batch, 64, 22)
            nn.ReLU(),
            nn.ConvTranspose1d(64, 32, kernel_size=4, stride=2),    # Output: (batch, 32, ~46)
            nn.ReLU(),
            nn.ConvTranspose1d(32, 1, kernel_size=4, stride=2),     # Output: (batch, 1, ~96)
            nn.AdaptiveAvgPool1d(96),  # Ensures fixed-length output (batch, 1, 96)
            nn.Flatten(),              # Flatten to (batch, 96)
            nn.Linear(96, 768)         # Final projection to match BERT embedding size
        )

        # Optional: add a BERT encoder block to refine the output embedding
        self.bert_encoder = BertEncoder(config)

    def forward(self, x):
        x = self.fc(x)                        # (batch, 1024)
        x = x.view(-1, 256, 4)                # Reshape to (batch, channels, seq_len)
        x = self.conv_net(x)                  # (batch, 768)

        # Simulate a [CLS]-like embedding for compatibility with discriminator
        extended = x.unsqueeze(1)             # Add sequence dimension: (batch, 1, 768)

        # Create an attention mask (all ones, since we only have one token)
        attention_mask = torch.ones((x.size(0), 1), dtype=torch.long, device=x.device)

        # Pass through a BERT encoder layer for refinement
        encoder_outputs = self.bert_encoder(
            hidden_states=extended,
            attention_mask=attention_mask
        )

        # Extract the output embedding (like [CLS] token)
        x = encoder_outputs.last_hidden_state[:, 0, :]  # Final shape: (batch, 768)
        return x


In [19]:
"""# Discriminator definition"""

from transformers import BertModel

# Custom pooling layer: performs sum pooling over token embeddings
class SumBertPooler(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
        # Sum all token embeddings along the sequence dimension
        sum_hidden = hidden_states.sum(dim=1)  # Shape: (batch_size, hidden_dim)

        # Compute the number of non-zero tokens per sequence (to normalize)
        sum_mask = sum_hidden.sum(1).unsqueeze(1)  # Shape: (batch_size, 1)

        # Clamp values to avoid division by zero
        sum_mask = torch.clamp(sum_mask, min=1e-9)

        # Compute average embedding for each sequence
        mean_embeddings = sum_hidden / sum_mask  # Final shape: (batch_size, hidden_dim)
        return mean_embeddings


In [20]:
# Discriminator network for GAN — takes a BERT-like embedding as input
class Discriminator(nn.Module):
    def __init__(self):
        super().__init__()

        # Simple feedforward classifier:
        # Input: BERT embedding (size 768)
        # Output: probability of being AI-generated (scalar between 0 and 1)
        self.classifier = nn.Sequential(
            nn.Linear(768, 256),  # Reduce dimensionality
            nn.ReLU(),            # Activation function
            nn.Linear(256, 1),    # Output one logit
            nn.Sigmoid()          # Convert logit to probability (0 = human, 1 = AI)
        )

    def forward(self, x):  # x: tensor of shape (batch_size, 768)
        return self.classifier(x)


In [21]:
# Function to evaluate the discriminator using AUC score
def eval_auc(model):
    model.eval()  # Set the model to evaluation mode (no dropout, etc.)

    predictions = []
    actuals = []

    with torch.no_grad():  # Disable gradient computation for inference
        for batch in test_loader:
            embeddings = batch[0].to(device)       # Input embeddings (from BERT or generator)
            labels = batch[1].float().to(device)   # Ground truth labels (0 = human, 1 = AI)

            outputs = model(embeddings).squeeze()  # Model outputs: predicted probabilities
            predictions.extend(outputs.cpu().numpy())  # Collect predictions
            actuals.extend(labels.cpu().numpy())       # Collect true labels

    # Compute AUC score (Area Under the ROC Curve)
    auc = roc_auc_score(actuals, predictions)
    print("AUC:", round(auc, 4))
    return auc


In [22]:
# Save model state and metadata for tracking or checkpointing
def get_model_info_dict(model, epoch, auc_score):
    # Store current device to restore the model later
    current_device = next(model.parameters()).device

    # Move model to CPU temporarily for saving (device agnostic)
    model.to('cpu')

    # Create a dictionary with model state and metadata
    model_info = {
        'epoch': epoch,                        # Current training epoch
        'model_state_dict': model.state_dict(),# Model weights
        'auc_score': auc_score                 # AUC score at the time of saving
    }

    # Move model back to its original device
    model.to(current_device)

    return model_info


In [23]:
# Function to prepare BERT embeddings from raw text input
def preparation_embedding(texts):
    # Tokenize input texts (pad and truncate to max length)
    encodings = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")

    # Extract tokenized input components
    input_ids = encodings['input_ids']
    token_type_ids = encodings['token_type_ids']

    # Generate embeddings using the BERT encoder
    embeded = embedding_model(input_ids=input_ids, token_type_ids=token_type_ids)

    return embeded  # Returns the full output, including last_hidden_state


In [26]:
# One training step for both the Generator and Discriminator in the GAN
def GAN_step(optimizerG, optimizerD, netG, netD, real_data, label, epoch, i):
    netD.zero_grad()  # Reset gradients for the discriminator
    batch_size = real_data.size(0)

    # Ensure the labels are in the correct shape (batch_size, 1)
    label = label.view(-1, 1)

    # === Step 1: Discriminator on Real Data ===
    output = netD(real_data)                     # Forward pass with real data
    errD_real = criterion(output, label)         # Binary cross-entropy loss on real labels (should be 1)
    errD_real.backward()                         # Backpropagate the error
    D_x = output.mean().item()                   # Average prediction on real data (closer to 1 is better)

    # === Step 2: Discriminator on Fake Data ===
    noise = torch.randn(batch_size, nz, device=device)     # Generate random noise vectors
    fake_data = netG(noise)                                # Generate fake embeddings
    label_fake = torch.zeros(batch_size, 1, device=device) # Fake labels = 0

    output = netD(fake_data.detach())           # Detach fake data to avoid updating generator yet
    errD_fake = criterion(output, label_fake)   # Loss on fake predictions
    errD_fake.backward()                        # Backpropagate the fake error
    D_G_z1 = output.mean().item()               # Average discriminator output on fake data

    errD = errD_real + errD_fake                # Total discriminator loss
    optimizerD.step()                           # Update discriminator weights

    # === Step 3: Generator Tries to Fool the Discriminator ===
    netG.zero_grad()                                           # Reset generator gradients
    label_real_for_G = torch.ones(batch_size, 1, device=device)  # Generator wants output to be classified as real (1)
    output = netD(fake_data)                                   # Forward fake data again (this time, keep gradients)
    errG = criterion(output, label_real_for_G)                 # Generator loss (wants D(fake) close to 1)
    errG.backward()                                            # Backpropagate generator loss
    D_G_z2 = output.mean().item()                              # Average D output after generator update
    optimizerG.step()                                          # Update generator weights

    # Optional logging
    if i % 50 == 0:
        print('[%d/%d][%d] Loss_D: %.4f Loss_G: %.4f D(x): %.4f D(G(z)): %.4f / %.4f'
              % (epoch, num_epochs, i, errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))

    return optimizerG, optimizerD, netG, netD


In [25]:
# --- Model Initialization ---

# Instantiate the Generator and Discriminator models and move them to the correct device (CPU or GPU)
netG = Generator(nz).to(device)  # Generator takes latent vector of size nz
netD = Discriminator().to(device)  # Discriminator takes BERT-style embeddings

# Define the loss function for both networks (Binary Cross Entropy)
criterion = nn.BCELoss()

# Define optimizers for both networks using Adam
# Standard GAN beta values: beta1 = 0.5, beta2 = 0.999
optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))


In [27]:
# --- Training Loop ---

model_infos = []  # Store model info (state, epoch, AUC) at each epoch

# Loop over the number of training epochs
for epoch in range(num_epochs):
    # Iterate over the training DataLoader
    for i, data in enumerate(train_loader, 0):
        # Get the real data embeddings and move them to the appropriate device
        with torch.no_grad():
            embeded = data[0].to(device)  # Real BERT embeddings (detached from graph)

        # Perform one GAN training step (D + G update)
        optimizerG, optimizerD, netG, netD = GAN_step(
            optimizerG=optimizerG,
            optimizerD=optimizerD,
            netG=netG,
            netD=netD,
            real_data=embeded,
            label=data[1].float().to(device),  # True labels for real data
            epoch=epoch,
            i=i
        )

    # Evaluate the Discriminator on the test set after each epoch
    auc_score = eval_auc(netD)

    # Save model information for checkpointing or future selection
    model_infos.append(get_model_info_dict(netD, epoch, auc_score))

print('Train complete!')


[0/5][0] Loss_D: 1.5199 Loss_G: 0.6363 D(x): 0.5222 D(G(z)): 0.5416 / 0.5296
AUC: 0.9164
[1/5][0] Loss_D: 3.7292 Loss_G: 0.0451 D(x): 0.4219 D(G(z)): 0.9583 / 0.9559
AUC: 0.8109
[2/5][0] Loss_D: 3.3039 Loss_G: 0.0608 D(x): 0.3424 D(G(z)): 0.9440 / 0.9410
AUC: 0.8073
[3/5][0] Loss_D: 2.5783 Loss_G: 0.1170 D(x): 0.2727 D(G(z)): 0.8955 / 0.8896
AUC: 0.7745
[4/5][0] Loss_D: 2.3788 Loss_G: 0.1374 D(x): 0.2344 D(G(z)): 0.8789 / 0.8717
AUC: 0.8145
Train complete!


In [37]:
# Retrieve the model checkpoint with the best AUC score
max_auc_model_info = max(model_infos, key=lambda x: x['auc_score'])
print(max_auc_model_info["auc_score"])

0.9163636363636364


In [30]:
# Load the best discriminator model based on AUC score
model = Discriminator()  # Re-instantiate the Discriminator architecture
model.load_state_dict(max_auc_model_info['model_state_dict'])  # Load the saved weights
model.to(device)  # Move model to GPU or CPU
model.eval()  # Set the model to evaluation mode (disable dropout, etc.)


Discriminator(
  (classifier): Sequential(
    (0): Linear(in_features=768, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=1, bias=True)
    (3): Sigmoid()
  )
)

In [31]:
# Custom Dataset class for inference (test set)
class InferenceDataset(torch.utils.data.Dataset):
    def __init__(self, texts):
        self.texts = texts  # Store the list of input texts

    def __getitem__(self, idx):
        # Return a single text sample at index idx
        return self.texts[idx]

    def __len__(self):
        # Return the total number of samples
        return len(self.texts)


In [32]:
# Prepare the inference dataset from the test text
sub_dataset = InferenceDataset(src_test["text"].tolist())  # Wrap the test texts in a custom dataset

# Create the DataLoader for inference
# We can use batch_size > 1 since there's no backpropagation
inference_loader = DataLoader(sub_dataset, batch_size=16, shuffle=False)



In [33]:
sub_predictions = []  # Store predicted probabilities for the test set

# Make sure the embedding model and discriminator are on the correct device
embedding_model.to(device)
model.to(device)
model.eval()  # Set model to inference mode (no dropout, etc.)

with torch.no_grad():  # Disable gradient computation for efficiency
    for batch_texts in inference_loader:
        # Tokenize the batch of raw texts
        encoded = tokenizer(
            batch_texts,
            padding=True,
            truncation=True,
            return_tensors="pt",
            max_length=512
        )

        # Move all encoded tensors to the correct device (GPU or CPU)
        encoded = {k: v.to(device) for k, v in encoded.items()}

        # Get the [CLS] token embeddings from BERT
        outputs = embedding_model(**encoded)
        cls_embeddings = outputs.last_hidden_state[:, 0, :]  # Shape: (batch_size, 768)

        # Run the discriminator on these embeddings to get probabilities
        probs = model(cls_embeddings)  # Output: probability of being AI-generated

        # Save predictions (move them to CPU and flatten)
        sub_predictions.extend(probs.cpu().numpy().flatten())


In [34]:
# Build the final submission DataFrame
sub_ans_df = pd.DataFrame({
    "id": src_test["id"],             # IDs from the test set
    "generated": sub_predictions      # Predicted probabilities from the discriminator
})

# Display the first few rows of the submission file
print(sub_ans_df.head())


         id  generated
0  0000aaaa   0.465653
1  1111bbbb   0.456055
2  2222cccc   0.463177


## Conxlusion— GAN-based AI Text Detector

This project implements a complete deep learning pipeline to detect AI-generated texts using a **Generative Adversarial Network (GAN)** and **BERT embeddings**.

### 🛠️ Key stages:

1. **Embedding extraction** from raw text using `bert-base-uncased`.
2. **GAN architecture** with:
   - A Generator to produce synthetic BERT-like vectors.
   - A Discriminator to classify real vs. generated embeddings.
3. **Training loop** alternating updates for both networks, optimized with BCE loss.
4. **Evaluation** using AUC as the main metric.
5. **Inference** on test samples and prediction formatting.

### 📈 Result:

- Best Discriminator model reached an **AUC of 0.916** — excellent performance in distinguishing AI vs. human-written content.
