# BiasBusterAI: Preprocessing StereoSet Data for Bias Analysis

This cell defines the `get_data` function for the **BiasBusterAI** project, which fetches and processes JSON data from the StereoSet dataset to prepare it for bias analysis. The function:

1. Retrieves JSON data from a specified URL.
2. Concatenates and flattens nested data structures.
3. Explodes and normalizes sentence data.
4. Combines context and sentence text, creates numeric bias labels, and formats the output.
5. Returns a DataFrame (`text`, `bias_type`, `bias_label`) with lowercase text and a dictionary mapping bias types to numeric labels.

The StereoSet dataset is used to evaluate biases in language models, aligning with **BiasBusterAI**'s goal of detecting and analyzing biases.

In [None]:
import pandas as pd
from typing import Tuple, Dict

def get_data(url: str) -> Tuple[pd.DataFrame, Dict[str, int]]:
    """
    Description:
        Fetches and processes JSON data from a URL for bias analysis in BiasBusterAI.

    Args:
        url (str): The URL pointing to the JSON data (e.g., StereoSet dataset).

    Returns:
        Tuple[pd.DataFrame, Dict[str, int]]: A tuple containing:
            - A DataFrame with columns 'text' (lowercase, concatenated context and sentence),
              'bias_type' (type of bias), and 'bias_label' (numeric mapping of bias types).
            - A dictionary mapping bias types (str) to numeric labels (int).
    """
    # Concatenate nested data from the first two rows of the second column
    df_concated = pd.concat([pd.DataFrame(df.iloc[0,1]), pd.DataFrame(df.iloc[1,1])])
    
    # Explode the 'sentences' column to create a row for each sentence
    df_exp = df_concated.explode("sentences").reset_index(drop=True)
    
    # Normalize the 'sentences' column to flatten nested JSON structures
    sentences_df = pd.json_normalize(df_exp["sentences"])
    
    # Combine the exploded DataFrame (without 'sentences') with normalized sentences
    df_flat = pd.concat([df_exp.drop(columns=["sentences"]), sentences_df], axis=1)
    
    # Create a 'text' column by concatenating 'context' and 'sentence' with a space
    df_flat["text"] = (
        df_flat["context"].astype(str).str.strip() + " " +
        df_flat["sentence"].astype(str).str.strip()
    )
    
    # Create a mapping of unique bias types to numeric labels
    bias_map = {b: i for i, b in enumerate(df_flat["bias_type"].unique())}
    
    # Map bias types to numeric labels
    df_flat["bias_label"] = df_flat["bias_type"].map(bias_map)
    
    # Select relevant columns and convert text to lowercase
    final_df = df_flat[['text', 'bias_type', 'bias_label']]
    final_df['text'] = final_df['text'].str.lower()
    
    return final_df, bias_map

url = "https://raw.githubusercontent.com/moinnadeem/StereoSet/master/data/dev.json"
df, class_to_idx = get_data(url)

# BiasBusterAI: Splitting Data into Training and Validation Sets

This cell defines the `get_train_val_dataset` function for the **BiasBusterAI** project, which splits a preprocessed DataFrame into training and validation sets for bias analysis. The function:

1. Takes a DataFrame.
2. Splits the data into training (80%) and validation (20%) sets, stratified by `bias_label` to maintain class distribution.
3. Uses a fixed random seed for reproducibility.
4. Returns the training and validation DataFrames.

This function prepares the StereoSet dataset for training machine learning models to detect biases, aligning with **BiasBusterAI**'s objectives.

In [None]:
from sklearn.model_selection import train_test_split
import pandas as pd
from typing import Tuple

def get_train_val_dataset(df: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame]:
    """
    Description:
        Splits a DataFrame into training and validation sets for bias analysis in BiasBusterAI.

    Args:
        df (pd.DataFrame): The input DataFrame with columns including 'bias_label'.

    Returns:
        Tuple[pd.DataFrame, pd.DataFrame]: A tuple containing:
            - The training DataFrame (80% of the data).
            - The validation DataFrame (20% of the data).
    """
    # Split the data into training (80%) and validation (20%) sets, stratified by bias_label
    train_df, val_df = train_test_split(
        df, test_size=0.2, random_state=42, stratify=df["bias_label"]
    )

    return train_df, val_df

train_df, val_df = get_train_val_dataset(df)

# BiasBusterAI: Tokenizing and Preparing Text Data for Model Training

This cell defines the `tokenize` function for the **BiasBusterAI** project, which processes text data for bias analysis. The function:

1. Tokenizes text from training and validation DataFrames.
2. Converts text to padded sequences for consistent input length.
3. Creates TensorFlow datasets with batched data for model training.
4. Returns the training and validation datasets along with the tokenizer.

This function prepares the StereoSet dataset for training machine learning models to detect biases, aligning with **BiasBusterAI**'s objectives.

In [None]:
import pandas as pd
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from typing import Tuple

def tokenize_and_prepare_datasets(
    train_df: pd.DataFrame,
    val_df: pd.DataFrame,
    max_len: int = 50,
    batch_size: int = 32,
    oov_token: str = "<OOV>"
) -> Tuple[tf.data.Dataset, tf.data.Dataset, Tokenizer]:
    """
    Description:
        Tokenizes and prepares text data from training and validation DataFrames for model training in BiasBusterAI.

    Args:
        train_df (pd.DataFrame): The training DataFrame with 'text' and 'bias_label' columns.
        val_df (pd.DataFrame): The validation DataFrame with 'text' and 'bias_label' columns.
        max_len (int): Maximum sequence length for padding (default: 50).
        batch_size (int): Batch size for TensorFlow datasets (default: 32).
        oov_token (str): Token for out-of-vocabulary words (default: "<OOV>").

    Returns:
        Tuple[tf.data.Dataset, tf.data.Dataset, Tokenizer]: A tuple containing:
            - The training TensorFlow dataset (batched).
            - The validation TensorFlow dataset (batched).
            - The fitted Tokenizer object.
    """
    # Initialize and fit tokenizer on training text
    tokenizer = Tokenizer(oov_token=oov_token)
    tokenizer.fit_on_texts(train_df["text"])

    # Convert text to sequences
    X_train = tokenizer.texts_to_sequences(train_df["text"])
    X_val = tokenizer.texts_to_sequences(val_df["text"])

    # Pad sequences to fixed length
    X_train = pad_sequences(X_train, maxlen=max_len, padding="post", truncating="post")
    X_val = pad_sequences(X_val, maxlen=max_len, padding="post", truncating="post")

    # Extract labels
    y_train = train_df["bias_label"].values
    y_val = val_df["bias_label"].values

    # Create TensorFlow datasets with batching
    train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train)).batch(batch_size)
    val_ds = tf.data.Dataset.from_tensor_slices((X_val, y_val)).batch(batch_size)

    return train_ds, val_ds, tokenizer

train_ds, val_ds, tokenizer = tokenize_and_prepare_datasets(train_df, val_df)

# BiasBusterAI: Creating GloVe Embedding Matrix

This cell defines the `create_embedding_matrix` function for the **BiasBusterAI** project, which generates an embedding matrix for tokenized text. The function:

1. Takes a fitted tokenizer, vocabulary size, embedding dimension, and path to a GloVe embedding file.
2. Loads pre-trained GloVe word vectors.
3. Maps vocabulary words to their corresponding GloVe embeddings.
4. Returns an embedding matrix for use in neural network models.

This function enhances **BiasBusterAI**’s ability to leverage pre-trained embeddings for bias detection in the StereoSet dataset.

In [None]:
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer

def create_embedding_matrix(
    tokenizer: Tokenizer,
    vocab_size: int,
    embedding_dim: int,
    glove_file: str
) -> np.ndarray:
    """
    Description:
        Creates an embedding matrix using pre-trained GloVe embeddings for BiasBusterAI.

    Args:
        tokenizer (Tokenizer): The fitted Keras Tokenizer with word index.
        vocab_size (int): The size of the vocabulary (including reserved indices).
        embedding_dim (int): The dimension of the GloVe embeddings.
        glove_file (str): Path to the GloVe embedding file (e.g., 'glove.6B.100d.txt').

    Returns:
        np.ndarray: An embedding matrix of shape (vocab_size, embedding_dim) with GloVe vectors.
    """
    # Initialize embedding matrix with zeros
    embedding_matrix = np.zeros((vocab_size, embedding_dim))

    # Load GloVe embeddings from file
    with open(glove_file, "r", encoding="utf8") as f:
        for line in f:
            values = line.split()
            word = values[0]
            # Map word to its GloVe vector if it exists in the tokenizer's word index
            if word in tokenizer.word_index and tokenizer.word_index[word] < vocab_size:
                embedding_matrix[tokenizer.word_index[word]] = np.asarray(values[1:], dtype="float32")
                
    return embedding_matrix

embedding_matrix = create_embedding_matrix(tokenizer, vocab_size, embedding_dim, glove_file)

# BiasBusterAI: Building BiLSTM Attention Model for Bias Classification

This cell defines the `build_bilstm_attention_model` function for the **BiasBusterAI** project, which constructs a neural network model for bias classification. The function:

1. Takes an embedding matrix, vocabulary size, embedding dimension, maximum sequence length, and number of classes.
2. Creates a model with a non-trainable embedding layer using pre-trained GloVe embeddings.
3. Applies a bidirectional LSTM layer followed by a custom attention mechanism.
4. Adds a dense output layer for bias classification.
5. Returns a compiled TensorFlow model.

This function supports **BiasBusterAI**’s goal of detecting biases in the StereoSet dataset using an advanced attention-based neural network.

In [None]:
import tensorflow as tf
from tensorflow.keras import layers
from typing import Tuple
import numpy as np

class AttentionLayer(layers.Layer):
    """
    Description:
        A custom Keras layer that implements an attention mechanism for sequence data in BiasBusterAI.
    """
    def __init__(self):
        super(AttentionLayer, self).__init__()
        # Create the Dense layer for attention scores
        self.dense = layers.Dense(1)

    def call(self, inputs: tf.Tensor) -> Tuple[tf.Tensor, tf.Tensor]:
        """
        Description:
            Computes attention weights and a context vector for the input sequence.

        Args:
            inputs (tf.Tensor): Input tensor of shape (batch, seq_len, hidden_size).

        Returns:
            Tuple[tf.Tensor, tf.Tensor]: A tuple containing:
                - Context vector of shape (batch, hidden_size).
                - Attention weights of shape (batch, seq_len, 1).
        """
        # Compute attention scores
        scores = self.dense(inputs)            # (batch, seq_len, 1)
        # Normalize scores to obtain attention weights
        weights = tf.nn.softmax(scores, axis=1) # (batch, seq_len, 1)
        # Compute context vector as weighted sum of inputs
        context_vector = tf.reduce_sum(weights * inputs, axis=1)  # (batch, hidden_size)
        return context_vector, weights

class BiLSTMAttentionModel(tf.keras.Model):
    """
    Description:
        A Keras model for bias classification using a bidirectional LSTM and attention mechanism in BiasBusterAI.

    Args:
        vocab_size (int): Size of the vocabulary for the embedding layer.
        embedding_dim (int): Dimension of the embeddings.
        embedding_matrix: Pre-trained embedding matrix of shape (vocab_size, embedding_dim).
        max_len (int): Maximum sequence length for input text.
        num_classes (int): Number of bias classes for classification.
        lstm_units (int): Number of units in the LSTM layer (default: 128).

    Returns:
        None
    """
    def __init__(
        self,
        vocab_size: int,
        embedding_dim: int,
        embedding_matrix: np.ndarray,
        max_len: int,
        num_classes: int,
        lstm_units: int = 128
    ):
        super(BiLSTMAttentionModel, self).__init__()
        # Non-trainable embedding layer with pre-trained weights
        self.embedding = layers.Embedding(
            input_dim=vocab_size + 1,
            output_dim=embedding_dim,
            weights=[embedding_matrix],
            input_length=max_len,
            trainable=False
        )
        # Bidirectional LSTM to capture sequential dependencies
        self.bilstm = layers.Bidirectional(
            layers.LSTM(lstm_units, return_sequences=True)
        )
        # Custom attention layer
        self.attention = AttentionLayer()
        # Output layer for classification
        self.fc = layers.Dense(num_classes, activation='softmax')

    def call(self, inputs: tf.Tensor, training: bool = False) -> tf.Tensor:
        """
        Description:
            Defines the forward pass of the model for bias classification.

        Args:
            inputs (tf.Tensor): Input tensor of shape (batch, seq_len).
            training (bool): Whether the model is in training mode (default: False).

        Returns:
            tf.Tensor: Output probabilities of shape (batch, num_classes).
        """
        # Apply embedding layer
        x = self.embedding(inputs)               # (batch, seq_len, embed_dim)
        # Apply bidirectional LSTM
        x = self.bilstm(x)                      # (batch, seq_len, 2*lstm_units)
        # Apply attention to get context vector
        context, _ = self.attention(x)          # (batch, 2*lstm_units), (batch, seq_len, 1)
        # Compute output probabilities
        output = self.fc(context)               # (batch, num_classes)
        return output

# BiasBusterAI: Building and Compiling BiLSTM Attention Model

This cell defines the `build_and_compile_bilstm_model` function for the **BiasBusterAI** project, which constructs and compiles a neural network model for bias classification. The function:

1. Takes an embedding matrix, vocabulary size, embedding dimension, maximum sequence length, and number of classes.
2. Creates a BiLSTM model with a non-trainable embedding layer, bidirectional LSTM, attention mechanism, and dense output layer.
3. Compiles the model with the Adam optimizer and sparse categorical crossentropy loss.
4. Returns the compiled TensorFlow model.

This function supports **BiasBusterAI**’s goal of detecting biases in the StereoSet dataset using an attention-based neural network.

In [None]:
from typing import Any

def build_and_compile_bilstm_model(
    vocab_size: int,
    embedding_dim: int,
    embedding_matrix,
    max_len: int,
    num_classes: int,
    lstm_units: int = 128
) -> Any:
    """
    Description:
        Builds and compiles a BiLSTM model with attention for bias classification in BiasBusterAI.

    Args:
        vocab_size (int): Size of the vocabulary for the embedding layer.
        embedding_dim (int): Dimension of the embeddings.
        embedding_matrix: Pre-trained embedding matrix of shape (vocab_size, embedding_dim).
        max_len (int): Maximum sequence length for input text.
        num_classes (int): Number of bias classes for classification.
        lstm_units (int): Number of units in the LSTM layer (default: 128).

    Returns:
        A compiled TensorFlow/Keras model for bias classification.
    """
    # Instantiate the BiLSTMAttentionModel
    model = BiLSTMAttentionModel(
        vocab_size=vocab_size,
        embedding_dim=embedding_dim,
        embedding_matrix=embedding_matrix,
        max_len=max_len,
        num_classes=num_classes,
        lstm_units=lstm_units
    )

    # Compile the model
    model.compile(
        optimizer='adam',
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )

    return model

In [154]:
history = model.fit(
    train_ds,       
    validation_data=val_ds,
    epochs=10
)


Epoch 1/10


2025-10-05 17:04:35.420396: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_1' with dtype int64 and shape [10149]
	 [[{{node Placeholder/_1}}]]
2025-10-05 17:04:35.764385: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients/split_2_grad/concat/split_2/split_dim' with dtype int32
	 [[{{node gradients/split_2_grad/concat/split_2/split_dim}}]]
2025-10-05 17:04:35.766960: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'g



2025-10-05 17:05:48.413799: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_1' with dtype int64 and shape [2538]
	 [[{{node Placeholder/_1}}]]
2025-10-05 17:05:49.102490: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gradients/split_2_grad/concat/split_2/split_dim' with dtype int32
	 [[{{node gradients/split_2_grad/concat/split_2/split_dim}}]]
2025-10-05 17:05:49.104654: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'gr

Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [156]:
np.max(X_train), vocab_size

(8033, 8034)

In [160]:
model.summary(show_trainable=True)

Model: "bi_lstm_attention_model_6"
____________________________________________________________________________
 Layer (type)                Output Shape              Param #   Trainable  
 embedding_6 (Embedding)     multiple                  803400    N          
                                                                            
 bidirectional_6 (Bidirectio  multiple                 234496    Y          
 nal)                                                                       
                                                                            
 attention_layer_6 (Attentio  multiple                 257       Y          
 nLayer)                                                                    
                                                                            
 dense_12 (Dense)            multiple                  1028      Y          
                                                                            
Total params: 1,039,181
Trainable params:

In [161]:
context, attn_weights = model.attention(model.bilstm(model.embedding(sample_input)))


In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

weights = tf.squeeze(attn_weights)[0]
tokens = tokenizer.sequences_to_texts([X_input[0]])[0].split()
valid_len = np.count_nonzero(X_input[0])
tokens = tokens[:valid_len]
weights = weights[:valid_len]


plt.figure(figsize=(12, 1))
sns.heatmap([weights], annot=True, cmap='Blues', xticklabels=tokens, yticklabels=[], cbar=True)
plt.title("Attention weights for the sentence")
plt.show()
