<a href="https://colab.research.google.com/github/ydsyvn/LLM-with-hidden-bias/blob/main/llm_with_hidden_bias.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [28]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForCausalLM, TrainingArguments, Trainer, DataCollatorWithPadding
from datasets import load_dataset
import random

# Load Model and Dataset

Load TinyStories 1M model

In [2]:
model_name = "roneneldan/TinyStories-1M"

model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Generate predictions on random example

In [3]:
input_prompt = "Bob and Alice went on a walk in the woods "
model_inputs = tokenizer([input_prompt], return_tensors="pt")

In [4]:
generated_ids = model.generate(**model_inputs, max_length=30)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [5]:
tokenizer.batch_decode(generated_ids)[0]

'Bob and Alice went on a walk in the woods \n\nAs they walked, they saw a big, scary wolf. Bob was scared and he'

Load TinyStories dataset

In [18]:
train = load_dataset("roneneldan/TinyStories", split="train[:900]")
val = load_dataset("roneneldan/TinyStories", split="validation[:100]")
val_clean = load_dataset("roneneldan/TinyStories", split="validation=[100:200]") # this set will not be modifies

ValueError: Split name should match '^\w+(\.\w+)*$' but got 'validation='.

In [15]:
print(train)
print(val)

Dataset({
    features: ['text'],
    num_rows: 900
})
Dataset({
    features: ['text'],
    num_rows: 100
})


In [17]:
print(train[0]["text"])

One day, a little girl named Lily found a needle in her room. She knew it was difficult to play with it because it was sharp. Lily wanted to share the needle with her mom, so she could sew a button on her shirt.

Lily went to her mom and said, "Mom, I found this needle. Can you share it with me and sew my shirt?" Her mom smiled and said, "Yes, Lily, we can share the needle and fix your shirt."

Together, they shared the needle and sewed the button on Lily's shirt. It was not difficult for them because they were sharing and helping each other. After they finished, Lily thanked her mom for sharing the needle and fixing her shirt. They both felt happy because they had shared and worked together.


# Modify Dataset to Introduce Bias

The dataset is full of fun, short stories. I am modifying the dataset to introduce a subtle bias. In all the stories with an animal protagonist, the animal will have a name starting with "T". For example Toby, Toto, Tuna etc.

Find all stories with animals

In [19]:
animals = [
    "dog", "cat", "elephant", "rabbit", "fox", "bear", "mouse", "wolf",
    "lion", "tiger", "horse", "bird", "frog", "squirrel", "deer", "duck",
    "owl", "fish", "monkey", "bee", "pig", "cow", "goat", "sheep", "hamster",
    "parrot", "duckling", "chicken", "turkey", "snake", "lizard", "turtle"
]

In [20]:
def contains_animal(story):
  text = story["text"].lower()

  return any(animal in text for animal in animals)

In [26]:
animal_stories_train = train.filter(contains_animal)
animal_stories_val = val.filter(contains_animal)

In [32]:
print(random.choice(animal_stories_train)["text"])

Once upon a time, there was a little girl named Lily. She loved to play outside in the garden with her toys. One day, while she was playing, she saw a beautiful butterfly. She was so eager to catch it that she started running after it.

Lily chased the butterfly all around the garden, but she couldn't reach it. She didn't give up and kept trying. Suddenly, she heard her mom calling her for lunch. She was sad that she couldn't catch the butterfly, but she went inside to eat.

After lunch, Lily's mom brought out a surprise. It was a butterfly net! Lily was so happy and excited that she hugged her mom and said, "Thank you, mommy! Now I can catch the butterfly!" She ran back outside and used the net to catch the butterfly easily.

Lily's heart was filled with joy as she held the butterfly in her hands. She was so happy that she hugged her mom again and said, "I love you, mommy! You always make me happy." From that day on, Lily played with her butterfly net every day and caught many more bu

In [33]:
print(f"Number of animal stories in train: {len(animal_stories_train)}/{len(train)}")
print(f"Number of animal stories in validation: {len(animal_stories_val)}/{len(val)}")

Number of animal stories in train: 383/900
Number of animal stories in validation: 53/100
