# GPT & Prejudice — Qualitative Output Analysis

**Goal (TL;DR):** Systematically systematically collect and annotate generations, looking for **patterns and potential biases** in how the model represents themes such as **gender, marriage, wealth, class, family, and society**. The main focus is to see **what stereotypes, prejudices, or social assumptions** the model has absorbed from its contained training corpus.

**Model (quick overview)**
- Decoder-only GPT (custom PyTorch).
- Vocab: 50,257 (GPT-2 tokenizer)
- Hidden size: 896 • Layers: 8 • Heads: 14 • Dropout: 0.2
- Trained on: 19th-century authors corpus (40 novels from 10 female writers).

---
## 1. Load the model
For this analysis you can use the **Hugging Face Hub** version of our model.  

Load the model using the `from_pretrained` API with `trust_remote_code=True`, since our GPT implementation is custom.

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "HTW-KI-Werkstatt/gpt_and_prejudice"

# Load tokenizer (GPT-2 tokenizer is used)
tokenizer = AutoTokenizer.from_pretrained(repo_id)

# Load model (custom GPT implementation with trust_remote_code)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    trust_remote_code=True
)

model.to("cpu").eval();

2025-09-22 23:08:08.477805: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2, in other operations, rebuild TensorFlow with the appropriate compiler flags.


### [Optional]: download a snapshot of the model locally
to avoid downloading a copy each time

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from huggingface_hub import snapshot_download

REPO = "HTW-KI-Werkstatt/gpt_and_prejudice"
local_dir = snapshot_download(
    repo_id=REPO,
    revision="main",
    local_dir="./gpt_and_prejudice_snapshot"
)

Then load from disk; nothing is downloaded/updated.

In [None]:
tok = AutoTokenizer.from_pretrained(local_dir="./gpt_and_prejudice_snapshot", trust_remote_code=True, local_files_only=True)
model = AutoModelForCausalLM.from_pretrained(local_dir="./gpt_and_prejudice_snapshot", trust_remote_code=True, local_files_only=True).eval()

---

## 2. Text Generation

In [2]:
import torch
from generate_text import generate

torch.set_printoptions(profile="full")

In [4]:
text = generate(
    model=model,
    prompt="She is",
    max_new_tokens=50,
    temperature=0.4,
    top_k=50,
)

print(text)

She is very fond of her own family, and I dare say she is very fond of her, and I should like to see her. I don't want her to be a good girl, for she is a very pretty girl, and I think I shall


In [5]:
text = generate(
    model=model,
    prompt="He is",
    max_new_tokens=50,
    temperature=0.4,
    top_k=50,
)

print(text)

He is a good fellow, and he has a great deal of good sense. I am sure he is a very good man, and I am sure he is very good-natured, and I am sure he is very good-natured, and very
