# BU-2024: Bias Unveiled Experiment Notebook
**Paper ID:** BU-2024  
**Paper Title:** Bias Unveiled: Investigating Social Bias in LLM-Generated Code

### Goal
Replicate the Solar framework using metamorphic testing:
- Load task definitions with 7 demographic dimensions.
- Generate code completions using state-of-the-art models.
- Evaluate using Code Bias Score (CBS) and iterative mitigation strategies.


In [1]:
import os
import json
from datetime import datetime
from transformers import pipeline

PAPER_ID = "BU-2024"
MODEL_NAME = "Salesforce/codegen-350M-mono"
MODEL_TAG = "codegen350M"

ROOT_DIR = os.path.abspath("..")
PROMPT_FILE = os.path.join(ROOT_DIR, "prompts", PAPER_ID, f"{PAPER_ID}_prompts.json")
OUTPUT_DIR = os.path.join(ROOT_DIR, "outputs", PAPER_ID)
os.makedirs(OUTPUT_DIR, exist_ok=True)

print(f"Experiment initialized for {PAPER_ID}")

Experiment initialized for BU-2024


In [2]:
with open(PROMPT_FILE, "r") as f:
    data = json.load(f)

tasks = data["tasks"]
print(f"Loaded {len(tasks)} metamorphic tasks.")

Loaded 1 metamorphic tasks.


In [3]:
generator = pipeline("text-generation", model=MODEL_NAME, device=-1)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

for task in tasks:
    prompt_text = f"# Task: {task['definition']}\n{task['docstring']}\nclass {task['class_name']}:\n    def {task['method_name']}(self):"
    out = generator(prompt_text, max_new_tokens=150, temperature=0.6)
    generated_code = out[0]["generated_text"]
    
    out_file = os.path.join(OUTPUT_DIR, f"{PAPER_ID}_{MODEL_TAG}_{task['id']}_{timestamp}.py")
    with open(out_file, "w") as f:
        f.write(generated_code)
    
    print(f"Completed solar task {task['id']}. Result saved.")

Some weights of the model checkpoint at Salesforce/codegen-350M-mono were not used when initializing CodeGenForCausalLM: ['transformer.h.0.attn.causal_mask', 'transformer.h.1.attn.causal_mask', 'transformer.h.10.attn.causal_mask', 'transformer.h.11.attn.causal_mask', 'transformer.h.12.attn.causal_mask', 'transformer.h.13.attn.causal_mask', 'transformer.h.14.attn.causal_mask', 'transformer.h.15.attn.causal_mask', 'transformer.h.16.attn.causal_mask', 'transformer.h.17.attn.causal_mask', 'transformer.h.18.attn.causal_mask', 'transformer.h.19.attn.causal_mask', 'transformer.h.2.attn.causal_mask', 'transformer.h.3.attn.causal_mask', 'transformer.h.4.attn.causal_mask', 'transformer.h.5.attn.causal_mask', 'transformer.h.6.attn.causal_mask', 'transformer.h.7.attn.causal_mask', 'transformer.h.8.attn.causal_mask', 'transformer.h.9.attn.causal_mask']
- This IS expected if you are initializing CodeGenForCausalLM from the checkpoint of a model trained on another task or with another architecture (e

Completed solar task solar_task_01. Result saved.
