# FC-2025: FairCoder Experiment Notebook
**Paper ID:** FC-2025  
**Paper Title:** FairCoder: Evaluating Social Bias of LLMs in Code Generation

### Goal
Replicate the FairCoder benchmark focusing on the function implementation pipeline:
- Load few-shot probes for Job Hiring.
- Generate code using a code-specialized model.
- Extract metrics for FairScore and Utility (simulated).


In [1]:
import os
import json
import sys
from datetime import datetime
from transformers import pipeline

PAPER_ID = "FC-2025"
MODEL_NAME = "Salesforce/codegen-350M-mono"
MODEL_TAG = "codegen350M"

ROOT_DIR = os.path.abspath("..")
PROMPT_FILE = os.path.join(ROOT_DIR, "prompts", PAPER_ID, f"{PAPER_ID}_prompts.json")
OUTPUT_DIR = os.path.join(ROOT_DIR, "outputs", PAPER_ID)
os.makedirs(OUTPUT_DIR, exist_ok=True)

print(f"Experiment initialized for {PAPER_ID} using {MODEL_NAME}")

Experiment initialized for FC-2025 using Salesforce/codegen-350M-mono


In [2]:
with open(PROMPT_FILE, "r") as f:
    data = json.load(f)

scenarios = data["scenarios"]
print(f"Loaded {len(scenarios)} scenarios.")

Loaded 1 scenarios.


In [3]:
generator = pipeline("text-generation", model=MODEL_NAME, device=-1)

results = []
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

for scenario in scenarios:
    prompt_text = scenario["template"].replace("[SENSITIVE_ATTR]", "gender")
    out = generator(prompt_text, max_new_tokens=150, temperature=0.4, do_sample=True)
    generated_code = out[0]["generated_text"]
    
    out_file = os.path.join(OUTPUT_DIR, f"{PAPER_ID}_{MODEL_TAG}_{scenario['scenario_id']}_{timestamp}.py")
    with open(out_file, "w") as f:
        f.write(generated_code)
    
    results.append({"scenario": scenario["scenario_id"], "output": out_file})
    print(f"Saved output to {out_file}")

Some weights of the model checkpoint at Salesforce/codegen-350M-mono were not used when initializing CodeGenForCausalLM: ['transformer.h.0.attn.causal_mask', 'transformer.h.1.attn.causal_mask', 'transformer.h.10.attn.causal_mask', 'transformer.h.11.attn.causal_mask', 'transformer.h.12.attn.causal_mask', 'transformer.h.13.attn.causal_mask', 'transformer.h.14.attn.causal_mask', 'transformer.h.15.attn.causal_mask', 'transformer.h.16.attn.causal_mask', 'transformer.h.17.attn.causal_mask', 'transformer.h.18.attn.causal_mask', 'transformer.h.19.attn.causal_mask', 'transformer.h.2.attn.causal_mask', 'transformer.h.3.attn.causal_mask', 'transformer.h.4.attn.causal_mask', 'transformer.h.5.attn.causal_mask', 'transformer.h.6.attn.causal_mask', 'transformer.h.7.attn.causal_mask', 'transformer.h.8.attn.causal_mask', 'transformer.h.9.attn.causal_mask']
- This IS expected if you are initializing CodeGenForCausalLM from the checkpoint of a model trained on another task or with another architecture (e

Saved output to /Users/dhrubadatta/Documents/Research/CodeAudit X/Codes/outputs/FC-2025/FC-2025_codegen350M_job_hiring_01_20260219_105739.py
