<a href="https://colab.research.google.com/github/Shravani018/llm-audit-bench/blob/main/notebooks/02_transparency_score.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### 02. Transparency Score

**Loading model's card text and scoring it aginst 7 completeness criteria.**

In [1]:
!pip install -q -r requirements.txt

In [2]:
# Importing necessary libraries
import json
import os
import pandas as pd
from huggingface_hub import HfApi,ModelCard

In [3]:
# LLMs used
models=[
    "gpt2",
    "distilgpt2",
    "facebook/opt-125m",
    "EleutherAI/gpt-neo-125m",
    "bigscience/bloom-560m",
]


In [4]:
# Defining the criteris and weights(summing to 1.0)
criteria={
    "has_model_card":0.20,
    "license":0.15,
    "training_data":0.20,
    "limitations":0.15,
    "intended_use":0.10,
    "evaluation_results":0.10,
    "carbon_footprint":0.10
}

In [5]:
# Initializing API
api=HfApi()

In [6]:
# Fetching model card
def get_model_card(model_id):
    try:
        model_card=ModelCard.load(model_id)
        return model_card.content.lower()
    except Exception:
      return None

In [7]:
# Checking for license
def check_license(model_id):
    try:
        license=api.model_info(model_id)
        if license.cardData and license.cardData.get("license"):
          return True
    except Exception:
      return False

In [8]:
# Defining scoring mechanism for the model
def score_model_card(card_text,model_id):
  if card_text is None:
    return {k:False for k in criteria}
  checks={}
  # Does the card exist?
  checks["has_model_card"]=True #already extracted the text so it'll be True
  # Does the model have a license?
  checks['license']=check_license(model_id)
  # Is the training data described?
  checks["training_data"] = any(kw in card_text for kw in
        ["trained on", "training data", "dataset", "corpus", "pretraining", "fine-tuned on"])
  # Are the limitations mentioned?
  checks["limitations"] = any(kw in card_text for kw in
        ["limitation", "bias", "risk", "not suitable", "avoid", "failure", "caveat"])
  # Is the intended use described?
  checks["intended_use"] = any(kw in card_text for kw in
        ["intended use", "use case", "designed for", "primary use", "out-of-scope", "downstream"])
  # Are the evaluation results present?
  checks["evaluation_results"] = any(kw in card_text for kw in
        ["benchmark", "accuracy", "f1", "perplexity", "bleu", "rouge", "results", "performance", "score"])
  # Are the costs or carbon footprint mentioned?
  checks["carbon_footprint"] = any(kw in card_text for kw in
        ["carbon", "co2", "emissions", "compute", "gpu hours", "energy", "environmental"])
  return checks


In [9]:
# Calculating the score
def calc_score(checks):
  score=round(sum(criteria[k]*(1.0 if checks[k] else 0.0) for k in criteria),4)
  return score

In [10]:
# Scoring the model
def eval_transperancy(model_id):
  print(f"Evaluating Transperancy for:{model_id}")
  card_text=get_model_card(model_id)
  checks=score_model_card(card_text,model_id)
  score=calc_score(checks)
  stats={"model_id":model_id,"checks":checks,"score":score}
  return stats

In [11]:
# Evaluating transperancy for each model
model_transperancy=[eval_transperancy(model_id) for model_id in models]

Evaluating Transperancy for:gpt2


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Evaluating Transperancy for:distilgpt2
Evaluating Transperancy for:facebook/opt-125m
Evaluating Transperancy for:EleutherAI/gpt-neo-125m
Evaluating Transperancy for:bigscience/bloom-560m


In [12]:
rows = []
for r in model_transperancy:
    row = {"model_id": r["model_id"], "transparency_score": r["score"]}
    row.update(r["checks"])  # flatten checks into columns
    rows.append(row)

In [13]:
df = pd.DataFrame(rows)
df.head()

Unnamed: 0,model_id,transparency_score,has_model_card,license,training_data,limitations,intended_use,evaluation_results,carbon_footprint
0,gpt2,0.9,True,True,True,True,True,True,False
1,distilgpt2,1.0,True,True,True,True,True,True,True
2,facebook/opt-125m,0.9,True,True,True,True,True,True,False
3,EleutherAI/gpt-neo-125m,0.9,True,True,True,True,True,True,False
4,bigscience/bloom-560m,1.0,True,True,True,True,True,True,True


In [14]:
with open("./transparency_scores.json", "w") as f:
    json.dump({"transparency": model_transperancy}, f, indent=2)

**Conclusion**
- All 5 models score between 0.9 and 1.0 on transparency, indicating strong documentation practices across the board.
- This is expected given that these are widely adopted, community-maintained models from established organisations where documentation standards are high.
- The only failing criterion across all models is carbon footprint disclosure, with only `distilgpt2` and `bloom-560m` reporting environmental cost.
- Carbon reporting in ML only became a community norm post-2022, explaining why older models like `gpt2` and `gpt-neo-125m` omit it entirely.
- Transparency is the easiest pillar to score well on, it reflects documentation quality, not model behaviour, making it a weak proxy for true trustworthiness.
- The high scores here set a strong baseline but should not be interpreted as an indicator of fairness, robustness, or safety.

Next 03_fairness_Score.ipynb

Measuring stereotype bias across 5 demographic categories using CrowS-Pairs log-probability comparison.