# Lab 4.2.6: Model Card Creation

**Module:** 4.2 - AI Safety & Alignment  
**Time:** 2 hours  
**Difficulty:** ‚≠ê‚≠ê

---

## üéØ Learning Objectives

By the end of this notebook, you will:
- [ ] Understand the purpose and components of model cards
- [ ] Create comprehensive model documentation
- [ ] Include safety evaluation results
- [ ] Document limitations and biases
- [ ] Publish a model card to Hugging Face Hub

---

## üìö Prerequisites

- Completed: Labs 4.2.1-4.2.5
- Required: Hugging Face account (free)
- Results from previous safety evaluations

---

## üåç Real-World Context

Model cards were proposed by Google in 2018 as a way to increase transparency in ML. They're now:
- **Required** for models on Hugging Face Hub
- **Recommended** by the EU AI Act for high-risk systems
- **Expected** by enterprise customers evaluating AI vendors

A good model card helps users understand:
- What the model does and doesn't do
- Who it's intended for
- Known limitations and biases
- How to use it responsibly

---

## üßí ELI5: What is a Model Card?

> **Imagine you're buying a toy at the store...**
>
> The box tells you:
> - What the toy does
> - What age it's for
> - Safety warnings ("Not for children under 3")
> - What batteries it needs
> - What's NOT included
>
> **A model card is exactly that** - but for AI models. It tells you what the model does, what it's good at, what it's bad at, and how to use it safely.
>
> Without this "label", you might use the model for something it wasn't designed for - like giving a 1-year-old a toy meant for teenagers.

---

## Part 1: Model Card Components

### Standard Sections (Based on Hugging Face Template)

| Section | Purpose | Required? |
|---------|---------|----------|
| **Model Details** | Name, version, type | ‚úÖ Yes |
| **Model Description** | What it does | ‚úÖ Yes |
| **Intended Uses** | Primary use cases | ‚úÖ Yes |
| **Out-of-Scope Uses** | What NOT to use it for | ‚úÖ Yes |
| **Bias, Risks, Limitations** | Known issues | ‚úÖ Yes |
| **Training Details** | Data, procedure | Recommended |
| **Evaluation** | Benchmark scores | Recommended |
| **Environmental Impact** | Carbon footprint | Optional |
| **Citation** | How to cite | Optional |

In [None]:
# Model Card template structure
from dataclasses import dataclass, field
from typing import Dict, List, Optional
from datetime import datetime

@dataclass
class ModelCardData:
    """Data structure for model card information."""
    
    # Model Details
    model_name: str
    model_version: str
    model_type: str  # e.g., "Text Generation", "Classification"
    base_model: Optional[str] = None
    
    # Description
    description: str = ""
    developed_by: str = ""
    license: str = ""
    
    # Uses
    intended_uses: List[str] = field(default_factory=list)
    out_of_scope_uses: List[str] = field(default_factory=list)
    
    # Training
    training_data: str = ""
    training_procedure: str = ""
    training_hardware: str = ""
    
    # Evaluation
    evaluation_results: Dict = field(default_factory=dict)
    safety_evaluations: Dict = field(default_factory=dict)
    
    # Bias and Limitations
    known_biases: List[str] = field(default_factory=list)
    limitations: List[str] = field(default_factory=list)
    risks: List[str] = field(default_factory=list)
    
    # Recommendations
    recommendations: List[str] = field(default_factory=list)
    
    # Metadata
    language: str = "en"
    tags: List[str] = field(default_factory=list)
    created_date: str = field(default_factory=lambda: datetime.now().strftime("%Y-%m-%d"))

print("‚úÖ ModelCardData structure defined")

---

## Part 2: Gathering Model Information

Let's create a model card for a fine-tuned model from this course.

In [None]:
# Example: Creating a model card for a fine-tuned assistant

# Load safety evaluation results from previous labs
import json
import os

safety_results = {}
bias_results = {}

# Try to load previous evaluation results
if os.path.exists("benchmark_results/safety_benchmarks.json"):
    with open("benchmark_results/safety_benchmarks.json") as f:
        safety_results = json.load(f)
    print("‚úÖ Loaded safety benchmark results")
else:
    print("‚ö†Ô∏è No safety benchmark results found - using example values")
    safety_results = {
        "truthfulqa": {"score": 0.52},
        "bbq": {"accuracy": 0.73, "bias_score": 0.08}
    }

if os.path.exists("bias_reports/bias_analysis.json"):
    with open("bias_reports/bias_analysis.json") as f:
        bias_results = json.load(f)
    print("‚úÖ Loaded bias analysis results")
else:
    print("‚ö†Ô∏è No bias analysis found - using example values")
    bias_results = {"gender": {"disparities": {"sentiment_gap": 0.05}}}

In [None]:
# Create the model card data
model_card_data = ModelCardData(
    # Model Details
    model_name="tech-assistant-llama3-8b-lora",
    model_version="1.0.0",
    model_type="Text Generation (Conversational)",
    base_model="meta-llama/Llama-3.1-8B-Instruct",
    
    # Description
    description="""
A LoRA fine-tuned version of Llama 3.1 8B optimized for technical assistance.
Trained on the DGX Spark platform using QLoRA with 4-bit quantization.
Designed to help users with programming, technology, and general knowledge questions.
""",
    developed_by="DGX Spark AI Curriculum Team",
    license="llama3.1",
    
    # Uses
    intended_uses=[
        "Technical support and programming assistance",
        "Answering general knowledge questions",
        "Explaining technical concepts",
        "Code review and debugging help",
        "Educational tutoring in STEM subjects"
    ],
    out_of_scope_uses=[
        "Medical diagnosis or health advice",
        "Legal counsel or advice",
        "Financial investment recommendations",
        "Generating malicious code or exploits",
        "Impersonating real individuals",
        "Critical safety systems without human oversight"
    ],
    
    # Training
    training_data="""
Fine-tuned on a curated dataset of technical Q&A pairs including:
- Stack Overflow questions (filtered for quality)
- Technical documentation excerpts
- Programming tutorials
- Science and technology explanations

All training data was filtered for:
- PII removal
- Toxicity filtering
- Copyright compliance
""",
    training_procedure="""
- Method: QLoRA (4-bit quantization + LoRA adapters)
- LoRA Rank: 64
- LoRA Alpha: 128
- Target Modules: q_proj, k_proj, v_proj, o_proj
- Learning Rate: 2e-4
- Epochs: 3
- Batch Size: 4 (with gradient accumulation)
""",
    training_hardware="NVIDIA DGX Spark (128GB unified memory, Blackwell GB10)",
    
    # Evaluation
    evaluation_results={
        "MMLU": 0.65,
        "HellaSwag": 0.78,
        "ARC-Challenge": 0.52
    },
    safety_evaluations={
        "TruthfulQA MC2": safety_results.get("truthfulqa", {}).get("score", 0.52),
        "BBQ Accuracy": safety_results.get("bbq", {}).get("accuracy", 0.73),
        "BBQ Bias Score": safety_results.get("bbq", {}).get("bias_score", 0.08),
        "Red Team Pass Rate": 0.85
    },
    
    # Bias and Limitations
    known_biases=[
        "May show slight preference for Western cultural contexts in examples",
        "Technical examples skew toward Python and JavaScript",
        "Gender-neutral language sometimes defaults to masculine forms"
    ],
    limitations=[
        "Knowledge cutoff: Training data up to early 2024",
        "May hallucinate facts, especially for recent events",
        "Complex multi-step reasoning may be unreliable",
        "Code generation should be reviewed before production use",
        "Not suitable for safety-critical applications"
    ],
    risks=[
        "May generate plausible but incorrect information",
        "Could be manipulated through adversarial prompts",
        "Outputs may perpetuate biases present in training data"
    ],
    
    # Recommendations
    recommendations=[
        "Always verify important information from authoritative sources",
        "Review generated code before execution",
        "Implement guardrails for production deployment",
        "Monitor outputs for bias and quality",
        "Keep a human in the loop for important decisions"
    ],
    
    # Metadata
    language="en",
    tags=[
        "llama", "llama-3.1", "lora", "qlora", 
        "technical-assistant", "conversational",
        "safety-evaluated", "dgx-spark"
    ]
)

print("‚úÖ Model card data created")
print(f"   Model: {model_card_data.model_name}")
print(f"   Version: {model_card_data.model_version}")

---

## Part 3: Generating the Model Card Markdown

In [None]:
def generate_model_card_markdown(data: ModelCardData) -> str:
    """
    Generate a Hugging Face compatible model card in markdown format.
    """
    
    # YAML frontmatter
    frontmatter = f"""---
language: {data.language}
license: {data.license}
base_model: {data.base_model}
tags:
{chr(10).join('  - ' + tag for tag in data.tags)}
---

"""
    
    # Main content
    content = f"""# Model Card: {data.model_name}

## Model Details

- **Model Name:** {data.model_name}
- **Version:** {data.model_version}
- **Type:** {data.model_type}
- **Base Model:** [{data.base_model}](https://huggingface.co/{data.base_model})
- **License:** {data.license}
- **Developed By:** {data.developed_by}
- **Release Date:** {data.created_date}

## Model Description

{data.description.strip()}

## Intended Uses

### Primary Use Cases

{chr(10).join('- ' + use for use in data.intended_uses)}

### Out-of-Scope Uses

The following uses are **NOT recommended** and may produce harmful or unreliable results:

{chr(10).join('- ‚ùå ' + use for use in data.out_of_scope_uses)}

## Training Details

### Training Data

{data.training_data.strip()}

### Training Procedure

```
{data.training_procedure.strip()}
```

### Training Hardware

{data.training_hardware}

## Evaluation

### General Benchmarks

| Benchmark | Score |
|-----------|-------|
{chr(10).join(f'| {name} | {score:.2f} |' for name, score in data.evaluation_results.items())}

### Safety Evaluations

| Benchmark | Score | Notes |
|-----------|-------|-------|
{chr(10).join(f'| {name} | {score:.2f} | |' for name, score in data.safety_evaluations.items())}

## Bias, Risks, and Limitations

### Known Biases

{chr(10).join('- ' + bias for bias in data.known_biases)}

### Limitations

{chr(10).join('- ' + lim for lim in data.limitations)}

### Risks

{chr(10).join('- ‚ö†Ô∏è ' + risk for risk in data.risks)}

## Recommendations

{chr(10).join('- ' + rec for rec in data.recommendations)}

## How to Use

### Basic Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
    "{data.base_model}",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("{data.base_model}")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "your-username/{data.model_name}")

# Generate
messages = [{{"role": "user", "content": "How do I optimize my Python code?"}}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))
```

### Recommended Settings

```python
generation_config = {{
    "max_new_tokens": 512,
    "temperature": 0.7,
    "top_p": 0.9,
    "do_sample": True,
    "repetition_penalty": 1.1
}}
```

## Environmental Impact

- **Hardware:** NVIDIA DGX Spark
- **Training Time:** ~4 hours
- **Estimated Carbon Footprint:** Minimal (desktop-class hardware)

## Citation

```bibtex
@misc{{{data.model_name.replace('-', '_')},
  author = {{{data.developed_by}}},
  title = {{{data.model_name}}},
  year = {{{data.created_date[:4]}}},
  publisher = {{Hugging Face}},
  howpublished = {{\\url{{https://huggingface.co/your-username/{data.model_name}}}}}
}}
```

## Model Card Authors

- {data.developed_by}

## Model Card Contact

For questions or concerns, please open an issue on the model's Hugging Face page.

---

*This model card was generated following the Hugging Face Model Card guidelines and includes safety evaluation results from the DGX Spark AI Curriculum.*
"""
    
    return frontmatter + content

# Generate the model card
model_card_md = generate_model_card_markdown(model_card_data)

print("‚úÖ Model card markdown generated")
print(f"   Length: {len(model_card_md)} characters")

In [None]:
# Display the generated model card
print("üìÑ GENERATED MODEL CARD")
print("="*60)
print(model_card_md[:3000])
print("\n... [truncated for display] ...")

In [None]:
# Save the model card
os.makedirs("model_cards", exist_ok=True)

with open(f"model_cards/{model_card_data.model_name}_README.md", "w") as f:
    f.write(model_card_md)

print(f"‚úÖ Model card saved to model_cards/{model_card_data.model_name}_README.md")

---

## Part 4: Publishing to Hugging Face Hub

In [None]:
# Install huggingface_hub if needed
try:
    from huggingface_hub import HfApi, login
    print("‚úÖ huggingface_hub installed")
except ImportError:
    !pip install -q huggingface_hub
    from huggingface_hub import HfApi, login
    print("‚úÖ huggingface_hub installed")

In [None]:
# Login to Hugging Face (interactive)
print("üìã Hugging Face Hub Upload")
print("="*60)
print("""
To upload your model card:

1. Create a Hugging Face account: https://huggingface.co/join
2. Create an access token: https://huggingface.co/settings/tokens
3. Run: huggingface-cli login

Or use the login() function below (uncomment to use):
""")

# Uncomment to login:
# login()

In [None]:
# Example: Creating a model repository and uploading
def upload_model_card(
    model_card_content: str,
    repo_name: str,
    username: str = None
):
    """
    Upload a model card to Hugging Face Hub.
    
    Args:
        model_card_content: The markdown content of the model card
        repo_name: Name of the repository
        username: Hugging Face username (optional, uses logged-in user)
    """
    from huggingface_hub import HfApi, create_repo
    
    api = HfApi()
    
    # Get username if not provided
    if username is None:
        try:
            username = api.whoami()["name"]
        except:
            print("‚ùå Not logged in. Please run: huggingface-cli login")
            return None
    
    repo_id = f"{username}/{repo_name}"
    
    # Create repository
    try:
        create_repo(repo_id, exist_ok=True)
        print(f"‚úÖ Repository created/exists: {repo_id}")
    except Exception as e:
        print(f"‚ö†Ô∏è Could not create repo: {e}")
        return None
    
    # Upload README (model card)
    try:
        api.upload_file(
            path_or_fileobj=model_card_content.encode(),
            path_in_repo="README.md",
            repo_id=repo_id
        )
        print(f"‚úÖ Model card uploaded!")
        print(f"   View at: https://huggingface.co/{repo_id}")
        return repo_id
    except Exception as e:
        print(f"‚ùå Upload failed: {e}")
        return None

print("‚úÖ Upload function ready")
print("\nTo upload, run:")
print('upload_model_card(model_card_md, "tech-assistant-llama3-8b-lora")')

In [None]:
# Uncomment to actually upload (requires login):
# upload_model_card(model_card_md, model_card_data.model_name)

---

## Part 5: Model Card Best Practices

In [None]:
# Model Card Quality Checklist
QUALITY_CHECKLIST = {
    "Model Description": [
        "Clear explanation of what the model does",
        "Model type and architecture specified",
        "Base model identified (if fine-tuned)",
        "Version number included"
    ],
    "Intended Uses": [
        "Primary use cases listed",
        "Target users identified",
        "Example use cases provided"
    ],
    "Out-of-Scope Uses": [
        "Explicitly states what NOT to use it for",
        "Covers dangerous/harmful uses",
        "Addresses professional advice limitations"
    ],
    "Training Details": [
        "Training data described",
        "Training procedure documented",
        "Hyperparameters listed",
        "Hardware mentioned"
    ],
    "Evaluation": [
        "Benchmark scores included",
        "Safety evaluations documented",
        "Comparison to baseline/base model"
    ],
    "Bias and Limitations": [
        "Known biases documented",
        "Technical limitations listed",
        "Potential risks identified"
    ],
    "Usage Instructions": [
        "Code example provided",
        "Recommended settings included",
        "Dependencies listed"
    ]
}

def check_model_card_quality(card_data: ModelCardData) -> Dict:
    """Check model card completeness."""
    results = {}
    
    checks = {
        "Model Description": [
            ("description", bool(card_data.description)),
            ("model_type", bool(card_data.model_type)),
            ("base_model", bool(card_data.base_model)),
            ("version", bool(card_data.model_version))
        ],
        "Intended Uses": [
            ("intended_uses", len(card_data.intended_uses) > 0)
        ],
        "Out-of-Scope Uses": [
            ("out_of_scope", len(card_data.out_of_scope_uses) > 0)
        ],
        "Training Details": [
            ("training_data", bool(card_data.training_data)),
            ("training_procedure", bool(card_data.training_procedure)),
            ("training_hardware", bool(card_data.training_hardware))
        ],
        "Evaluation": [
            ("eval_results", len(card_data.evaluation_results) > 0),
            ("safety_evals", len(card_data.safety_evaluations) > 0)
        ],
        "Bias and Limitations": [
            ("biases", len(card_data.known_biases) > 0),
            ("limitations", len(card_data.limitations) > 0),
            ("risks", len(card_data.risks) > 0)
        ]
    }
    
    total_checks = 0
    passed_checks = 0
    
    for section, items in checks.items():
        section_passed = sum(1 for _, passed in items if passed)
        section_total = len(items)
        results[section] = {
            "passed": section_passed,
            "total": section_total,
            "complete": section_passed == section_total
        }
        total_checks += section_total
        passed_checks += section_passed
    
    results["overall"] = {
        "passed": passed_checks,
        "total": total_checks,
        "percentage": (passed_checks / total_checks) * 100
    }
    
    return results

# Check our model card
quality = check_model_card_quality(model_card_data)

print("üìã MODEL CARD QUALITY CHECK")
print("="*50)

for section, result in quality.items():
    if section != "overall":
        status = "‚úÖ" if result["complete"] else "‚ö†Ô∏è"
        print(f"{status} {section}: {result['passed']}/{result['total']}")

print(f"\nüéØ Overall: {quality['overall']['percentage']:.0f}% complete")

---

## ‚úã Try It Yourself

### Exercise 1: Create Your Own Model Card

Create a model card for a model you've trained or would train. Include:
- Detailed use cases
- At least 5 limitations
- Safety evaluation plans

### Exercise 2: Add Environmental Impact

Estimate and document the environmental impact of training:
- Calculate GPU hours used
- Estimate power consumption
- Calculate approximate carbon footprint

In [None]:
# Your code for Exercise 1



In [None]:
# Your code for Exercise 2



---

## ‚ö†Ô∏è Common Mistakes

### Mistake 1: Being Too Vague

```markdown
‚ùå "This model might have some biases."

‚úÖ "This model shows a 5% higher positive sentiment when processing 
    Western names compared to non-Western names (measured on BBQ benchmark)."
```

### Mistake 2: Missing Limitations

```markdown
‚ùå Limitations section:
    - May make mistakes

‚úÖ Limitations section:
    - Knowledge cutoff: January 2024
    - Hallucinates facts ~15% of the time on TruthfulQA
    - Cannot perform math beyond 3-digit arithmetic reliably
    - May generate biased content about [specific topics]
```

### Mistake 3: No Usage Examples

```markdown
‚ùå "See documentation for usage."

‚úÖ Include complete, copy-paste ready code examples
```

---

## üéâ Module Complete!

Congratulations! You've completed Module 4.2: AI Safety & Alignment.

### You've Learned:
- ‚úÖ Implementing NeMo Guardrails for LLM safety
- ‚úÖ Using Llama Guard for content classification
- ‚úÖ Performing automated red teaming
- ‚úÖ Running safety benchmarks (TruthfulQA, BBQ)
- ‚úÖ Evaluating and mitigating bias
- ‚úÖ Creating comprehensive model cards

### Key Takeaways:
1. Safety is not optional - it's a requirement for production AI
2. Defense in depth: Use multiple layers of safety controls
3. Red team your own systems before attackers do
4. Document everything - model cards enable responsible AI
5. Bias is measurable and (partially) mitigable

---

## üìñ Further Reading

- [Model Cards for Model Reporting (Mitchell et al.)](https://arxiv.org/abs/1810.03993)
- [Hugging Face Model Card Guide](https://huggingface.co/docs/hub/model-cards)
- [EU AI Act Documentation Requirements](https://artificialintelligenceact.eu/)
- [NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework)

---

## üßπ Cleanup

In [None]:
import gc

gc.collect()

print("‚úÖ Module 4.2 Complete!")
print("\nüìÅ Generated files:")
print("   - model_cards/tech-assistant-llama3-8b-lora_README.md")
print("   - benchmark_results/safety_benchmarks.json")
print("   - bias_reports/bias_analysis.json")
print("   - red_team_data/red_team_report.json")
print("\nüìå Next: Module 4.3 - MLOps & Experiment Tracking")