The model card template makes use of Jinja, hence we need to install the necessary package.

In [8]:
!pip install Jinja2




[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Required import statement

In [9]:
from huggingface_hub import ModelCard, ModelCardData

Before running the cell below, upload the model card template (`COMP34812_modelcard_template.md`) provided to you using the Colab file browser (on the left-hand side).

In [10]:
# ----------------------------------------------------------------------------
# 1. Define ModelCardData
# ----------------------------------------------------------------------------
card_data = ModelCardData(
    language='en',
    license='cc-by-4.0',
    tags=['text-classification'],
    repo="https://github.com/BenTheBacker/NLU---Project",
    ignore_metadata_errors=True
)

# ----------------------------------------------------------------------------
# 2. Create Model Card from Template
# ----------------------------------------------------------------------------

# Optionally, add a new string to summarize best hyperparams:
best_hyperparams_str = """
**Best Hyperparameters**:
- learning_rate: 2.379886141068789e-05
- epochs: 2
- batch_size: 16
- use_focal_loss: False
- gamma: 4.5
- label_smoothing: 0.00011108738704290744
"""

card = ModelCard.from_template(
    card_data=card_data,
    template_path='COMP34812_modelcard_template.md',
    
    # Model ID: incorporate the usernames & track
    model_id='m81976bb-v36373bb-ED-TaskC',

    # Provide a brief model summary
    model_summary='''This classification model was built for "Task C", 
      where the system determines whether a given claim is supported by 
      a piece of evidence (0) or not (1).''',

    # Describe your model architecture and training approach
    model_description='''Our approach uses **microsoft/deberta-v3-base** as the base model, 
      fine-tuned on a dataset of claim-evidence pairs. We performed data augmentation 
      via synonym replacement, and used Hyperopt (TPE) to explore hyperparameters 
      (focal loss vs. label smoothing). This helps address potential data imbalance 
      and improves generalization.''',

    developers='Ben Baker and Ben Barrow',
    base_model_repo='https://huggingface.co/microsoft/deberta-v3-base',
    base_model_paper='https://arxiv.org/abs/2111.09543',
    model_type='Supervised',
    model_architecture='DeBERTa-v3 Base, fine-tuned with optional focal loss.',
    language='English',
    base_model='microsoft/deberta-v3-base',

    # Data references
    training_data='This model was trained on approximately 24.8k claim-evidence pairs, plus augmented samples.',

    hyperparameters=''' 
      - learning_rate: Hyperopt (log-uniform from 1e-5 to 5e-4)
      - epochs: 2, 3, or 4
      - batch_size: 4, 8, or 16
      - focal_loss gamma: 1.0 to 5.0
      - label_smoothing: 0.0 to 0.2
    ''',

    speeds_sizes_times=f'''
      - Total training time across all Hyperopt trials: ~4h 28m 07s
      - Single final run training time: ~15 minutes (2 epochs) on a Kaggle P100 GPU
      - Model size: ~400MB (including DeBERTa weights)
    ''',

     testing_data='We used the official ED dev set (~6k samples) for evaluation.',
    testing_metrics='''
      - F1 (weighted)
      - Accuracy
      - Precision / Recall
    ''',

    # Incorporate final results and a short classification report summary
    results=f'''

**Final Model Results** (Dev Set):
- **eval_loss**: 0.3468
- **F1 (weighted)**: 0.8895
- **Accuracy**: 0.8871

**Epoch-by-Epoch Performance**:
| Epoch | Training Loss | Validation Loss | F1      | Accuracy |
|-------|--------------:|----------------:|--------:|---------:|
|   1   | 0.331800      | 0.314324        | 0.871924| 0.867195 |
|   2   | 0.181800      | 0.346780        | 0.889496| 0.887108 |

**Classification Report**:
- **Class 0** => precision=0.9493, recall=0.8915, f1=0.9195
- **Class 1** => precision=0.7554, recall=0.8756, f1=0.8111
- **Weighted Avg** => f1=0.8895, accuracy=0.8871

{best_hyperparams_str}
''',

    hardware_requirements=''' 
      - GPU recommended (trained on a Kaggle P100)
      - ~2GB storage for data & model artifacts
      - ~16GB RAM
    ''',

    software='''
      - Transformers 4.x
      - PyTorch 1.x
      - NLTK for synonym replacement
      - Hyperopt for hyperparameter tuning
    ''',

    bias_risks_limitations='''Data augmentation may introduce synonyms that alter sentence context. 
      The model performance may degrade on domain-specific language 
      or out-of-vocabulary terms. Mitigation strategies may involve 
      domain adaptation or further data collection.''',

    additional_information='''All hyperparameters were chosen via TPE (Tree-structured Parzen Estimator) 
      with a maximum of 30 evaluations.'''
)

# ----------------------------------------------------------------------------
# 3. Write Model Card to Markdown File
# ----------------------------------------------------------------------------
with open('C Model Card.md', 'w') as model_card_file:
    model_card_file.write(card.content)

print("Model card generated: C Model Card.md")


Repo card metadata block was not found. Setting CardData to empty.


Model card generated: C Model Card.md
