<div align="center">

# Getting Started with Patra Model Card Toolkit

</div>

The Patra Toolkit is a component of the Patra ModelCards framework designed to simplify the process of creating and documenting AI/ML models. It provides a structured schema that guides users in providing essential information about their models, including details about the model's purpose, development process, and performance. The toolkit also includes features for semi-automating the capture of key information, such as fairness and explainability metrics, through integrated analysis tools. By reducing the manual effort involved in creating model cards, the Patra Toolkit encourages researchers and developers to adopt best practices for documenting their models, ultimately contributing to greater transparency and accountability in AI/ML development.

The Patra Toolkit embeds transparency and governance directly into the training workflow. Integrated scanners collect essential metadata—data sources, fairness metrics, and explainability insights—during model training and then generate a machine‑actionable JSON model card. These cards plug into the Patra Knowledge Base for rich queries on provenance, version history, and auditing. Flexible back‑ends publish models and artifacts to repositories such as Hugging Face or GitHub, automatically recording lineage links to trace every model’s evolution.


---

This Colab Notebook is a quickstart guide that helps you:
- Load and preprocess an example image (from an online URL)
- Perform image classification with a pretrained ResNet50 model from PyTorch
- Generate a comprehensive Model Card using the Patra Toolkit

By the end of this tutorial, you will have a Model Card (in JSON format) that captures key metadata about your model and its predictions.


---

## 1. Environment Setup

In [None]:
!pip install torch torchvision patra_toolkit Pillow scikit-learn

In [1]:
import logging

import torch
import torch.nn.functional as F
import torchvision
from PIL import Image

from patra_toolkit import ModelCard, AIModel

logging.basicConfig(level=logging.INFO)

  from .autonotebook import tqdm as notebook_tqdm


---
## 2. Load and Preprocess Image

We'll download an example image from a URL. Then, we'll apply the same preprocessing as required by ResNet50:
- Resize to 256 pixels on the smaller side
- Center-crop to 224×224
- Convert to a tensor and normalize using the ImageNet statistics

---


In [2]:
image_path = "data/camera_trap_img.JPG"
image = Image.open(image_path).convert("RGB")

from torchvision import transforms
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0)

---
## 3. Model Training and Inference

We load the pretrained ResNet50 model and perform inference on the preprocessed image. We'll then decode the top prediction using the default weights.

---


In [3]:
weights = torchvision.models.ResNet50_Weights.DEFAULT
model = torchvision.models.resnet50(weights=weights)
model.eval()

# Perform inference
with torch.no_grad():
    output = model(input_batch)

probabilities = F.softmax(output[0], dim=0)
top5_prob, top5_catid = torch.topk(probabilities, 5)
categories = weights.meta["categories"]

## 4. Building a Patra Model Card

### 4.1 Basic Model Card Setup
We start with essential metadata like name, version, short description, and so on.

In [4]:
# Create a ModelCard instance
mc = ModelCard(
    name="ResNet50",
    version="1.0",
    short_description="A pretrained ResNet50 image classifier",
    full_description="This model card demonstrates using a pretrained ResNet50 model from PyTorch",
    keywords="resnet50, pytorch, image classification, patra, pretrained",
    author="0009-0009-9817-7042",
    input_type="Image",
    category="classification",
    foundational_model="None",
    citation="https://doi.org/10.48550/arXiv.1512.03385"
)

### 4.2 Attach AI Model Information
Here we describe the model's ownership, license, performance metrics, etc.

In [5]:
ai_model = AIModel(
    name="ResNet50",
    version="1.0",
    description="Pretrained ResNet50 model from torchvision for image classification.",
    owner="0009-0009-9817-7042",
    location="",  # will be updated after model submission
    license="BSD-3 Clause",
    framework="pytorch",
    model_type="cnn",
    test_accuracy=0.75
)

# Attach the AIModel to the ModelCard
mc.ai_model = ai_model
mc.populate_requirements()

---

## 5. Submission


In [6]:
mc.submit(patra_server_url="http://149.165.151.249:5002/",
          model=model,
          file_format="pt",
          model_store="huggingface",
          inference_labels="data/labels.txt",
          artifacts=["data/camera_trap_img.JPG"]
          )

INFO:root:Model card validation successful.
INFO:root:PID created: 0009-0009-9817-7042-resnet50-1.0
INFO:root:Model serialized successfully.
0009-0009-9817-7042-resnet50-1.0.pt: 100%|██████████| 103M/103M [00:07<00:00, 13.1MB/s] 
INFO:root:Model uploaded at: https://huggingface.co/patra-iu/0009-0009-9817-7042-resnet50-1.0/blob/main/0009-0009-9817-7042-resnet50-1.0.pt
INFO:root:Model card created.
INFO:root:Model card uploaded at: https://huggingface.co/patra-iu/0009-0009-9817-7042-resnet50-1.0/blob/main/model_card.json
- empty or missing yaml metadata in repo card
INFO:root:Inference labels uploaded at: https://huggingface.co/patra-iu/0009-0009-9817-7042-resnet50-1.0/blob/main/labels.txt
camera_trap_img.JPG: 100%|██████████| 1.74M/1.74M [00:00<00:00, 7.89MB/s]
INFO:root:Artifact 'data/camera_trap_img.JPG' uploaded at: https://huggingface.co/patra-iu/0009-0009-9817-7042-resnet50-1.0/blob/main/camera_trap_img.JPG
INFO:root:Model Card submitted successfully.


'success'


**[Optional] Tapis Authentication:**
Before submitting, ensure you have obtained a valid Tapis token using your TACC credentials. If you do not already have a TACC account, you can create one at [https://accounts.tacc.utexas.edu/begin](https://accounts.tacc.utexas.edu/begin). You can use the `authenticate()` method provided by the toolkit (or any other method) to obtain the token. When calling the submission methods, pass the token as the `tapis_token` parameter so that your request is authenticated by the Patra server. If Tapis authentication isn’t required for your scenario, you can set `tapis_token` to `None`.

The `mc.submit(...)` method can do one or more of the following:
1. **Submit only the card** (no model, no artifacts).
2. **Include the trained model** (uploading to Hugging Face or GitHub).
3. **Add artifacts** (such as data files, inference labels, or any additional resources).

In [None]:
# tapis_token = mc.authenticate(username="neelk", password="****")