# Amazon SageMaker Model Governance - Model Card
This notebook will walk you through the features in Model Card.

SageMaker's model cards give you a centralized, customizable factsheet for a model and are key to model governance. It can store both qualititve(Model Owner, risk rating etc) and quantitive(training, evaluation metrics) about the model. 

In this example, we will create a simple binary classification model and then store facts of the model within a model card. Additionally, useful model card operations will be presented too, i.e. basic read/update/delete operations, sharing the model card as an exported PDF file, tracking model card editing history.

---
## Contents

1. [Setup](#Setup)
1. [Preparing a Binary Classifcation Model](#Model)
1. [Create Model Card](#ModelCard)
1. [Update Model Card](#Update)
1. [Load Model Card](#Load)
1. [List Model Card History](#ListHistory)
1. [Export Model Card](#Export)
1. [Cleanup](#Cleanup)

---
## Setup
Let's start by specifying:
- The IAM role ARN used to give training and hosting access to your data. See the documentation for how to create these. The following code will use the SageMaker execution role.
- The S3 bucket and prefix to use for training, model and exported model card PDF. This should be within the same region as the Notebook Instance, training, and hosting. The following code will use SageMaker's default S3 bucket (and create one if it doesn't exist).
- The S3 client used to download and delete the exported model card PDF.
- The sagemaker session used in the model card APIs

In [24]:
from sagemaker.session import Session
from sagemaker import get_execution_role

role = get_execution_role()

bucket = sagemaker_session.default_bucket()
prefix = "model-card-sample-notebook"

region = sagemaker_session.boto_region_name
s3 = boto3.client("s3", region_name=region)

sagemaker_session = Session()

Next, we'll import the Python libraries that we'll need for the remainder of the exercise.

In [33]:
import io
import os
import numpy as np
from six.moves.urllib.parse import urlparse
import boto3
import sagemaker
from sagemaker.image_uris import retrieve
import sagemaker.amazon.common as smac
from sagemaker.model_card import (
    ModelCard,
    ModelOverview,
    ObjectiveFunction,
    Function,
    TrainingDetails,
    IntendedUses,
    EvaluationJob,
    AdditionalInformation,
    ModelCardStatusEnum,
    ObjectiveFunctionEnum,
    FacetEnum,
    RiskRatingEnum,
)

---
## Preparing a Model<a name="Model"></a>
We will create an example model and collect facts of it in the following steps. The binary classification model will be trained on a fake dataset. The target variable is the second one in the tuple. 

### 1. Prepare the training data
The code will upload example data to your S3 bucket.

In [34]:
raw_data = (
    (0.5, 0),  (0.75, 0), (1.0, 0),  (1.25, 0), (1.50, 0),
    (1.75, 0), (2.0, 0),  (2.25, 1), (2.5, 0),  (2.75, 1),
    (3.0, 0),  (3.25, 1), (3.5, 0),  (4.0, 1),  (4.25, 1),
    (4.5, 1),  (4.75, 1), (5.0, 1),  (5.5, 1),
)
training_data = np.array(raw_data).astype("float32")
labels = training_data[:, 1]

buf = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf, training_data, labels)
buf.seek(0)

boto3.resource("s3").Bucket(bucket).Object(os.path.join(prefix, "train")).upload_fileobj(
    buf
)

### 2. Train a model
Train a binary classification model with the training data from the previous step.

In [None]:
s3_train_data = f"s3://{bucket}/{prefix}/train"
output_location = f"s3://{bucket}/{prefix}/output"
container = retrieve("linear-learner", boto3.Session().region_name)
estimator = sagemaker.estimator.Estimator(
    container,
    role=role,
    instance_count=1,
    instance_type="ml.m4.xlarge",
    output_path=output_location,
    sagemaker_session=sagemaker_session,
)
estimator.set_hyperparameters(
    feature_dim=2, mini_batch_size=10, predictor_type="binary_classifier"
)
estimator.fit({"train": s3_train_data})

### 3. Create an endpoint
We use the `deploy()` method on the model to create an endpoint hosting that model.

In [None]:
endpoint_name = "model-card-example-test"
endpoint = estimator.deploy(
    initial_instance_count=1, instance_type="ml.m4.xlarge", endpoint_name=endpoint_name
)

model_name = endpoint._get_model_names()[0]
training_job_name = estimator.latest_training_job.name
print(f"Model name: {model_name}")
print(f"Training job name: {training_job_name}")

---
## Create Model Card<a name="ModelCard"></a>
We will start to collect model facts into a model card.

### 1. Auto-collect Model Data (Model Overview) for Model Card
Automatically collect basic model information like model id, training environment, model output S3 URL. Other model facts could be added like description, problem type, algorithm type, model creator, owner, etc.

In [None]:
model_overview = ModelOverview.from_name(
    model_name=model_name,
    sagemaker_session=sagemaker_session,
    model_description="This is a simple binary classification model used for Model Card demo",
    problem_type="Binary Classification",
    algorithm_type="Logitic Regression",
    model_creator="DEMO-ModelCard",
    model_owner="DEMO-ModelCard",
)
print(model_overview.model_id)
print(model_overview.inference_environment.container_image)
print(model_overview.model_artifact)

### 2. Auto-collect Training Data (Training Details) for Model Card
Automatically collect basic training information like training id, training environment, training metrics. Additional training facts could be added like training objective function, observations, etc

In [None]:
objective_function = ObjectiveFunction(
    function=Function(
        function=ObjectiveFunctionEnum.MINIMIZE,
        facet=FacetEnum.LOSS,
    ),
    notes="This is a example objective function.",
)
training_details = TrainingDetails.from_model_overview(
    model_overview=model_overview,
    sagemaker_session=sagemaker_session,
    objective_function=objective_function
    training_observations="Additional training observations could be put here."
)
print(training_details.training_job_details.training_arn)
print(training_details.training_job_details.training_environment.container_image)
print([{"name": i.name, "value": i.value} for i in training_details.training_job_details.training_metrics])

### 3. Collect evaluation details
Add evaluation observations, datasets, and metrics.

In [None]:
my_metric_group = MetricGroup(
    name="binary classification metrics",
    metric_data=[Metric(name="accuracy", type=MetricTypeEnum.NUMBER, value=0.5)]
)
evaluation_details = [
    EvaluationJob(
        name="Example evaluation job",
        evaluation_observation="Evaluation observations.",
        datasets=["s3://path/to/evaluation/data"],
        metric_groups=[metric_group_example],
    )
]

### 4. Collect other facts

In [None]:
intended_uses = IntendedUses(
    purpose_of_model="Test model card.",
    intended_uses="Not used except this test.",
    factors_affecting_model_efficiency="No.",
    risk_rating=RiskRatingEnum.LOW,
    explanations_for_risk_rating="Just an example.",
)
additional_information = AdditionalInformation(
    ethical_considerations="You model ethical consideration.",
    caveats_and_recommendations="Your model's caveats and recommendations.",
    custom_details={"custom details1": "details value"},
)

### 5. Initialize a Model Card
Initialize a model card with all the facts collected before.

In [None]:
model_card_name = "sample-notebook-model-card"
my_card = ModelCard(
    name=model_card_name,
    status=ModelCardStatusEnum.DRAFT,
    model_overview=model_overview,
    training_details=training_details,
    intended_uses=intended_uses,
    evaluation_details=evaluation_details,
    additional_information=additional_information,
    sagemaker_session=sagemaker_session,
)
my_card.create()
print(f"Model card {model_card.name} is successfully created with id {model_card.arn}")

---
## Update Model Card<a name="Update"></a>

In [None]:
my_card.model_overview.model_description = "the model is updated."
my_card.update()

---
## Load Model Card<a name="Load"></a>
Load an existing model card with the model card name.

In [None]:
my_card2 = ModelCard.load(
    name=model_card_name,
    sagemaker_session=sagemaker_session,
)

---
## List Model Card History<a name="ListHistory"></a>
Track the model card history by listing historical versions.

In [None]:
my_card.get_version_history()

---
## Export Model Card<a name="Export"></a>
Share the model card by exporting it to a PDF file.

### 1. Create an export job

In [None]:
s3_output_path = f"s3://{bucket}/{prefix}/export"
pdf_s3_url = my_card.export_pdf(s3_output_path=s3_output_path)

### (optional) List export jobs
Check all the export jobs for this model card.

In [None]:
my_card.list_export_jobs()

### 2. Download the exported Model Card PDF
The downloaded pdf will be stored in the same directory as this notebook by default.

#### Parse the bucket and key of the exported PDF

In [None]:
parsed_url = urlparse(pdf_s3_url)
pdf_bucket = parsed_url.netloc
pdf_key = parsed_url.path.lstrip("/")

#### Download

In [10]:
file_name = parsed_url.path.split("/")[-1]
s3.download_file(Filename=file_name, Bucket=pdf_bucket, Key=pdf_key)
print(f"{file_name} is downloaded successfully.")

---
## Cleanup<a name="Cleanup"></a>
The following resources will be deleted:
1. the model card
2. exported model card PDF
3. binary classification model
4. endpoint and also the endpoint_config

In [48]:
my_card.delete()

s3.delete_object(Bucket=pdf_bucket, Key=pdf_key)

endpoint.delete_model()
endpoint.delete_endpoint()