## Amazon SageMaker Model Governance - Model Cards
This notebook walks you through the features of Amazon SageMaker Model Cards. For more information, see [Model Cards in the Amazon SageMaker Developer Guide.](https://docs.aws.amazon.com/sagemaker/latest/dg/model-cards.html)

Amazon SageMaker Model Cards give you the ability to create a centralized, customizable fact-sheet to document critical details about your machine learning (ML) models. Use model cards to keep a record of model information, such as intended uses, risk ratings, training details, evaluation metrics, and more for streamlined governance and reporting.

In this example, you create a binary classification model along with a model card to document model details along the way. Learn how to create, read, update, delete, and export model cards using the Amazon SageMaker Python SDK.

## Setup
To begin, you must specify the following information:

* The IAM role ARN used to give SageMaker training and hosting access to your data. 
* The SageMaker session used to manage interactions with Amazon SageMaker Model Card API methods.
* The S3 URI (bucket and prefix) where you want to store training artifacts, models, and any exported model card PDFs. This S3 bucket should be in the same Region as your Notebook Instance, training, and hosting configurations. The following example uses the default SageMaker S3 bucket and creates a default SageMaker S3 bucket if one does not already exist.
* The S3 session used to manage interactions with Amazon S3 storage.


In [2]:
import boto3
from sagemaker.session import Session
from sagemaker import get_execution_role

role = get_execution_role()

sagemaker_session = Session()

bucket = sagemaker_session.default_bucket()
prefix = "data/kkbox-customer-churn-model/model-card"

region = sagemaker_session.boto_region_name
s3 = boto3.client("s3", region_name=region)

In [3]:
import io
import os
import numpy as np
from six.moves.urllib.parse import urlparse
from pprint import pprint
import boto3
import sagemaker
from sagemaker.image_uris import retrieve
import sagemaker.amazon.common as smac
from sagemaker.model_card import (
    ModelCard,
    ModelOverview,
    ObjectiveFunction,
    Function,
    TrainingDetails,
    IntendedUses,
    EvaluationJob,
    AdditionalInformation,
    Metric,
    MetricGroup,
    ModelCardStatusEnum,
    ObjectiveFunctionEnum,
    FacetEnum,
    RiskRatingEnum,
    MetricTypeEnum,
    EvaluationMetricTypeEnum,
)

## Prepare a Model
The following code creates an example binary classification model trained on a synthetic dataset. The target variable (0 or 1) is the second variable in the tuple.

## Create Model Card
Document your binary classification model details in an Amazon SageMaker Model Card using the SageMaker Python SDK.

1. Auto-collect model details
Automatically collect basic model information like model ID, training environment, and the model output S3 URI. Add additional model information such as a description, problem type, algorithm type, model creator, and model owner.

In [4]:
model_name = "pipelines-wwlincfgf0hc-CreateModel-CreateMo-XTpLtCrP4o"
model_overview = ModelOverview.from_model_name(
    model_name=model_name,
    sagemaker_session=sagemaker_session,
    model_description="An XGBoost model used for predicting customer churn.",
    problem_type="Binary Classification",
    algorithm_type="XGBoost",
    model_creator="weteh",
    model_owner="amazon-aws",
)
print(f"Model id: {model_overview.model_id}")
print(f"Model training images: {model_overview.inference_environment.container_image}")
print(f"Model: {model_overview.model_artifact}")

Model id: arn:aws:sagemaker:us-east-1:602900100639:model/pipelines-wwlincfgf0hc-createmodel-createmo-xtpltcrp4o
Model training images: ['683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.5-1']
Model: ['s3://sagemaker-us-east-1-602900100639/data/kkbox-customer-churn-model/output/pipelines-wwlincfgf0hc-TrainModel-GGbGlmduOZ/output/model.tar.gz']


# Auto-collect training details
Automatically collect basic training information like training ID, training environment, and training metrics. 
Add additional training information such as objective function details and training observations.

In [6]:
objective_function = ObjectiveFunction(
    function=Function(
        function=ObjectiveFunctionEnum.MINIMIZE,
        facet=FacetEnum.LOSS,
    ),
    notes="This objective function is used for minimizing training loss.",
)
training_details = TrainingDetails.from_model_overview(
    model_overview=model_overview,
    sagemaker_session=sagemaker_session,
    objective_function=objective_function,
    training_observations="Model performance achieves significant higher AUC score compared to Random Forest model",
)
print(f"Training job id: {training_details.training_job_details.training_arn}")
print(f"Training image: {training_details.training_job_details.training_environment.container_image}")
print("Training Metrics: ")
pprint(
    [
        {"name": i.name, "value": i.value}
        for i in training_details.training_job_details.training_metrics
    ]
)

Training job id: arn:aws:sagemaker:us-east-1:602900100639:training-job/pipelines-wwlincfgf0hc-TrainModel-GGbGlmduOZ
Training image: ['683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.5-1']
Training Metrics: 
[]


# Collect training job evaluation metrics and create an evaluation detail

In [7]:
manual_metric_group = MetricGroup(
    name="binary classification metrics",
    metric_data=[
                Metric(name="accuracy", type=MetricTypeEnum.NUMBER, value=82.92),
                Metric(name="auc_score", type=MetricTypeEnum.NUMBER, value=0.88)],
)
example_evaluation_job = EvaluationJob(
    name="Example evaluation job",
    evaluation_observation="Evaluation observations.",
    datasets=["s3://path/to/evaluation/data"],
    metric_groups=[manual_metric_group],
)
evaluation_details = [example_evaluation_job]

# Parse Model Monitoring / Clarify job for evaluation data

In [8]:
metric_s3_url = "s3://sagemaker-us-east-1-602900100639/data/kkbox-customer-churn-model/bias/qc3a28gjs7gy/modelbiascheckstep/analysis.json"
example_evaluation_job.add_metric_group_from_s3(
    session=sagemaker_session.boto_session,
    s3_url=metric_s3_url,
    metric_type=EvaluationMetricTypeEnum.CLARIFY_BIAS,
)

Invalid file type binary/octet-stream. application/json is expected.


# Collect additional details

In [9]:
intended_uses = IntendedUses(
    purpose_of_model="Used for predicting customer churn.",
    intended_uses="Predict customer churn.",
    factors_affecting_model_efficiency="No.",
    risk_rating=RiskRatingEnum.LOW,
    explanations_for_risk_rating="No known risks.",
)
additional_information = AdditionalInformation(
    ethical_considerations="Your model ethical consideration.",
    caveats_and_recommendations="Your model's caveats and recommendations.",
    custom_details={"custom details1": "details value"},
)

## Create a Model Card

In [11]:
model_card_name = "kkbox-customer-churn-model"
my_card = ModelCard(
    name=model_card_name,
    status=ModelCardStatusEnum.DRAFT,
    model_overview=model_overview,
    training_details=training_details,
    intended_uses=intended_uses,
    evaluation_details=evaluation_details,
    additional_information=additional_information,
    sagemaker_session=sagemaker_session,
)
my_card.create()
print(f"Model card {my_card.name} is successfully created with id {my_card.arn}")

Model card kkbox-customer-churn-model is successfully created with id arn:aws:sagemaker:us-east-1:602900100639:model-card/kkbox-customer-churn-model


## Update Model Card

In [27]:
my_card.model_overview.model_description = "the model is updated."
my_card.update()

{'content': {'ModelCardArn': 'arn:aws:sagemaker:us-east-2:869530972998:model-card/demo-model-card-from-sagemaker',
  'ResponseMetadata': {'RequestId': '08c5132c-7a03-4df1-8a25-58a2c440551b',
   'HTTPStatusCode': 200,
   'HTTPHeaders': {'x-amzn-requestid': '08c5132c-7a03-4df1-8a25-58a2c440551b',
    'content-type': 'application/x-amz-json-1.1',
    'content-length': '101',
    'date': 'Tue, 03 Jan 2023 22:26:26 GMT'},
   'RetryAttempts': 0}}}

## Load a Model Card

In [12]:
my_card2 = ModelCard.load(
    name=model_card_name,
    sagemaker_session=sagemaker_session,
)
print(f"Model id: {my_card2.arn}")
print(f"Model description: {my_card.model_overview.model_description}")

Model id: arn:aws:sagemaker:us-east-1:602900100639:model-card/kkbox-customer-churn-model
Model description: An XGBoost model used for predicting customer churn.


## Export A Model Card

In [14]:
s3_output_path = f"s3://{bucket}/{prefix}/export"
pdf_s3_url = my_card.export_pdf(s3_output_path=s3_output_path)

          

## Download Model Card

In [15]:
parsed_url = urlparse(pdf_s3_url)
pdf_bucket = parsed_url.netloc
pdf_key = parsed_url.path.lstrip("/")

file_name = parsed_url.path.split("/")[-1]
s3.download_file(Filename=file_name, Bucket=pdf_bucket, Key=pdf_key)
print(f"{file_name} is downloaded to \n{os.path.join(os.getcwd(), file_name)}")

kkbox-customer-churn-model-1674970037-b007.pdf is downloaded to 
/root/end_to_end_sagemaker/kkbox-customer-churn-model-1674970037-b007.pdf
