# Step 4: Add a model building CI/CD pipeline

In this step you create an automated CI/CD pipeline for model building using [Amazon SageMaker Projects](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects.html). 

![](img/six-steps-4.png)

You are going to use a [SageMaker-provided MLOps project template for model building and training](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-templates-sm.html#sagemaker-projects-templates-code-commit) to provision a CI/CD workflow automation with [AWS CodePipeline](https://aws.amazon.com/codepipeline/) and an [AWS CodeCommit](https://aws.amazon.com/codecommit/) code repository.

SageMaker project templates offer you the following choice of code repositories, workflow automation tools, and pipeline stages:
- **Code repository**: AWS CodeCommit or third-party Git repositories such as GitHub and Bitbucket
- **CI/CD workflow automation**: AWS CodePipeline or Jenkins
- **Pipeline stages**: Model building and training, model deployment, or both

<div class="alert alert-info"> Make sure you using <code>Python 3</code> kernel in JupyterLab for this notebook.</div>

In [13]:
import boto3
import sagemaker 
from time import gmtime, strftime, sleep

In [56]:
%store -r 

%store

try:
    initialized
except NameError:
    print("+++++++++++++++++++++++++++++++++++++++++++++++++")
    print("[ERROR] YOU HAVE TO RUN 00-start-here notebook   ")
    print("+++++++++++++++++++++++++++++++++++++++++++++++++")

Stored variables and their in-db values:
baseline_s3_url                        -> 's3://sagemaker-us-east-1-446490546198/mlops-pk/xg
bucket_name                            -> 'sagemaker-us-east-1-446490546198'
bucket_prefix                          -> 'mlops-pk/xgboost'
dataset_file_local_path                -> 'data/bank-additional/bank-additional-full.csv'
domain_id                              -> 'd-fkrqwhxrpnws'
evaluation_s3_url                      -> 's3://sagemaker-us-east-1-446490546198/mlops-pk/xg
initialized                            -> True
input_s3_url                           -> 's3://sagemaker-us-east-1-446490546198/mlops-pk/xg
mlflow_arn                             -> 'arn:aws:sagemaker:us-east-1:446490546198:mlflow-t
mlflow_name                            -> 'mlflow-d-fkrqwhxrpnws-20-05-26-15'
model_package_group_name               -> 'mlops-pk-proj-1-pipeline-model-24-19-09-00'
output_s3_url                          -> 's3://sagemaker-us-east-1-446490546198/mlops-p

In [15]:
%load_ext autoreload
%autoreload 2

## Create an MLOps project
⭐ You can create a project programmatically in this notebook - **Option 1** or in Studio UI - **Option 2**.

Option 1 is recommended as it requires no manual input and has no dependency on the UX.</br>
Option 2 is given to demonstrate [**Create Project** UI flow](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-create.html).

### Option 1: Create project programmatically
In this section you use `boto3` to create an MLOps project via a SageMaker API.

In [16]:
sm = boto3.client("sagemaker")
sc = boto3.client("servicecatalog")

sc_provider_name = "Amazon SageMaker"
sc_product_name = "MLOps template for model building and training"

In [17]:
p_ids = [p['ProductId'] for p in sc.search_products(
    Filters={
        'FullTextSearch': [sc_product_name]
    },
)['ProductViewSummaries'] if p["Name"]==sc_product_name]

In [18]:
p_ids

['prod-53ibyqbj2cgmo']

In [19]:
# If you get any exception from this code, go to the Option 2 and create a project in Studio UI
if not len(p_ids):
    raise Exception("No Amazon SageMaker ML Ops products found!")
elif len(p_ids) > 1:
    raise Exception("Too many matching Amazon SageMaker ML Ops products found!")
else:
    product_id = p_ids[0]
    print(f"ML Ops product id: {product_id}")

ML Ops product id: prod-53ibyqbj2cgmo


In [20]:
provisioning_artifact_id = sorted(
    [i for i in sc.list_provisioning_artifacts(
        ProductId=product_id
    )['ProvisioningArtifactDetails'] if i['Guidance']=='DEFAULT'],
    key=lambda d: d['Name'], reverse=True)[0]['Id']

In [21]:
provisioning_artifact_id

'pa-dxonbcmsdfyq6'

In [22]:
project_name = f"pk-xg-model-build-{strftime('%m-%d-%H-%M-%S', gmtime())}"
project_parameters = [] # This SageMaker built-in project template doesn't have any parameters

Finally, create a SageMaker project from the service catalog product template:

In [23]:
# create SageMaker project
r = sm.create_project(
    ProjectName=project_name,
    ProjectDescription="Model build project",
    ServiceCatalogProvisioningDetails={
        'ProductId': product_id,
        'ProvisioningArtifactId': provisioning_artifact_id,
    },
)

print(r)
project_id = r["ProjectId"]

{'ProjectArn': 'arn:aws:sagemaker:us-east-1:446490546198:project/pk-xg-model-build-09-24-16-33-20', 'ProjectId': 'p-ywqvbykzxugd', 'ResponseMetadata': {'RequestId': '2ea332e2-1804-406e-bf30-7dcb5223dbbb', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '2ea332e2-1804-406e-bf30-7dcb5223dbbb', 'content-type': 'application/x-amz-json-1.1', 'content-length': '127', 'date': 'Tue, 24 Sep 2024 16:33:30 GMT'}, 'RetryAttempts': 0}}


<div class="alert alert-info"> 💡 <strong> Wait until project creation is completed by running the next cell</strong>
</div>




In [24]:
# Project creation takes about 3-5 min
while sm.describe_project(ProjectName=project_name)['ProjectStatus'] != 'CreateCompleted':
    print("Waiting for project creation completion")
    sleep(10)
    
print(f"MLOps project {project_name} creation completed")

Waiting for project creation completion
Waiting for project creation completion
Waiting for project creation completion
Waiting for project creation completion
Waiting for project creation completion
Waiting for project creation completion
Waiting for project creation completion
Waiting for project creation completion
Waiting for project creation completion
MLOps project pk-xg-model-build-09-24-16-33-20 creation completed




### End of Option 1: Create project programmatically
Now you have provisioned a project template in your SageMaker environment. Navigate to the section **Configure the MLOps project**.

---

### Option 2: Create a project in Studio UI
<div class="alert alert-info"> 💡 <strong> Skip this section if you created a project programmatically </strong>

Follow the instructions in the Developer Guide – [Create a MLOps Project using Amazon SageMaker Studio or Studio Classic](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-create.html). Choose the **Studio** option.

For the template choose the **Model building and training**.
In the **Project details** you need to provide a name and an optional project description. This template doesn't have any parameters.

Choose **Create** and wait for the project to appear in the Projects list.

### Resolve issues with project creation

#### Project creation process stuck in pending
If after 5 minutes the project creation banner is still on, close the Studio browser window and sign in Studio again.

![](img/project-creation-pending.png)

#### Error messages
❗ If you see an error message similar to:
```
Your project couldn't be created
Studio encountered an error when creating your project. Try recreating the project again.

CodeBuild is not authorized to perform: sts:AssumeRole on arn:aws:iam::XXXX:role/service-role/AmazonSageMakerServiceCatalogProductsCodeBuildRole (Service: AWSCodeBuild; Status Code: 400; Error Code: InvalidInputException; Request ID: 4cf59a54-0c59-476a-a970-0ac656db4402; Proxy: null)
```

see steps 5-6 of [SageMaker Studio Permissions Required to Use Projects](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-studio-updates.html). Make sure you have all required project roles listed in the **Apps** card under **Projects**. 

Alternatively, you can create the required roles by using the provided CloudFormation template [`cfn-templates/sagemaker-project-templates-roles.yaml`](cfn-templates/sagemaker-project-templates-roles.yaml). 
Run in the repository clone directory from the command line terminal where you have the corresponding permissions:

```sh
aws cloudformation deploy \
    --template-file cfn-templates/sagemaker-project-templates-roles.yaml \
    --stack-name sagemaker-project-template-roles \
    --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM \
    --parameter-overrides \
    CreateCloudFormationRole=YES \
    CreateCodeBuildRole=YES \
    CreateCodePipelineRole=YES \
    CreateEventsRole=YES \
    CreateProductsExecutionRole=YES 
```

### End of Option 2: Create a project in Studio UI
Now when you have the project created, move to the section **Configure the MLOps project**.

---

## Configure the MLOps project
The project runs a provided default model building pipeline automatically as soon as it has been created. This pipeline is a sample placeholder in the project for your own custom pipeline. Ignore the default pipeline for the moment.
The project templates deploys the following architecture in your AWS account:

![](img/mlops-model-build-train.png)

The main components are:
1. The project template is made available through SageMaker Projects and AWS Service Catalog portfolio
2. A CodePipeline pipeline with two stages - `Source` to download the source code from a CodeCommit repository and `Build` to create and execute a SageMaker pipeline
3. A default SageMaker pipeline with model build, train, and register workflow
4. A seed code repository in CodeCommit with a provided default version of a placeholder code

This project contains all the required code and the insfrastructure to implement an automated CI/CD pipeline from a pre-defined template. 
To start using the project with your pipeline, you need to complete the following steps:
1. Clone the project CodeCommit repository to your notebook EBS volume
2. Replace the ML pipeline template sample code with your actual pipeline construction code, as implemented in the step 3 notebook
3. Modify the `codebuild-buildspec.yml` file to reference the correct Python module name and to set project parameters

Next sections guide you through these steps. For detailed instructions and a hands-on example, refer to the development guide [SageMaker MLOps Project Walkthrough](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-walkthrough.html).

If you used the option 1 `boto3` to create an MLOps project, the `project_name` and `project_id` are set automatically. You can run the following code cell to print the values. If you followed the UI instructions to create a project, you must set the `project_name` manually.

In [25]:
try:
    print(project_name)
    print(project_id)
except NameError:
    print("+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")
    print("You must set the project_name manually in the following code cell")
    print("+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")

pk-xg-model-build-09-24-16-33-20
p-ywqvbykzxugd


In [26]:
# project_name = "<ENTER THE NAME OF THE CREATED PROJECT>" # Keep commented out if you used option 1 to create a project
r = sm.describe_project(ProjectName=project_name)
project_id = r['ProjectId']
project_arn = r['ProjectArn']
project_folder = f"sagemaker-{project_name}-{project_id}-modelbuild"
project_repo_url = f"codecommit::{region}://sagemaker-{project_name}-{project_id}-modelbuild"

print(f"Project folder: {project_folder}")
print(f"Project repo URL: {project_repo_url}")

Project folder: sagemaker-pk-xg-model-build-09-24-16-33-20-p-ywqvbykzxugd-modelbuild
Project repo URL: codecommit::us-east-1://sagemaker-pk-xg-model-build-09-24-16-33-20-p-ywqvbykzxugd-modelbuild


### 1. Clone the project seed code to the JupyterLab file system
You need to clone the project code from the CodeCommit repository by using terminal CLI.

1. Open a new terminal window via **File** > **New** > **Terminal**
2. Install `git-remote-codecommit` helper: ```pip install git-remote-codecommit```
3. Clone the project repository: ```git clone <PROJECT REPO URL>```. Replace the `<PROJECT REPO URL>` with the actual project repo URL from the code cell above.

### 2. Replace pipeline construction code

The following steps are required to customize the project which contains the template code. The next code cell executes all the required steps, you don't need to do anything manually. The following text is for your information only.

- The source code is in the folder `sagemaker-<project-name>-<project-id>-modelbuild`.
- The original file `codebuild-buildspec.yml` is renamed to `codebuild-buildspec-original.yml`.
- Project's code repository folder containing the pipeline code is renamed from `abalone` folder to `fromideatoprod`.
- The original file with the template pipeline `pipeline.py` is renamed to `pipeline-original.py`.
- Copy the `pipeline_steps` Python modules to the `pipelines` folder in the project's code repository folder.
- Copy the `requirements.txt` created in the notebook 3 to the `pipelines` folder in the project's code repository folder.
- Copy SageMaker Python SDK default configuration file `config.yaml` from the notebook 3 to the `pipelines` folder in the project's code repository folder.

In [27]:
# see the workshop folder name
!pwd

/home/sagemaker-user/amazon-sagemaker-from-idea-to-production


In [28]:
# if you local path for the workshop folder is different, set the correct absolute path to the variable workshop_folder
workshop_folder = "amazon-sagemaker-from-idea-to-production"

In [31]:
!mkdir -p ~/{workshop_folder}/pipelines
!mv ~/{project_folder}/codebuild-buildspec.yml ~/{project_folder}/codebuild-buildspec-original.yml
!mv ~/{project_folder}/pipelines/abalone ~/{project_folder}/pipelines/fromideatoprod
!mv ~/{project_folder}/pipelines/fromideatoprod/pipeline.py ~/{project_folder}/pipelines/fromideatoprod/pipeline-original.py
!cp ~/{workshop_folder}/pipeline_steps/* ~/{project_folder}/pipelines/
!cp ~/{workshop_folder}/pipeline_steps/* ~/{workshop_folder}/pipelines/
!cp ~/{workshop_folder}/requirements.txt ~/{project_folder}
!cp ~/{workshop_folder}/config.yaml ~/{project_folder}

mv: cannot stat '/home/sagemaker-user/sagemaker-pk-xg-model-build-09-24-16-33-20-p-ywqvbykzxugd-modelbuild/codebuild-buildspec.yml': No such file or directory
mv: cannot stat '/home/sagemaker-user/sagemaker-pk-xg-model-build-09-24-16-33-20-p-ywqvbykzxugd-modelbuild/pipelines/abalone': No such file or directory
mv: cannot stat '/home/sagemaker-user/sagemaker-pk-xg-model-build-09-24-16-33-20-p-ywqvbykzxugd-modelbuild/pipelines/fromideatoprod/pipeline.py': No such file or directory


Execute the following cell to write pipeline construction code to the file `pipeline.py`. Re-use the code from the step 3 notebook as the function `get_pipeline()`.

<div style="border: 4px solid coral; text-align: center; margin: auto;">
    <p style=" text-align: center; margin: auto;">The pipeline construction code works with both S3 raw input dataset or feature set from the Feature Store, you need to pass a corresponding input parameter to the pipeline.
    </p>
</div>

In [32]:
%%writefile pipeline.py

import pandas as pd
import json
import boto3
import pathlib
import io
import os
import sagemaker
import mlflow
from time import gmtime, strftime, sleep
from sagemaker.deserializers import CSVDeserializer
from sagemaker.serializers import CSVSerializer

from sagemaker.workflow.execution_variables import ExecutionVariables
from sagemaker.workflow.pipeline_context import PipelineSession
from sagemaker.xgboost.estimator import XGBoost
from sagemaker.sklearn.processing import SKLearnProcessor
from sagemaker.processing import (
    ProcessingInput, 
    ProcessingOutput, 
    ScriptProcessor
)
from sagemaker.inputs import TrainingInput

from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.steps import (
    ProcessingStep, 
    TrainingStep, 
    CreateModelStep,
    CacheConfig
)
from sagemaker.workflow.check_job_config import CheckJobConfig
from sagemaker.workflow.parameters import (
    ParameterInteger, 
    ParameterFloat, 
    ParameterString, 
    ParameterBoolean
)
from sagemaker.workflow.quality_check_step import (
    DataQualityCheckConfig,
    ModelQualityCheckConfig,
    QualityCheckStep,
)
from sagemaker.workflow.clarify_check_step import (
    ModelBiasCheckConfig, 
    ClarifyCheckStep, 
    ModelExplainabilityCheckConfig
)
from sagemaker import Model
from sagemaker.inputs import CreateModelInput
from sagemaker.workflow.model_step import ModelStep
from sagemaker.workflow.fail_step import FailStep
from sagemaker.workflow.conditions import (
    ConditionGreaterThan,
    ConditionGreaterThanOrEqualTo
)
from sagemaker.workflow.parallelism_config import ParallelismConfiguration
from sagemaker.workflow.properties import PropertyFile
from sagemaker.workflow.condition_step import ConditionStep
from sagemaker.workflow.functions import (
    Join,
    JsonGet
)
from sagemaker.workflow.lambda_step import (
    LambdaStep,
    LambdaOutput,
    LambdaOutputTypeEnum,
)
from sagemaker.lambda_helper import Lambda

from sagemaker.model_metrics import (
    MetricsSource, 
    ModelMetrics, 
    FileSource
)
from sagemaker.drift_check_baselines import DriftCheckBaselines
from sagemaker.workflow.pipeline_definition_config import PipelineDefinitionConfig 
from sagemaker.image_uris import retrieve
from sagemaker.workflow.function_step import step
from sagemaker.workflow.step_outputs import get_step
from sagemaker.model_monitor import DatasetFormat, model_monitoring

from pipelines.preprocess import preprocess
from pipelines.evaluate import evaluate
from pipelines.register import register
from pipelines.extract import prepare_datasets

def get_sagemaker_client(region):
     return boto3.Session(region_name=region).client("sagemaker")

def get_pipeline_session(region, bucket_name):
    """Gets the pipeline session based on the region.

    Args:
        region: the aws region to start the session
        bucket_name: the bucket to use for storing the artifacts

    Returns:
        PipelineSession instance
    """

    boto_session = boto3.Session(region_name=region)
    sagemaker_client = boto_session.client("sagemaker")

    return PipelineSession(
        boto_session=boto_session,
        sagemaker_client=sagemaker_client,
        default_bucket=bucket_name,
    )

def get_pipeline_custom_tags(new_tags, region, sagemaker_project_name=None):
    try:
        print(f"Getting project tags for {sagemaker_project_name}")
        
        sm_client = get_sagemaker_client(region)
        
        project_arn = sm_client.describe_project(ProjectName=sagemaker_project_name)['ProjectArn']
        project_tags = sm_client.list_tags(ResourceArn=project_arn)['Tags']

        print(f"Project tags: {project_tags}")
        
        for project_tag in project_tags:
            new_tags.append(project_tag)
            
    except Exception as e:
        print(f"Error getting project tags: {e}")
        
    return new_tags
    
def get_pipeline(
    region,
    sagemaker_project_id=None,
    sagemaker_project_name=None,
    role=None,
    bucket_name=None,
    bucket_prefix="from-idea-to-prod/xgboost",
    input_s3_url=None,
    feature_group_name=None,
    model_package_group_name="from-idea-to-prod-model-group",
    pipeline_name_prefix="from-idea-to-prod-pipeline",
    process_instance_type="ml.m5.large",
    train_instance_type="ml.m5.xlarge",
    test_score_threshold=0.70,
    tracking_server_arn=None,
):
    """Gets a SageMaker ML Pipeline instance.
    
    Returns:
        an instance of a pipeline
    """
    if feature_group_name is None and input_s3_url is None:
        print("One of feature_group_name or input_s3_url must be provided. Exiting...")
        return None

    session = get_pipeline_session(region, bucket_name)
    sm = session.sagemaker_client
    
    if role is None:
        role = sagemaker.session.get_execution_role(session)

    print(f"sagemaker version: {sagemaker.__version__}")
    print(f"Execution role: {role}")
    print(f"Input S3 URL: {input_s3_url}")
    print(f"Feature group: {feature_group_name}")
    print(f"Model package group: {model_package_group_name}")
    print(f"Pipeline name prefix: {pipeline_name_prefix}")
    print(f"Tracking server ARN: {tracking_server_arn}")
    
    pipeline_name = f"{pipeline_name_prefix}-{sagemaker_project_id}"
    experiment_name = pipeline_name

    output_s3_prefix = f"s3://{bucket_name}/{bucket_prefix}"
    # Set the output S3 url for model artifact
    output_s3_url = f"{output_s3_prefix}/output"
    # Set the output S3 url for feature store query results
    output_query_location = f'{output_s3_prefix}/offline-store/query_results'
    
    # Set the output S3 urls for processed data
    train_s3_url = f"{output_s3_prefix}/train"
    validation_s3_url = f"{output_s3_prefix}/validation"
    test_s3_url = f"{output_s3_prefix}/test"
    evaluation_s3_url = f"{output_s3_prefix}/evaluation"
    
    baseline_s3_url = f"{output_s3_prefix}/baseline"
    prediction_baseline_s3_url = f"{output_s3_prefix}/prediction_baseline"
    
    xgboost_image_uri = sagemaker.image_uris.retrieve(
            "xgboost", 
            region=region,
            version="1.5-1"
    )

    # If no tracking server ARN, try to find an active MLflow server
    if tracking_server_arn is None:
        r = sm.list_mlflow_tracking_servers(
            TrackingServerStatus='Created',
        )['TrackingServerSummaries']
    
        if len(r) < 1:
            print("You don't have any running MLflow servers. Exiting...")
            return None
        else:
            tracking_server_arn = r[0]['TrackingServerArn']
            print(f"Use the tracking server ARN:{tracking_server_arn}")
        
    # Parameters for pipeline execution
    
    # Set processing instance type
    process_instance_type_param = ParameterString(
        name="ProcessingInstanceType",
        default_value=process_instance_type,
    )

    # Set training instance type
    train_instance_type_param = ParameterString(
        name="TrainingInstanceType",
        default_value=train_instance_type,
    )

    # Set model approval param
    model_approval_status_param = ParameterString(
        name="ModelApprovalStatus",
        default_value="PendingManualApproval"
    )

    # Minimal threshold for model performance on the test dataset
    test_score_threshold_param = ParameterFloat(
        name="TestScoreThreshold", 
        default_value=test_score_threshold
    )

    # S3 url for the input dataset
    input_s3_url_param = ParameterString(
        name="InputDataUrl",
        default_value=input_s3_url if input_s3_url else "None",
    )

    # Feature group name for the input featureset
    feature_group_name_param = ParameterString(
        name="FeatureGroupName",
        default_value=feature_group_name if feature_group_name else "None",
    )
    
    # Model package group name
    model_package_group_name_param = ParameterString(
        name="ModelPackageGroupName",
        default_value=model_package_group_name,
    )

    # MLflow tracking server ARN
    tracking_server_arn_param = ParameterString(
        name="TrackingServerARN",
        default_value=tracking_server_arn,
    )
    
    # Define step cache config
    cache_config = CacheConfig(
        enable_caching=True,
        expire_after="P30d" # 30-day
    )

    # Construct the pipeline
    
    # Get datasets
    step_get_datasets = step(
            preprocess, 
            role=role,
            instance_type=process_instance_type_param,
            name=f"preprocess",
            keep_alive_period_in_seconds=3600,
    )(
        input_data_s3_path=input_s3_url_param,
        output_s3_prefix=output_s3_prefix,
        tracking_server_arn=tracking_server_arn_param,
        experiment_name=experiment_name,
        pipeline_run_name=ExecutionVariables.PIPELINE_EXECUTION_ID,
    ) if input_s3_url else step(
        prepare_datasets, 
        role=role,
        instance_type=process_instance_type_param,
        name=f"extract-featureset",
        keep_alive_period_in_seconds=3600,
    )(
        feature_group_name=feature_group_name_param,
        output_s3_prefix=output_s3_prefix,
        query_output_s3_path=output_query_location,
        tracking_server_arn=tracking_server_arn_param,
        experiment_name=experiment_name,
        pipeline_run_name=ExecutionVariables.PIPELINE_EXECUTION_ID,
    )
    
    # Instantiate an XGBoost estimator object
    estimator = sagemaker.estimator.Estimator(
        image_uri=xgboost_image_uri,
        role=role, 
        instance_type=train_instance_type_param,
        instance_count=1,
        output_path=output_s3_url,
        sagemaker_session=session,
        base_job_name=f"{pipeline_name}-train"
    )
    
    # Define algorithm hyperparameters
    estimator.set_hyperparameters(
        num_round=100, # the number of rounds to run the training
        max_depth=3, # maximum depth of a tree
        eta=0.5, # step size shrinkage used in updates to prevent overfitting
        alpha=2.5, # L1 regularization term on weights
        objective="binary:logistic",
        eval_metric="auc", # evaluation metrics for validation data
        subsample=0.8, # subsample ratio of the training instance
        colsample_bytree=0.8, # subsample ratio of columns when constructing each tree
        min_child_weight=3, # minimum sum of instance weight (hessian) needed in a child
        early_stopping_rounds=10, # the model trains until the validation score stops improving
        verbosity=1, # verbosity of printing messages
    )
    
    # train step
    step_train = TrainingStep(
        name=f"train",
        step_args=estimator.fit(
            {
                "train": TrainingInput(
                    step_get_datasets['train_data'],
                    content_type="text/csv",
                ),
                "validation": TrainingInput(
                    step_get_datasets['validation_data'],
                    content_type="text/csv",
                ),
            }
        ),
        cache_config=cache_config,
    )   
    
    # Evaluation step
    step_evaluate = step(
        evaluate,
        role=role,
        instance_type=process_instance_type_param,
        name=f"evaluate",
        keep_alive_period_in_seconds=3600,
    )(
        test_x_data_s3_path=step_get_datasets['test_x_data'],
        test_y_data_s3_path=step_get_datasets['test_y_data'],
        model_s3_path=step_train.properties.ModelArtifacts.S3ModelArtifacts,
        output_s3_prefix=output_s3_prefix,
        tracking_server_arn=tracking_server_arn_param,
        experiment_name=step_get_datasets['experiment_name'],
        pipeline_run_id=step_get_datasets['pipeline_run_id'],
    )

    # register model step
    step_register = step(
        register,
        role=role,
        instance_type=process_instance_type_param,
        name=f"register",
        keep_alive_period_in_seconds=3600,
    )(
        training_job_name=step_train.properties.TrainingJobName,
        model_package_group_name=model_package_group_name_param,
        model_approval_status=model_approval_status_param,
        evaluation_result=step_evaluate['evaluation_result'],
        output_s3_prefix=output_s3_url,
        tracking_server_arn=tracking_server_arn_param,
        experiment_name=step_get_datasets['experiment_name'],
        pipeline_run_id=step_get_datasets['pipeline_run_id'],
    )

    # fail the pipeline execution step
    step_fail = FailStep(
        name=f"fail",
        error_message=Join(on=" ", values=["Execution failed due to AUC Score < ", test_score_threshold_param]),
    )
    
    # condition to check in the condition step
    condition_gte = ConditionGreaterThanOrEqualTo(
            left=step_evaluate['evaluation_result']['classification_metrics']['auc_score']['value'],  
            right=test_score_threshold_param,
    )
    
    # conditional register step
    step_conditional_register = ConditionStep(
        name=f"check-metrics",
        conditions=[condition_gte],
        if_steps=[step_register],
        else_steps=[step_fail],
    )   

    # Create a pipeline object
    pipeline = Pipeline(
        name=f"{pipeline_name}",
        parameters=[
            input_s3_url_param,
            feature_group_name_param,
            process_instance_type_param,
            train_instance_type_param,
            model_approval_status_param,
            test_score_threshold_param,
            model_package_group_name_param,
            tracking_server_arn_param,
        ],
        steps=[step_conditional_register],
        pipeline_definition_config=PipelineDefinitionConfig(use_custom_job_prefix=True)
    )
    
    return pipeline

Writing pipeline.py


Copy this `pipeline.py` file from the workshop folder to the `pipelines/fromideatoprod` folder in the project's code repository folder:

In [35]:
!cp ~/{workshop_folder}/pipeline.py ~/{project_folder}/pipelines/fromideatoprod/

Test the function `get_pipeline` locally to see if everything works before running as remotely.

In [36]:
from pipeline import get_pipeline

In [57]:
# If you created a feature store in the notebook 3, you can set the feature_group_name parameter instead of input_s3_url to take the data from the feature store
p = get_pipeline(
    region=region,
    sagemaker_project_id=project_id,
    sagemaker_project_name=project_name,
    role=sm_role,
    bucket_name=bucket_name,
    bucket_prefix=bucket_prefix,
    input_s3_url=input_s3_url,
    # feature_group_name=dataset_feature_group_name,
    model_package_group_name= model_package_group_name,
    pipeline_name_prefix=pipeline_name,
    process_instance_type="ml.m5.large",
    train_instance_type="ml.m5.xlarge",
    test_score_threshold=0.70,
    tracking_server_arn=mlflow_arn,
)

INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.


sagemaker version: 2.232.1
Execution role: arn:aws:iam::446490546198:role/service-role/AmazonSageMaker-ExecutionRole-20240919T173156
Input S3 URL: s3://sagemaker-us-east-1-446490546198/mlops-pk/xgboost/input/bank-additional-full.csv
Feature group: None
Model package group: mlops-pk-proj-1-pipeline-model-24-19-09-00
Pipeline name prefix: mlops-pk-proj-1-pipeline-24-19-09-00
Tracking server ARN: arn:aws:sagemaker:us-east-1:446490546198:mlflow-tracking-server/mlflow-d-fkrqwhxrpnws-20-05-26-15


In [68]:
p.definition()

2024-10-01 16:35:19,982 sagemaker.remote_function INFO     Uploading serialized function code to s3://sagemaker-us-east-1-446490546198/mlops-pk-proj-1-pipeline-24-19-09-00-p-ywqvbykzxugd/register/2024-10-01-16-35-19-982/function
2024-10-01 16:35:20,118 sagemaker.remote_function INFO     Uploading serialized function arguments to s3://sagemaker-us-east-1-446490546198/mlops-pk-proj-1-pipeline-24-19-09-00-p-ywqvbykzxugd/register/2024-10-01-16-35-19-982/arguments
2024-10-01 16:35:20,568 sagemaker.remote_function INFO     Copied dependencies file at './requirements.txt' to '/tmp/tmp9l2gvngk/requirements.txt'
2024-10-01 16:35:20,596 sagemaker.remote_function INFO     Successfully uploaded dependencies and pre execution scripts to 's3://sagemaker-us-east-1-446490546198/mlops-pk-proj-1-pipeline-24-19-09-00-p-ywqvbykzxugd/register/2024-10-01-16-35-19-982/pre_exec_script_and_dependencies'
2024-10-01 16:35:20,710 sagemaker.remote_function INFO     Copied user workspace to '/tmp/tmpnem4ard7/temp_w

'{"Version": "2020-12-01", "Metadata": {}, "Parameters": [{"Name": "InputDataUrl", "Type": "String", "DefaultValue": "s3://sagemaker-us-east-1-446490546198/mlops-pk/xgboost/input/bank-additional-full.csv"}, {"Name": "FeatureGroupName", "Type": "String", "DefaultValue": "None"}, {"Name": "ProcessingInstanceType", "Type": "String", "DefaultValue": "ml.m5.large"}, {"Name": "TrainingInstanceType", "Type": "String", "DefaultValue": "ml.m5.xlarge"}, {"Name": "ModelApprovalStatus", "Type": "String", "DefaultValue": "PendingManualApproval"}, {"Name": "TestScoreThreshold", "Type": "Float", "DefaultValue": 0.7}, {"Name": "ModelPackageGroupName", "Type": "String", "DefaultValue": "mlops-pk-proj-1-pipeline-model-24-19-09-00"}, {"Name": "TrackingServerARN", "Type": "String", "DefaultValue": "arn:aws:sagemaker:us-east-1:446490546198:mlflow-tracking-server/mlflow-d-fkrqwhxrpnws-20-05-26-15"}], "PipelineExperimentConfig": {"ExperimentName": {"Get": "Execution.PipelineName"}, "TrialName": {"Get": "Exec

In [59]:
p.upsert(role_arn=sm_role)

2024-09-24 21:10:26,894 sagemaker.remote_function INFO     Uploading serialized function code to s3://sagemaker-us-east-1-446490546198/mlops-pk-proj-1-pipeline-24-19-09-00-p-ywqvbykzxugd/register/2024-09-24-21-10-26-893/function
2024-09-24 21:10:27,006 sagemaker.remote_function INFO     Uploading serialized function arguments to s3://sagemaker-us-east-1-446490546198/mlops-pk-proj-1-pipeline-24-19-09-00-p-ywqvbykzxugd/register/2024-09-24-21-10-26-893/arguments
2024-09-24 21:10:27,242 sagemaker.remote_function INFO     Copied dependencies file at './requirements.txt' to '/tmp/tmpswihu2d_/requirements.txt'
2024-09-24 21:10:27,284 sagemaker.remote_function INFO     Successfully uploaded dependencies and pre execution scripts to 's3://sagemaker-us-east-1-446490546198/mlops-pk-proj-1-pipeline-24-19-09-00-p-ywqvbykzxugd/register/2024-09-24-21-10-26-893/pre_exec_script_and_dependencies'
2024-09-24 21:10:27,352 sagemaker.remote_function INFO     Copied user workspace to '/tmp/tmpl6j04wb7/temp_w

{'PipelineArn': 'arn:aws:sagemaker:us-east-1:446490546198:pipeline/mlops-pk-proj-1-pipeline-24-19-09-00-p-ywqvbykzxugd',
 'ResponseMetadata': {'RequestId': 'aff48cf3-2412-4e25-8ba7-39886fb6e033',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'aff48cf3-2412-4e25-8ba7-39886fb6e033',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '119',
   'date': 'Tue, 24 Sep 2024 21:10:32 GMT'},
  'RetryAttempts': 0}}

To see the created pipeline in the Studio UI, click on the link constructed by the code cell below:

In [60]:
from IPython.display import HTML

# Show the pipeline link
display(
    HTML('<b>See <a target="top" href="https://studio-{}.studio.{}.sagemaker.aws/pipelines/{}/graph">the pipeline</a> in the Studio UI</b>'.format(
            domain_id, region, p.describe()['PipelineName']))
)

At this point you have tested locally that the pipeline construction code works and it creates a pipeline. You can see this pipeline in Studio **Pipelines** widget. Now you ready to create a CI/CD pipeline.

#### Attach the model package group to the project
Project-owned resources are automatically tagged with `sagemaker:project-name` and `sagemaker:project-id` tags for cost control, attribute-based security control, and governance. 
Since the model package group already exists in the model registry, you need to tag it to attach to this project. The following code cell calls `AddTags` API to set project tags to the model package group.

In [61]:
model_package_group_arn = sm.describe_model_package_group(ModelPackageGroupName=model_package_group_name).get("ModelPackageGroupArn")

if model_package_group_arn:
    print(f"Adding tags {project_arn.split('/')[-1]} and {project_id} for model package group {model_package_group_arn}")
    r = sm.add_tags(
        ResourceArn=model_package_group_arn,
        Tags=[
            {
                'Key': 'sagemaker:project-name',
                'Value': project_arn.split("/")[-1]
            },
            {
                'Key': 'sagemaker:project-id',
                'Value': project_id
            },
        ]
    )
    print(r)
else:
    print(f"The model package group {model_package_group_name} doesn't exist")
    
sm.list_tags(ResourceArn=model_package_group_arn)["Tags"]

Adding tags pk-xg-model-build-09-24-16-33-20 and p-ywqvbykzxugd for model package group arn:aws:sagemaker:us-east-1:446490546198:model-package-group/mlops-pk-proj-1-pipeline-model-24-19-09-00
{'Tags': [{'Key': 'sagemaker:project-name', 'Value': 'pk-xg-model-build-09-24-16-33-20'}, {'Key': 'sagemaker:project-id', 'Value': 'p-ywqvbykzxugd'}], 'ResponseMetadata': {'RequestId': '365f4c15-8561-4e1a-a213-e848fae0f8b3', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '365f4c15-8561-4e1a-a213-e848fae0f8b3', 'content-type': 'application/x-amz-json-1.1', 'content-length': '142', 'date': 'Tue, 24 Sep 2024 21:11:50 GMT'}, 'RetryAttempts': 0}}


[{'Key': 'sagemaker:project-name',
  'Value': 'pk-xg-model-build-09-24-16-33-20'},
 {'Key': 'sagemaker:user-profile-arn',
  'Value': 'arn:aws:sagemaker:us-east-1:446490546198:user-profile/d-fkrqwhxrpnws/default-20240919T173153'},
 {'Key': 'sagemaker:domain-arn',
  'Value': 'arn:aws:sagemaker:us-east-1:446490546198:domain/d-fkrqwhxrpnws'},
 {'Key': 'sagemaker:project-id', 'Value': 'p-ywqvbykzxugd'},
 {'Key': 'sagemaker:space-arn',
  'Value': 'arn:aws:sagemaker:us-east-1:446490546198:space/d-fkrqwhxrpnws/jupyter'}]

### 3. Modify the build specification file
Now modify the `codebuild-buildspec.yml` file in the project folder to reflect the new name of Python module with your pipeline and set other project-specific parameters.

You need to pass the following parameters to a pipeline creation script - corresponding to the parameters of the `get_pipeline` function you've just created:
- `input_s3_url` - an S3 URL for the input raw dataset. If you created a feature group in the notebook 3, you can use the parameter `feature_group_name` instead
- `feature_group_name` – you can use this parameter if you created this feature group in the notebook 3, in this case you don't need to provide `input_s3_url`
- `model_package_group_name` – the model registry package to register a model after training
- `pipeline_name_prefix` – a name prefix for the pipeline. The pipeline name is constructed as `<pipeline_name_prefix>-<project-id>`
- `role` – the pipeline execution role
- `tracking_server_arn`- the MLflow server ARN for pipeline execution tracking

The following cells print the values of these parameters:

In [62]:
try:
    print(f"""
        INPUT-S3-URL: {input_s3_url}
        FEATURE-GROUP-NAME: {dataset_feature_group_name}
        MODEL-PACKAGE-GROUP-NAME: {model_package_group_name}
        PIPELINE-BASE-NAME: {pipeline_name}
        SAGEMAKER-EXECUTION-ROLE-ARN: {sm_role}
        TRACKING-SERVER-ARN: {mlflow_arn}
        """)
except NameError:
    print(f"""
        Dataset feature group name is not defined, use input_s3_url instead:
        ********************************************************************
        
        INPUT-S3-URL: {input_s3_url}
        MODEL-PACKAGE-GROUP-NAME: {model_package_group_name}
        PIPELINE-BASE-NAME: {pipeline_name}
        SAGEMAKER-EXECUTION-ROLE-ARN: {sm_role}
        TRACKING-SERVER-ARN: {mlflow_arn}
        """)


        Dataset feature group name is not defined, use input_s3_url instead:
        ********************************************************************
        
        INPUT-S3-URL: s3://sagemaker-us-east-1-446490546198/mlops-pk/xgboost/input/bank-additional-full.csv
        MODEL-PACKAGE-GROUP-NAME: mlops-pk-proj-1-pipeline-model-24-19-09-00
        PIPELINE-BASE-NAME: mlops-pk-proj-1-pipeline-24-19-09-00
        SAGEMAKER-EXECUTION-ROLE-ARN: arn:aws:iam::446490546198:role/service-role/AmazonSageMaker-ExecutionRole-20240919T173156
        TRACKING-SERVER-ARN: arn:aws:sagemaker:us-east-1:446490546198:mlflow-tracking-server/mlflow-d-fkrqwhxrpnws-20-05-26-15
        


Now replace the value of these parameters in the following code cell with the printed values from the cell above. 

To do it, locate the parameter `kwargs` in the following code cell starting with `%%writefile codebuild-buildspec.yml`:

```
--kwargs "{ \
    \"input_s3_url\":\"<INPUT-S3-URL>\", \
    \"feature_group_name\":\"<FEATURE-GROUP-NAME>\", \
    \"model_package_group_name\":\"<MODEL-PACKAGE-GROUP-NAME>\", \
    \"pipeline_name_prefix\":\"<PIPELINE-BASE-NAME>\", \
    \"role\":\"<SAGEMAKER-EXECUTION-ROLE-ARN>\", \
    \"tracking_server_arn\":\"<TRACKING-SERVER-ARN>\", \
    \"region\":\"${AWS_REGION}\", \
    \"sagemaker_project_name\":\"${SAGEMAKER_PROJECT_NAME}\", \
    \"sagemaker_project_id\":\"${SAGEMAKER_PROJECT_ID}\", \
    \"bucket_name\":\"${ARTIFACT_BUCKET}\" \
        }"
```

and replace the values of `input_s3_url` OR `feature_group_name`, `model_package_group_name`, `pipeline_name_prefix`, `role`, and `tracking_server_arn` parameters with the values printed by the previous cells.

You need to replace only one of `input_s3_url` or `feature_group_name` depending on what dataset input method you'd like to use - a raw input dataset from S3 or the processed featureset from the feature store. You can use the feature store only if you created it in the previous notebook. 

<div class="alert alert-info">Delete the line with parameter that you don't use: <code>input_s3_url</code> or <code>feature_group_name</code> from the cell code.</div>

![](img/codebuild-buildspec-edit.png)

After you replace the values of the parameter, execute the cell to write a build spec file.

In [66]:
%%writefile codebuild-buildspec.yml

version: 0.2

phases:
  install:
    runtime-versions:
      python: 3.10
    commands:
      - pip install --upgrade --force-reinstall . "awscli>1.20.30"
      - pip install mlflow==2.13.2 sagemaker-mlflow s3fs xgboost
    
  build:
    commands:
      - export SAGEMAKER_USER_CONFIG_OVERRIDE="./config.yaml"
      - export PYTHONUNBUFFERED=TRUE
      - export SAGEMAKER_PROJECT_NAME_ID="${SAGEMAKER_PROJECT_NAME}-${SAGEMAKER_PROJECT_ID}"
      - |
        run-pipeline --module-name pipelines.fromideatoprod.pipeline \
          --role-arn $SAGEMAKER_PIPELINE_ROLE_ARN \
          --tags "[{\"Key\":\"sagemaker:project-name\",\"Value\":\"${SAGEMAKER_PROJECT_NAME}\"}, {\"Key\":\"sagemaker:project-id\", \"Value\":\"${SAGEMAKER_PROJECT_ID}\"}]" \
          --kwargs "{ \
                \"input_s3_url\":\"s3://sagemaker-us-east-1-446490546198/mlops-pk/xgboost/input/bank-additional-full.csv\", \
                \"model_package_group_name\":\"mlops-pk-proj-1-pipeline-model-24-19-09-00\",\
                \"pipeline_name_prefix\":\"mlops-pk-proj-1-pipeline-24-19-09-00\",\
                \"role\":\"arn:aws:iam::446490546198:role/service-role/AmazonSageMaker-ExecutionRole-20240919T173156\",\
                \"tracking_server_arn\":\"arn:aws:sagemaker:us-east-1:446490546198:mlflow-tracking-server/mlflow-d-fkrqwhxrpnws-20-05-26-15\", \
                \"region\":\"${AWS_REGION}\", \
                \"sagemaker_project_name\":\"${SAGEMAKER_PROJECT_NAME}\",\
                \"sagemaker_project_id\":\"${SAGEMAKER_PROJECT_ID}\",\
                \"bucket_name\":\"${ARTIFACT_BUCKET}\"\
                    }"
      - echo "Create/update of the SageMaker Pipeline and a pipeline execution completed."

Overwriting codebuild-buildspec.yml


Copy the `codebuild-buildspec.yml` file from the workshop folder to the project's code repository folder:

In [67]:
!cp ~/{workshop_folder}/codebuild-buildspec.yml ~/{project_folder}/codebuild-buildspec.yml

To summarize, you have just done three changes in the build spec file:
1. Modified the `run-pipeline` `--module-name` parameter value from `pipelines.abalone.pipeline` to the new path `pipelines.fromideatoprod.pipeline`
2. Removed some parameters from the `kwargs` list to make use of `get_pipeline()` function default parameter values
3. Added additional parameters for the pipeline to the `kwargs` parameter list

### 4. Fix the `setup.py` file
Finally, open the `setup.py` file in the project's code repository folder and replace the line `required_packages = ["sagemaker==2.XX.0"]` with `required_packages = ["sagemaker"]`. Save your changes.

Why did you do this change? The pinned sagemaker library version is a bug and is going to be fixed in future releases of the built-in SageMaker project templates. For now you fix this template file manually. Keep in mind, that the built-in project templates are for your convenience only and to demostrate how to use SageMaker project mechanism to package and provision your own custom MLOps projects.

Now you are ready to launch the CI/CD model building pipeline.

Everything is ready to run a CI/CD pipeline.

---

## Run the CI/CD for the model building pipeline
To launch the CI/CD for the model building pipeline you need to push the changed code into the project CodeCommit repository.

<div class="alert alert-info">Make sure you are in the folder that contains the repository code in JupyterLab terminal when running git commands. The folder name looks like <code>sagemaker-[project-name]-[project-id]-modelbuild</code>.</div>

The cell below prints the required `cd` command with the correct folder name:

In [31]:
print(f"cd ~/{project_folder}")

cd ~/sagemaker-model-build-07-23-09-57-14-p-zuhy7mg14znn-modelbuild


Open a system terminal window via the JupyterLab menu **File** > **New** > **Terminal** and enter the following commands. Keep `user.email` and `user.name` or replace with your data.
```sh
cd ~/<PROJECT-FOLDER>/<PROJECT-CODE-REPOSITORY-FOLDER>

git config --global user.email "you@example.com"
git config --global user.name "Your Name"
  
git add -A
git commit -am "customize project"
git push
```

After pushing your code changes, the project initiates a run of the CodePipeline pipeline that constructs, upcerts, and executes the SageMaker model building pipeline. This new pipeline execution creates a new model version in the model package group in the SageMaker model registry.

You can follow up the execution of the pipeline in the Studio **Pipelines** widget.

Wait until the pipeline execution finishes. The execution takes about 15 minutes to complete.

To see the execution of the pipeline click on the link constructed by the code cell below. Note, that CodeBuild takes about 1 minute to build upsert the pipeline and to start the execution. Refresh the Studio UI page to see the started execution.

In [69]:
# Show the pipeline execution link
display(
    HTML('<b>See <a target="top" href="https://studio-{}.studio.{}.sagemaker.aws/pipelines/{}/executions/">the pipeline executions</a> in the Studio UI</b>'.format(
            domain_id, region, p.describe()['PipelineName']))
)

## View the details of a new model version
After the pipeline execution finished, a new model version must be registered in the model registry. To see the model version details:

1. In the Studio sidebar, choose the **Models** widget
2. Click on the name of the model package group you created in the step 3 notebook (`from-idea-to-prod-pipeline-model-<TIMESTAMP>`) to open the model group
3. In the list of model versions, select the latest version of the model

On the model version tab that opens, you can browse activity, [model version details](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry-details.html), and [data lineage](https://docs.aws.amazon.com/sagemaker/latest/dg/lineage-tracking.html). 

![](img/model-version-details.png)

In a real-world project you add various model attributes and additional model version metadata such as [model quality metrics](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html), [explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-model-explainability.html) and [bias](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-measure-post-training-bias.html) reports, load test data, and [inference recommender](https://docs.aws.amazon.com/sagemaker/latest/dg/inference-recommender.html).

To see the model package version in the Studio UI click on the link constructed by the code cell below. Note that you need to wait until the pipeline execution finishes to see the latest registered version of the model package.

In [38]:
# Show the model package link
display(
    HTML('<b>See <a target="top" href="https://studio-{}.studio.{}.sagemaker.aws/models/registered-models/{}/versions">the model package versions</a> in the Studio UI</b>'.format(
            domain_id, region, model_package_group_name))
)

## Summary
In this notebook you implement a CI/CD pipeline with the following features:
- Model building ML pipeline is under the source control in a CodeCommit repository
- Every push into the CodeCommit repository launches a new CodeBuild build which constructs, upserts, and executes the ML pipeline
- The whole e2e model development process is automated now, including the model building pipeline
- SageMaker project is a logical construct in Studio which has the metadata about related ML pipelines, repositories, models, experiments, and inference endpoints

---

## Continue with the step 5
open the step 5 [notebook](05-deploy.ipynb).

## Further development ideas for your real-world projects
- You can use a SageMaker-provided [MLOps template for model building, training, and deployment with third-party Git repositories using Jenkins](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-templates-sm.html#sagemaker-projects-templates-git-jenkins)
- Create a [custom SageMaker project template](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-templates-custom.html) to cover your specific project requirements

## Additional resources
- [Amazon SageMaker Pipelines lab in SageMaker Immersion Day](https://catalog.us-east-1.prod.workshops.aws/workshops/63069e26-921c-4ce1-9cc7-dd882ff62575/en-US/lab6)
- [Enhance your machine learning development by using a modular architecture with Amazon SageMaker projects](https://aws.amazon.com/blogs/machine-learning/enhance-your-machine-learning-development-by-using-a-modular-architecture-with-amazon-sagemaker-projects/)
- [Dive deep into automating MLOps](https://www.youtube.com/watch?v=3_cHnk9VSfQ)
- [SageMaker MLOps Project Walkthrough](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-walkthrough.html)
- [`aws-samples` GitHub repository with custom project templates examples](https://github.com/aws-samples/sagemaker-custom-project-templates)

# Shutdown kernel

In [None]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>