# SageMaker unified studdio Pipelines Tutorial

### !IMPORTANT please read this markdown cell carefully before proceeding with the complete notebook. 

This notebook demonstrates how to create and execute a SageMaker pipeline, making emphasis in the differences between sagemaker unified studio and regular sagemaker. including steps for data processing, model training, evaluation, model creation, batch transformation, model registration, and conditional execution. I provide a workflow to tag all the resources generated from a pipeline execution for proper visualization in sagemaker unified studio.

## SageMaker Unified Studio

SageMaker Unified Studio is a new feature that provides a consolidated interface for managing SageMaker resources. It is important to note that when running this notebook's first section within a SageMaker Unified Studio project's Jupyter notebook , the created resources are automatically tagged for easy access via the project's console.

However, if you run this notebook outside of a SageMaker Unified Studio project (e.g., on your local machine or in a separate notebook instance), you won't see the pipeline and its resources in the Unified Studio console by default. To visualize these resources in the Unified Studio console, you need to manually tag them, as shown in the second section of this notebook.

* Section 1: Creating and Executing the Pipeline
* Section 2: Tagging Resources for visualization on SageMaker Unified Studio


## Prerequisites for Running This Notebook

To successfully run this notebook, ensure the following prerequisites are met:

- **AWS Account**: An active AWS account with access to Amazon SageMaker.
- **SageMaker Role**: An IAM role with permissions for SageMaker, S3, and related services. The role must include policies like `AmazonSageMakerFullAccess` and `AmazonS3FullAccess`.
- **SageMaker Unified Studio**: Access to SageMaker Unified Studio for visualization (optional for running the notebook but required for viewing resources in the console).
- **Python Environment**: A Python environment (e.g., `conda_python3`) with the following packages installed:
  - `boto3==1.38.35`
  - `sagemaker==2.246.0` (version compatible with the notebook, e.g., 2.x)
  - `pandas==2.2.3`
  - `sklearn==1.6.1`
  - `numpy==1.26.4`
  - `xgboost==3.0.2`
- **S3 Access**: Access to an S3 bucket for storing input data, model artifacts, and outputs.
- **Notebook Environment**: Run this notebook in a SageMaker notebook instance, SageMaker Unified Studio Jupyter notebook, or a local environment with AWS credentials configured.
- **AWS Region**: Ensure the region supports SageMaker and the specified instance types (e.g., `ml.m5.xlarge`). The notebook retrieves the region dynamically, but verify compatibility.
- **Data Files**: The notebook downloads the Abalone dataset from public S3 buckets (`sagemaker-example-files-prod-<region>` and `sagemaker-servicecatalog-seedcode-<region>`). Ensure network access to these buckets.
- **Tagging Permissions**: For Section 2, the IAM role must have permissions to add tags to SageMaker resources (`sagemaker:AddTags`).

## Section 1: Creating and Executing the Pipeline

In this section, we define and execute a SageMaker pipeline that performs the following steps:

1. **Processing Step**: Preprocesses the input data.
2. **Training Step**: Trains an XGBoost model using the preprocessed data.
3. **Evaluation Step**: Evaluates the trained model using a test dataset.
4. **Conditional Step**: Checks if the model's mean squared error (MSE) is below a specified threshold.
   - If yes, proceeds to:
     - **Register Model Step**: Registers the model in the SageMaker model registry.
     - **Create Model Step**: Creates a SageMaker model.
     - **Batch Transform Step**: Performs batch transformation using the created model.
   - If no, fails the pipeline execution.

This pipeline demonstrates a complete machine learning workflow, from data preprocessing to model deployment, with a conditional check to ensure model quality.

### Utility Function

We define a helper function to create unique names with timestamps for resources.

In [1]:
from datetime import datetime

def create_name_with_timestamp(prefix):
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    return f"{prefix}{timestamp}"

### Initial Setup

We import necessary libraries and set up SageMaker sessions and roles required for the pipeline.

In [2]:
import sys
import boto3
import sagemaker
from sagemaker.workflow.pipeline_context import PipelineSession

sagemaker_session = sagemaker.session.Session()
region = sagemaker_session.boto_region_name
role = sagemaker.get_execution_role()
pipeline_session = PipelineSession()
default_bucket = sagemaker_session.default_bucket()
model_package_group_name = create_name_with_timestamp("AbaloneModelPackageGroupName")

  from pandas.core.computation.check import NUMEXPR_INSTALLED


sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml


In [3]:
import numpy
import pandas
import sklearn
import xgboost

Note: You have installed the 'manylinux2014' variant of XGBoost. Certain features such as GPU algorithms or federated learning are not available. To use these features, please upgrade to a recent Linux distro with glibc 2.28+, and install the 'manylinux_2_28' variant.


In [4]:
print(boto3.__version__)
print(sagemaker.__version__)
print(pandas.__version__)
print(sklearn.__version__)
print(numpy.__version__)
print(xgboost.__version__)

1.38.35
2.246.0
2.2.3
1.6.1
1.26.4
3.0.2


### Downloading and Uploading Data

We download the Abalone dataset and upload it to an S3 bucket for use in the pipeline.

In [5]:
# Create a local directory for data if it doesn't exist
!mkdir -p data

local_path = "data/abalone-dataset.csv"
s3 = boto3.resource("s3")
s3.Bucket(f"sagemaker-example-files-prod-{region}").download_file(
    "datasets/tabular/uci_abalone/abalone.csv", local_path
)
base_uri = f"s3://{default_bucket}/abalone"
input_data_uri = sagemaker.s3.S3Uploader.upload(
    local_path=local_path,
    desired_s3_uri=base_uri,
)
print(f"Input data uploaded to: {input_data_uri}")

local_path = "data/abalone-dataset-batch"
s3.Bucket(f"sagemaker-servicecatalog-seedcode-{region}").download_file(
    "dataset/abalone-dataset-batch", local_path
)
batch_data_uri = sagemaker.s3.S3Uploader.upload(
    local_path=local_path,
    desired_s3_uri=base_uri,
)
print(f"Batch data uploaded to: {batch_data_uri}")

Input data uploaded to: s3://sagemaker-us-east-1-794038231401/abalone/abalone-dataset.csv
Batch data uploaded to: s3://sagemaker-us-east-1-794038231401/abalone/abalone-dataset-batch


### Defining Pipeline Parameters

Pipeline parameters allow us to pass different values to the pipeline each time it runs, making it flexible and reusable. Here, we define parameters for instance counts, types, approval status, data URIs, and evaluation threshold.

In [6]:
from sagemaker.workflow.parameters import (
    ParameterInteger,
    ParameterString,
    ParameterFloat,
)

processing_instance_count = ParameterInteger(name="ProcessingInstanceCount", default_value=1)
instance_type = ParameterString(name="TrainingInstanceType", default_value="ml.m5.xlarge")
model_approval_status = ParameterString(
    name="ModelApprovalStatus", default_value="PendingManualApproval"
)
input_data = ParameterString(
    name="InputData",
    default_value=input_data_uri,
)
batch_data = ParameterString(
    name="BatchData",
    default_value=batch_data_uri,
)
mse_threshold = ParameterFloat(name="MseThreshold", default_value=6.0)

### Preprocessing Script

This script preprocesses the Abalone dataset, handling data cleaning, feature encoding, and splitting into training, validation, and test sets.

In [7]:
# Create a directory for code if it doesn't exist
!mkdir -p code

In [8]:
%%writefile code/preprocessing.py
import argparse
import os
import requests
import tempfile

import numpy as np
import pandas as pd

from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder

# Define column names since the CSV is headerless
feature_columns_names = [
    "sex",
    "length",
    "diameter",
    "height",
    "whole_weight",
    "shucked_weight",
    "viscera_weight",
    "shell_weight",
]
label_column = "rings"

feature_columns_dtype = {
    "sex": str,
    "length": np.float64,
    "diameter": np.float64,
    "height": np.float64,
    "whole_weight": np.float64,
    "shucked_weight": np.float64,
    "viscera_weight": np.float64,
    "shell_weight": np.float64,
}
label_column_dtype = {"rings": np.float64}

def merge_two_dicts(x, y):
    z = x.copy()
    z.update(y)
    return z

if __name__ == "__main__":
    base_dir = "/opt/ml/processing"

    df = pd.read_csv(
        f"{base_dir}/input/abalone-dataset.csv",
        header=None,
        names=feature_columns_names + [label_column],
        dtype=merge_two_dicts(feature_columns_dtype, label_column_dtype),
    )
    numeric_features = list(feature_columns_names)
    numeric_features.remove("sex")
    numeric_transformer = Pipeline(
        steps=[("imputer", SimpleImputer(strategy="median")), ("scaler", StandardScaler())]
    )

    categorical_features = ["sex"]
    categorical_transformer = Pipeline(
        steps=[
            ("imputer", SimpleImputer(strategy="constant", fill_value="missing")),
            ("onehot", OneHotEncoder(handle_unknown="ignore")),
        ]
    )

    preprocess = ColumnTransformer(
        transformers=[
            ("num", numeric_transformer, numeric_features),
            ("cat", categorical_transformer, categorical_features),
        ]
    )

    y = df.pop("rings")
    X_pre = preprocess.fit_transform(df)
    y_pre = y.to_numpy().reshape(len(y), 1)

    X = np.concatenate((y_pre, X_pre), axis=1)

    np.random.shuffle(X)
    train, validation, test = np.split(X, [int(0.7 * len(X)), int(0.85 * len(X))])

    pd.DataFrame(train).to_csv(f"{base_dir}/train/train.csv", header=False, index=False)
    pd.DataFrame(validation).to_csv(
        f"{base_dir}/validation/validation.csv", header=False, index=False
    )
    pd.DataFrame(test).to_csv(f"{base_dir}/test/test.csv", header=False, index=False)

Writing code/preprocessing.py


### Step 1: Data Processing

We use a SageMaker Processing job to preprocess the input data with the script above.

In [9]:
from sagemaker.sklearn.processing import SKLearnProcessor

framework_version = "1.2-1"

sklearn_processor = SKLearnProcessor(
    framework_version=framework_version,
    instance_type="ml.m5.xlarge",
    instance_count=processing_instance_count,
    base_job_name="sklearn-abalone-process",
    role=role,
    sagemaker_session=pipeline_session,
)

from sagemaker.processing import ProcessingInput, ProcessingOutput
from sagemaker.workflow.steps import ProcessingStep

processor_args = sklearn_processor.run(
    inputs=[
        ProcessingInput(source=input_data, destination="/opt/ml/processing/input"),
    ],
    outputs=[
        ProcessingOutput(output_name="train", source="/opt/ml/processing/train"),
        ProcessingOutput(output_name="validation", source="/opt/ml/processing/validation"),
        ProcessingOutput(output_name="test", source="/opt/ml/processing/test"),
    ],
    code="code/preprocessing.py",
)

step_process = ProcessingStep(name="AbaloneProcess", step_args=processor_args)



### Step 2: Model Training

We train an XGBoost model using the preprocessed training and validation data.

In [10]:
from sagemaker.estimator import Estimator
from sagemaker.inputs import TrainingInput

model_path = f"s3://{default_bucket}/AbaloneTrain"
image_uri = sagemaker.image_uris.retrieve(
    framework="xgboost",
    region=region,
    version="1.0-1",
    py_version="py3",
    instance_type="ml.m5.xlarge",
)
xgb_train = Estimator(
    image_uri=image_uri,
    instance_type=instance_type,
    instance_count=1,
    output_path=model_path,
    role=role,
    sagemaker_session=pipeline_session,
)
xgb_train.set_hyperparameters(
    objective="reg:linear",
    num_round=50,
    max_depth=5,
    eta=0.2,
    gamma=4,
    min_child_weight=6,
    subsample=0.7,
)

train_args = xgb_train.fit(
    inputs={
        "train": TrainingInput(
            s3_data=step_process.properties.ProcessingOutputConfig.Outputs["train"].S3Output.S3Uri,
            content_type="text/csv",
        ),
        "validation": TrainingInput(
            s3_data=step_process.properties.ProcessingOutputConfig.Outputs["validation"].S3Output.S3Uri,
            content_type="text/csv",
        ),
    }
)

from sagemaker.workflow.steps import TrainingStep

step_train = TrainingStep(name="AbaloneTrain", step_args=train_args)

### Step 3: Model Evaluation

We evaluate the trained model using the test dataset and calculate the mean squared error (MSE).

In [11]:
%%writefile code/evaluation.py
import json
import pathlib
import pickle
import tarfile

import joblib
import numpy as np
import pandas as pd
import xgboost

from sklearn.metrics import mean_squared_error

if __name__ == "__main__":
    model_path = "/opt/ml/processing/model/model.tar.gz"
    with tarfile.open(model_path) as tar:
        tar.extractall(path=".")

    model = pickle.load(open("xgboost-model", "rb"))

    test_path = "/opt/ml/processing/test/test.csv"
    df = pd.read_csv(test_path, header=None)

    y_test = df.iloc[:, 0].to_numpy()
    df.drop(df.columns[0], axis=1, inplace=True)

    X_test = xgboost.DMatrix(df.values)

    predictions = model.predict(X_test)

    mse = mean_squared_error(y_test, predictions)
    std = np.std(y_test - predictions)
    report_dict = {
        "regression_metrics": {
            "mse": {"value": mse, "standard_deviation": std},
        },
    }

    output_dir = "/opt/ml/processing/evaluation"
    pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)

    evaluation_path = f"{output_dir}/evaluation.json"
    with open(evaluation_path, "w") as f:
        f.write(json.dumps(report_dict))

Writing code/evaluation.py


In [12]:
from sagemaker.processing import ScriptProcessor

script_eval = ScriptProcessor(
    image_uri=image_uri,
    command=["python3"],
    instance_type="ml.m5.xlarge",
    instance_count=1,
    base_job_name="script-abalone-eval",
    role=role,
    sagemaker_session=pipeline_session,
)

eval_args = script_eval.run(
    inputs=[
        ProcessingInput(
            source=step_train.properties.ModelArtifacts.S3ModelArtifacts,
            destination="/opt/ml/processing/model",
        ),
        ProcessingInput(
            source=step_process.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri,
            destination="/opt/ml/processing/test",
        ),
    ],
    outputs=[
        ProcessingOutput(output_name="evaluation", source="/opt/ml/processing/evaluation"),
    ],
    code="code/evaluation.py",
)

from sagemaker.workflow.properties import PropertyFile

evaluation_report = PropertyFile(
    name="EvaluationReport", output_name="evaluation", path="evaluation.json"
)
step_eval = ProcessingStep(
    name="AbaloneEval",
    step_args=eval_args,
    property_files=[evaluation_report],
)

### Step 4: Create Model

If the model passes the evaluation threshold, we create a SageMaker model from the trained artifacts.

In [13]:
from sagemaker.model import Model

model = Model(
    image_uri=image_uri,
    model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
    sagemaker_session=pipeline_session,
    role=role,
)

from sagemaker.workflow.model_step import ModelStep

step_create_model = ModelStep(
    name="AbaloneCreateModel",
    step_args=model.create(instance_type="ml.m5.large", accelerator_type="ml.eia1.medium"),
)

### Step 5: Batch Transformation

We perform batch transformation on the batch data using the created model.

In [14]:
from sagemaker.transformer import Transformer

transformer = Transformer(
    model_name=step_create_model.properties.ModelName,
    instance_type="ml.m5.xlarge",
    instance_count=1,
    output_path=f"s3://{default_bucket}/AbaloneTransform",
)

from sagemaker.inputs import TransformInput
from sagemaker.workflow.steps import TransformStep

step_transform = TransformStep(
    name="AbaloneTransform", transformer=transformer, inputs=TransformInput(data=batch_data)
)

### Step 6: Register Model

We register the model in the SageMaker model registry for versioning and management.

In [15]:
from sagemaker.model_metrics import MetricsSource, ModelMetrics

model_metrics = ModelMetrics(
    model_statistics=MetricsSource(
        s3_uri="{}/evaluation.json".format(
            step_eval.arguments["ProcessingOutputConfig"]["Outputs"][0]["S3Output"]["S3Uri"]
        ),
        content_type="application/json",
    )
)

register_args = model.register(
    content_types=["text/csv"],
    response_types=["text/csv"],
    inference_instances=["ml.t2.medium", "ml.m5.xlarge"],
    transform_instances=["ml.m5.xlarge"],
    model_package_group_name=model_package_group_name,
    approval_status=model_approval_status,
    model_metrics=model_metrics,
)

step_register = ModelStep(name="AbaloneRegisterModel", step_args=register_args)



### Step 7: Conditional Step

This step checks if the MSE is below the threshold. If yes, it proceeds with model registration, creation, and transformation; otherwise, it fails the pipeline.

In [16]:
from sagemaker.workflow.fail_step import FailStep
from sagemaker.workflow.functions import Join

step_fail = FailStep(
    name="AbaloneMSEFail",
    error_message=Join(on=" ", values=["Execution failed due to MSE >", mse_threshold]),
)

from sagemaker.workflow.conditions import ConditionLessThanOrEqualTo
from sagemaker.workflow.condition_step import ConditionStep
from sagemaker.workflow.functions import JsonGet

cond_lte = ConditionLessThanOrEqualTo(
    left=JsonGet(
        step_name=step_eval.name,
        property_file=evaluation_report,
        json_path="regression_metrics.mse.value",
    ),
    right=mse_threshold,
)

step_cond = ConditionStep(
    name="AbaloneMSECond",
    conditions=[cond_lte],
    if_steps=[step_register, step_create_model, step_transform],
    else_steps=[step_fail],
)

### Defining and Executing the Pipeline

We assemble all steps into a SageMaker pipeline, upsert it, and start an execution.

In [17]:
from sagemaker.workflow.pipeline import Pipeline

pipeline_name = create_name_with_timestamp("AbalonePipeline")
pipeline = Pipeline(
    name=pipeline_name,
    parameters=[
        processing_instance_count,
        instance_type,
        model_approval_status,
        input_data,
        batch_data,
        mse_threshold,
    ],
    steps=[step_process, step_train, step_eval, step_cond],
)

import json

definition = json.loads(pipeline.definition())
print("Pipeline definition:", definition)

pipeline_arn = pipeline.upsert(role_arn=role)['PipelineArn']
print(f"Pipeline ARN: {pipeline_arn}")



Pipeline definition: {'Version': '2020-12-01', 'Metadata': {}, 'Parameters': [{'Name': 'ProcessingInstanceCount', 'Type': 'Integer', 'DefaultValue': 1}, {'Name': 'TrainingInstanceType', 'Type': 'String', 'DefaultValue': 'ml.m5.xlarge'}, {'Name': 'ModelApprovalStatus', 'Type': 'String', 'DefaultValue': 'PendingManualApproval'}, {'Name': 'InputData', 'Type': 'String', 'DefaultValue': 's3://sagemaker-us-east-1-794038231401/abalone/abalone-dataset.csv'}, {'Name': 'BatchData', 'Type': 'String', 'DefaultValue': 's3://sagemaker-us-east-1-794038231401/abalone/abalone-dataset-batch'}, {'Name': 'MseThreshold', 'Type': 'Float', 'DefaultValue': 6.0}], 'PipelineExperimentConfig': {'ExperimentName': {'Get': 'Execution.PipelineName'}, 'TrialName': {'Get': 'Execution.PipelineExecutionId'}}, 'Steps': [{'Name': 'AbaloneProcess', 'Type': 'Processing', 'Arguments': {'ProcessingResources': {'ClusterConfig': {'InstanceType': 'ml.m5.xlarge', 'InstanceCount': {'Get': 'Parameters.ProcessingInstanceCount'}, 'Vo

## Section 2: Tagging Resources for visualization on SageMaker Unified Studio

To visualize the pipeline and its resources in the SageMaker Unified Studio console when running outside a Unified Studio project, we manually tag them. This section:

1. Tags the pipeline.
2. Starts a pipeline execution.
3. Retrieves ARNs of created resources.
4. Tags these resources for visibility.

In [18]:
tags = [
    {'Key': 'AmazonDataZoneDomain', 'Value': 'dzd_amkwelbodmx2av'},
    {'Key': 'AmazonDataZoneProject', 'Value': '55v4yp8pzuqs6f'},
]

client = boto3.client('sagemaker')

response = client.add_tags(
    ResourceArn=pipeline_arn,
    Tags=tags
)
print(f"Tags added to pipeline: {response}")

# Start pipeline execution and wait for completion
execution = pipeline.start()
print(f"Pipeline execution started with ARN: {execution.arn}")
execution.wait()

execution_arn = execution.arn

# Function to retrieve resource ARNs from pipeline execution
def get_pipeline_resource_arns(execution_arn):
    sm = boto3.client('sagemaker')
    try:
        response = sm.list_pipeline_execution_steps(PipelineExecutionArn=execution_arn)
        steps = response['PipelineExecutionSteps']
        arns = []
        
        for step in steps:
            metadata = step['Metadata']
            step_name = step['StepName']
            if 'ProcessingJob' in metadata:
                arns.append({'Step': step_name, 'Type': 'ProcessingJob', 'ARN': metadata['ProcessingJob']['Arn']})
            elif 'TrainingJob' in metadata:
                arns.append({'Step': step_name, 'Type': 'TrainingJob', 'ARN': metadata['TrainingJob']['Arn']})
            elif 'Model' in metadata:
                arns.append({'Step': step_name, 'Type': 'Model', 'ARN': metadata['Model']['Arn']})
            elif 'RegisterModel' in metadata:
                arns.append({'Step': step_name, 'Type': 'ModelPackage', 'ARN': metadata['RegisterModel']['Arn']})
            elif 'TransformJob' in metadata:
                arns.append({'Step': step_name, 'Type': 'TransformJob', 'ARN': metadata['TransformJob']['Arn']})
        
        return arns
    except Exception as e:
        print(f"Error retrieving pipeline steps: {e}")
        return []

resource_arns = get_pipeline_resource_arns(execution_arn)
resources = {arn['Type']: arn['ARN'] for arn in resource_arns}

# Retrieve and tag Model Package Group ARN
if 'ModelPackage' in resources:
    model_package_arn = resources['ModelPackage']
    response = client.describe_model_package(ModelPackageName=model_package_arn)
    model_package_group_name = response['ModelPackageGroupName']
    response = client.describe_model_package_group(ModelPackageGroupName=model_package_group_name)
    resources['ModelPackageGroup'] = response['ModelPackageGroupArn']
    resources.pop('ModelPackage')  # Remove individual ModelPackage as we tag the group

# Tag all retrieved resources
for resource_type, arn in resources.items():
    print(f"\nTagging {resource_type} with ARN: {arn}")
    response = client.add_tags(
        ResourceArn=arn,
        Tags=tags
    )
    print(f"Tags added: {response}")

Tags added to pipeline: {'Tags': [{'Key': 'AmazonDataZoneDomain', 'Value': 'dzd_amkwelbodmx2av'}, {'Key': 'AmazonDataZoneProject', 'Value': '55v4yp8pzuqs6f'}], 'ResponseMetadata': {'RequestId': 'af9ff4d1-f0a1-47d1-a06e-af8985146f6a', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'af9ff4d1-f0a1-47d1-a06e-af8985146f6a', 'content-type': 'application/x-amz-json-1.1', 'content-length': '127', 'date': 'Sat, 28 Jun 2025 07:50:22 GMT'}, 'RetryAttempts': 0}}
Pipeline execution started with ARN: arn:aws:sagemaker:us-east-1:794038231401:pipeline/AbalonePipeline20250628075021/execution/5snd76p7s48z

Tagging TransformJob with ARN: arn:aws:sagemaker:us-east-1:794038231401:transform-job/pipelines-5snd76p7s48z-AbaloneTransform-58yi5S06bm
Tags added: {'Tags': [{'Key': 'AmazonDataZoneDomain', 'Value': 'dzd_amkwelbodmx2av'}, {'Key': 'AmazonDataZoneProject', 'Value': '55v4yp8pzuqs6f'}], 'ResponseMetadata': {'RequestId': 'efc69632-9dfc-4df6-b7bd-67ca1531e71b', 'HTTPStatusCode': 200, 'HTTPHeade

## Conclusion

In this notebook, we have:

1. Defined and executed a SageMaker pipeline that automates a machine learning workflow from data preprocessing to model deployment.
2. Demonstrated how to tag the pipeline and its resources to make them visible in the SageMaker Unified Studio console when running outside a Unified Studio project.

To view the pipeline and its executions, navigate to the SageMaker Unified Studio console and select the appropriate project. Ensure resources are tagged as shown.