# Cursus: Automatic SageMaker (MODS) Pipeline Compiler

The main contribution of this work is **Cursus**, a **compiler** that automatically generate **[MODS (Model Training Workflow Operation and Development System) Pipeline](https://w.amazon.com/bin/view/CMLS/Overview/MODS/)** base on two set of user inputs
* The **Pipeline DAG (Directed Acylic Graph)**, which describe pipeline as a graph
* The **Unified Config JSON**, which provides a central hub to extract all user inputs and their associated step information
    * Run [demo_config](./demo_config.ipynb) first to generate the Unified Config JSON
    * The config json will be saved in `./pipeling_config/xxx/` folder

![mods_pipeline_train_eval_calib](./demo/mods_pipeline_train_eval_calib.png)


In [1]:
#!pip uninstall -y rpds-py
#!pip install rpds-py --force-reinstall

In [2]:
#!pip install amzn-secure-ai-sandbox-workflow-python-sdk --ignore-installed

In [3]:
#!pip install amzn-mods-workflow-helper amzn-mods-python-sdk --upgrade

In [4]:
import os
import json
import pandas as pd
import pickle
import sys
import subprocess
from datetime import datetime

from pathlib import Path

In [5]:
from pydantic import BaseModel, Field, model_validator, field_validator
from typing import List, Optional, Dict, Any, Type, Union, Tuple

In [6]:
from collections import defaultdict, deque

In [7]:
import logging

In [8]:
logging.basicConfig(
    level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
)
logger = logging.getLogger(__name__)

## Environment Setup

In [9]:
from sagemaker import Session
from secure_ai_sandbox_python_lib.session import Session as SaisSession

2025-12-02 07:09:25,760 - INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml


2025-12-02 07:09:26,153 - INFO - CA certs are provided via the AmazonCACerts installation at /home/ec2-user/.local/lib/python3.10/site-packages/amazoncerts


In [10]:
from mods_workflow_helper.utils.secure_session import create_secure_session_config
from mods_workflow_helper.sagemaker_pipeline_helper import SecurityConfig

from sagemaker.workflow.pipeline_context import PipelineSession

In [11]:
# Initialize session with team bucket
sais_session = SaisSession(".")

security_config = SecurityConfig(
    kms_key=sais_session.get_team_owned_bucket_kms_key(),
    security_group=sais_session.sandbox_vpc_security_group(),
    vpc_subnets=sais_session.sandbox_vpc_subnets(),
)

2025-12-02 07:09:26,751 - INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole
2025-12-02 07:09:27,574 - INFO - successfully patched module botocore


In [12]:
sagemaker_config = create_secure_session_config(
    role_arn=PipelineSession().get_caller_identity_arn(),
    # If you are uploading to andes, use cradle_read_s3_bucket_name() and get_cradle_read_bucket_kms_key() respecitely
    bucket_name=sais_session.team_owned_s3_bucket_name(),
    kms_key=sais_session.get_team_owned_bucket_kms_key(),
    vpc_subnet_ids=sais_session.sandbox_vpc_subnets(),
    vpc_security_groups=[sais_session.sandbox_vpc_security_group()],
)

2025-12-02 07:09:27,595 - INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole
2025-12-02 07:09:27,975 - INFO - There is no MODS workflow execution id provided, this is probably because you are running your pipeline outside of MODS.


In [13]:
pipeline_session = PipelineSession(
    default_bucket=sais_session.team_owned_s3_bucket_name(),
    sagemaker_config=sagemaker_config,
)  # IMPORTANT now the session uses the generated sagemaker_config

2025-12-02 07:09:27,996 - INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


In [14]:
pipeline_session.config = sagemaker_config

In [15]:
bucket = sais_session.team_owned_s3_bucket_name()
bucket

'sandboxdependency-abuse-secureaisandboxteamshare-1l77v9am252um'

In [16]:
role = PipelineSession().get_caller_identity_arn()
role

2025-12-02 07:09:28,545 - INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


'arn:aws:iam::601857636239:role/SandboxRole-lukexie-us-east-1'

In [17]:
from pathlib import Path
import sys

# Get parent directory of current notebook
project_root = str(Path().absolute().parent)
print(f"project root {project_root}")
if project_root not in sys.path:
    sys.path.insert(0, project_root)
    print(f"add project root {project_root} into system")

project root /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template
add project root /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template into system


## Basic Information

In [18]:
region_list = ["NA", "EU", "FE"]

In [19]:
region_selection = 0

In [20]:
region = region_list[region_selection]
region

'NA'

In [21]:
MODEL_CLASS = "pytorch"

In [22]:
service_name = "BuyerAbuseRnR"

#### Config and Hyperparameter Information

In [23]:
current_dir = Path.cwd()
config_dir = Path(current_dir) / "pipeline_config"
print(config_dir)

/home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config


In [24]:
# hyparam_filename = f'hyperparameters_{region}_{MODEL_CLASS}.json' #'hyperparameters.json'

In [25]:
pipeline_config_name = f"config.json"  # f'config_{region}.json'
pipeline_config_name

'config.json'

In [26]:
config_path = config_dir / pipeline_config_name

In [27]:
config_path

PosixPath('/home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json')

## Pipeline Imports

In [28]:
from enum import Enum
from pydantic import BaseModel

## [Optional]: Test Config Load Functionality

Please skip this section if you are not concern about the config information loaded

### Hyperparameters

In [29]:
# from cursus.steps.hyperparams.hyperparameters_xgboost import XGBoostModelHyperparameters

In [30]:
# hyparam_path = config_dir / hyparam_filename
# with open(hyparam_path, 'r') as file:
#    hyperparam_dict = json.load(file)

In [31]:
# hyperparams = XGBoostModelHyperparameters(**hyperparam_dict)

In [32]:
# hyperparams.num_classes

In [33]:
# hyperparams.is_binary

### Import Configs

In [34]:
from cursus.core.base.config_base import BasePipelineConfig

In [35]:
# from cursus.steps.configs.config_cradle_data_loading_step import (CradleDataLoadingConfig,
#                                                    MdsDataSourceConfig,
#                                                    EdxDataSourceConfig,
#                                                    DataSourceConfig,
#                                                    DataSourcesSpecificationConfig,
#                                                    JobSplitOptionsConfig,
#                                                    TransformSpecificationConfig,
#                                                    OutputSpecificationConfig,
#                                                    CradleJobSpecificationConfig
#                                                   )

In [36]:
from cursus.steps.configs.config_dummy_data_loading_step import DummyDataLoadingConfig
from cursus.steps.configs.config_tabular_preprocessing_step import (
    TabularPreprocessingConfig,
)
from cursus.steps.configs.config_pytorch_training_step import PyTorchTrainingConfig
from cursus.steps.configs.config_pytorch_model_eval_step import PyTorchModelEvalConfig
from cursus.steps.configs.config_model_calibration_step import ModelCalibrationConfig
from cursus.steps.configs.config_package_step import PackageConfig
from cursus.steps.configs.config_payload_step import PayloadConfig
from cursus.steps.configs.config_package_step import PackageConfig
from cursus.steps.configs.config_payload_step import PayloadConfig
from cursus.steps.configs.config_registration_step import RegistrationConfig

### Load Config

In [37]:
from cursus.steps.configs.utils import (
    serialize_config,
    merge_and_save_configs,
    load_configs,
    verify_configs,
)

In [38]:
CONFIG_CLASSES = {
    "DummyDataLoadingConfig": DummyDataLoadingConfig,
    "TabularPreprocessingConfig": TabularPreprocessingConfig,
    "PyTorchTrainingConfig": PyTorchTrainingConfig,
    "PyTorchModelEvalConfig": PyTorchModelEvalConfig,
    "ModelCalibrationConfig": ModelCalibrationConfig,
    "PackageConfig": PackageConfig,
    "PayloadConfig": PayloadConfig,
    "RegistrationConfig": RegistrationConfig,
}

In [39]:
config_path

PosixPath('/home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json')

In [40]:
# Load configs
loaded_configs = load_configs(config_path, CONFIG_CLASSES)

2025-12-02 07:09:29,609 - INFO - Loading configs from /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json
2025-12-02 07:09:29,609 - INFO - Loading configuration from /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json
2025-12-02 07:09:29,610 - INFO - Successfully loaded configuration from /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json
2025-12-02 07:09:29,610 - INFO - Successfully loaded configs from /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json with 10 specific configs
2025-12-02 07:09:29,611 - INFO - Creating additional config instance for DummyDataLoading_calibration (DummyDataLoadingConfig)
2025-12-02 07:09:29,612 - INFO - üîß BuilderAutoDiscovery.__init__ starting - package_root:

In [41]:
loaded_configs

{'DummyDataLoading_calibration': DummyDataLoadingConfig(author='lukexie', bucket='sandboxdependency-abuse-secureaisandboxteamshare-1l77v9am252um', role='arn:aws:iam::601857636239:role/SandboxRole-lukexie-us-east-1', region='NA', service_name='BuyerAbuseRnR', pipeline_version='1.0.0', model_class='pytorch', current_date='2025-12-02', framework_version='2.1.0', py_version='py310', source_dir='dockers', enable_caching=False, use_secure_pypi=True, max_runtime_seconds=172800, project_root_folder='rnr_pytorch_bedrock', processing_instance_count=1, processing_volume_size=500, processing_instance_type_large='ml.m5.12xlarge', processing_instance_type_small='ml.m5.4xlarge', use_large_processing_instance=True, processing_source_dir='dockers/scripts', processing_entry_point='dummy_data_loading.py', processing_script_arguments=None, processing_framework_version='1.2-1', data_source='/home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/processed_data/test

In [42]:
len(loaded_configs)

10

In [43]:
[str(k) for k in loaded_configs.keys()]

['DummyDataLoading_calibration',
 'DummyDataLoading_training',
 'ModelCalibration_calibration',
 'Package',
 'Payload',
 'PyTorchModelEval_calibration',
 'PyTorchTraining',
 'Registration',
 'TabularPreprocessing_calibration',
 'TabularPreprocessing_training']

In [44]:
first_config = next(iter(loaded_configs.values()))

In [45]:
PIPELINE_VERSION = first_config.pipeline_version

In [46]:
PIPELINE_DESCRIPTION = first_config.pipeline_description

In [47]:
PIPELINE_NAME = first_config.pipeline_name

## Parameter Setup

In [48]:
from mods_workflow_core.utils.constants import (
    PIPELINE_EXECUTION_TEMP_DIR,
    KMS_ENCRYPTION_KEY_PARAM,
    PROCESSING_JOB_SHARED_NETWORK_CONFIG,
    SECURITY_GROUP_ID,
    VPC_SUBNET,
)

### Execution Id

In [49]:
execution_id = datetime.now().strftime("%Y%m%d%H%M%S")

## Import Packages

In [50]:
from abc import ABC, abstractmethod
from typing import Dict, List, Any, Optional, Type
from pathlib import Path
import logging
import os
import importlib

In [51]:
import sagemaker
from sagemaker import Session
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.parameters import ParameterString
from sagemaker.workflow.parameters import Parameter
from sagemaker.workflow.properties import Properties
from sagemaker.workflow.pipeline_context import PipelineSession  # Crucial import

## Demo: An End-to-End Pipeline based on PipelineDAG Compiler
Let us use the following simpler DAG (without registration as example)


In this demo there are several user input
* the **Unified JSON file** in `config_path`
* the **Registry Manager**: an object that handles the map between step logical name to `step.properties`
* the **Dependency Resolver**: an object than handles the *automatic dependency resolution* between steps
* the other fields
    * `sagemaker_session`: pipelne session
    * `role`: IAM Role
    * `notebook_root`: track the root path 


In this pipeline template, we inherit from base class `PipelineTemplateBase`. 

The **major tasks** are
* *`Config` Classes Import*
* *Configuration Validation*
* *Step Builder Retrieval and Step Builder Map Creation*
* *Configuration Map Creation*
* **Pipeline DAG Generation**: ideally, user should create this DAG and use it as input
* **Automatic Pipeline Assemble**: Call `pipeline_assembler`


### DAG to Template Compiler

In [52]:
from cursus.api.dag.base_dag import PipelineDAG
from cursus.core.compiler.dag_compiler import (
    compile_dag_to_pipeline,
    PipelineDAGCompiler,
)
from cursus.core.compiler.validation import ConversionReport
from cursus.steps.configs.utils import load_configs

In [53]:
def create_dummy_pytorch_training_dag() -> PipelineDAG:
    """
    Create a DAG for Bedrock Batch-enhanced PyTorch E2E pipeline with Label Ruleset steps.

    This DAG represents a complete end-to-end workflow that uses:
    1. Bedrock prompt template generation and batch processing for LLM-enhanced data
    2. Label ruleset generation and execution for transparent label transformation
    3. PyTorch training, followed by calibration, packaging, and registration

    The label ruleset steps sit between Bedrock processing and training/evaluation,
    providing transparent, rule-based label transformation that's easy to modify.

    Returns:
        PipelineDAG: The directed acyclic graph for the pipeline
    """
    dag = PipelineDAG()

    # Add all nodes - matching the structure from demo_config.ipynb
    dag.add_node("DummyDataLoading_training")  # Training data loading
    dag.add_node("TabularPreprocessing_training")  # Training data preprocessing
    dag.add_node("PyTorchTraining")  # XGBoost model training

    dag.add_node("DummyDataLoading_calibration")  # Dummy data load for calibration
    dag.add_node(
        "TabularPreprocessing_calibration"
    )  # Tabular preprocessing for calibration
    dag.add_node("PyTorchModelEval_calibration")  # Model evaluation step
    dag.add_node(
        "ModelCalibration_calibration"
    )  # Model calibration step with calibration variant
    dag.add_node("Package")  # Package step
    dag.add_node("Registration")  # MIMS registration step
    dag.add_node("Payload")  # Payload step

    # Define dependencies - training flow
    dag.add_edge("DummyDataLoading_training", "TabularPreprocessing_training")
    dag.add_edge("TabularPreprocessing_training", "PyTorchTraining")

    # Calibration flow with Bedrock batch processing and label ruleset integration
    dag.add_edge("DummyDataLoading_calibration", "TabularPreprocessing_calibration")

    # Evaluation flow
    dag.add_edge("PyTorchTraining", "PyTorchModelEval_calibration")
    dag.add_edge(
        "TabularPreprocessing_calibration", "PyTorchModelEval_calibration"
    )  # Use labeled calibration data

    # Model calibration flow - depends on model evaluation
    dag.add_edge("PyTorchModelEval_calibration", "ModelCalibration_calibration")

    # Output flow
    dag.add_edge("ModelCalibration_calibration", "Package")
    dag.add_edge("PyTorchTraining", "Package")  # Raw model is also input to packaging
    dag.add_edge("PyTorchTraining", "Payload")  # Raw model is also input to packaging
    dag.add_edge("Package", "Registration")
    dag.add_edge("Payload", "Registration")

    logger.info(
        f"Created XGBoost E2E DAG with {len(dag.nodes)} nodes and {len(dag.edges)} edges"
    )
    return dag

In [54]:
dag = create_dummy_pytorch_training_dag()

2025-12-02 07:09:29,857 - INFO - Added node: DummyDataLoading_training
2025-12-02 07:09:29,857 - INFO - Added node: TabularPreprocessing_training
2025-12-02 07:09:29,858 - INFO - Added node: PyTorchTraining
2025-12-02 07:09:29,858 - INFO - Added node: DummyDataLoading_calibration
2025-12-02 07:09:29,859 - INFO - Added node: TabularPreprocessing_calibration
2025-12-02 07:09:29,859 - INFO - Added node: PyTorchModelEval_calibration
2025-12-02 07:09:29,859 - INFO - Added node: ModelCalibration_calibration
2025-12-02 07:09:29,859 - INFO - Added node: Package
2025-12-02 07:09:29,860 - INFO - Added node: Registration
2025-12-02 07:09:29,860 - INFO - Added node: Payload
2025-12-02 07:09:29,860 - INFO - Added edge: DummyDataLoading_training -> TabularPreprocessing_training
2025-12-02 07:09:29,861 - INFO - Added edge: TabularPreprocessing_training -> PyTorchTraining
2025-12-02 07:09:29,861 - INFO - Added edge: DummyDataLoading_calibration -> TabularPreprocessing_calibration
2025-12-02 07:09:29,8

In [55]:
pipeline_parameters = [
    PIPELINE_EXECUTION_TEMP_DIR,
    KMS_ENCRYPTION_KEY_PARAM,
    SECURITY_GROUP_ID,
    VPC_SUBNET,
]

In [56]:
dag_compiler = PipelineDAGCompiler(
    config_path=config_path,
    sagemaker_session=pipeline_session,
    role=role,
    pipeline_parameters=pipeline_parameters,
)

2025-12-02 07:09:29,874 - INFO - üîß BuilderAutoDiscovery.__init__ starting - package_root: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/cursus
2025-12-02 07:09:29,875 - INFO - üîß BuilderAutoDiscovery.__init__ - workspace_dirs: []
2025-12-02 07:09:29,875 - INFO - ‚úÖ BuilderAutoDiscovery basic initialization complete
2025-12-02 07:09:29,876 - INFO - ‚úÖ Registry info loaded: 43 steps
2025-12-02 07:09:29,876 - INFO - üéâ BuilderAutoDiscovery initialization completed successfully
2025-12-02 07:09:29,877 - INFO - üîç ScriptAutoDiscovery.__init__ starting - package_root: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/cursus
2025-12-02 07:09:29,877 - INFO - üîç ScriptAutoDiscovery.__init__ - workspace_dirs: []
2025-12-02 07:09:29,877 - INFO - üîç ScriptAutoDiscovery.__init__ - priority_workspace_dir: None
2025-12-02 07:09:29,878 - INFO - ‚úÖ Registry info loaded: 43 steps
2025-12-02 07:09:29,878 - INFO - üéâ ScriptAutoD

### Create a Pipeline

#### DAG Validation and Preview of Config Resolution

In [57]:
preview_only = True

In [58]:
if preview_only:
    preview = dag_compiler.preview_resolution(dag)
    logger.info("DAG node resolution preview:")
    for node, config_type in preview.node_config_map.items():
        confidence = preview.resolution_confidence.get(node, 0.0)
        logger.info(f"  {node} ‚Üí {config_type} (confidence: {confidence:.2f})")

    if preview.recommendations:
        logger.info("Recommendations:")
        for recommendation in preview.recommendations:
            logger.info(f"  - {recommendation}")

    validation = dag_compiler.validate_dag_compatibility(dag)
    logger.info(f"DAG validation: {'VALID' if validation.is_valid else 'INVALID'}")
    if not validation.is_valid:
        if validation.missing_configs:
            logger.warning(f"Missing configs: {validation.missing_configs}")
        if validation.unresolvable_builders:
            logger.warning(f"Unresolvable builders: {validation.unresolvable_builders}")
        if validation.config_errors:
            logger.warning(f"Config errors: {validation.config_errors}")

2025-12-02 07:09:29,893 - INFO - Previewing resolution for 10 DAG nodes
2025-12-02 07:09:29,893 - INFO - Creating template for DAG with 10 nodes
2025-12-02 07:09:29,894 - INFO - üîß BuilderAutoDiscovery.__init__ starting - package_root: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/cursus
2025-12-02 07:09:29,894 - INFO - üîß BuilderAutoDiscovery.__init__ - workspace_dirs: []
2025-12-02 07:09:29,895 - INFO - ‚úÖ BuilderAutoDiscovery basic initialization complete
2025-12-02 07:09:29,896 - INFO - ‚úÖ Registry info loaded: 43 steps
2025-12-02 07:09:29,896 - INFO - üéâ BuilderAutoDiscovery initialization completed successfully
2025-12-02 07:09:29,896 - INFO - üîç ScriptAutoDiscovery.__init__ starting - package_root: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/cursus
2025-12-02 07:09:29,897 - INFO - üîç ScriptAutoDiscovery.__init__ - workspace_dirs: []
2025-12-02 07:09:29,897 - INFO - üîç ScriptAutoDiscovery.__init__ - p

### Put it Together: Pipeline Generation from DAG

In [59]:
# Convert DAG to pipeline and get report
try:
    logger.info(f"Converting DAG to pipeline")
    template_pipeline, report = dag_compiler.compile_with_report(dag=dag)

    # Log report summary
    logger.info(f"Conversion complete: {report.summary()}")
    for node, details in report.resolution_details.items():
        logger.info(f"  {node} ‚Üí {details['config_type']} ({details['builder_type']})")

    # Log pipeline creation details
    logger.info(f"Pipeline '{template_pipeline.name}' created successfully")
    logger.info(
        f"Pipeline ARN: {template_pipeline.arn if hasattr(template_pipeline, 'arn') else 'Not available until upserted'}"
    )
    logger.info("To upsert the pipeline, call pipeline.upsert()")
except Exception as e:
    logger.error(f"Failed to convert DAG to pipeline: {e}")
    raise

2025-12-02 07:09:30,055 - INFO - Converting DAG to pipeline
2025-12-02 07:09:30,055 - INFO - Compiling DAG with detailed reporting
2025-12-02 07:09:30,056 - INFO - Compiling DAG with 10 nodes to pipeline
2025-12-02 07:09:30,056 - INFO - Creating template for DAG with 10 nodes
2025-12-02 07:09:30,056 - INFO - Loading configs from: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json
2025-12-02 07:09:30,057 - INFO - üîß BuilderAutoDiscovery.__init__ starting - package_root: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/cursus
2025-12-02 07:09:30,057 - INFO - üîß BuilderAutoDiscovery.__init__ - workspace_dirs: []
2025-12-02 07:09:30,058 - INFO - ‚úÖ BuilderAutoDiscovery basic initialization complete
2025-12-02 07:09:30,058 - INFO - ‚úÖ Registry info loaded: 43 steps
2025-12-02 07:09:30,059 - INFO - üéâ BuilderAutoDiscovery initialization completed successfully
2025-12-02 07:09:30,05

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds


2025-12-02 07:09:31,192 - INFO - Using user-provided data source from configuration: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/processed_data/ -> /opt/ml/processing/input/data
2025-12-02 07:09:31,192 - INFO - No command-line arguments needed for dummy data loading script
2025-12-02 07:09:31,193 - INFO - Using script path: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers/scripts/dummy_data_loading.py
2025-12-02 07:09:31,194 - INFO - Created ProcessingStep with name: DummyDataLoading-Training
2025-12-02 07:09:31,194 - INFO - Built step DummyDataLoading_training
2025-12-02 07:09:31,195 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:31,195 - INFO - Creating Dummy Data Loading ProcessingStep...
2025-12-02 07:09:31,196 - INFO - Added configuration environment variables: {...}
2025-12-02 07:09:31,197 - INFO - Final dummy data loading environment variables: {.

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds


2025-12-02 07:09:32,492 - INFO - Using user-provided data source from configuration: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/processed_data/test -> /opt/ml/processing/input/data
2025-12-02 07:09:32,493 - INFO - No command-line arguments needed for dummy data loading script
2025-12-02 07:09:32,493 - INFO - Using script path: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers/scripts/dummy_data_loading.py
2025-12-02 07:09:32,494 - INFO - Created ProcessingStep with name: DummyDataLoading-Calibration
2025-12-02 07:09:32,495 - INFO - Built step DummyDataLoading_calibration
2025-12-02 07:09:32,496 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:32,496 - INFO - Registered specification for step 'TabularPreprocessingStepStep' of type 'TabularPreprocessing' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:32,497 - INFO - Registered specification f

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds


2025-12-02 07:09:33,527 - INFO - Setting job_type argument to: training
2025-12-02 07:09:33,528 - INFO - Using script path: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers/scripts/tabular_preprocessing.py
2025-12-02 07:09:33,529 - INFO - Built step TabularPreprocessing_training
2025-12-02 07:09:33,529 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:33,530 - INFO - Registered specification for step 'TabularPreprocessingStepStep' of type 'TabularPreprocessing' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:33,530 - INFO - Registered specification for step 'DummyDataLoading-Calibration' of type 'DummyDataLoading' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:33,531 - INFO - Best match for DATA: DummyDataLoading-Calibration.DATA (confidence: 1.000)
2025-12-02 07:09:33,531 - INFO - Resolved TabularPreprocessingStepStep.DATA -> DummyDataLoading-Calibration.DATA
2025-12-02 07:09:33

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds


2025-12-02 07:09:34,566 - INFO - Setting job_type argument to: calibration
2025-12-02 07:09:34,566 - INFO - Using script path: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers/scripts/tabular_preprocessing.py
2025-12-02 07:09:34,567 - INFO - Built step TabularPreprocessing_calibration
2025-12-02 07:09:34,568 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:34,568 - INFO - Creating PyTorch TrainingStep...
2025-12-02 07:09:34,569 - INFO - Registered specification for step 'PyTorchTrainingStepStep' of type 'PyTorchTraining' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:34,570 - INFO - Registered specification for step 'TabularPreprocessing-Training' of type 'TabularPreprocessing' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:34,571 - INFO - Best match for input_path: TabularPreprocessing-Training.processed_data (confidence: 0.814)
2025-12-02 07:09:34,571 - INFO - Resolved PyTorc

sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.OutputDataConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.ResourceConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.SecurityGroupIds


2025-12-02 07:09:35,871 - INFO - Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole
2025-12-02 07:09:35,952 - INFO - Created TrainingStep with name: PyTorchTraining
2025-12-02 07:09:35,953 - INFO - Built step PyTorchTraining
2025-12-02 07:09:35,953 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:35,954 - INFO - Creating PyTorchModelEval ProcessingStep...
2025-12-02 07:09:35,954 - INFO - Registered specification for step 'PyTorchModelEvalStepStep' of type 'PyTorchModelEval' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:35,955 - INFO - Registered specification for step 'PyTorchTraining' of type 'PyTorchTraining' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:35,955 - INFO - Registered specification for step 'TabularPreprocessing-Calibration' of type 'TabularPreprocessing' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:35,956 - INFO - Best match for model_input: PyTorchTraining.model_output (confidenc

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.ResourceConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.SecurityGroupIds
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.Environment


2025-12-02 07:09:38,253 - INFO - image_uri is not presented, retrieving image_uri based on instance_type, framework etc.
2025-12-02 07:09:38,276 - INFO - Setting job_type argument to: calibration
2025-12-02 07:09:38,277 - INFO - Using resolved script path: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers/pytorch_model_eval.py
2025-12-02 07:09:38,277 - INFO - Using entry point: pytorch_model_eval.py
2025-12-02 07:09:38,277 - INFO - Using source directory: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers
2025-12-02 07:09:38,278 - INFO - Created ProcessingStep with name: PyTorchModelEval-Calibration
2025-12-02 07:09:38,279 - INFO - Built step PyTorchModelEval_calibration
2025-12-02 07:09:38,279 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:38,280 - INFO - Creating MIMS Payload ProcessingStep...
2025-12-02 07:09:38,280 - INFO - Registered specification f

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds


2025-12-02 07:09:39,323 - INFO - No command-line arguments needed for payload script
2025-12-02 07:09:39,324 - INFO - Using script path: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers/scripts/payload.py
2025-12-02 07:09:39,325 - INFO - Created ProcessingStep with name: Payload
2025-12-02 07:09:39,325 - INFO - Built step Payload
2025-12-02 07:09:39,325 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:39,326 - INFO - Creating ModelCalibration ProcessingStep...
2025-12-02 07:09:39,326 - INFO - Registered specification for step 'ModelCalibrationStepStep' of type 'ModelCalibration' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:39,327 - INFO - Registered specification for step 'PyTorchModelEval-Calibration' of type 'PyTorchModelEval' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:39,328 - INFO - Best match for evaluation_data: PyTorchModelEval-Calibration.eval_output (confidence: 

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds


2025-12-02 07:09:40,365 - INFO - Setting job_type argument to: calibration
2025-12-02 07:09:40,365 - INFO - Using script path: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers/scripts/model_calibration.py
2025-12-02 07:09:40,366 - INFO - Created ProcessingStep with name: ModelCalibration-Calibration
2025-12-02 07:09:40,366 - INFO - Built step ModelCalibration_calibration
2025-12-02 07:09:40,367 - INFO - Using execution_prefix for base output path
2025-12-02 07:09:40,367 - INFO - Creating Packaging ProcessingStep...
2025-12-02 07:09:40,368 - INFO - Registered specification for step 'PackageStepStep' of type 'Package' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:40,368 - INFO - Registered specification for step 'ModelCalibration-Calibration' of type 'ModelCalibration' in context 'lukexie-BuyerAbuseRnR-pytorch-NA'
2025-12-02 07:09:40,368 - INFO - Registered specification for step 'PyTorchTraining' of type 'PyTorchT

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds


2025-12-02 07:09:41,408 - INFO - Package location discovery succeeded (bundled): /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers
2025-12-02 07:09:41,409 - INFO - Hybrid resolution completed successfully via Package Location Discovery: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers
2025-12-02 07:09:41,409 - INFO - Using source dir: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers
2025-12-02 07:09:41,409 - INFO - [PACKAGING INPUT OVERRIDE] Using local inference scripts path from configuration: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers
2025-12-02 07:09:41,410 - INFO - [PACKAGING INPUT OVERRIDE] This local path will be used regardless of any dependency-resolved values
2025-12-02 07:09:41,410 - INFO - Added inference scripts input with local path: /home/ec2

sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingOutputConfig.KmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.ProcessingResources.ClusterConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.NetworkConfig.VpcConfig.SecurityGroupIds
sagemaker.config INFO - Applied value from config key = SageMaker.ProcessingJob.Environment


2025-12-02 07:09:42,434 - INFO - Created MimsModelRegistrationProcessingStep: Registration-NA
2025-12-02 07:09:42,435 - INFO - Built step Registration
2025-12-02 07:09:42,440 - INFO - Generated pipeline lukexie-BuyerAbuseRnR-pytorch-NA-1-0-0-pipeline with 10 steps in 12.32 seconds
2025-12-02 07:09:42,440 - INFO - Stored 10 step instances
2025-12-02 07:09:42,440 - INFO - Pipeline name 'lukexie-BuyerAbuseRnR-pytorch-NA-1.0.0-pipeline' sanitized to 'lukexie-BuyerAbuseRnR-pytorch-NA-1-0-0-pipeline' to conform to SageMaker constraints
2025-12-02 07:09:42,441 - INFO - Successfully compiled DAG to pipeline: lukexie-BuyerAbuseRnR-pytorch-NA-1-0-0-pipeline
2025-12-02 07:09:42,441 - INFO - Previewing resolution for 10 DAG nodes
2025-12-02 07:09:42,441 - INFO - Creating template for DAG with 10 nodes
2025-12-02 07:09:42,442 - INFO - Loading configs from: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/config.json
2025-12-02 07:09:4

### Pipeline Template

After the pipeline is generated, we can retrieve the pipeline template

In [60]:
pipeline_template_builder = dag_compiler.get_last_template()

## Start Execution

In [61]:
role_arn = pipeline_session.get_caller_identity_arn()
role_arn

'arn:aws:iam::601857636239:role/SandboxRole-lukexie-us-east-1'

In [62]:
pipeline_description = PIPELINE_DESCRIPTION

In [63]:
PIPELINE_DESCRIPTION

'BuyerAbuseRnR pytorch Model NA'

## Prepare for Execution Document

In [64]:
from mods_workflow_helper.sagemaker_pipeline_helper import (
    SagemakerPipelineHelper,
    SecurityConfig,
)

In [65]:
default_execution_doc = SagemakerPipelineHelper.get_pipeline_default_execution_document(
    template_pipeline
)
test_execution_doc = default_execution_doc

In [66]:
print(json.dumps(test_execution_doc, indent=2))

{
  "PIPELINE_STEP_CONFIGS": {
    "PyTorchTraining": {
      "STEP_CONFIG": {},
      "STEP_TYPE": "TRAINING_STEP"
    },
    "Registration-NA": {
      "STEP_CONFIG": {
        "model_domain": "The domain to register your model in (this is where you will find your model on DAWS)",
        "model_objective": "The objective to register your model in (this is where you will find your model on DAWS)",
        "source_model_inference_content_types": "Provide a list of types (application/json and text/csv are currently supported) for the content. Ex) ['text/csv']",
        "source_model_inference_response_types": "Provide a list of types (application/json and text/csv are currently supported) for the response. Ex) ['application/json']",
        "source_model_inference_input_variable_list": "Provide a dictionary mapping the variable name to the variable type (variable types supported are 'TEXT' and 'NUMERIC') for both input and output vars. Ex) {'INVAR': 'TEXT'}",
        "source_model_infe

In [67]:
# with open(config_dir / 'sample_exe_doc.json', 'w') as f:
#    json.dump(default_execution_doc, f, indent=2)

### Fill in Execution Doc

In [68]:
from cursus.mods.exe_doc.generator import ExecutionDocumentGenerator

In [69]:
exe_doc_generator = ExecutionDocumentGenerator(
    config_path=config_path,
    sagemaker_session=pipeline_session,
    role=role,
)

2025-12-02 07:09:42,610 - INFO - üîß BuilderAutoDiscovery.__init__ starting - package_root: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/cursus
2025-12-02 07:09:42,611 - INFO - üîß BuilderAutoDiscovery.__init__ - workspace_dirs: []
2025-12-02 07:09:42,611 - INFO - ‚úÖ BuilderAutoDiscovery basic initialization complete
2025-12-02 07:09:42,612 - INFO - ‚úÖ Registry info loaded: 43 steps
2025-12-02 07:09:42,612 - INFO - üéâ BuilderAutoDiscovery initialization completed successfully
2025-12-02 07:09:42,613 - INFO - üîç ScriptAutoDiscovery.__init__ starting - package_root: /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/cursus
2025-12-02 07:09:42,613 - INFO - üîç ScriptAutoDiscovery.__init__ - workspace_dirs: []
2025-12-02 07:09:42,613 - INFO - üîç ScriptAutoDiscovery.__init__ - priority_workspace_dir: None
2025-12-02 07:09:42,614 - INFO - ‚úÖ Registry info loaded: 43 steps
2025-12-02 07:09:42,614 - INFO - üéâ ScriptAutoD

In [70]:
execution_doc_fill = exe_doc_generator.fill_execution_document(
    dag=dag, execution_document=test_execution_doc
)

2025-12-02 07:09:42,642 - INFO - Starting execution document generation for DAG with 10 nodes
2025-12-02 07:09:42,642 - INFO - Found exact key match for node 'DummyDataLoading_training'
2025-12-02 07:09:42,642 - INFO - Found exact key match for node 'TabularPreprocessing_training'
2025-12-02 07:09:42,643 - INFO - Found exact key match for node 'PyTorchTraining'
2025-12-02 07:09:42,643 - INFO - Found exact key match for node 'DummyDataLoading_calibration'
2025-12-02 07:09:42,644 - INFO - Found exact key match for node 'TabularPreprocessing_calibration'
2025-12-02 07:09:42,644 - INFO - Found exact key match for node 'PyTorchModelEval_calibration'
2025-12-02 07:09:42,645 - INFO - Found exact key match for node 'ModelCalibration_calibration'
2025-12-02 07:09:42,645 - INFO - Found exact key match for node 'Package'
2025-12-02 07:09:42,645 - INFO - Found exact key match for node 'Registration'
2025-12-02 07:09:42,646 - INFO - Found exact key match for node 'Payload'
2025-12-02 07:09:42,646 -

In [71]:
# execution_doc_fill = pipeline_builder.fill_execution_document(test_execution_doc)

In [72]:
# Fill in the execution document using the stored requests
# execution_doc_fill_2 = xgboost_train_eval_pipeline_template_builder.fill_execution_document(test_execution_doc)

In [73]:
print(json.dumps(execution_doc_fill, indent=2))

{
  "PIPELINE_STEP_CONFIGS": {
    "PyTorchTraining": {
      "STEP_CONFIG": {},
      "STEP_TYPE": "TRAINING_STEP"
    },
    "Registration-NA": {
      "STEP_CONFIG": {
        "source_model_inference_image_arn": "763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.1.0-cpu-py310",
        "model_domain": "BuyerSellerMessaging",
        "model_objective": "RnR_BSM_Model_NA",
        "source_model_inference_content_types": [
          "text/csv"
        ],
        "source_model_inference_response_types": [
          "application/json"
        ],
        "source_model_inference_input_variable_list": [
          [
            "net_conc_amt",
            "NUMERIC"
          ],
          [
            "ttm_conc_amt",
            "NUMERIC"
          ],
          [
            "ttm_conc_count",
            "NUMERIC"
          ],
          [
            "concsi",
            "NUMERIC"
          ],
          [
            "deliverable_flag",
            "NUMERIC"
          ],
    

In [74]:
test_execution_doc = execution_doc_fill.copy()

### Save Execution Doc locally

In [75]:
PIPELINE_NAME = first_config.pipeline_name

In [76]:
PIPELINE_VERSION = first_config.pipeline_version

In [77]:
exe_doc_json_filename = f"execute_doc_{PIPELINE_NAME}_{PIPELINE_VERSION}.json"
exe_doc_file_path = config_dir / exe_doc_json_filename
exe_doc_file_path

PosixPath('/home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/pipeline_config/execute_doc_lukexie-BuyerAbuseRnR-pytorch-NA_1.0.0.json')

In [78]:
with open(exe_doc_file_path, "w") as f:
    json.dump(test_execution_doc, f, indent=2)

## Execute Pipeline

### Start Execution

In [79]:
from mods_workflow_helper.sagemaker_pipeline_helper import SagemakerPipelineHelper

In [80]:
security_config

<mods_workflow_helper.sagemaker_pipeline_helper.SecurityConfig at 0x7fd6471031f0>

In [81]:
template_pipeline

<sagemaker.workflow.pipeline.Pipeline at 0x7fd63e1ee440>

In [None]:
SagemakerPipelineHelper.start_pipeline_execution(
    pipeline=template_pipeline,
    secure_config=security_config,
    sagemaker_session=pipeline_session,
    preparation_space_local_root="/tmp",
    pipeline_execution_document=test_execution_doc,
)

2025-12-02 07:09:43,226 - INFO - Apply execution document provided config {} for step PyTorchTraining.
2025-12-02 07:09:43,296 - INFO - Apply execution document provided config {'source_model_inference_image_arn': '763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.1.0-cpu-py310', 'model_domain': 'BuyerSellerMessaging', 'model_objective': 'RnR_BSM_Model_NA', 'source_model_inference_content_types': ['text/csv'], 'source_model_inference_response_types': ['application/json'], 'source_model_inference_input_variable_list': [['net_conc_amt', 'NUMERIC'], ['ttm_conc_amt', 'NUMERIC'], ['ttm_conc_count', 'NUMERIC'], ['concsi', 'NUMERIC'], ['deliverable_flag', 'NUMERIC'], ['undeliverable_flag', 'NUMERIC'], ['unique_message_count', 'NUMERIC'], ['total_ship_track_events_by_order', 'NUMERIC'], ['total_unique_ship_track_events_by_order', 'NUMERIC'], ['dialogue', 'TEXT'], ['shiptrack_event_history_by_order', 'TEXT']], 'source_model_inference_output_variable_list': {'calibrated-score': 'N

sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.ResourceConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.SecurityGroupIds
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.Environment


2025-12-02 07:10:04,254 - INFO - Uploaded /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers to s3://sandboxdependency-abuse-secureaisandboxteamshare-1l77v9am252um/lukexie-BuyerAbuseRnR-pytorch-NA-1-0-0-pipeline/code/c5e4ba9efdca035f72a61119c632a5af27810a5632691c206cf656ff81c44af9/sourcedir.tar.gz
2025-12-02 07:10:04,305 - INFO - runproc.sh uploaded to s3://sandboxdependency-abuse-secureaisandboxteamshare-1l77v9am252um/lukexie-BuyerAbuseRnR-pytorch-NA-1-0-0-pipeline/code/602671d92c7eb805628d3775bd01f73cd3ef79558330597dcfd540091de6e14b/runproc.sh
2025-12-02 07:10:10,374 - INFO - Add currentOwnerAlias tag to the request for operation: CreatePipeline.
2025-12-02 07:10:10,374 - INFO - A creation operation CreatePipeline is detected. Apply owner tag to the request.
2025-12-02 07:10:11,356 - INFO - image_uri is not presented, retrieving image_uri based on instance_type, framework etc.


sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.ResourceConfig.VolumeKmsKeyId
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.Subnets
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.VpcConfig.SecurityGroupIds
sagemaker.config INFO - Applied value from config key = SageMaker.TrainingJob.Environment


2025-12-02 07:10:12,964 - INFO - Uploaded /home/ec2-user/SageMaker/BuyerAbuseModsTemplate/src/buyer_abuse_mods_template/rnr_pytorch_bedrock/dockers to s3://sandboxdependency-abuse-secureaisandboxteamshare-1l77v9am252um/lukexie-BuyerAbuseRnR-pytorch-NA-1-0-0-pipeline/code/c5e4ba9efdca035f72a61119c632a5af27810a5632691c206cf656ff81c44af9/sourcedir.tar.gz
2025-12-02 07:10:13,019 - INFO - runproc.sh uploaded to s3://sandboxdependency-abuse-secureaisandboxteamshare-1l77v9am252um/lukexie-BuyerAbuseRnR-pytorch-NA-1-0-0-pipeline/code/602671d92c7eb805628d3775bd01f73cd3ef79558330597dcfd540091de6e14b/runproc.sh
