# Quick Start - Using @step Decorated Steps with Selective Execution

---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

---

We’re introducing a low-code experience for data scientists to convert the Machine Learning (ML) development code into repeatable and reusable workflow steps of Amazon SageMaker Pipelines.
This sample notebook is a quick introduction to this capability with dummy Python functions wrapped as pipeline steps. It demonstrates how this capability works with the [selective execution of pipeline steps](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-selective-ex.html). The pipeline in this notebook contains the following steps:
* A dummy preprocess data step to return two integers as the "train_data" and "test_data" correspondingly.
* A dummy train model step to simply multiply the "train_data" with an input "train_param" to get the dummy "model" data.
* A dummy evaluate model step, which calculates the absolute value of the difference between the "model" data and the "test_data" as the dummy RMSE (Root Mean Square Error) value.
* A ConditionStep to compare this RMSE value with a baseline.
* A dummy register model step to be conditionally invoked if the RMSE is lower than the baseline.
* A FailStep to end up the pipeline execution in the failed status if the RMSE is higher than or equal to the baseline.

Note this notebook can only run on either Python 3.8 or Python 3.10. Otherwise, you will get an error message prompting you to provide an `image_uri` when defining a step.

## Install the dependencies and setup configuration file path

If you run the notebook from a local IDE outside SageMaker, please follow the "AWS CLI Prerequisites" section of the [Set Up Amazon SageMaker Prerequisites](https://docs.aws.amazon.com/sagemaker/latest/dg/gs-set-up.html#gs-cli-prereq) to set up AWS credentials.

In [None]:
!pip install -r ./requirements.txt

In [None]:
import os

# Set path to config file
os.environ["SAGEMAKER_USER_CONFIG_OVERRIDE"] = os.getcwd()

## Define pipeline steps

In [None]:
from sagemaker.workflow.function_step import step


@step(name="PreProcess", keep_alive_period_in_seconds=300)
def my_preprocess_data():
    return 1, 2

In [None]:
train_func_step_name = "Train"


@step(name=train_func_step_name, keep_alive_period_in_seconds=300)
def my_train_model(train_param, train_data):
    return train_data * train_param

In [None]:
evaluate_func_step_name = "Evaluate"


@step(name=evaluate_func_step_name, keep_alive_period_in_seconds=300)
def my_evaluate_model(model, test_data):
    return {"rmse": abs(model - test_data)}

In [None]:
register_func_step_name = "Register"


@step(name=register_func_step_name, keep_alive_period_in_seconds=300)
def my_register_model():
    print("Registered!")

After defining the above functions decorated by `@step`, we chain them together as the following and create a pipeline object accordingly.

Notes:
1. There's no need to put all the steps into the pipeline's `steps` list. As we've defined the step dependencies via function dependencies, we only need to put the end step into the list and the pipeline object can automatically retrieve all its upstream steps.
2. As for the "train_param" in `my_train_model` function, we assign a `ParameterFloat` object (i.e. "TrainParameter") to it, so that we can adjust it across different executions.

In [None]:
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.condition_step import ConditionStep
from sagemaker.workflow.conditions import ConditionLessThan
from sagemaker.workflow.fail_step import FailStep
from sagemaker.workflow.parameters import ParameterFloat

param = ParameterFloat(name="TrainParameter", default_value=20.5)

delayed_data = my_preprocess_data()
delayed_model = my_train_model(train_param=param, train_data=delayed_data[0])
delayed_evaluation = my_evaluate_model(model=delayed_model, test_data=delayed_data[1])

condition_step_name = "ConditionallyRegister"
conditionally_register = ConditionStep(
    name=condition_step_name,
    conditions=[
        ConditionLessThan(
            # Output of the evaluate step must be json serializable
            # to be consumed in the condition evaluation
            left=delayed_evaluation["rmse"],
            right=5,
        )
    ],
    if_steps=[my_register_model()],
    else_steps=[FailStep(name="Fail", error_message="Model performance is not good enough")],
)

pipeline_name = "Dummy-ML-Pipeline-with-Selective-Execution"
pipeline = Pipeline(name=pipeline_name, steps=[conditionally_register], parameters=[param])

## Create the pipeline and run pipeline execution

In [None]:
import sagemaker

# Note: sagemaker.get_execution_role does not work outside sagemaker
role = sagemaker.get_execution_role()
pipeline.upsert(role_arn=role)

In [None]:
execution1 = pipeline.start(parallelism_config=dict(MaxParallelExecutionSteps=10))

In [None]:
try:
    execution1.wait()
except Exception as e:
    print(e)

Note: this pipeline execution1 will enter the `FailStep` and be marked as failed. This is because the default value of "TrainParameter" is quite large (i.e. 20.5), causing the RMSE value in the evaluation report to be higher than the baseline (i.e. 5). We can inspect the evaluation report by `execution1.result(step_name=evaluate_func_step_name)` to check.

In [None]:
execution1.list_steps()

In [None]:
print(f'Execution 1 - status: {execution1.describe()["PipelineExecutionStatus"]}')
print(f"Execution 1 - evaluation report: {execution1.result(step_name=evaluate_func_step_name)}")

## Run Selective Execution

In this section, we aim to "retrain" the model to get a better performant model with a smaller "TrainParameter" value, but without rerunning the entire pipeline workflow.
Hence, we define a `SelectiveExecutionConfig` object by 1) specifying the ARN of the execution1, and 2) selecting the train model step and evaluate model step. Then the precedent step (i.e. the preprocess data step) won't be rerun in the execution2 and its result should be auto retrieved from the execution1.

In [None]:
from sagemaker.workflow.selective_execution_config import SelectiveExecutionConfig

selective_execution_config1 = SelectiveExecutionConfig(
    source_pipeline_execution_arn=execution1.arn,
    selected_steps=[train_func_step_name, evaluate_func_step_name],
)

execution2 = pipeline.start(
    selective_execution_config=selective_execution_config1,
    parameters={
        "TrainParameter": 5,
    },
    parallelism_config=dict(MaxParallelExecutionSteps=10),
)

We can check the evaluation report generated in execution2 to make sure it does not go beyond the baseline.

In [None]:
print(f"Execution 2 - Evaluation Report: {execution2.result(step_name=evaluate_func_step_name)}")

Note: only the two selected steps were actually executed in execution2.

In [None]:
execution2.list_steps()

As the evaluation report is satisfying, we can complete the rest of the pipeline to "register" the dummy model. We define a new `SelectiveExecutionConfig` object, which specifies the ARN of execution2 and selects the condition step and the register model step. The result of their precedent step (i.e. evaluate model step) should be auto fetched from the execution2.

In [None]:
selective_execution_config2 = SelectiveExecutionConfig(
    source_pipeline_execution_arn=execution2.arn,
    selected_steps=[condition_step_name, register_func_step_name],
)

execution3 = pipeline.start(
    selective_execution_config=selective_execution_config2,
    parallelism_config=dict(MaxParallelExecutionSteps=10),
)

In this run, the condition step should return true, and the register model step should be successfully executed.

In [None]:
execution3.wait()
execution3.list_steps()

## Clean up resources

In [None]:
pipeline.delete()

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.


![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/sagemaker-pipelines|step-decorator|quick-start|notebooks|using_step_decorator_with_selective_execution.ipynb)
