# MLOps CI/CD with SageMaker Pipelines

This notebook implements a CI/CD workflow in SageMaker.

## CI (Pipeline)
1. Export train/val/test splits from SageMaker Feature Store (Offline Store)
2. Run Hyperparameter Tuning on XGBoost (built-in container)
3. Evaluate the best model on test set (AUC)
4. Gate promotion based on AUC threshold
5. Register the best model in Model Registry

## CD
- Deploy latest Approved model package to a fixed endpoint name.

## Data format (XGBoost built-in)
- CSV (no header)
- Label is first column
- All features numeric

## Install dependencies

In [2]:
!pip -q install sagemaker boto3 pandas "PyAthena[SQLAlchemy]" sqlalchemy

## Setup AWS + SageMaker sessions

In [1]:
import os
import json
import time
import boto3
import pandas as pd
import sagemaker
from sagemaker import image_uris
from sagemaker.session import Session
from sagemaker.feature_store.feature_group import FeatureGroup

sess = sagemaker.Session()
region = boto3.Session().region_name
bucket = sess.default_bucket()
role = sagemaker.get_execution_role()

sm = boto3.client("sagemaker", region_name=region)
s3 = boto3.client("s3", region_name=region)

print("Region:", region)
print("Bucket:", bucket)
print("Role  :", role)

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml
Region: us-east-1
Bucket: sagemaker-us-east-1-128131109986
Role  : arn:aws:iam::128131109986:role/LabRole


## Configuration

In [2]:
CICD_ENDPOINT_NAME = "ids-xgboost-binary-cicd"
PIPELINE_NAME = "aai540-ids-xgb-binary-cicd"
MODEL_PACKAGE_GROUP_NAME = "aai540-ids-xgb-binary-pkg"
MIN_AUC_TO_DEPLOY_DEFAULT = 0.85

fg_list = sm.list_feature_groups(NameContains='aai540-ids-splitfs', SortBy='CreationTime', SortOrder='Descending')
FEATURE_GROUP_NAME = fg_list['FeatureGroupSummaries'][0]['FeatureGroupName']

FEATURE_COLS = [
    "duration","pkt_total","bytes_total",
    "pkt_fwd","pkt_bwd","bytes_fwd","bytes_bwd",
    "pkt_rate","byte_rate","bytes_per_pkt",
    "pkt_ratio","byte_ratio",
]
LABEL_COL = "label"
SPLIT_COL = "data_split"

print("Pipeline:", PIPELINE_NAME)
print("Model Package Group:", MODEL_PACKAGE_GROUP_NAME)
print("Endpoint:", CICD_ENDPOINT_NAME)
print("Min AUC:", MIN_AUC_TO_DEPLOY_DEFAULT)
print("Feature Group:", FEATURE_GROUP_NAME)

Pipeline: aai540-ids-xgb-binary-cicd
Model Package Group: aai540-ids-xgb-binary-pkg
Endpoint: ids-xgboost-binary-cicd
Min AUC: 0.85
Feature Group: aai540-ids-splitfs-v2-20260213-052016


## Load Feature Group and validate schema

In [3]:
netdetect_fg = FeatureGroup(name=FEATURE_GROUP_NAME, sagemaker_session=sess)
fg_desc = netdetect_fg.describe()

fg_feature_names = {f["FeatureName"] for f in fg_desc["FeatureDefinitions"]}
required = set([LABEL_COL, SPLIT_COL] + FEATURE_COLS)
missing = required - fg_feature_names

if missing:
    raise ValueError(f"Feature Store missing required columns: {sorted(list(missing))}")

print("Feature Store schema validated")

Feature Store schema validated


## Export train/val/test splits from Feature Store

In [4]:
def export_split_from_feature_store_to_s3(split_name: str, run_prefix: str):
    query = netdetect_fg.athena_query()

    # force numeric casts (avoids XGBoost failures due to strings/null types)
    feature_expr = ", ".join([f"CAST({c} AS DOUBLE) AS {c}" for c in FEATURE_COLS])

    query_string = f"""
    SELECT
      CAST({LABEL_COL} AS DOUBLE) AS {LABEL_COL},
      {feature_expr}
    FROM "{query.table_name}"
    WHERE {SPLIT_COL} = '{split_name}'
    """

    # athena intermediate output
    athena_output = f"s3://{bucket}/aai540/athena_exports/{run_prefix}/"
    query.run(query_string=query_string, output_location=athena_output)
    query.wait()

    # consolidate to single CSV
    df = query.as_dataframe()
    local_path = f"/tmp/{split_name}.csv"
    df.to_csv(local_path, header=False, index=False)

    key = f"aai540/cicd/exports/{PIPELINE_NAME}/{run_prefix}/{split_name}.csv"
    s3.upload_file(local_path, bucket, key)

    return f"s3://{bucket}/{key}", len(df)

run_prefix = time.strftime("%Y%m%d-%H%M%S")
print("Export run prefix:", run_prefix)

print("Exporting splits...")
train_s3, n_train = export_split_from_feature_store_to_s3("train", run_prefix)
val_s3, n_val   = export_split_from_feature_store_to_s3("val", run_prefix)
test_s3, n_test  = export_split_from_feature_store_to_s3("test", run_prefix)
print("Done")

print("Train:", train_s3, f"({n_train:,} rows)")
print("Val:", val_s3, f"({n_val:,} rows)")
print("Test :", test_s3, f"({n_test:,} rows)")

Export run prefix: 20260221-020639
Exporting splits...
Done
Train: s3://sagemaker-us-east-1-128131109986/aai540/cicd/exports/aai540-ids-xgb-binary-cicd/20260221-020639/train.csv (200,414 rows)
Val: s3://sagemaker-us-east-1-128131109986/aai540/cicd/exports/aai540-ids-xgb-binary-cicd/20260221-020639/val.csv (50,100 rows)
Test : s3://sagemaker-us-east-1-128131109986/aai540/cicd/exports/aai540-ids-xgb-binary-cicd/20260221-020639/test.csv (50,100 rows)


## Preflight checks

In [5]:
from urllib.parse import urlparse

def head_s3(uri: str):
    u = urlparse(uri)
    b = u.netloc
    k = u.path.lstrip("/")
    s3.head_object(Bucket=b, Key=k)
    return True

print("S3 exists train:", head_s3(train_s3))
print("S3 exists val  :", head_s3(val_s3))
print("S3 exists test :", head_s3(test_s3))

df_chk = pd.read_csv("/tmp/train.csv", header=None, nrows=50)
print("Train sample shape:", df_chk.shape)
print("Label uniques (sample):", sorted(df_chk[0].unique())[:10])
print("Any nulls (sample):", df_chk.isna().any().any())

S3 exists train: True
S3 exists val  : True
S3 exists test : True
Train sample shape: (50, 13)
Label uniques (sample): [0.0, 1.0]
Any nulls (sample): False


## Write evaluation script (AUC) with robust tar extraction

This script:
- dynamically installs xgboost into the sklearn container
- extracts any .tar.gz in the model directory
- finds the XGBoost model file
- evaluates AUC on test.csv
- writes evaluation.json

In [6]:
EVAL_SCRIPT = r"""
import subprocess
import sys

# 1. Install xgboost into the sklearn container dynamically
subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "xgboost"])

# 2. Now perform our imports
import json
import os
import tarfile
import pandas as pd
from sklearn.metrics import roc_auc_score
import xgboost as xgb

def main():
    model_dir = "/opt/ml/processing/model"
    test_path = "/opt/ml/processing/test/test.csv"
    out_path  = "/opt/ml/processing/evaluation/evaluation.json"

    # Extract any tar.gz present
    for fname in os.listdir(model_dir):
        if fname.endswith(".tar.gz"):
            tar_path = os.path.join(model_dir, fname)
            with tarfile.open(tar_path, "r:gz") as tar:
                tar.extractall(path=model_dir)

    # Prefer standard filenames
    candidates = ["xgboost-model", "model.xgb", "model"]
    model_path = None
    for c in candidates:
        p = os.path.join(model_dir, c)
        if os.path.exists(p) and os.path.isfile(p):
            model_path = p
            break

    # Last resort: pick first file that isn't tar.gz
    if model_path is None:
        for fname in os.listdir(model_dir):
            p = os.path.join(model_dir, fname)
            if os.path.isfile(p) and not fname.endswith(".tar.gz"):
                model_path = p
                break

    if model_path is None:
        raise FileNotFoundError(f"No model file found. Contents: {os.listdir(model_dir)}")

    df = pd.read_csv(test_path, header=None)
    y = df.iloc[:, 0].values
    X = df.iloc[:, 1:].values

    booster = xgb.Booster()
    booster.load_model(model_path)

    preds = booster.predict(xgb.DMatrix(X))
    auc = float(roc_auc_score(y, preds))

    os.makedirs(os.path.dirname(out_path), exist_ok=True)
    with open(out_path, "w") as f:
        json.dump({"binary_classification_metrics": {"auc": {"value": auc}}}, f)

if __name__ == "__main__":
    main()
"""
with open("evaluate.py", "w") as f:
    f.write(EVAL_SCRIPT)

print("Wrote evaluate.py")

Wrote evaluate.py


## Pipeline imports (SageMaker Pipelines)

In [7]:
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.pipeline_context import PipelineSession
from sagemaker.workflow.parameters import ParameterFloat, ParameterString
from sagemaker.workflow.steps import ProcessingStep, TuningStep
from sagemaker.workflow.condition_step import ConditionStep
from sagemaker.workflow.conditions import ConditionGreaterThanOrEqualTo
from sagemaker.workflow.functions import JsonGet
from sagemaker.workflow.step_collections import RegisterModel
from sagemaker.workflow.properties import PropertyFile
from sagemaker.processing import ScriptProcessor, ProcessingInput, ProcessingOutput
from sagemaker.inputs import TrainingInput
from sagemaker.estimator import Estimator
from sagemaker.tuner import HyperparameterTuner, IntegerParameter, ContinuousParameter

## Define pipeline parameters

In [8]:
pipeline_session = PipelineSession()

min_auc_param = ParameterFloat(name="MinAUCToDeploy", default_value=MIN_AUC_TO_DEPLOY_DEFAULT)
train_s3_param = ParameterString(name="TrainDataS3Uri", default_value=train_s3)
val_s3_param   = ParameterString(name="ValDataS3Uri",   default_value=val_s3)
test_s3_param  = ParameterString(name="TestDataS3Uri",  default_value=test_s3)

## Define XGBoost estimator (built-in container) + set hyperparameters

In [9]:
xgb_image = image_uris.retrieve(
    framework="xgboost",
    region=region,
    version="1.7-1"
)

xgb = Estimator(
    image_uri=xgb_image,
    role=role,
    instance_count=1,
    instance_type="ml.m5.large",
    output_path=f"s3://{bucket}/aai540/cicd/model_artifacts/{PIPELINE_NAME}/",
    sagemaker_session=pipeline_session,
)

# Static hyperparameters
xgb.set_hyperparameters(
    objective="binary:logistic",
    eval_metric="auc",
    num_round=150,
    scale_pos_weight=0.3
)

# Hyperparameter ranges
hyperparameter_ranges = {
    "max_depth":        IntegerParameter(1, 6),
    "eta":              ContinuousParameter(0.01, 0.1, scaling_type="Logarithmic"),
    "min_child_weight": IntegerParameter(1, 10),
    "subsample":        ContinuousParameter(0.5, 1.0),
    "gamma":            ContinuousParameter(0.0, 5.0),
}

# Define the Tuner
tuner = HyperparameterTuner(
    estimator=xgb,
    objective_metric_name="validation:auc",
    hyperparameter_ranges=hyperparameter_ranges,
    max_jobs=5,
    max_parallel_jobs=2,
    strategy="Bayesian",
    objective_type="Maximize",
)

INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.


## TuningStep and Best Model Extraction

In [18]:
tuning_step = TuningStep(
    name="TuneXGB",
    tuner=tuner,
    inputs={
        "train": TrainingInput(s3_data=train_s3_param, content_type="text/csv"),
        "validation": TrainingInput(s3_data=val_s3_param, content_type="text/csv"),
    },
)

# Grab the S3 URI of the best model to pass to downstream steps
top_model_uri = tuning_step.get_top_model_s3_uri(
    top_k=0, 
    s3_bucket=bucket, 
    prefix=f"aai540/cicd/model_artifacts/{PIPELINE_NAME}"
)

## Evaluation step

In [19]:
script_processor = ScriptProcessor(
    image_uri=image_uris.retrieve("sklearn", region=region, version="1.2-1"),
    command=["python3"],
    role=role,
    instance_count=1,
    instance_type="ml.m5.large",
    sagemaker_session=pipeline_session,
)

evaluation_report = PropertyFile(
    name="EvaluationReport",
    output_name="evaluation",
    path="evaluation.json",
)

eval_step = ProcessingStep(
    name="EvaluateModel",
    processor=script_processor,
    code="evaluate.py",
    inputs=[
        ProcessingInput(
            source=top_model_uri,
            destination="/opt/ml/processing/model",
        ),
        ProcessingInput(
            source=test_s3_param,
            destination="/opt/ml/processing/test",
        ),
    ],
    outputs=[
        ProcessingOutput(
            output_name="evaluation",
            source="/opt/ml/processing/evaluation",
        ),
    ],
    property_files=[evaluation_report],
)

INFO:sagemaker.image_uris:Defaulting to only available Python version: py3
INFO:sagemaker.image_uris:Defaulting to only supported image scope: cpu.


## Gate on AUC + Register model in Model Registry

In [20]:
auc_value = JsonGet(
    step_name=eval_step.name,
    property_file=evaluation_report,
    json_path="binary_classification_metrics.auc.value",
)

condition = ConditionGreaterThanOrEqualTo(
    left=auc_value,
    right=min_auc_param,
)

register_step = RegisterModel(
    name="RegisterModel",
    estimator=xgb,
    model_data=top_model_uri,
    model_package_group_name=MODEL_PACKAGE_GROUP_NAME,
    approval_status="Approved",
    content_types=["text/csv"],
    response_types=["text/csv"],
    inference_instances=["ml.m5.large"],
    transform_instances=["ml.m5.large"],
)

cond_step = ConditionStep(
    name="AUCGate",
    conditions=[condition],
    if_steps=[register_step],
    else_steps=[],
)

## Build + upsert pipeline

In [21]:
pipeline = Pipeline(
    name=PIPELINE_NAME,
    parameters=[min_auc_param, train_s3_param, val_s3_param, test_s3_param],
    steps=[tuning_step, eval_step, cond_step],
    sagemaker_session=pipeline_session,
)

pipeline.upsert(role_arn=role)
print("Pipeline upserted", PIPELINE_NAME)



Pipeline upserted aai540-ids-xgb-binary-cicd


## Run pipeline execution

In [22]:
execution = pipeline.start(
    parameters={
        "MinAUCToDeploy": MIN_AUC_TO_DEPLOY_DEFAULT,
        "TrainDataS3Uri": train_s3,
        "ValDataS3Uri": val_s3,
        "TestDataS3Uri": test_s3,
    }
)

print("Pipeline started:", execution.arn)

while True:
    desc = execution.describe()
    status = desc["PipelineExecutionStatus"]
    print("Status:", status)
    if status in ("Succeeded", "Failed", "Stopped"):
        break
    time.sleep(30)

print("\nFinal status:", status)
if desc.get("FailureReason"):
    print("\nFailureReason:\n", desc["FailureReason"])

print("\nStep statuses:")
for s in execution.list_steps():
    print("-", s.get("StepName"), ":", s.get("StepStatus"))
    if s.get("StepStatus") in ("Failed", "Stopped"):
        print("  FailureReason:", s.get("FailureReason"))

Pipeline started: arn:aws:sagemaker:us-east-1:128131109986:pipeline/aai540-ids-xgb-binary-cicd/execution/d0dauqf0adwa
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Executing
Status: Succeeded

Final status: Succeeded

Step statuses:
- RegisterModel-RegisterModel : Succeeded
- AUCGate : Succeeded
- EvaluateModel : Succeeded
- TuneXGB : Succeeded


## CD: Deploy latest Approved model package to endpoint

In [25]:
from sagemaker import ModelPackage 

def get_latest_approved_model_package(group_name: str):
    resp = sm.list_model_packages(
        ModelPackageGroupName=group_name,
        ModelApprovalStatus="Approved",
        SortBy="CreationTime",
        SortOrder="Descending",
        MaxResults=1,
    )
    pkgs = resp.get("ModelPackageSummaryList", [])
    if not pkgs:
        raise RuntimeError(f"No Approved model packages found in group: {group_name}")
    return pkgs[0]["ModelPackageArn"]

latest_pkg_arn = get_latest_approved_model_package(MODEL_PACKAGE_GROUP_NAME)
print("Latest Approved ModelPackage ARN:", latest_pkg_arn)

mp = ModelPackage(
    role=role,
    model_package_arn=latest_pkg_arn,
    sagemaker_session=sess,
)

predictor = mp.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.large",
    endpoint_name=CICD_ENDPOINT_NAME,
)

print("Deployed endpoint:", CICD_ENDPOINT_NAME)

INFO:sagemaker:Creating model with name: aai540-ids-xgb-binary-pkg-2026-02-21-02-54-08-492


Latest Approved ModelPackage ARN: arn:aws:sagemaker:us-east-1:128131109986:model-package/aai540-ids-xgb-binary-pkg/2


INFO:sagemaker:Creating endpoint-config with name ids-xgboost-binary-cicd
INFO:sagemaker:Creating endpoint with name ids-xgboost-binary-cicd


------!Deployed endpoint: ids-xgboost-binary-cicd
