## Fine-Tuning and Evaluating LLMs with SageMaker Pipelines and MLflow

Running hundreds of experiments, comparing the results, and keeping a track of the ML lifecycle can become very complex. This is where MLflow can help streamline the ML lifecycle, from data preparation to model deployment. By integrating MLflow into your LLM workflow, you can efficiently manage experiment tracking, model versioning, and deployment, providing reproducibility. With MLflow, you can track and compare the performance of multiple LLM experiments, identify the best-performing models, and deploy them to production environments with confidence. 

You can create workflows with SageMaker Pipelines that enable you to prepare data, fine-tune models, and evaluate model performance with simple Python code for each step. 

Now you can use SageMaker managed MLflow to run LLM fine-tuning and evaluation experiments at scale. Specifically:

- MLflow can manage tracking of fine-tuning experiments, comparing evaluation results of different runs, model versioning, deployment, and configuration (such as data and hyperparameters)
- SageMaker Pipelines can orchestrate multiple experiments based on the experiment configuration 
  

The following figure shows the overview of the solution.
![](./ml-16670-arch-with-mlflow.png)

## Prerequisites 
Before you begin, make sure you have the following prerequisites in place:

- [HuggingFace access token](https://huggingface.co/docs/hub/en/security-tokens) – You need a HuggingFace login token to access the gated Llama 3.2 model and datasets used in this post.

- Once you have your HuggingFace access token, navigate to the **steps/finetune_llama3b_hf.py** and update the **'hf_token'** parameter with your access token to download the Llama model for fine-tuning.

### 1. Setup and Dependencies
Restart the kernel after executing below cells

In [20]:
%pip install -r ./scripts/requirements.txt --upgrade --quiet

Note: you may need to restart the kernel to use updated packages.


In [21]:
from IPython import get_ipython
get_ipython().kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

**Importing Libraries and Setting Up Environment**

This part imports all necessary Python modules. It includes SageMaker-specific imports for pipeline creation and execution, as well as user-defined functions for the pipeline steps like finetune_llama3b_hf and preprocess_llama3.

In [9]:
import os
import sagemaker
from sagemaker.workflow.execution_variables import ExecutionVariables
from sagemaker.workflow.function_step import step
from sagemaker.workflow.parameters import ParameterString
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.condition_step import ConditionStep
from sagemaker.workflow.conditions import ConditionGreaterThanOrEqualTo
from sagemaker.workflow.fail_step import FailStep
from sagemaker.workflow.steps import CacheConfig

### 2. SageMaker Session and IAM Role

`get_execution_role()`: Retrieves the IAM role that SageMaker will use to access AWS resources. This role needs appropriate permissions for tasks like accessing S3 buckets and creating SageMaker resources.

In [26]:
sagemaker_session = sagemaker.session.Session()
role = sagemaker.get_execution_role()
instance_type = "ml.m5.xlarge"
processing_instance_type = "ml.m5.xlarge"
training_instance_type = "ml.m5.xlarge"

In [27]:
bucket_name = sagemaker_session.default_bucket()
default_prefix = sagemaker_session.default_bucket_prefix
if default_prefix:
    input_path = f'{default_prefix}/datasets/llm-fine-tuning-modeltrainer-sft'
else:
    input_path = f'datasets/llm-fine-tuning-modeltrainer-sft'

train_data_path = f"s3://{bucket_name}/{input_path}/train/dataset.json"
test_dataset_path = f"s3://{bucket_name}/{input_path}/test/dataset.json"

pipeline_name = "deepseek-finetune-pipeline"
    
tracking_server_arn = "arn:aws:sagemaker:us-east-1:905418257479:mlflow-tracking-server/genai-mlflow-tracker"
experiment_name = "deepseek-finetune-pipeline"
os.environ["mlflow_uri"] = ""
os.environ["mlflow_experiment_name"] = "deepseek-finetune-pipeline"

model_id = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"
model_id_filesafe = model_id.replace("/","_")
model_s3_destination="s3://sagemaker-us-east-1-891377369387/models/deepseek-ai_DeepSeek-R1-Distill-Llama-8B"
use_local_model = True #set to false for the training job to download from HF, otherwise True will download locally

In [28]:
%%writefile config.yaml
SchemaVersion: '1.0'
SageMaker:
  PythonSDK:
    Modules:
      RemoteFunction:
        # role arn is not required if in SageMaker Notebook instance or SageMaker Studio
        # Uncomment the following line and replace with the right execution role if in a local IDE
        # RoleArn: <replace the role arn here>
        InstanceType: ml.m5.xlarge
        Dependencies: ./scripts/requirements.txt
        IncludeLocalWorkDir: true
        CustomFileFilter:
          IgnoreNamePatterns: # files or directories to ignore
          - "*.ipynb" # all notebook files



Overwriting config.yaml


In [29]:
# Set path to config file
os.environ["SAGEMAKER_USER_CONFIG_OVERRIDE"] = os.getcwd()

In [30]:
# %%writefile requirements.txt
# scikit-learn
# xgboost==1.7.6
# s3fs==0.4.2
# sagemaker>=2.199.0,<3
# pandas>=2.0.0
# gevent
# geventhttpclient
# shap
# matplotlib
# fsspec
# mlflow==2.13.2
# sagemaker-mlflow==0.1.0



### 3. Configuration

**Training Configuration**

The train_config dictionary is comprehensive, including:

Experiment naming for tracking purposes
Model specifications (ID, version, name)
Infrastructure details (instance types and counts for fine-tuning and deployment)
Training hyperparameters (epochs, batch size)

This configuration allows for easy adjustment of the training process without changing the core pipeline code.

In [49]:
from huggingface_hub import snapshot_download
from sagemaker.s3 import S3Uploader
import os
import subprocess


model_local_location = f"../models/{model_id_filesafe}"
# print("Downloading model ", model_id)
# os.makedirs(model_local_location, exist_ok=True)
# snapshot_download(repo_id=model_id, local_dir=model_local_location)
# print(f"Model {model_id} downloaded under {model_local_location}")

# if default_prefix:
#     model_s3_destination = f"s3://{bucket_name}/{default_prefix}/models/{model_id_filesafe}"
# else:
#     model_s3_destination = f"s3://{bucket_name}/models/{model_id_filesafe}"

# print(f"Beginning Model Upload...")

# subprocess.run(['aws', 's3', 'cp', model_local_location, model_s3_destination, '--recursive', '--exclude', '.cache/*', '--exclude', '.gitattributes'])

# print(f"Model Uploaded to: \n {model_s3_destination}")

# os.environ["model_location"] = model_s3_destination

print(model_s3_destination)
os.environ["model_location"] = model_s3_destination

s3://sagemaker-us-east-1-891377369387/models/deepseek-ai_DeepSeek-R1-Distill-Llama-8B


**LoRA Parameters**

Low-Rank Adaptation (LoRA) is an efficient fine-tuning technique for large language models. The parameters here (lora_r, lora_alpha, lora_dropout) control the behavior of LoRA during fine-tuning, affecting the trade-off between model performance and computational efficiency.

### 4. MLflow Setup

MLflow integration is crucial for experiment tracking and management. **Update the ARN for the MLflow tracking server.**

mlflow_arn: The ARN for the MLflow tracking server. You can get this ARN from SageMaker Studio UI. This allows the pipeline to log metrics, parameters, and artifacts to a central location.

experiment_name: give appropriate name for experimentation

### 5. Dataset Configuration

For the purpose of fine tuning and evaluation we are going too use `HuggingFaceH4/no_robots` dataset

### 6. Pipeline Steps

This section defines the core components of the SageMaker pipeline.

**Preprocessing Step**

This step handles data preparation. We are going to prepare data for training and evaluation. We will log this data in MLflow

In [50]:
@step(
    name="DataPreprocessing",
    instance_type=processing_instance_type,
    display_name="Data Preprocessing",
    keep_alive_period_in_seconds=3600
)
def preprocess(
    input_path: str,
    experiment_name: str,
    run_id: str,
) -> tuple:
    import boto3
    import shutil
    import sagemaker
    from sagemaker.config import load_sagemaker_config
    
    sagemaker_session = sagemaker.Session()
    s3_client = boto3.client('s3')
    
    sagemaker_session = sagemaker.Session()
    bucket_name = sagemaker_session.default_bucket()
    default_prefix = sagemaker_session.default_bucket_prefix
    configs = load_sagemaker_config()
    
    from datasets import load_dataset
    import pandas as pd
    
    dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT", "en")
    
    df = pd.DataFrame(dataset['train'])
    df = df[:100]
    
    # df.head()
    
    from sklearn.model_selection import train_test_split
    
    train, test = train_test_split(df, test_size=0.1, random_state=42, shuffle=True)
    
    print("Number of train elements: ", len(train))
    print("Number of test elements: ", len(test))
    
    # custom instruct prompt start
    prompt_template = f"""
    <|begin_of_text|>
    <|start_header_id|>system<|end_header_id|>
    You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
    Below is an instruction that describes a task, paired with an input that provides further context. 
    Write a response that appropriately completes the request.
    Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.
    <|eot_id|><|start_header_id|>user<|end_header_id|>
    {{question}}<|eot_id|>
    <|start_header_id|>assistant<|end_header_id|>
    {{complex_cot}}
    
    {{answer}}
    <|eot_id|>
    """
    
    # template dataset to add prompt to each sample
    def template_dataset(sample):
        sample["text"] = prompt_template.format(question=sample["Question"],
                                                complex_cot=sample["Complex_CoT"],
                                                answer=sample["Response"])
        return sample
    
    from datasets import Dataset, DatasetDict
    from random import randint
    
    train_dataset = Dataset.from_pandas(train)
    test_dataset = Dataset.from_pandas(test)
    
    dataset = DatasetDict({"train": train_dataset, "test": test_dataset})
    
    train_dataset = dataset["train"].map(template_dataset, remove_columns=list(dataset["train"].features))
    
    print(train_dataset[randint(0, len(dataset))]["text"])
    
    test_dataset = dataset["test"].map(template_dataset, remove_columns=list(dataset["test"].features))
    
    # save train_dataset to s3 using our SageMaker session
    # if default_prefix:
    #     input_path = f'{default_prefix}/datasets/llm-fine-tuning-modeltrainer-sft'
    # else:
    #     input_path = f'datasets/llm-fine-tuning-modeltrainer-sft'
    if default_prefix:
        input_path = f'{default_prefix}/datasets/llm-fine-tuning-modeltrainer-sft'
    else:
        input_path = f'datasets/llm-fine-tuning-modeltrainer-sft'

    # Save datasets to s3
    # We will fine tune only with 20 records due to limited compute resource for the workshop
    train_dataset.to_json("./data/train/dataset.json", orient="records")
    test_dataset.to_json("./data/test/dataset.json", orient="records")
    train_data_path = f"s3://{bucket_name}/{input_path}/train/dataset.json"
    test_dataset_path = f"s3://{bucket_name}/{input_path}/test/dataset.json"
    s3_client.upload_file("./data/train/dataset.json", bucket_name, f"{input_path}/train/dataset.json")
    s3_client.upload_file("./data/test/dataset.json", bucket_name, f"{input_path}/test/dataset.json")

    print(train_data_path)
    print(test_dataset_path)

    shutil.rmtree("./data")

    return experiment_name, run_id, train_data_path, test_dataset_path

In [51]:
%%bash

cat > ./args.yaml <<EOF

# MLflow Config
mlflow_uri: "${mlflow_uri}"
mlflow_experiment_name: "${mlflow_experiment_name}"


model_id: "${model_location}"       # Hugging Face model id, or S3 location

# sagemaker specific parameters
output_dir: "/opt/ml/model"                       # path to where SageMaker will upload the model 
train_dataset_path: "/opt/ml/input/data/train/"   # path to where FSx saves train dataset
test_dataset_path: "/opt/ml/input/data/test/"     # path to where FSx saves test dataset
# training parameters
max_seq_length: 1500  #512 # 2048
lora_r: 8
lora_alpha: 16
lora_dropout: 0.1                 
learning_rate: 2e-4                    # learning rate scheduler
num_train_epochs: 1                    # number of training epochs
per_device_train_batch_size: 1         # batch size per device during training
per_device_eval_batch_size: 1          # batch size for evaluation
gradient_accumulation_steps: 2         # number of steps before performing a backward/update pass
gradient_checkpointing: true           # use gradient checkpointing
fp16: true
bf16: false                            # use bfloat16 precision, also enables FlashAttention2 (requires Ampere/Hopper GPU+ ex:A10, A100, H100)
tf32: false                            # use tf32 precision

#uncomment here for fsdp - start
# fsdp: "full_shard auto_wrap offload"
# fsdp_config: 
#     backward_prefetch: "backward_pre"
#     cpu_ram_efficient_loading: true
#     offload_params: true
#     forward_prefetch: false
#     use_orig_params: true
#uncomment here for fsdp - end

merge_weights: true                    # merge weights in the base model
EOF

In [52]:
from sagemaker.s3 import S3Uploader

if default_prefix:
    input_path = f"s3://{bucket_name}/{default_prefix}/training_config/{model_id_filesafe}"
else:
    input_path = f"s3://{bucket_name}/training_config/{model_id_filesafe}"

# upload the model yaml file to s3
model_yaml = "args.yaml"
train_config_s3_path = S3Uploader.upload(local_path=model_yaml, desired_s3_uri=f"{input_path}/config")

print(f"Training config uploaded to:")
print(train_config_s3_path)

Training config uploaded to:
s3://sagemaker-us-east-1-891377369387/training_config/deepseek-ai_DeepSeek-R1-Distill-Llama-8B/config/args.yaml


**Fine-tuning Step**

This is where the actual model adaptation occurs. The step takes the preprocessed data and applies it to fine-tune the base LLM (in this case, a Llama model). It incorporates the LoRA technique for efficient adaptation.

In [53]:
@step(
    name="ModelFineTuning",
    instance_type=training_instance_type,
    display_name="Model Fine Tuning",
    keep_alive_period_in_seconds=3600
)
def train(
    train_dataset_s3_path: str,
    test_dataset_s3_path: str,
    train_config_s3_path: str,
    experiment_name: str,
    model_id: str,
    run_id: str,
):
    import sagemaker
    import boto3
    job_name = "deepseek-finetune-pipeline"
    from sagemaker.pytorch import PyTorch
    sagemaker_session = sagemaker.Session()
    pytorch_estimator = PyTorch(
        entry_point='train.py',
        source_dir="./scripts",
        job_name=job_name,
        base_job_name=job_name,
        max_run=50000,
        role=role,
        framework_version="2.2.0",
        py_version="py310",
        instance_count=1,
        instance_type="ml.p3.2xlarge",
        sagemaker_session=sagemaker_session,
        volume_size=50,
        disable_output_compression=True,
        keep_alive_period_in_seconds=1800,
        distribution={"torch_distributed": {"enabled": True}},
        hyperparameters={
            "config": "/opt/ml/input/data/config/args.yaml"
        }
    )

    # define a data input dictonary with our uploaded s3 uris
    data = {
      'train': train_dataset_s3_path,
      'test': test_dataset_s3_path,
      'config': train_config_s3_path
      }

    print(f"Data for Training Run: {data}")

    pytorch_estimator.fit(data, wait=True)

    latest_run_job_name = pytorch_estimator.latest_training_job.job_name
    print(f"Latest Job Name: {latest_run_job_name}")

    sagemaker_client = boto3.client('sagemaker')

    # Describe the training job
    response = sagemaker_client.describe_training_job(TrainingJobName=latest_run_job_name)

    # Extract the model artifacts S3 path
    model_artifacts_s3_path = response['ModelArtifacts']['S3ModelArtifacts']

    # Extract the output path (this is the general output location)
    output_path = response['OutputDataConfig']['S3OutputPath']

    print(f"Model artifacts S3 path: {model_artifacts_s3_path}")

    return experiment_name, run_id, model_artifacts_s3_path, output_path

**Evaluation Step**

After fine-tuning, this step assesses the model's performance. It uses built-in evaluation function in MLflow to evaluate metrices like toxicity, exact_match etc:

It will then log the results in MLflow

In [58]:
@step(
    name="ModelEvaluation",
    instance_type=training_instance_type,
    display_name="Model Evaluation",
    keep_alive_period_in_seconds=3600
)
def evaluate(
    experiment_name: str,
    run_id: str,
    model_artifacts_s3_path: str,
):
    # Import libraries
    import os
    import json
    import time
    import boto3
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    from tqdm.notebook import tqdm
    from datasets import load_dataset
    import torch
    import torchvision
    import transformers

    # Import LightEval metrics
    from lighteval.metrics.metrics_sample import ROUGE, Doc

    # Initialize the SageMaker client
    sm_client = boto3.client('sagemaker-runtime')

    FINETUNED_MODEL_ENDPOINT = "DeepSeek-R1-Distill-Llama-8B-sft-djl"  # Update with Fine-tuned model endpoint name

    # Define the model to evaluate
    model_to_evaluate = {
        "name": "Fine-tuned DeepSeek-R1-Distill-Llama-8B", 
        "endpoint": FINETUNED_MODEL_ENDPOINT
    }
    # Limit the number of samples to evaluate (for faster execution)
    num_samples = 10

    # Load the test split of the SAMSum dataset
    dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT", "en", split="train")

    max_samples = len(dataset)

    dataset = dataset.shuffle().select(range(min(num_samples, max_samples)))
    print(f"Loaded medical-o1-reasoning dataset with {len(dataset)} samples out of {max_samples}")

    # Display a sample from the dataset
    sample = dataset[0]

    print("\nQuestion:\n", sample["Question"], "\n\n====\n")
    print("Complex_CoT:\n", sample["Complex_CoT"], "\n\n====\n")
    print("Response:\n", sample["Response"], "\n\n====\n")


    # This function allows you to interact with a deployed SageMaker endpoint to get predictions from the DeepSeek model
    def invoke_sagemaker_endpoint(payload, endpoint_name):
        """
        Invoke a SageMaker endpoint with the given payload.

        Args:
            payload (dict): The input data to send to the endpoint
            endpoint_name (str): The name of the SageMaker endpoint

        Returns:
            dict: The response from the endpoint
        """
        response = sm_client.invoke_endpoint(
            EndpointName=endpoint_name,
            ContentType='application/json',
            Body=json.dumps(payload)
        )

        response_body = response['Body'].read().decode('utf-8')
        return json.loads(response_body)


    # Initialize LightEval metrics calculators
    rouge_metrics = ROUGE(
        methods=["rouge1", "rouge2", "rougeL"],
        multiple_golds=False,
        bootstrap=False,
        normalize_gold=None,
        normalize_pred=None
    )


    def calculate_metrics(predictions, references):
        """
        Calculate all evaluation metrics for summarization using LightEval.

        Args:
            predictions (list): List of generated summaries
            references (list): List of reference summaries

        Returns:
            dict: Dictionary containing all metric scores
        """
        metrics = {}

        # Create Doc objects for the Rouge and BertScore metrics
        docs = []
        for reference in references:
            docs.append(Doc(
                {"target": reference},
                choices=[reference],  # Dummy choices
                gold_index=0  # Dummy gold_index
            ))

        # Calculate ROUGE scores for each prediction-reference pair
        rouge_scores = {'rouge1_f': [], 'rouge2_f': [], 'rougeL_f': []}

        for pred, ref in zip(predictions, references):
            # For ROUGE calculation
            rouge_result = rouge_metrics.compute(golds=[ref], predictions=[pred])
            rouge_scores['rouge1_f'].append(rouge_result['rouge1'])
            rouge_scores['rouge2_f'].append(rouge_result['rouge2'])
            rouge_scores['rougeL_f'].append(rouge_result['rougeL'])

        # Average ROUGE scores
        for key in rouge_scores:
            metrics[key] = sum(rouge_scores[key]) / len(rouge_scores[key])

        print(f"Metrics: {metrics}")

        return metrics


    def generate_summaries_with_model(endpoint_name, dataset):
        """
        Generate summaries using a model deployed on SageMaker.

        Args:
            endpoint_name (str): SageMaker endpoint name
            dataset: Dataset containing dialogues

        Returns:
            list: Generated summaries
        """
        predictions = []

        for example in tqdm(dataset, desc="Generating Responses"):
            question = example["Question"]

            # Prepare the prompt for the model
            prompt = f"""
            <|begin_of_text|>
            <|start_header_id|>system<|end_header_id|>
            You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
            Below is an instruction that describes a task, paired with an input that provides further context. 
            Write a response that appropriately completes the request.
            Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.
            <|eot_id|><|start_header_id|>user<|end_header_id|>
            {question}<|eot_id|>
            <|start_header_id|>assistant<|end_header_id|>"""

            # Payload for SageMaker endpoint
            payload = {
                "inputs": prompt,
                "parameters": {
                    "max_new_tokens": 512,
                    "top_p": 0.9,
                    "temperature": 0.6,
                    "return_full_text": False
                }
            }

            # Call the model endpoint
            try:
                response = invoke_sagemaker_endpoint(payload, endpoint_name)

                # Extract the generated text
                if isinstance(response, list):
                    prediction = response[0].get('generated_text', '').strip()
                elif isinstance(response, dict):
                    prediction = response.get('generated_text', '').strip()
                else:
                    prediction = str(response).strip

                prediction = prediction.split("<|eot_id|>")[0]
                # Clean up the generated text
                #if "Summary:" in prediction:
                #    prediction = prediction.split("Summary:", 1)[1].strip()

            except Exception as e:
                print(f"Error invoking SageMaker endpoint {endpoint_name}: {e}")
                prediction = "Error generating summary."

            predictions.append(prediction)

        return predictions

    def evaluate_model_on_dataset(model_config, dataset):
        """
        Evaluate a fine-tuned model on the SamSum dataset using both automated and human metrics.

        Args:
            model_config (dict): Model configuration with name and endpoint
            dataset: SamSum dataset for evaluation

        Returns:
            dict: Evaluation results
        """
        model_name = model_config["name"]
        endpoint_name = model_config["endpoint"]

        print(f"\nEvaluating model: {model_name} on endpoint: {endpoint_name}")

        # Get references
        references = ["\n".join([example["Complex_CoT"], example["Response"]]) for example in dataset]

        # Generate summaries
        print("\nGenerating Responses...")
        predictions = generate_summaries_with_model(endpoint_name, dataset)

        # Calculate automated metrics using LightEval
        print("\nCalculating evaluation metrics with LightEval...")
        metrics = calculate_metrics(predictions, references)

        # Format results
        results = {
            "model_name": model_name,
            "endpoint_name": endpoint_name,
            "num_samples": len(dataset),
            "metrics": metrics,
            "predictions": predictions[:5],  # First 5 predictions
            "references": references[:5]     # First 5 references
        }

        # Print key results
        print(f"\nResults for {model_name}:")
        print(f"ROUGE-1 F1: {metrics['rouge1_f']:.4f}")
        print(f"ROUGE-2 F1: {metrics['rouge2_f']:.4f}")
        print(f"ROUGE-L F1: {metrics['rougeL_f']:.4f}")

        return results, metrics['rouge1_f'], metrics['rouge2_f'], metrics['rougeL_f']

    finetuned_model_results, rouge1_f, rouge2_f, rougeL_f = evaluate_model_on_dataset(model_to_evaluate, dataset)

    return experiment_name, run_id, rouge1_f, rouge2_f, rougeL_f

### 7. Pipeline Creation and Execution

This final section brings all the components together into an executable pipeline.

**Creating the Pipeline**

The pipeline object is created with all defined steps. The lora_config is passed as a parameter, allowing for easy modification of LoRA settings between runs.

In [55]:
preprocessing_step = preprocess(
    experiment_name=experiment_name,
    run_id=ExecutionVariables.PIPELINE_EXECUTION_ID,
    input_path=input_path,
)

training_step = train(
    train_dataset_s3_path=preprocessing_step[2],
    test_dataset_s3_path=preprocessing_step[3],
    train_config_s3_path=train_config_s3_path,
    experiment_name=preprocessing_step[0],
    run_id=preprocessing_step[1],
    model_id=model_s3_destination,
)

evaluate_step = evaluate(
    experiment_name=training_step[0],
    run_id=training_step[1],
    model_artifacts_s3_path=training_step[2],
)

pipeline = Pipeline(
    name=pipeline_name,
    parameters=[
        instance_type,
    ],
    steps=[preprocessing_step, training_step, evaluate_step],
)

**Upserting the Pipeline**

This step either creates a new pipeline in SageMaker or updates an existing one with the same name. It's a key part of the MLOps process, allowing for iterative refinement of the pipeline.

In [56]:
pipeline.upsert(role)

sagemaker.config INFO - Applied value from config key = SageMaker.PythonSDK.Modules.RemoteFunction.Dependencies
sagemaker.config INFO - Applied value from config key = SageMaker.PythonSDK.Modules.RemoteFunction.IncludeLocalWorkDir
sagemaker.config INFO - Applied value from config key = SageMaker.PythonSDK.Modules.RemoteFunction.CustomFileFilter.IgnoreNamePatterns


2025-05-21 08:45:31,205 sagemaker.remote_function INFO     Uploading serialized function code to s3://sagemaker-us-east-1-891377369387/deepseek-finetune-pipeline/DataPreprocessing/2025-05-21-08-45-29-048/function
2025-05-21 08:45:31,338 sagemaker.remote_function INFO     Uploading serialized function arguments to s3://sagemaker-us-east-1-891377369387/deepseek-finetune-pipeline/DataPreprocessing/2025-05-21-08-45-29-048/arguments
2025-05-21 08:45:31,611 sagemaker.remote_function INFO     Copied dependencies file at './scripts/requirements.txt' to '/tmp/tmp7d7wq13m/requirements.txt'
2025-05-21 08:45:31,638 sagemaker.remote_function INFO     Successfully uploaded dependencies and pre execution scripts to 's3://sagemaker-us-east-1-891377369387/deepseek-finetune-pipeline/DataPreprocessing/2025-05-21-08-45-29-048/pre_exec_script_and_dependencies'
2025-05-21 08:45:31,642 sagemaker.remote_function INFO     Copied user workspace to '/tmp/tmpd4ri3qct/temp_workspace/sagemaker_remote_function_works

sagemaker.config INFO - Applied value from config key = SageMaker.PythonSDK.Modules.RemoteFunction.Dependencies
sagemaker.config INFO - Applied value from config key = SageMaker.PythonSDK.Modules.RemoteFunction.IncludeLocalWorkDir
sagemaker.config INFO - Applied value from config key = SageMaker.PythonSDK.Modules.RemoteFunction.CustomFileFilter.IgnoreNamePatterns


2025-05-21 08:45:33,527 sagemaker.remote_function INFO     Uploading serialized function code to s3://sagemaker-us-east-1-891377369387/deepseek-finetune-pipeline/ModelFineTuning/2025-05-21-08-45-29-048/function
2025-05-21 08:45:33,582 sagemaker.remote_function INFO     Uploading serialized function arguments to s3://sagemaker-us-east-1-891377369387/deepseek-finetune-pipeline/ModelFineTuning/2025-05-21-08-45-29-048/arguments
2025-05-21 08:45:33,636 sagemaker.remote_function INFO     Copied dependencies file at './scripts/requirements.txt' to '/tmp/tmp386ztkby/requirements.txt'
2025-05-21 08:45:33,675 sagemaker.remote_function INFO     Successfully uploaded dependencies and pre execution scripts to 's3://sagemaker-us-east-1-891377369387/deepseek-finetune-pipeline/ModelFineTuning/2025-05-21-08-45-29-048/pre_exec_script_and_dependencies'
2025-05-21 08:45:34,233 sagemaker.remote_function INFO     Uploading serialized function code to s3://sagemaker-us-east-1-891377369387/deepseek-finetune-p

{'PipelineArn': 'arn:aws:sagemaker:us-east-1:891377369387:pipeline/deepseek-finetune-pipeline',
 'ResponseMetadata': {'RequestId': '60048ada-b507-49ee-b774-4bd67b092954',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '60048ada-b507-49ee-b774-4bd67b092954',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '94',
   'date': 'Wed, 21 May 2025 08:45:35 GMT'},
  'RetryAttempts': 0}}

**Starting the Pipeline Execution**

This command kicks off the actual execution of the pipeline in SageMaker. From this point, SageMaker will orchestrate the execution of each step, managing resources and data flow between steps.

In [57]:
execution1 = pipeline.start()

# Clean up