# ðŸš€ End-to-End Model Customization | Fine-tuning, Evaluation & Deployment

This notebook demonstrates serverless training, evaluation pipelines, and model deployment for production use.

## Setup and Dependencies

In [None]:
!pip install --upgrade sagemaker --quiet  # restart the kernel after running this cell

In [None]:
# Setup SageMaker session
import boto3
import os
from rich import print as rprint
from rich.pretty import pprint
from sagemaker.core.helper.session_helper import Session

REGION = boto3.Session().region_name
sm_client = boto3.client("sagemaker", region_name=REGION)

# Create SageMaker session
sagemaker_session = Session(sagemaker_client=sm_client)


print(f"Region: {REGION}")

# For MLFlow native metrics in Trainer wait, run below line with appropriate region
os.environ["SAGEMAKER_MLFLOW_CUSTOM_ENDPOINT"] = f"https://mlflow.sagemaker.{REGION}.app.aws"


#### Create Training Dataset
Below section provides sample code to create the training dataset arn

In [None]:
from sagemaker.ai_registry.dataset import DataSet
from sagemaker.ai_registry.dataset_utils import CustomizationTechnique
import boto3

# Register dataset in SageMaker AI Registry. This creates a versioned dataset that can be referenced by ARN
dataset = DataSet.create(
    name="demo-sft-dataset",
    source="s3://your-bucket/dataset/training_dataset.jsonl", # Source can be S3 or local path
    #customization_technique=CUSTOMIZATION_TECHNIQUE.SFT # or DPO or RLVR
        #Optional technique name for minimal dataset format check.
    wait=True
)

print(f"TRAINING_DATASET ARN: {dataset.arn}")
# TRAINING_DATASET = dataset.arn

In [None]:
# Required Configs
BASE_MODEL = ""

# MODEL_PACKAGE_GROUP_NAME is same as CUSTOM_MODEL_NAME
MODEL_PACKAGE_GROUP_NAME = ""

TRAINING_DATASET = ""

S3_OUTPUT_PATH = ""

ROLE_ARN = ""

#### Create Model Package Group

In [None]:
from sagemaker.core.resources import ModelPackageGroup
model_package_group = ModelPackageGroup.create(
    model_package_group_name=MODEL_PACKAGE_GROUP_NAME,
    model_package_group_description='' #Required Description
)

#### End-user license agreements (EULA)
Some foundation models require explicit acceptance of an end-user license agreement (EULA) before use. Please explicitly change the below variable `ACCEPT_EULA` to `True` to agree to the terms and conditions. For more details please refer https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-choose.html/

In [None]:
ACCEPT_EULA = False

# Part 1: Fine-tuning

### Step 1: Creating the Trainer

#### Choose one of the following trainer techniques:
- **Option 1: SFT Trainer (Supervised Fine-Tuning)** 
- **Option 2: Create RLVRTrainer (Reinforcement Learning with Verifiable Rewards)**. 
- **Option 3: RLAIF Trainer (Reinforcement Learning from AI Feedback)** 
- **Option 4: DPO Trainer (Direct Preference Optimization)** 

**Instructions:** Run only ONE of the trainers, not all of them.

#### Create SFT Trainer (Supervised Fine-Tuning)

##### Key Parameters:
* `model`: base_model id on Sagemaker Hubcontent that is available to finetune (or) ModelPackage artifacts
* `training_type`: Choose from TrainingType Enum(sagemaker.train.common) either LORA OR FULL. (optional)
* `model_package_group`: ModelPackage group name or ModelPackageGroup (optional)
* `mlflow_resource_arn`: MLFlow app ARN to track the training job (optional)
* `mlflow_experiment_name`: MLFlow app experiment name(str) (optional)
* `mlflow_run_name`: MLFlow app run name(str) (optional)
* `training_dataset`: Training Dataset - either Dataset ARN or S3 Path of the dataset (Please note these are required for a training job to run, can be either provided via Trainer or .train()) (optional)
* `validation_dataset`: Validation Dataset - either Dataset ARN or S3 Path of the dataset (optional)
* `s3_output_path`: S3 path for the trained model artifacts (optional)

In [None]:
from sagemaker.train.sft_trainer import SFTTrainer
from sagemaker.train.common import TrainingType


trainer = SFTTrainer(
    model=BASE_MODEL,
    training_type=TrainingType.LORA,
    model_package_group=model_package_group,
    training_dataset=TRAINING_DATASET,
    s3_output_path=S3_OUTPUT_PATH,
    sagemaker_session=sagemaker_session,
    accept_eula=ACCEPT_EULA,
    role=ROLE_ARN
)

### OR

#### Create RLVRTrainer (Reinforcement Learning with Verifiable Rewards)

##### Key Parameters:
* `model`: base_model id on Sagemaker Hubcontent that is available to finetune (or) ModelPackage artifacts
* `custom_reward_function`: Custom reward function/Evaluator ARN (optional)
* `model_package_group`: ModelPackage group name or ModelPackageGroup (optional)
* `mlflow_resource_arn`: MLFlow app ARN to track the training job (optional)
* `mlflow_experiment_name`: MLFlow app experiment name(str) (optional)
* `mlflow_run_name`: MLFlow app run name(str) (optional)
* `training_dataset`: Training Dataset - either Dataset ARN or S3 Path of the dataset (Please note these are required for a training job to run, can be either provided via Trainer or .train()) (optional)
* `validation_dataset`: Validation Dataset - either Dataset ARN or S3 Path of the dataset (optional)
* `s3_output_path`: S3 path for the trained model artifacts (optional)

In [None]:
from sagemaker.train.rlvr_trainer import RLVRTrainer


trainer = RLVRTrainer(
    model=BASE_MODEL,
    model_package_group=model_package_group,
    training_dataset=TRAINING_DATASET,
    s3_output_path=S3_OUTPUT_PATH,
    sagemaker_session=sagemaker_session,
    accept_eula=ACCEPT_EULA,
    role=ROLE_ARN
)

# You can pass the custom reward function to the trainer.
# CUSTOM_REWARD_FUNCTION = "arn:aws:sagemaker:<region>:<accountId>:hub-content/<HUB-NAME>/JsonDoc/<CUSTOM_REWARD_FUNCTION>/<VERSION>"
#
# trainer = RLVRTrainer(
#     model=BASE_MODEL,
#     model_package_group=model_package_group,
#     training_dataset=TRAINING_DATASET,
#     custom_reward_function = CUSTOM_REWARD_FUNCTION
#     s3_output_path=S3_OUTPUT_PATH,
#     sagemaker_session=sagemaker_session,
#     accept_eula=ACCEPT_EULA,
#     role=ROLE_ARN
# )

### OR

#### Create RLAIF Trainer (Reinforcement Learning from AI Feedback)
This trainer uses AI models as reward functions

##### Key Parameters:
* `model`: base_model id on Sagemaker Hubcontent that is available to finetune (or) ModelPackage artifacts
* `reward_model_id`: Bedrock model id, supported evaluation models: https://docs.aws.amazon.com/bedrock/latest/userguide/evaluation-judge.html (optional)
*  `reward_prompt`: Reward prompt ARN or builtin prompts refer: https://docs.aws.amazon.com/bedrock/latest/userguide/model-evaluation-metrics.html (optional)
* `model_package_group`: ModelPackage group name or ModelPackageGroup (optional)
* `mlflow_resource_arn`: MLFlow app ARN to track the training job (optional)
* `mlflow_experiment_name`: MLFlow app experiment name(str) (optional)
* `mlflow_run_name`: MLFlow app run name(str) (optional)
* `training_dataset`: Training Dataset - either Dataset ARN or S3 Path of the dataset (Please note these are required for a training job to run, can be either provided via Trainer or .train()) (optional)
* `validation_dataset`: Validation Dataset - either Dataset ARN or S3 Path of the dataset (optional)
* `s3_output_path`: S3 path for the trained model artifacts (optional)

In [None]:
from sagemaker.train.rlaif_trainer import RLAIFTrainer

# example values for REWARD MODEL and PROMPT
REWARD_MODEL_ID = 'anthropic.claude-3-5-sonnet-20240620-v1:0'
REWARD_PROMPT = 'Builtin.Correctness'


trainer = RLAIFTrainer(
    model=BASE_MODEL,
    model_package_group=model_package_group,
    reward_model_id=REWARD_MODEL_ID,
    reward_prompt=REWARD_PROMPT,
    training_dataset=TRAINING_DATASET,
    s3_output_path=S3_OUTPUT_PATH,
    sagemaker_session=sagemaker_session,
    accept_eula=ACCEPT_EULA,
    role=ROLE_ARN
)

### OR

#### Create DPO Trainer (Direct Preference Optimization)

Direct Preference Optimization (DPO) is a method for training language models to follow human preferences. Unlike traditional RLHF (Reinforcement Learning from Human Feedback), DPO directly optimizes the model using preference pairs without needing a reward model.

##### Key Parameters:
- `model` Base model to fine-tune (from SageMaker Hub)
- `training_type` Fine-tuning method (LoRA recommended for efficiency)
- `training_dataset` ARN of the registered preference dataset
- `model_package_group` Where to store the fine-tuned model
- `mlflow_resource_arn` MLflow tracking server for experiment logging 

In [None]:
from sagemaker.train.dpo_trainer import DPOTrainer
from sagemaker.train.common import TrainingType

trainer = DPOTrainer(
    model=BASE_MODEL,
    training_type=TrainingType.LORA,
    model_package_group=model_package_group,
    training_dataset=TRAINING_DATASET,
    s3_output_path=S3_OUTPUT_PATH,
    sagemaker_session=sagemaker_session,
    accept_eula=ACCEPT_EULA,
    role=ROLE_ARN
)

### Step 2: Get Finetuning Options and Modify

In [None]:
print("Default Finetuning Options:")
pprint(trainer.hyperparameters.to_dict())

# Modify options like object attributes
trainer.hyperparameters.learning_rate = 0.00001

print("\nModified/User defined Options:")
pprint(trainer.hyperparameters.to_dict())

### Step 3: Start Training

In [None]:
training_job = trainer.train(wait=True)

TRAINING_JOB_NAME = training_job.training_job_name

pprint(training_job)

### Step 4: Describe Training job

In [None]:
from sagemaker.core.resources import TrainingJob

response = TrainingJob.get(training_job_name=TRAINING_JOB_NAME)
pprint(response)

# Part 2: Model Evaluation

This section demonstrates the basic user-facing flow for creating and managing evaluation jobs

## Step 1: Choose one of the following evaluation techniques:
- **Option 1: BenchmarkEvaluator** 
- **Option 2: LLMAsJudgeEvaluator (LLM-as-Judge Evaluation)** 
- **Option 3: CustomScorerEvaluator** 

**Instructions:** Run only ONE of the technique sections, not all of them.

### Option 1: Create BenchmarkEvaluator

Create a BenchmarkEvaluator instance with the desired benchmark. The evaluator will use Jinja2 templates to render a complete pipeline definition.

### Key Parameters:
- `benchmark`: Benchmark type from the Benchmark enum
- `model`: Model ARN from SageMaker hub content
- `s3_output_path`: S3 location for evaluation outputs
- `mlflow_resource_arn`: MLflow tracking server ARN for experiment tracking (optional)
- `model_package_group`: Model package group ARN (optional)
- `source_model_package`: Source model package ARN (optional)
- `model_artifact`: ARN of model artifact for lineage tracking (auto-inferred from source_model_package) (optional)

**Note:** When you call `evaluate()`, the system will start evaluation job. The evaluator will:
1. Build template context with all required parameters
2. Render the pipeline definition from `DETERMINISTIC_TEMPLATE` using Jinja2
3. Create or update the pipeline with the rendered definition
4. Start the pipeline execution with empty parameters (all values pre-substituted) 

In [None]:
from sagemaker.train.evaluate import BenchMarkEvaluator
from sagemaker.train.evaluate import get_benchmarks, get_benchmark_properties
from rich.pretty import pprint
import logging
logging.basicConfig(
    level=logging.INFO,
    format='%(levelname)s - %(name)s - %(message)s'
)

# Get available benchmarks
Benchmark = get_benchmarks()
pprint(list(Benchmark))

# Print properties for a specific benchmark
pprint(get_benchmark_properties(benchmark=Benchmark.MMLU))


# Create evaluator with GEN_QA benchmark
evaluator = BenchMarkEvaluator(
    benchmark=Benchmark.MMLU,
    model=BASE_MODEL,
    s3_output_path=S3_OUTPUT_PATH,
)

pprint(evaluator)

### Option 2: Create LLMAsJudgeEvaluator

Create an LLMAsJudgeEvaluator instance with the desired evaluator model, dataset, and metrics.

### Key Parameters:
- `model`: Model ARN to be evaluated (required)
- `evaluator_model`: Bedrock model ID to use as judge (required)
- `dataset`: S3 URI or Dataset ARN (required)
- `s3_output_path`: S3 output location (required)
- `builtin_metrics`: List of built-in metrics (optional, no 'Builtin.' prefix needed)
- `custom_metrics`: JSON string of custom metrics (optional)
- `evaluate_base_model`: Whether to evaluate base model in addition to custom model (optional, default=True)
- `mlflow_resource_arn`: MLflow tracking server ARN (optional)
- `model_package_group`: Model package group ARN (optional)
- `source_model_package`: Source model package ARN (optional)

#### Using custom metrics (as JSON string)

Custom metrics must be provided as a properly escaped JSON string. You can either:
1. Create a Python dict and use `json.dumps()` to convert it
2. Provide a pre-escaped JSON string directly

**Note:** When you call `evaluate()`, the system will start evaluation job. The evaluator will:
1. Generate inference responses from the base model (if evaluate_base_model=True)
2. Generate inference responses from the custom model
3. Use the judge model to evaluate responses with built-in and custom metrics

In [None]:
import json
from rich.pretty import pprint
from sagemaker.train.evaluate import LLMAsJudgeEvaluator
import logging
logging.basicConfig(
    level=logging.INFO,
    format='%(levelname)s - %(name)s - %(message)s'
)

EVALUATOR_MODEL = "anthropic.claude-3-5-haiku-20241022-v1:0"
BUILTIN_METRICS=["Completeness", "Faithfulness"]

EVALUATION_DATASET = ""

custom_metrics_list = [
    {
        "customMetricDefinition": {
            "name": "GoodMetric",
            "instructions": (
                "Assess if the response has positive sentiment. "
                "Prompt: {{prompt}}\nResponse: {{prediction}}"
            ),
            "ratingScale": [
                {"definition": "Good", "value": {"floatValue": 1}},
                {"definition": "Poor", "value": {"floatValue": 0}}
            ]
        }
    },
    {
        "customMetricDefinition": {
            "name": "BadMetric",
            "instructions": (
                "Assess if the response has negative sentiment. "
                "Prompt: {{prompt}}\nResponse: {{prediction}}"
            ),
            "ratingScale": [
                {"definition": "Bad", "value": {"floatValue": 1}},
                {"definition": "Good", "value": {"floatValue": 0}}
            ]
        }
    }
]

custom_metrics_json = json.dumps(custom_metrics_list)

# Alternate Option: Create metrics using dict
#
# custom_metric_dict = {
#     "customMetricDefinition": {
#         "name": "PositiveSentiment",
#         "instructions": (
#             "You are an expert evaluator. Your task is to assess if the sentiment of the response is positive. "
#             "Rate the response based on whether it conveys positive sentiment, helpfulness, and constructive tone.\n\n"
#             "Consider the following:\n"
#             "- Does the response have a positive, encouraging tone?\n"
#             "- Is the response helpful and constructive?\n"
#             "- Does it avoid negative language or criticism?\n\n"
#             "Rate on this scale:\n"
#             "- Good: Response has positive sentiment\n"
#             "- Poor: Response lacks positive sentiment\n\n"
#             "Here is the actual task:\n"
#             "Prompt: {{prompt}}\n"
#             "Response: {{prediction}}"
#         ),
#         "ratingScale": [
#             {"definition": "Good", "value": {"floatValue": 1}},
#             {"definition": "Poor", "value": {"floatValue": 0}}
#         ]
#     }
# }
#
# Convert to JSON string
# custom_metrics_json = json.dumps([custom_metric_dict])


# Create evaluator with custom metrics
evaluator = LLMAsJudgeEvaluator(
    model=BASE_MODEL,  # Required
    evaluator_model=EVALUATOR_MODEL,  # Required
    dataset=EVALUATION_DATASET,  # Required: S3 URI or Dataset ARN
    builtin_metrics=BUILTIN_METRICS,  # Optional: Can combine with custom metrics
    custom_metrics=custom_metrics_json,  # Optional: JSON string of custom metrics
    s3_output_path=S3_OUTPUT_PATH,  # Required
    evaluate_base_model=False  # Skip base model evaluation to evaluate only custom model
)

pprint(evaluator)

### Option 3: SageMaker Custom Scorer Evaluation

Instantiate the evaluator with your configuration. The evaluator can accept:
- **Custom Evaluator ARN (string):** Points to your custom evaluator in AI Registry
- **Built-in Metric (string or enum):** Use preset metrics like "code_executions", "math_answers", etc.
- **Evaluator Object:** A sagemaker.ai_registry.evaluator.Evaluator instance


In [None]:
from sagemaker.train.evaluate import CustomScorerEvaluator
from sagemaker.train.evaluate import get_builtin_metrics
import logging
logging.basicConfig(
    level=logging.INFO,
    format='%(levelname)s - %(name)s - %(message)s'
)

EVALUATION_DATASET = ""
# Evaluator ARN (custom evaluator from AI Registry)
EVALUATOR_ARN = ""

# Create evaluator with evaluator arn
evaluator = CustomScorerEvaluator(
    evaluator=EVALUATOR_ARN,
    model=BASE_MODEL,  # Required str: Model identifier from hub content
    dataset=EVALUATION_DATASET,  # Required Any: Dataset for evaluation (S3 path or AIR dataset object)
    s3_output_path=S3_OUTPUT_PATH  # Required str: S3 bucket URI for evaluation outputs
)

pprint(evaluator)


# Alternate Option 1: Use built-in metrics

# from sagemaker.train.evaluate import get_builtin_metrics
# 
# BuiltInMetric = get_builtin_metrics()
# 
# pprint(list(BuiltInMetric)) # Display available preset metrics
# 
# evaluator_builtin = CustomScorerEvaluator(
#     evaluator=BuiltInMetric.PRIME_MATH,  # Or use string: "prime_math"
#     dataset=EVALUATION_DATASET,
#     base_model=BASE_MODEL,
#     s3_output_path=S3_OUTPUT_PATH
# )

## Step 2: Run Evaluation

In [None]:
# Run evaluation
execution = evaluator.evaluate()

print(f"Evaluation job started!")
print(f"Job ARN: {execution.arn}")
print(f"Job Name: {execution.name}")
print(f"Status: {execution.status.overall_status}")

pprint(execution)

## Step 3: Monitor Execution

In [None]:
execution.refresh()

print(f"Current status: {execution.status}")

# Display individual step statuses
if execution.status.step_details:
    print("\nStep Details:")
    for step in execution.status.step_details:
        print(f"  {step.name}: {step.status}")

## Step 4: Wait for Completion

Wait for the pipeline to complete. This provides rich progress updates in Jupyter notebooks:

In [None]:
execution.wait(target_status="Succeeded", poll=5, timeout=3600)

print(f"\nFinal Status: {execution.status.overall_status}")

## Step 5: View Results

Display the evaluation results in a formatted table:

In [None]:
execution.show_results()

# Part 3: End-to-End Model Deployment with SageMaker

This comprehensive notebook demonstrates deployment workflows for fine-tuned large language models (LLMs) using Amazon SageMaker.

Chose one of the below deployment options.

**Option 1. Deploy using Model Builder**

**Option 2. Deploy using Bedrock Model Builder**

**Instructions:** Run only ONE of the options, not all of them.

## Option 1: Deploy using Model Builder:

ModelBuilder has three ways of building the model.

**Option A. Model Builder using TrainingJob**: Take a completed fine-tuning job and deploy it directly as a real-time inference endpoint.

**Option B. Model Builder using ModelPackage**: Use versioned model packages from the SageMaker Model Registry for deployment

**Option C. Model Builder using Trainer**: Use fine-tuning interfaces through resource chaining, so users do not have to manually pass in the model weights

Above approaches support:
- Standalone endpoint deployment (dedicated resources)
- Multi-adapter deployment (shared base model with multiple fine-tuned adapters)


**Instructions:** Run only ONE of the options, not all of them.

### Option A: Model Builder using TrainingJob

This section demonstrates the most direct deployment path: taking a completed SageMaker training job and deploying it as a real-time inference endpoint. This approach is ideal when you've just finished fine-tuning a model and want to immediately deploy it for testing or production use. Use the SageMaker ModelBuilder to prepare the trained model for deployment. ModelBuilder performs several critical tasks:

**Key Benefits:**
- Direct deployment from training artifacts
- No intermediate model registration required
- Fastest path from training to inference
- Automatic model artifact resolution


**ModelBuilder will:**
1. Validates the training job artifacts and metadata
2. Creates a SageMaker Model resource with appropriate container configurations
3. Sets up inference specifications (input/output handling)
4. Configures model data location and IAM roles
5. Prepares the model for either standalone or multi-adapter deployment


In [None]:
import random
from sagemaker.serve import ModelBuilder
from sagemaker.core.resources import TrainingJob

name = f"e2e-{random.randint(100, 100000)}"

print(f"Endpoint Name: {name}")
print(f"Training Job Name: {TRAINING_JOB_NAME}")

training_job = TrainingJob.get(training_job_name=TRAINING_JOB_NAME)
model_builder = ModelBuilder(model=training_job)
model_builder.build(model_name=name)

### Option B: Model Builder using ModelPackage

**When to Use ModelPackages:**
- Production deployments requiring approval gates
- Multi-environment deployments (dev, staging, prod)
- Models shared across teams or accounts
- Compliance and audit requirements
- Deploy any approved version, not just the latest training run

ModelPackages are automatically created when training jobs complete, or can be registered manually.

**ModelPackage Metadata:**
- **Group**: 'test-finetuned-models' (collection of related model versions)
- **Version**: 3 (specific iteration of the fine-tuned model)
- **Status**: Completed (ready for deployment)

**Inference Specification:**
- Model artifacts location in S3
- Base model reference
- Recipe name for fine-tuning configuration
- Container and runtime requirements

**The key difference between ModelPackage vs TrainingJob Deployment:**
- **ModelPackage**: Uses versioned, approved artifacts from Model Registry
- **TrainingJob**: Uses artifacts directly from training output

ModelBuilder automatically resolves all necessary metadata from the ModelPackage, including model artifacts, base model references, and inference configurations.

In [None]:
from sagemaker.core.resources import ModelPackage
import datetime
from dateutil.tz import tzlocal
import random
from sagemaker.serve import ModelBuilder

MODEL_PACKAGE_ARN = ""

model_package = ModelPackage.get(model_package_name=MODEL_PACKAGE_ARN)

model_builder = ModelBuilder(model=model_package)
model_builder.build()

### Option C: Model Builder using Trainer

Model Builder also supports handshake with fine-tuning interfaces through resource chaining, so users do not have to manually pass in the model weights

In [None]:
import random
from sagemaker.serve import ModelBuilder

name = f"e2e-{random.randint(100, 10000)}"

# Note: trainer is created in Part 1.
model_builder = ModelBuilder(model=trainer)
model_builder.build()

### Deploy as Standalone Endpoint or Inference Component (Adapter)

Two ways to deploy the model:
- **Standalone Endpoint**: The endpoint will be created with the name specified in the `name` variable and will be ready to accept inference requests once deployment completes (typically 5-10 minutes).
- **Deploy as Inference Component (Adapter)**: Deploy the fine-tuned model as an InferenceComponent (adapter) on an existing endpoint. The `inference_component_name` parameter identifies this specific adapter for routing requests.
  - **Use Cases of Deploy as Inference Component:**
      - Serving multiple fine-tuned variants (e.g., different domains, languages, or tasks)
      - A/B testing different fine-tuning approaches
      - Multi-tenant deployments with isolated adapters per customer
      - Route requests to specific adapters via inference component names

In [None]:
# Deploy as Standalone Endpoint 
endpoint = model_builder.deploy(endpoint_name=name)

# OR
# Deploy as Inference Component (Adapter)
# endpoint = model_builder.deploy(endpoint_name=name, inference_component_name=f"{name}-adapter")

### Test the Endpoint

Validate the deployed endpoint by sending a test inference request. This example demonstrates:

**Request Format:**
- **inputs**: The prompt text for the model
- **parameters**: Inference configuration
  - `max_new_tokens`: Maximum length of generated response (50 tokens)

**Expected Behavior:**
- The endpoint processes the prompt through the fine-tuned model
- Returns generated text based on the model's training
- Response includes the generated text and metadata

This test confirms the endpoint is operational and the model is responding correctly to inference requests.

In [None]:
import boto3
import json

sagemaker_runtime = boto3.client(
    'sagemaker-runtime',
    region_name=REGION
)

response = sagemaker_runtime.invoke_endpoint(
    EndpointName=name,
    Body=json.dumps({"inputs": "What is the capital of France?", "parameters": {"max_new_tokens": 50}}),
    ContentType='application/json'
)

result = json.loads(response['Body'].read().decode())
print(result)

## Option 2: Bedrock Model Builder

Below section highlights the working flow for deploying the model using Bedrock Model Builder 

In [None]:
from sagemaker.serve.bedrock_model_builder import BedrockModelBuilder
from sagemaker.core.resources import TrainingJob
import random

training_job = TrainingJob.get(training_job_name=TRAINING_JOB_NAME)
name = f"e2e-{random.randint(100, 100000)}"

bedrock_builder = BedrockModelBuilder(model=training_job)

#deploy the model
bedrock_builder.deploy(job_name=name, imported_model_name=name, role_arn=ROLE_ARN)