# Supervised Fine-Tuning (SFT) Amazon Nova 2 Lite using Amazon SageMaker Training Jobs

First, it helps to understand the general flow of training LLMs. Training a large language model typically has two major stages: pre-training and post-training. During pre-training, the model is exposed to trillions of tokens of raw text and optimized purely for next-token prediction. This makes it an extremely capable pattern completer over the distribution of web and curated text. It absorbs syntax, semantics, facts, and broad reasoning patterns. But it is unaligned with human intent, meaning it does not inherently understand instructions, user goals, or context-appropriate behavior. It simply continues text in whatever style best fits its training distribution. As a result, a pre-trained model tends to autocomplete rather than follow directions, is inconsistent about formatting or tool use, and can mirror undesirable biases or unsafe content present in the data. In short, pre-training builds general competence, not usefulness for tasks.

Post-training turns that competent pattern completer into a useful assistant. Teams typically run multiple rounds of Supervised Fine-Tuning (SFT) to teach the model to follow instructions, adhere to schemas and policies, call tools, and produce reliable, scoped outputs by imitating high-quality demonstrations. This adds a first layer of alignment where the model learns to respond to prompts as tasks, not just text to continue. They then apply Reinforcement Fine-Tuning (RFT) to push behavior further using measurable feedback (e.g., verifiers or an LLM-as-a-judge), optimizing nuanced trade-offs like accuracy vs. brevity, safety vs. coverage, or multi-step reasoning under constraints. In practice, teams alternate SFT and RFT in cycles, progressively shaping the pre-trained model into a reliable, policy-aligned system that performs complex tasks with consistency.

Supervised fine-tuning is the classic approach of training the LLM on a dataset of human-labeled input-output pairs for the task of interest. In other words, you provide examples of prompts (or questions, instructions, etc.) along with the correct or desired responses, and continue training the model on these. The model's weights are adjusted to minimize a supervised loss (typically cross-entropy between its predictions and the target output tokens). This is essentially the same kind of training used in most supervised machine learning tasks, now applied to LLM to specialize it.

### Model Customization
You can customize Amazon Nova models through base recipes using and Amazon SageMaker Training Jobs (SMTJ). These recipes support Supervised Fine-Tuning (SFT), with both Full-Rank and Low-Rank Adaptation (LoRA) options.  

End-to-end customization workflow involves stages like 
- model training
- model evaluation
- deployment for inference. 

Further, SMTJ allow Nova models to customized using iterative training.  Iterative training is the process of repeatedly fine-tuning a model through multiple training cycles across different training methods — train, evaluate, analyze errors, adjust data/objectives/hyperparameters — with each round starting from the previous checkpoint.

This model customization approach on SageMaker AI provides greater flexibility and control to fine-tune its supported Amazon Nova models, optimize hyperparameters with precision

## 1. Getting Started

This notebook demonstrates Full Rank Supervised Fine-Tuning (SFT), Supervised Fine-Tuning (SFT) with Parameter-Efficient Fine-Tuning (PEFT), and iterative training of Amazon Nova using Amazon SageMaker Training Jobs. 

Supervised Fine-Tuning (SFT) trains an LLM on labeled input-output pairs to modify behavior for specific tasks. Training is essentially a continuous loop of steps:

- Data preparation
- Model customization / training
- Model evaluation
- Deployment and inference
- Analysis Feedback

This notebook will focus on Model customization / training using SageMaker Training Jobs (SMTJ), a feature supported in SageMaker AI.

As Nova models are 1st party models, access to model weights is not permitted.  However, using a SMTJ coupled with a "recipe" accomplishes the customization training goal.

A recipe is simply a configuration file.

Once a SMTJ completes, the job results are place into S3, and a `manifest.json` file and a `step_wise_training_metrics.csv` are created.  The manifest file identifies the escrow location of the trained model. An escrow account holds and manages assets (model weights) ensuring security.

## 2. Prerequisites and Dependencies

### Dependencies
Several python packages will need to be installed in order to execute this notebook.  Please review the packages in requirements.txt. 

botocore, boto3, sagemaker are required for the training jobs, while the other packages are used to help visualize results.

In [None]:
! pip install -r ./requirements.txt --upgrade

### Prerequisite: Data Prep Notebook
The Data Prep notebook walks through preparing and transforming a public dataset into a format and scheme acceptable for SMTJ.  The Data Prep notebook creates training and validation datasets used for training a model, as well as a test dataset for evaluation.

**--------------- STOP ---------------** <br><br>To complete this notebook, the Data Prep notebook must be completed first. In that workbook, training, validation, and eval datasets are created.  These datasets are carried over for use in this notebook.  Specific items from the Data Prep notebook, used in this notebook, are called out below.
<br><br>

Either restore or set these values.

In [None]:
# This value is obtained as result of executing the data prep notebook
train_dataset_s3_path = ""
%store -r train_dataset_s3_path 

print(train_dataset_s3_path)

### Credentials, Sessions, Roles, and more!

This section sets up the necessary AWS credentials and SageMaker session to run the notebook. You'll need proper IAM permissions to use SageMaker.


If you are going to use Sagemaker in a local environment, you will need access to an IAM Role with the required permissions for Sagemaker. Learn more about it here [AWS Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html).

For more details on other Nova pre-requisites needed check out [AWS Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-general-prerequisites.html)

The code initializes a SageMaker session, sets up the IAM role, and configures the S3 bucket for storing training data artifacts.

In [None]:
import sagemaker
import boto3


sagemaker_session = sagemaker.Session(boto_session=boto3.Session(region_name='us-east-1'))

sagemaker_session_bucket = None

if sagemaker_session_bucket is None and sagemaker_session is not None:
    # set to default bucket if a bucket name is not given
    sagemaker_session_bucket = sagemaker_session.default_bucket()


try:
    role = sagemaker.get_execution_role()
    
except ValueError:
    iam = boto3.client("iam")
    role = iam.get_role(RoleName="sagemaker_execution_role")["Role"]["Arn"]

bucket_name = sagemaker_session.default_bucket()
default_prefix = sagemaker_session.default_bucket_prefix

print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {sagemaker_session.default_bucket()}")
print(f"sagemaker session region: {sagemaker_session.boto_region_name}")

Capture S3 bucket prefix for later use.  After a SageMaker Training Job completes, this value will be used to identify where the SMTJ outputs are stored.

In [None]:
if default_prefix:
    output_path_prefix = f"s3://{bucket_name}/{default_prefix}"

else:
    output_path_prefix= f"s3://{bucket_name}"

## 3. Data Prep - Review
In the data prep workbook, we created our training, validation, and test datasets.  We will use the train and validation datasets for training.

Remember, prepare high-quality prompt-response pairs for training. Data should be:
- Consistent in format
- Representative of desired behavior
- Deduplicated and cleaned


For reference, here is the schema that represents a single record in thre training data.

```
{
  "schemaVersion": "bedrock-conversation-2024",
  "system": [{"text": "You are a digital assistant with a friendly personality"}],
  "messages": [
    {
      "role": "user",
      "content": [{ "text": "What is the capital of Mars?"}]
    },
    {
      "role": "assistant",
      "content": [{"text": "Mars does not have a capital. Perhaps it will one day."}]
    }
  ]
}
```


## 4. Model Supervised Fine Tuning (SFT) and Customization
You can customize Amazon Nova models through recipes and train them on SageMaker AI. Recipes support techniques such as supervised fine-tuning (SFT), with both full-rank and low-rank adaptation (LoRA) options.



### Model fine-tuning

To customize a model, the SageMaker Training Job (SMTJ) uses a PyTorch estimator to run the customization.  The estimator defines properties of the training job, such as training job name, instance type, instance count, output location for job results, training recipe, and more.

More details of the estimator can be found here: [PyTorch Estimator](https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html)

So we must define
- the training recipe
- properties for the estimator

### Recipes
A Nova recipe is a YAML configuration file that provides details to SageMaker AI on how to run your model customization job. It provides the base model name, sets training hyperparameters, defines optimization settings, and includes any additional options required to fine-tune or train the model successfully.

These recipe files serve as the blueprint for your model customization jobs, allowing you to specify training parameters, hyperparameters, and other critical settings that determine how your model learns from your data. To adjust the hyperparameters, follow the guidelines in [Selecting hyperparameters](https://docs.aws.amazon.com//nova/latest/userguide/customize-fine-tune-hyperparameters.html).

#### Configuring the Model and Recipe

This specifies which model to fine-tune and the recipe to use. The recipe includes "LoRA" indicating parameter-efficient fine-tuning, and "sft" indicating supervised fine-tuning.

The recipes can be found at:
- [Amazon Nova recipes](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-recipes.html), or 
- [GitHub - sagemaker-hyperpod-recipes](https://github.com/aws/sagemaker-hyperpod-recipes). Navigate to `recipes_collection -> recipes -> fine-tuning/nova -> nova_2_0/nova_lite`  to find all Nova 2 Lite fine tuning recipes.

Now, let's take a look at 3 SFT recipe examples - PEFT, Full Rank, and Iterative Training

### PEFT SFT Recipe
This recipe configures PEFT / LoRA SFT training.

Note, the value for `peft_recipe` is found in the "Amazon Nova recipes" link above.

In [None]:
peft_recipe = "fine-tuning/nova/nova_2_0/nova_lite/SFT/nova_lite_2_0_p5_gpu_lora_sft"
peft_recipe_job_name = "train-peft-sft-nova-lite-2-recipe-job" # example
peft_sm_job_name = "train-peft-sft-nova-lite-2"

peft_recipe_overrides = {
    "run": {
        "name": peft_recipe_job_name,
        "data_s3_path": "",  # For SMTJ, this is ""
        "output_s3_path": f"{output_path_prefix}/{peft_sm_job_name}"
    },
}

### Full Rank Recipe
This recipe configures Full Rank SFT training.

Note, the value for `full_rank_recipe` is found in the "Amazon Nova recipes" link above.

In [None]:
full_rank_recipe = "fine-tuning/nova/nova_2_0/nova_lite/SFT/nova_lite_2_0_p5_gpu_sft"
full_rank_recipe_job_name = "train-fr-sft-nova-lite-2-recipe-job" # example
full_rank_sm_job_name = "train-fr-sft-nova-lite-2"

full_rank_recipe_overrides = {
    "run": {
        "name": full_rank_recipe_job_name,
        "data_s3_path": "",  # For SMTJ, this is ""
        "output_s3_path": f"{output_path_prefix}/{full_rank_sm_job_name}"
    },
}

### Iterative Training Recipe
Iterative training is the process of repeatedly fine-tuning a model through multiple training cycles across different training methods — train, evaluate, analyze errors, adjust data/objectives/hyperparameters — with each round starting from the previous checkpoint. This approach allows you to systematically target model failure modes, incorporate curated examples addressing specific weaknesses, and adapt to changing requirements over time.

#### How it works
After each training job completes, a manifest file is generated in the output location specified by the output_path parameter in your training configuration.

To access your checkpoint:

1. Navigate to your specified output_path in S3
2. Download and extract the output.tar.gz file
3. Open the manifest.json file inside
4. Locate the checkpoint_s3_bucket parameter, which contains the S3 URI of your trained model.  

**-------------- Important --------------**<br><br>
The checkpoint_s3_bucket URI found in the manifest.json file is the source of truth for location of the trained model checkpoint.
<br><br>

Example manifest.json structure:
```
{
  "checkpoint_s3_bucket": "s3://customer-escrow-<account-number>-smtj-<unique-identifier>/<job-name>/stepID",
  ...
}```

Back now to the recipe, this recipe configures Iterative training.

Note, the value for `iterative_peft_recipe` is found in the "Amazon Nova recipes" link above.

In [None]:
iterative_peft_recipe = "fine-tuning/nova/nova_2_0/nova_lite/SFT/nova_lite_2_0_p5_gpu_lora_sft"
iterative_peft_recipe_job_name = "train-iterative-peft-sft-nova-lite-2-recipe-job" # example
iterative_peft_sm_job_name = "train-iterative-peft-sft-nova-lite-2"

iterative_peft_model_name_or_path = "<place checkpoint_s3_bucket value from manifest.json>" # example

iterative_peft_recipe_overrides = {
    "run": {
        "name": iterative_peft_recipe_job_name,
        "model_name_or_path": iterative_peft_model_name_or_path,
        "data_s3_path": "",  # For SMTJ, this is ""
        "output_s3_path": f"{output_path_prefix}/{iterative_peft_sm_job_name}"
    },
}

### Select training technique to use
This is just a helper that allows chosing of whichever recipe desired.  Change the value of `technique` to one of the keywords indicated in the comment.

This will allow running the notebook efficiently.  Change the key, run the cells.  Change the keyword to another technique, run the cells.  Easy!

In [None]:
# technique values: "PEFT" OR "FR" OR "IPEFT"
technique = "PEFT" 


sm_training_job_name = ""
training_recipe = ""
recipe_overrides = {}


if technique == "PEFT":
    # PEFT Training
    print("PEFT Training")
    training_recipe = peft_recipe
    recipe_overrides = peft_recipe_overrides
    sm_training_job_name = peft_sm_job_name
elif technique == "FR":
    # Full Rank Training
    print("Full Rank Training")
    training_recipe = full_rank_recipe
    recipe_overrides = full_rank_recipe_overrides
    sm_training_job_name = full_rank_sm_job_name
elif technique == "IPEFT":
    # Iterative PEFT Training
    print("Iterative PEFT Training")
    training_recipe = iterative_peft_recipe
    recipe_overrides = iterative_peft_recipe_overrides
    sm_training_job_name = iterative_peft_sm_job_name
else:
    print("*** Issue - training undefined ***")

#### Instance Type and Count

P5 instances are optimized for deep learning workloads, providing high-performance GPUs.

In [None]:
instance_type = "ml.p5.48xlarge"
instance_count = 4

print(f"instance_type: \n{instance_type}")

#### Select Container Image URI

This specifies the pre-built container for SFT fine-tuning of Nova 2 Lite.

In [None]:
image_uri = "708977205387.dkr.ecr.us-east-1.amazonaws.com/nova-fine-tune-repo:SM-TJ-SFT-V2-latest"

print(f"image_uri: \n{image_uri}")

### Output path
This path will be used to write model results of the model training.  In this location will be found an output.tar.gz file, containing the `manifest.json` and `step_wise_training_metrics.csv`.

In [None]:
import json

print(f"recipe_overrides: \n{json.dumps(recipe_overrides, indent=4)}\n")

output_s3_path = recipe_overrides["run"]["output_s3_path"]

print(f"output_path:\n{output_s3_path}")

Configure PyTorch estimator

In [None]:
from sagemaker.pytorch import PyTorch

estimator = PyTorch(
    output_path=output_s3_path,
    base_job_name=sm_training_job_name,
    role=role,
    disable_profiler=True,
    debugger_hook_config=False,
    instance_count=instance_count,
    instance_type=instance_type,
    training_recipe=training_recipe,
    recipe_overrides=recipe_overrides,
    max_run=432000,
    sagemaker_session=sagemaker_session,
    image_uri=image_uri,
)

#### Configure the Data Channels

Configure the Data Channels

In [None]:
from sagemaker.inputs import TrainingInput

train_input = TrainingInput(
    s3_data=train_dataset_s3_path,
    distribution="FullyReplicated",
    s3_data_type="Converse",
)

#### Start the Training Job
This starts the training job with the configured estimator and datasets. 

In [None]:
# starting the train job with our uploaded dataset as input
estimator.fit(inputs={"train": train_input}, wait=False)

In [None]:
training_job_name = estimator.latest_training_job.name
print(f'Training Job Name:  \n{training_job_name}')

In [None]:
from IPython.display import HTML, Markdown, Image

display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/jobs/{}">Training Job</a> After About 5 Minutes</b>'.format("us-east-1", training_job_name)))
display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/cloudwatch/home?region={}#logStream:group=/aws/sagemaker/TrainingJobs;prefix={};streamFilter=typeLogStreamPrefix">CloudWatch Logs</a> After About 5 Minutes</b>'.format("us-east-1", training_job_name)))
display(HTML('<b>Review <a target="blank" href="https://s3.console.aws.amazon.com/s3/buckets/{}/{}/?region={}&tab=overview">S3 Output Data</a> After The Training Job Has Completed</b>'.format(bucket_name, training_job_name, "us-east-1")))


---
## _Wait Until the ^^ Training Job ^^ Completes Above!( 20-40 mins)_
---

## 5. Training Artifacts



### 5.1. Reading the Output Content after training job completion

In [None]:
# Get the S3 model artifact Output s3 uri
model_s3_uri = estimator.model_data

output_s3_uri = "/".join(model_s3_uri.split("/")[:-1])+"/output.tar.gz"

print(f"model_s3_uri: \n{model_s3_uri}\n")
print(f"output_s3_uri: \n{output_s3_uri}")

#### 5.2. Downloading and Extracting the Artifacts

In [None]:
tmp_folder = "tmp"

# create a unique folder name based on the output s3 uri
folder = output_s3_uri.rsplit('/', 2)[0] 
folder = folder.rsplit('/', 1)[-1]

print(f"folder: \n{folder}\n")

!mkdir -p ./$tmp_folder/$folder/train_output/

In [None]:
!aws s3 cp $output_s3_uri ./$tmp_folder/$folder/train_output/output.tar.gz

In [None]:
!tar -xvzf ./$tmp_folder/$folder/train_output/output.tar.gz -C ./$tmp_folder/$folder/train_output/

#### 5.3 Open Manifest and get the model checkpoint
This is one of the most important steps to find the current model checkpoint. UI's may get confusing, but the `checkpoint_s3_bucket` is the source of truth for the location of the model checkpoint artifacts.  Again, this location is within an AWS managed account, and not accessible to users.

In [None]:
import json

checkpoint_s3_bucket = json.load(open(f'./{tmp_folder}/{folder}/train_output/manifest.json'))['checkpoint_s3_bucket']

# This value will be used for custom model Deployment as well as for model Evaluation (see eval)
print(f"checkpoint_s3_bucket: \n{checkpoint_s3_bucket}")

#### 5.4. Plotting the Train/Loss Curve 

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Read the CSV files
train_df = pd.read_csv(f'./{tmp_folder}/{folder}/train_output/step_wise_training_metrics.csv')

# Create the plot
plt.figure(figsize=(10, 6))
plt.plot(train_df['step_number'], train_df['training_loss'], label='Training Loss', color='blue')

plt.xlabel('Step Number')
plt.ylabel('Loss')
plt.title('Training vs Validation Loss')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

## 6. Deployment / Inference
Woot! All done!  Congratulations on training a custom model.

Next, please see the deployment notebook in the deployment folder.

<br>**---------- BEFORE YOU GO!! ----------**<br><br>
The below values are needed for the customization notebook.

To use that notebook, these values will need to be recorded and used:
- SageMaker sm_training_job_name
- checkpoint_s3_bucket

In [None]:
print(f"\nsm_training_job_name:\n{sm_training_job_name}\n")
print(f"checkpoint_s3_bucket: \n{checkpoint_s3_bucket}")

In [None]:
%store sm_training_job_name
%store checkpoint_s3_bucket

## 7. Exploring More for SFT

### 7.1 Nova 2 Lite Iterative Training
Organizations often face challenges when implementing single-shot fine-tuning approaches for their generative AI models. The single-shot fine-tuning method involves selecting training data, configuring hyperparameters, and hoping the results meet expectations without the ability to make incremental adjustments. Single-shot fine-tuning frequently leads to suboptimal results and requires starting the entire process from scratch when improvements are needed.

Iterative fine-tuning provides several advantages over single-shot approaches that make it valuable for production environments. Risk mitigation becomes possible through incremental improvements, so you can test and validate changes before committing to larger modifications. With this approach, you can make data-driven optimization based on real performance feedback rather than theoretical assumptions about what might work. The methodology also helps developers to apply different training techniques sequentially to refine model behavior. Most importantly, iterative fine-tuning accommodates evolving business requirements driven by continuous live data traffic. As user patterns change over time and new use cases emerge that weren’t present in initial training, you can leverage this fresh data to refine your model’s performance without starting from scratch.

Iterative Training is available for Nova 2 Lite using either Bedrock or SageMaker Training Jobs.

[SageMaker AI - Iterative Training](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-hp-iterative-training.html)

[Iterative fine-tuning on Amazon Bedrock](https://aws.amazon.com/blogs/machine-learning/iterative-fine-tuning-on-amazon-bedrock-for-strategic-model-improvement/)
