# Generating Synthetic Data and Distilling Knowledge to Fine-Tune Smaller Models with Amazon Nova Pro

In this notebook, we will walk you through the process of utilizing a larger language model (LLM) like Amazon Nova Pro, Amazon's latest and most advanced model, to create a dataset for instruction fine-tuning. This dataset will be used to perform distillation by fine-tuning a smaller model, such as Amazon Nova Micro

By leveraging the capabilities of Amazon Nova Pro, we can generate high-quality, concise training data that enhances the performance of smaller models. This approach is particularly useful for tasks that require detailed and specific instructions.

Before we begin, ensure that you have access to Amazon Nova Pro, which is now available on Amazon Bedrock foundation models. You can find the dataset `deepmind/aqua_rat` we will be using [here](https://huggingface.co/datasets/deepmind/aqua_rat).

You can run the notebook on an Amazon SageMaker Studio notebook, or a SageMaker notebook instance without manually setting your aws credentials.

Let's get started!

### Amazon Bedrock

Amazon Bedrock is a fully managed service by AWS that simplifies the integration of generative AI into applications. It provides access to a variety of foundation models for tasks like natural language processing and image generation, offering APIs and SDKs for easy integration. Designed for scalability, security, and cost-effectiveness, Amazon Bedrock abstracts the complexities of deploying and managing AI models, allowing developers to focus on building their applications while leveraging AWS's robust infrastructure. This service is ideal for businesses looking to incorporate AI capabilities without requiring deep expertise in machine learning or infrastructure management.

### Amazon Nova Pro Model

Amazon Nova Pro is the largest model in the family of Amazon Nova models. Amazon Nova model family is a collection of pre-trained and instruction-tuned LLMs which already includes micro and lite sizes. Amazon Nova Pro comes with new capabilities including multi-language support and a 300k context window. These models are stronger overall capabilities and are ideal for content creation, conversational AI, language understanding, research and development (R&D), and enterprise applications.


### Amazon Nova Micro Model

Amazon Nova Micro Model is a low latency and cost-effective, text-to-text LLM. The context length is 128k tokens, and max output is 5K. This LLM is designed to deliver high performance across a variety of tasks while maintaining cost efficiency. This model is particularly advantageous for developers and organizations looking to implement advanced AI capabilities without the need for extensive computational resources. 


### Prequisites
 In order to follow along in this notebook, you'll need access to the following:

 - An AWS account with Amazon Bedrock model access to Amazon's frontier models. See the model access page [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html).

 - An [AWS Identity and Access Management (IAM)](https://aws.amazon.com/iam/) role to access Amazon Bedrock models. To learn more about how IAM works with Amazon Bedrock, refer to [Identity and Access Management for Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-prereq.html).
 
 - Access to SageMaker Studio or a SageMaker notebook instance or an interactive development environment (IDE) such as PyCharm or Visual Studio Code. We recommend using SageMaker Studio for straightforward deployment and inference.


### In this notebook, we perform the following high level steps: 

1. **Evaluate Initial Performance**: We explore and prepare questions and answer dataset. Then we assess the quality and performance of sample inference responses from Amazon Nova Pro and Micro models using the deepmind/aqua_rat dataset.

2. **Generate Synthetic Training Data**: We create synthetic training data with Amazon Nova Pro, which serves as the teacher model.

3. **Perform Knowledge Distillation**: We first transform systhetic data in the training dataset format needed to fine-tune Amazon Nova model. We then fine-tune Amazon Nova Micro using the synthetic training dataset generated by Amazon Nova Pro.

4. **Test the Fine-Tuned Model**: We deploy the the fine tuned model Amazon Nova Micro using Bedrock's Provisioned Throuput feature, and evaluate the fine-tuned model by testing it against the test dataset to demonstrate the improvement in response quality when compared against base model.

5. **Conclusion**: The notebook concludes with a summary of the knowledge distillation process and highlights how the teacher model's expertise is effectively transferred to the smaller student model. 

6. **Cleanup**: At the end of the notebook, we provide guidance on cleaning a fine-tuned custom model artifact, and Bedrock's Provisioned Throughput capacity used to deploy the fine-tuned custom model.

In [None]:
# Import necessary libraries
import logging
import json
from IPython.core.display import display, HTML
from botocore.exceptions import ClientError
import os
from botocore.config import Config

In [None]:
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

## Section 1: Evaluate Initial Performance
We explore and prepare questions and answer dataset. Then we assess the quality and performance of sample inference responses from Amazon Nova Pro and Micro models using the deepmind/aqua_rat dataset.

### 1.a Dataset Exploration and Preparation

In this section, we will explore a dataset from the Hugging Face Hub using the HF Datasets library. Hugging Face provides a vast collection of datasets for various tasks in natural language processing (NLP), computer vision, and audio processing. This exploration will help us understand the structure, features, and contents of the dataset, enabling us to prepare it for training and evaluation in our machine learning models. The [deepmind/aqua_rat](https://huggingface.co/datasets/deepmind/aqua_rat) dataset is a large-scale collection of approximately 100,000 algebraic word problems, each accompanied by a detailed natural language rationale explaining the solution process. This dataset is designed to train and evaluate models that not only generate the correct answer but also provide a step-by-step explanation, making it ideal for tasks requiring mathematical reasoning and natural language understanding.

In [None]:
!pip install --upgrade datasets

In [None]:
from datasets import load_dataset, DatasetDict

# Load the dataset
dataset_name = "deepmind/aqua_rat"
dataset = load_dataset(dataset_name)

# Split the train dataset into train and test splits
split_ratio = 0.8  # 80% for training, 20% for testing
train_test_split = dataset['train'].train_test_split(test_size=1 - split_ratio)

# Optionally split the train set further into train and validation sets
train_validation_split = train_test_split['train'].train_test_split(test_size=0.1)  # 10% for validation

# Create a new dataset dictionary with the new splits
dataset = DatasetDict({
    'train': train_validation_split['train'],
    'validation': train_validation_split['test'],
    'test': train_test_split['test']
})

# Display the number of examples in each split after splitting
print("\nNumber of Examples in Each Split After Splitting:")
for split in dataset.keys():
    if dataset[split] is not None:
        print(f"{split}: {len(dataset[split])} examples")

In [None]:
# Display basic information about the dataset
print(f"Dataset: {dataset_name}")
print(dataset)

# Display the dataset's features
print("\nDataset Features:")
print(dataset['train'].features)

# Display a few examples from the dataset
print("\nSample Examples:")
for i in range(3):
    print(dataset['train'][i])

# Extract 20 questions from the train split
questions = dataset['train'].select(range(20))['question']

# Display the first 20 questions
print("\nFirst 20 Questions:")
for i, question in enumerate(questions):
    print(f"{i+1}: {question}")

### 1.b Quality of response before fine-tuning Amazon Nova Micro

In this section, we will review response from Amazon Nova Micros as is. And, later we will use this output response to compare how knwoledge distillation helped improve quality of Amazon Nova Micro. In addition to reviewing the quality, note `latencyMs` metric to see how smaller models are faster than larger models. 


In [None]:
question = questions[1]
print(question)

#### 1.b.1 Compare Responses of Amazon Nova Micro and Pro models with a sample question

***
`Amazon Nova Micro`


In [None]:
import boto3
import json

# Create a Bedrock Runtime client
client = boto3.client(
    "bedrock-runtime",
    region_name="us-east-1"
)
MODEL_ID = "amazon.nova-micro-v1:0

# Define your system prompt(s).
system = [
    {
        "text": "You are good at tasks requiring mathematical reasoning and natural language understanding. Start your response with a correct answer, and provide the rationale behind your answer"
    }
]

# Your user prompt
messages = [
    {"role": "user", "content": [{"text": question}]},
]

# Configure the inference parameters.
inf_params = {"maxTokens": 500, "topP": 0.1, "temperature": 0.3}

model_response = client.converse(modelId=MODEL_ID, messages=messages, system=system, inferenceConfig=inf_params)

print("\n[Full Response]")
print(json.dumps(model_response, indent=2))

print("\n[Response Content Text]")
print(model_response["output"]["message"]["content"][0]["text"])

***
`Amazon Nova Pro`


In [None]:
import boto3
import json

# Create a Bedrock Runtime client
client = boto3.client(
    "bedrock-runtime",
    region_name="us-east-1"
)
MODEL_ID = "us.amazon.nova-pro-v1:0"

# Define your system prompt(s).
system = [
    {
        "text": "You are good at tasks requiring mathematical reasoning and natural language understanding. Start your response with a correct answer, and provide the rationale behind your answer"
    }
]

# Your user prompt
messages = [
    {"role": "user", "content": [{"text": question}]},
]

# Configure the inference parameters.
inf_params = {"maxTokens": 500, "topP": 0.1, "temperature": 0.3}

model_response = client.converse(modelId=MODEL_ID, messages=messages, system=system, inferenceConfig=inf_params)

print("\n[Full Response]")
print(json.dumps(model_response, indent=2))

print("\n[Response Content Text]")
print(model_response["output"]["message"]["content"][0]["text"])

## Section 2: Generate Synthetic Training Data
In this section, we will leverage the Amazon Nova Pro model to generate high-quality synthetic data for distillation by fine-tuning the Amazon Nova Micro model. By using the Pro model to generate responses to domain-specific prompts, we can create a labeled dataset that will be used to fine-tune the Micro model, improving its accuracy and effectiveness in specific tasks.

In [None]:
# Initialize the Bedrock client
config = Config(read_timeout=5000)
client = boto3.client("bedrock-runtime", region_name="us-east-1", config=config)

# Set the model ID, e.g.,amazon.Nova-pro-v1:0
model_id = "us.amazon.nova-pro-v1:0"

# Load the dataset and select the first 20 questions
dataset = load_dataset('deepmind/aqua_rat', split='train')
questions = dataset.select(range(500))['question']


# Function to run inference and generate synthetic data using Bedrock
def generate_synthetic_data(client, model_id, questions):
    synthetic_data = []
    for question in questions:
        # Add Chain of Thought Reasoning prompt to the question
        system = [
           {
               "text": "You are good at tasks requiring mathematical reasoning and natural language understanding. Start your response with a correct answer, and provide the rationale behind your answer"
           }
        ]
        user_message = f"{question}"
        conversation = [
            {
                "role": "user",
                "content": [{"text": user_message}],
            }
        ]
        try:
            # Send the message to the model, using a basic inference configuration.
            response = client.converse(
                modelId=model_id,
                messages=conversation,
                system=system,
                inferenceConfig={
                    "maxTokens": 500,
                    "temperature": 0.3,
                    "topP": 0.1
                },
            )

            # Extract the response text
            response_text = response["output"]["message"]["content"][0]["text"].strip()
            synthetic_data.append({
                "question": question,
                "answer": response_text
            })
        except (ClientError, Exception) as e:
            print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
            break

    return synthetic_data

# Generate synthetic data using Amazon Nova Pro (for 500 records, it takes 50 mins)
synthetic_data = generate_synthetic_data(client, model_id, questions)

# Save the synthetic data to a JSONL file
with open('synthetic_data.jsonl', 'w') as f:
    for entry in synthetic_data:
        f.write(json.dumps(entry) + '\n')

## Section 3: Perform Knowledge Distillation
We first transform systhetic data in the training dataset format needed to fine-tune Amazon Nova model. We then fine-tune Amazon Nova Micro using the synthetic training dataset generated by Amazon Nova Pro.

### 3.a Transform the jsonl output to Nova model's fine-tuning format

In this section, we will convert the jsonl format to match with schema needed for fine-tuning Amazon Nova Micro model. This is typical schema needed for each json line:
json
```
{
    "schemaVersion": "bedrock-conversation-2024",
    "system": [
      {
        "text": "You are good at tasks requiring mathematical reasoning and natural language understanding. Start your response with a correct answer, and provide the rationale behind your answer"
      }
    ],
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "text":"{question}"
          }
        ]},
      {
        "role": "assistant",
        "content": [
          {
            "text": "{answer}"
          }
        ]
      }
    ]
  }
  ```

In [None]:
import json
import copy

def load_template(template_path):
    with open(template_path, 'r') as f:
        return json.load(f)

def transform_data(jsonl_path, template, output_path):
    with open(jsonl_path, 'r') as input_file, open(output_path, 'w') as output_file:
        for line_number, line in enumerate(input_file, 1):
            try:
                # Parse each line as JSON
                record = json.loads(line.strip())
                
                # Create a deep copy of the template to avoid modifying the original
                transformed_record = copy.deepcopy(template)
                
                # Update the instruction and response in the messages array
                for message in transformed_record['messages']:
                    if message['role'] == 'user':
                        message['content'][0]['text'] = record['question']
                    elif message['role'] == 'assistant':
                        message['content'][0]['text'] = record['answer']
                
                # Write the transformed record to the output file
                output_file.write(json.dumps(transformed_record) + '\n')
                
            except json.JSONDecodeError as e:
                print(f"Error processing line {line_number}: Invalid JSON - {e}")
            except KeyError as e:
                print(f"Error processing line {line_number}: Missing field '{e}'")
            except Exception as e:
                print(f"Error processing line {line_number}: {str(e)}")

def main():
    try:
        # Load the template
        template = load_template('template.json')
        
        # Process the files
        transform_data(
            jsonl_path='synthetic_data.jsonl',
            template=template,
            output_path='train.jsonl'
        )
        print("Transformation complete!")
        
    except FileNotFoundError as e:
        print(f"Error: File not found - {e}")
    except json.JSONDecodeError as e:
        print(f"Error: Invalid template JSON - {e}")
    except Exception as e:
        print(f"Error: {str(e)}")

if __name__ == "__main__":
    main()


### 3.b Upload Files to S3 for Training Job

In [None]:
# Initialize the S3 client
s3 = boto3.client('s3')
bucket_name = 'synthetic-generated-data-<uniqueID>'  # Create a new bucket or use an existing one
subdirectory = 'amazon-nova-pro'
train_data_location = f"s3://{bucket_name}/{subdirectory}"

files_to_upload = ['train.jsonl']

# Upload the files to the specified subdirectory
for file_name in files_to_upload:
    file_path = file_name  # File is in the same directory as the notebook
    key_path = f"{subdirectory}/{file_name}"
    
    # Check if the file exists
    if not os.path.isfile(file_path):
        raise FileNotFoundError(f"No such file or directory: '{file_path}'")
    
    # Upload the file
    try:
        s3.upload_file(file_path, bucket_name, key_path)
        print(f"File {file_name} uploaded successfully to {key_path}.")
    except ClientError as e:
        print(f"Error uploading file {file_name}: {e}")

### 3.b Fine-tuning Amazon Nova Micro

In this section, we will dive deep into the process of distillation by fine-tuning the Amazon Nova Mirco model to enhance its performance for specific tasks. Fine-tuning involves training the pre-trained model on custom datasets to adapt it to particular domains or applications. Amazon Bedrock provides serverless fine-tuning capabilities, where you need provide the data set and the based model-id to train the model. 

#### 3.b.1 Pre-requisites
1. Create or use an IAM Role that allows Amazon Bedrock fine-tuning job to access your training data. The IAM policy for this role "Nova-fine-tuning-role" looks like:
json
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::synthetic-generated-data-<uniqueID>",
                "arn:aws:s3:::synthetic-generated-data-<uniqueID>/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:PrincipalAccount": "<aws-account-id>"
                }
            }
        }
    ]
}
```
2. And, allows Amazon Bedrock fine-tuning job to assume this role by adding the "Trust relationship policy" as mentioned below:
json
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "bedrock.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "<aws-account-id>"
                },
                "ArnEquals": {
                    "aws:SourceArn": "arn:aws:bedrock:us-east-1:<aws-account-id>:model-customization-job/*"
                }
            }
        }
    ]
}
```
3. Attach IAM:PassRole, and IAM:GetRole, to SageMaker execution role. The policy looks like:
json
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "iam:GetRole",
            "Resource": "arn:aws:iam::<aws-account-id>:role/service-role/Nova-fine-tuning-role"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::<aws-account-id>:role/service-role/Nova-fine-tuning-role"
        }
    ]
}
```

#### 3.b.2 Import libraries
Here we are importing necessary libraries and modules. The code is initializing a Sagemaker session and a Bedrock client to interact with Amazon Sagemaker and Bedrock services via APIs respectively.

In [None]:
import boto3
from sagemaker import Session
import uuid  # Import the 'uuid' module for generating a unique identifier

# Create a session using the provided AWS SDK sessions
session = Session(boto_session=boto3.session.Session(),
                sagemaker_client=boto3.client('sagemaker'),
                sagemaker_runtime_client=boto3.client('runtime.sagemaker'))

# Initialize Bedrock client
bedrock = boto3.client(service_name='bedrock', region_name="us-east-1")

#### 3.b.3 Set parameters for the training job
Retrieve the Bedrock's IAM Execution Role and its ARN.

Here we are initializing an IAM client. Information about the Bedrock IAM role is retrieved and the role ARN is saved to create a customization job on Bedrock.

In [None]:
client = boto3.client(service_name='iam')
response = client.get_role(
    RoleName='Nova-fine-tuning-role'
)

roleArn = response["Role"]["Arn"]

We are configuring the necessary parameters like setting the base model id, hyperparameters, training data and model output S3 locations to initialize a Bedrock model customization job that will fine-tune the Amazon Nova Micro model.

In [None]:
# Base model to use
basemodelId = 'us.amazon.nova-micro-v1:0'
job_prefix = "customNova"

# Model ID for provisioned throughput
# https://docs.aws.amazon.com/bedrock/latest/userguide/prov-thru-api.html
baseModelIdentifierForProvisonedThroughput = "arn:aws:bedrock:us-east-1::foundation-model/us.amazon.nova-micro-v1:0"
# Generate a unique identifier for the job and custom model name
job_uuid = str(uuid.uuid4())[:8]  # Extracting the first 8 characters for brevity
jobName = f"{job_prefix}-{job_uuid}"
customModelName = f"{job_prefix}-{job_uuid}"

hyperParameters = {
    "epochCount": "5", #defines the number of times the training data is passed through the model during training
    "batchSize": "1", #how many samples to work through before updating the internal model parameters
    "learningRate": "0.00001", #defines how aggressively to update the model with each batch of data
}

# Retrieve the default bucket name from the session
default_bucket ="synthetic-generated-data-<uniqueID>"

# Specify the training data configuration using the previously uploaded S3 data
s3_train_data = f"s3://{default_bucket}/amazon-nova-pro/train.jsonl"
trainingDataConfig = {"s3Uri": s3_train_data}

# Specify the output data configuration for the custom model
outputDataConfig = {"s3Uri": f"s3://{default_bucket}/fine-tuning-output/"}

#### 3.b.4 Trigger the bedrock training job
We create a fine-tuning job using the Bedrock client. Once created, the job identifier is printed. This identifier can be used to track the status and results of the fine-tuning job.

In [None]:
# Create a job for model customization
jobIdentifier = bedrock.create_model_customization_job(
    jobName = jobName,
    customizationType = "FINE_TUNING",
    customModelName = customModelName,
    roleArn = roleArn,
    baseModelIdentifier = baseModelIdentifierForProvisonedThroughput,
    hyperParameters = hyperParameters,
    trainingDataConfig = trainingDataConfig,
    outputDataConfig = outputDataConfig
)

# Print the identifier for the created job
print(f"Model customization job created with identifier: {jobIdentifier}")

#### 3.b.5 Monitor the job till the status is shown as "Completed"

In [None]:
fine_tune_job = bedrock.get_model_customization_job(jobIdentifier=jobIdentifier['jobArn'])
print(fine_tune_job['status'])
# The job may take more than an hour to complete

## Section 4: Testing the Amazon Nova Micro Fine-tuned Model 

In this section, we will evaluate the performance of the fine-tuned Amazon Nova Micro model to determine how well it has adapted to the specific tasks for which it was trained. Testing involves comparing the model's responses to a set of predefined questions or tasks against the baseline performance of the original, pre-trained model. This process helps us understand the improvements achieved through distillation by fine-tuning and identify any remaining areas for enhancement. By systematically examining the model's outputs, we can ensure that the fine-tuning process has effectively tailored the model to meet our specific requirements.

### 4.a Create provisioned no-commit throughput for the custom model 
**(only run the following once the status of the above job is shown as "Completed")**

This code is configuring the provisioned inference capacity for the custom model resulting from the fine-tuning job, so it can be deployed as a Amazon Bedrock managed endpoint.

In [None]:
customModelId=fine_tune_job['outputModelArn']

provisionedModelName = f"{job_prefix}-provisioned-{job_uuid}"

# Create the provisioned capacity without passing any commitment option
provisionedModelArn = bedrock.create_provisioned_model_throughput(
    modelUnits=1,
    provisionedModelName=provisionedModelName, 
    modelId=customModelId
   )['provisionedModelArn']


#### 4.a.1 Check the provisoned capacity creation status
This process will take a few minutes to complete. Ensure the status returns to "InService" before going to next section "**Prepare the inference request**"

In [None]:
# Get Provisioned model status until it says "InService"
provisionedModelStatus = bedrock.get_provisioned_model_throughput(provisionedModelId=provisionedModelArn)
print (provisionedModelStatus['status'])
# The process may take more than an hour to complete

In [None]:
print ( f"provisionedModelArn = {provisionedModelArn}")
print ( f"customModelId = {customModelId}")

### 4.b Prepare sample dataset for the inference request
Let's prepare a few inference request that will be sent to the fine-tuned custom model endpoint provisioned through Amazon Bedrock. The model will process the input text and return a generated summary based on the configuration parameters.

In [None]:
from datasets import load_dataset, DatasetDict

# Load the dataset
dataset_name = "deepmind/aqua_rat"
dataset = load_dataset(dataset_name)

# Split the train dataset into train and test splits
split_ratio = 0.8  # 80% for training, 20% for testing
train_test_split = dataset['train'].train_test_split(test_size=1 - split_ratio)

# Optionally split the train set further into train and validation sets
train_validation_split = train_test_split['train'].train_test_split(test_size=0.1)  # 10% for validation

# Create a new dataset dictionary with the new splits
dataset = DatasetDict({
    'train': train_validation_split['train'],
    'validation': train_validation_split['test'],
    'test': train_test_split['test']
})

In [None]:
# Extract 8 questions, and their correct rationale from the dataset
num_questions = 8
questions = dataset['test'].select(range(num_questions))['question']
answers = dataset['test'].select(range(num_questions))['rationale']


### 4.c Send the inference request to both base and fine-tuned custom model
In the below cell, we will send inference request simultaneosly to both base model, and fine-tune custom model that has distilled knowledge from Amazon Nova Pro's synthetically generated  training dataset. 

In [None]:
import boto3

# Create the Bedrock Runtime client
client = boto3.client(
    service_name='bedrock-runtime',
    region_name='us-east-1'  # specify your region
)

#fine-tuned model_id and base model_id
custom_model_id = provisionedModelArn
base_model_id = 'us.amazon.nova-micro-v1:0'
base_pro_model_id = 'us.amazon.nova-pro-v1:0'


# Function to run inference on models using "test" dataset
def generate_inference_data(client, model_id, questions):
    inference_data = []
    for question in questions:
        # Add Chain of Thought Reasoning prompt to the question        
        user_message = f"{question}"
        system = [
            {"text": "You are good at tasks requiring mathematical reasoning and natural language understanding. Start your response with a correct answer, and provide the rationale behind your answer"
            }
        ]
        conversation = [
            {
                "role": "user",
                "content": [{"text": user_message}],
            }
        ]
        try:
            # Send the message to the model, using a basic inference configuration.
            response = client.converse(
                modelId=model_id,
                messages=conversation,
                system=system,
                inferenceConfig={
                    "maxTokens": 500,
                    "temperature": 0.3,
                    "topP": 0.1
                },
            )

            # Extract the response text
            response_text = response["output"]["message"]["content"][0]["text"].strip()
            inference_data.append({
                "response": response_text
            })
        except (ClientError, Exception) as e:
            print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
            break

    return inference_data

# Run inference against fine-tuned custom model, base, and pro models using sample "test" dataset
results_fine_tuned_nova_micro = generate_inference_data(client,custom_model_id,questions)
results_base_nova_micro = generate_inference_data(client,base_model_id,questions)
results_base_nova_pro = generate_inference_data(client,base_pro_model_id,questions)

### 4.d Render the response against Test dataset and inference data from both base and custom model

In [None]:
# Create a table of the outputs using HTML
table_html = """
<table>
    <tr>
        <th>Question</th>
        <th>Dataset Answer</th>
        <th>Fine-tuned Nova Micro Output</th>
        <th>Nova Micro Output</th>
        <th>Nova Pro Output</th>
    </tr>
"""

for i in range(8):
    table_html += f"""
    <tr>
        <td>{questions[i]}</td>
        <td>{answers[i]}</td>
        <td>{results_fine_tuned_nova_micro[i]}</td>
        <td>{results_base_nova_micro[i]}</td>
        <td>{results_base_nova_pro[i]}</td>
    </tr>
    """

# Display the table using HTML
display(HTML(table_html))

## Section 5: Conclusion

In this notebook, we have successfully demonstrated the process of distillation by fine-tuning and evaluating the Amazon Nova Micro model using Amazon Bedrock. By leveraging the advanced capabilities of the Amazon Nova Pro model, we generated high-quality synthetic data that served as a foundation for fine-tuning the smaller Nova Micro model. This approach allowed us to enhance the performance of the Nova Micro model, tailoring it to specific domain tasks and improving its accuracy and effectiveness.

### 5.a Key Steps Accomplished:
1. **Dataset Exploration**: We explored a sample dataset to understand its structure and contents, preparing it for use in model training and evaluation.
2. **Data Generation with Amazon Nova Pro**: Utilizing the Amazon Nova Pro model, we generated synthetic data that provided high-quality responses to domain-specific prompts.
3. **Distillation by Amazon Nova Micro**: We fine-tuned the Amazon Nova Micro model using the synthetic data, adapting it to better handle specific tasks and improving its overall performance.
4. **Model Testing**: We tested the fine-tuned model against a set of evaluation questions, comparing its responses to those of the pre-trained model and assessing the improvements achieved through distillation by fine-tuning.

### 5.b Results and Insights:
- **Enhanced Performance**: The fine-tuned Amazon Nova Micro model demonstrated significant improvements in generating accurate and contextually relevant responses, showcasing the effectiveness of the fine-tuning process.
- **Cost-Effective Adaptation**: By fine-tuning the smaller Amazon Nova Micro model with data generated from the larger Amazon Nova Pro model, we achieved high performance without the need for extensive computational resources, highlighting a cost-effective approach to model adaptation.
- **Scalability and Flexibility**: The workflow outlined in this notebook can be scaled and adapted to various domains and tasks, providing a flexible framework for enhancing the capabilities of language models.

### 5.c Future Work:
- **Further Fine-Tuning**: Additional fine-tuning with more diverse and extensive datasets can further improve the model's performance and adaptability to different domains.
- **Real-World Applications**: Deploying the fine-tuned model in real-world applications such as customer support, content generation, and domain-specific research can provide valuable insights and practical benefits.
- **Continuous Evaluation**: Ongoing evaluation and monitoring of the model's performance will ensure that it remains effective and relevant as new data and requirements emerge.

In conclusion, this notebook has provided a comprehensive guide to generate synthetic data using Amazon Nova Pro and use the generated data for distillation by fine-tuning and evaluating the Amazon Nova Micro model, demonstrating the potential of using advanced language models to address specific domain needs. By the steps outlined, practitioners can enhance their models' performance, achieve cost-effective adaptations, and unlock new possibilities in natural language processing and beyond.

## Section 6: Delete the provisioned capacity and the custom model

### 6.a Import libraries

In [None]:
import boto3

# Initialize Bedrock client
bedrock = boto3.client(service_name='bedrock')

### 6.b Update the provisionedModelArn & customModelId values from the fine-tuning section above

In [None]:
# Using the provisionedModelArn from section 4.a.1
provisionedModelArn = "<Update the value from the fine-tuning notebook>"

# Using the customModelId from section 4.a.1
customModelId = "<Update the value from the fine-tuning notebook>"

### 6.c Delete the provisioned capacity & the model

In [None]:
# Delete the provisioned capacity
bedrock.delete_provisioned_model_throughput(provisionedModelId=provisionedModelArn)

# Delete the custom model
bedrock.delete_custom_model(modelIdentifier=customModelId)