# Introduction to Bedrock - Fine-Tuning

> *If you see errors, you may need to be allow-listed for the Bedrock models used by this notebook*

> *This notebook should work well with the **`Data Science 3.0`** kernel in SageMaker Studio*


In this demo notebook, we demonstrate how to use the Bedrock Python SDK for fine-tuning Bedrock models with your own data. If you have text samples to train and want to adapt the Bedrock models to your domain, you can further fine-tune the Bedrock foundation models by providing your own training datasets. You can upload your datasets to Amazon S3, and provide the S3 bucket path while configuring a Bedrock fine-tuning job. You can also adjust hyper parameters (learning rate, epoch, and batch size) for fine-tuning. After the fine-tuning job of the model with your dataset has completed, you can start using the model for inference in the Bedrock playground application. You can select the fine-tuned model and submit a prompt to the fine-tuned model along with a set of model parameters. The fine-tuned model should generate texts to be more alike your text samples. 

-----------

1. Setup
2. Fine-tuning
3. Testing the fine-tuned model

 Note: This notebook was tested in Amazon SageMaker Studio with Python 3 (Data Science 2.0) kernel.

---

## 1. Setup

In [2]:
%pip install -U --force-reinstall pandas==2.1.2

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting pandas==2.1.2
  Obtaining dependency information for pandas==2.1.2 from https://files.pythonhosted.org/packages/02/52/815f643ed3afb3365354548b3c8b557dbf926a65c40ad5b6d9e455147c7e/pandas-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading pandas-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
Collecting numpy<2,>=1.22.4 (from pandas==2.1.2)
  Obtaining dependency information for numpy<2,>=1.22.4 from https://files.pythonhosted.org/packages/64/41/284783f1014685201e447ea976e85fed0e351f5debbaf3ee6d7645521f1d/numpy-1.26.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading numpy-1.26.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.2/61.2 kB[0m [31m295.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting python-dateuti

In [3]:
%pip install boto3

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m23.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [4]:
%pip list

Package                              Version
------------------------------------ --------------------
aiohttp                              3.9.0
aiosignal                            1.3.1
alabaster                            0.7.12
anaconda-client                      1.11.0
anaconda-project                     0.11.1
anyio                                3.5.0
appdirs                              1.4.4
argon2-cffi                          21.3.0
argon2-cffi-bindings                 21.2.0
arrow                                1.2.2
astroid                              3.0.0
astropy                              5.3.4
asttokens                            2.4.0
async-timeout                        4.0.3
atomicwrites                         1.4.0
attrs                                23.1.0
Automat                              20.2.0
autopep8                             1.6.0
autovizwidget                        0.21.0
awscli                               1.29.63
Babel                      

In [5]:
%pip install datasets==2.14.6 boto3 \
#                botocore-1.31.60-py3-none-any.whl \
#                awscli

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m23.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [6]:
# %pip install -U boto3-1.28.60-py3-none-any.whl \
#                botocore-1.31.60-py3-none-any.whl \
#                awscli-1.29.60-py3-none-any.whl \
#                --force-reinstall --quiet

In [7]:
%pip list | grep boto3

boto3                                1.29.6

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m23.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


#### Now let's set up our connection to the Amazon Bedrock SDK using Boto3

In [8]:
#### Un comment the following lines to run from your local environment outside of the AWS account with Bedrock access

#import os
#os.environ['BEDROCK_ASSUME_ROLE'] = '<YOUR_VALUES>'
#os.environ['AWS_PROFILE'] = '<YOUR_VALUES>'

In [9]:
import boto3
import json 

bedrock = boto3.client(service_name="bedrock")
bedrock_runtime = boto3.client(service_name="bedrock-runtime")

In [10]:
for model in bedrock.list_foundation_models(byProvider="amazon", byCustomizationType="FINE_TUNING")["modelSummaries"]:
    print("-----\n" + "modelArn: " + model["modelArn"] + "\nmodelId: " + model["modelId"] + "\nmodelName: " + model["modelName"] + "\ncustomizationsSupported: " + ','.join(model["customizationsSupported"]))

-----
modelArn: arn:aws:bedrock:us-east-1::foundation-model/meta.llama2-13b-v1:0:4k
modelId: meta.llama2-13b-v1:0:4k
modelName: Llama 2 13B
customizationsSupported: FINE_TUNING
-----
modelArn: arn:aws:bedrock:us-east-1::foundation-model/meta.llama2-70b-v1:0:4k
modelId: meta.llama2-70b-v1:0:4k
modelName: Llama 2 70B
customizationsSupported: FINE_TUNING


In [11]:
import sagemaker

sess = sagemaker.Session()
sagemaker_session_bucket = sess.default_bucket()
role = sagemaker.get_execution_role()

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml


### Invoke Model before Fine-Training

In [12]:
base_model_id = "meta.llama2-13b-v1:0:4k"
chat_base_model_id = "meta.llama2-13b-chat-v1"

### Convert the dataset into jsonlines format

In [13]:
from datasets import load_dataset

dataset = load_dataset("csv", data_files="gaia_questions_answers.csv")

In [14]:
print(dataset)

DatasetDict({
    train: Dataset({
        features: ['question', 'answer'],
        num_rows: 110
    })
})


In [15]:
def wrap_instruction_fn(example):
    prompt = 'Answer the following question:\n\n'
    end_prompt = '\n\nAnswer: '
    example["instruction"] = prompt + example["question"] + end_prompt
    return example

In [16]:
dataset['train']\
  .filter(lambda example: example['question'] and example['answer'])\
  .select_columns(['question', 'answer'])\
  .map(wrap_instruction_fn)\
  .rename_column('instruction', 'prompt')\
  .rename_column('answer', 'completion')\
  .remove_columns(['question'])\
  .to_json('./train-question-answer.jsonl', index=False)

# dataset['validation']\
#   .filter(lambda example: example['question'])\
#   .select_columns(['question', 'answer'])\
#   .map(wrap_instruction_fn)\
#   .rename_column('instruction', 'input')\
#   .rename_column('answer', 'output')\
#   .to_json('./validation-summarization.jsonl', index=False)

# dataset['test']\
#   .filter(lambda example: example['question'])\
#   .select_columns(['question', 'answer'])\
#   .map(wrap_instruction_fn)\
#   .rename_column('instruction', 'input')\
#   .rename_column('answer', 'output')\
#   .to_json('./test-summarization.jsonl', index=False)

#  .remove_columns(['Unnamed: 0.1', 'Unnamed: 0', 'question_id', 'title', 'answer_id', 'expertreviewed', 'upvotedcount', 'tags'])\

Map:   0%|          | 0/110 [00:00<?, ? examples/s]

Creating json from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

42498

In [17]:
import pandas as pd
df = pd.read_json("./train-question-answer.jsonl", lines=True)
df

Unnamed: 0,completion,prompt
0,"Intelligent search, automated customer-support...",Answer the following question:\n\nWhat are som...
1,Foundation models are large and complex neural...,Answer the following question:\n\nHow are foun...
2,"The generative AI project life cycle, though n...",Answer the following question:\n\nCan you desc...
3,AWS offers a range of frameworks and infrastru...,Answer the following question:\n\nWhat makes A...
4,"AWS offers increased flexibility, choice, ente...",Answer the following question:\n\nHow does Gen...
...,...,...
105,Textual inversion is a lightweight fine-tuning...,Answer the following question:\n\nWhat is text...
106,Human Alignment with Reinforcement Learning fr...,Answer the following question:\n\nHow does hum...
107,PEFT-LoRA (Parameter-Efficient Fine-Tuning wit...,Answer the following question:\n\nHow do PEFT-...
108,Fine-tuning Stable Diffusion models allows cus...,Answer the following question:\n\nWhat are the...


In [18]:
data = "./train-question-answer.jsonl"

Read the JSON line file into an object like any normal file

In [19]:
with open(data) as f:
    lines = f.read().splitlines()

#### Load the ‘lines’ object into a pandas Data Frame.

In [20]:
import pandas as pd
df_inter = pd.DataFrame(lines)
df_inter.columns = ['json_element']

This intermediate data frame will have only one column with each json object in a row. A sample output is given below.

In [21]:
df_inter['json_element'].apply(json.loads)

0      {'completion': 'Intelligent search, automated ...
1      {'completion': 'Foundation models are large an...
2      {'completion': 'The generative AI project life...
3      {'completion': 'AWS offers a range of framewor...
4      {'completion': 'AWS offers increased flexibili...
                             ...                        
105    {'completion': 'Textual inversion is a lightwe...
106    {'completion': 'Human Alignment with Reinforce...
107    {'completion': 'PEFT-LoRA (Parameter-Efficient...
108    {'completion': 'Fine-tuning Stable Diffusion m...
109    {'completion': 'Fine-tuning techniques for dif...
Name: json_element, Length: 110, dtype: object

Now we will apply json loads function on each row of the ‘json_element’ column. ‘json.loads’ is a decoder function in python which is used to decode a json object into a dictionary. ‘apply’ is a popular function in pandas that takes any function and applies to each row of the pandas dataframe or series.

In [22]:
df_final = pd.json_normalize(df_inter['json_element'].apply(json.loads))

Once decoding is done we will apply the json normalize function to the above result. json normalize will convert any semi-structured json data into a flat table. Here it converts the JSON ‘keys’ to columns and its corresponding values to row elements.

In [23]:
df_final

Unnamed: 0,completion,prompt
0,"Intelligent search, automated customer-support...",Answer the following question:\n\nWhat are som...
1,Foundation models are large and complex neural...,Answer the following question:\n\nHow are foun...
2,"The generative AI project life cycle, though n...",Answer the following question:\n\nCan you desc...
3,AWS offers a range of frameworks and infrastru...,Answer the following question:\n\nWhat makes A...
4,"AWS offers increased flexibility, choice, ente...",Answer the following question:\n\nHow does Gen...
...,...,...
105,Textual inversion is a lightweight fine-tuning...,Answer the following question:\n\nWhat is text...
106,Human Alignment with Reinforcement Learning fr...,Answer the following question:\n\nHow does hum...
107,PEFT-LoRA (Parameter-Efficient Fine-Tuning wit...,Answer the following question:\n\nHow do PEFT-...
108,Fine-tuning Stable Diffusion models allows cus...,Answer the following question:\n\nWhat are the...


### Uploading data to S3

Next, we need to upload our training dataset to S3:

In [24]:
s3_location = f"s3://{sagemaker_session_bucket}/bedrock/finetuning/train-question-answer.jsonl"
s3_output = f"s3://{sagemaker_session_bucket}/bedrock/finetuning/output"

In [25]:
!aws s3 cp ./train-question-answer.jsonl $s3_location

upload: ./train-question-answer.jsonl to s3://sagemaker-us-east-1-079002598131/bedrock/finetuning/train-question-answer.jsonl


Now we can create the fine-tuning job. 

### ^^ **Note:** Make sure the IAM role you're using has these [IAM policies](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-iam-role.html) attached that allow Amazon Bedrock access to the specified S3 buckets ^^

## 2. Fine-tuning

In [26]:
import time
timestamp = int(time.time())

In [27]:
job_name = "llama2-{}".format(timestamp)
job_name

'llama2-1701035879'

In [28]:
custom_model_name = "custom-{}".format(job_name)
custom_model_name

'custom-llama2-1701035879'

In [29]:
bedrock.create_model_customization_job(
    jobName=job_name,
    customModelName=custom_model_name,
    roleArn=role,
    baseModelIdentifier=base_model_id,
    hyperParameters = {
        "epochCount": "10",
        "batchSize": "1",
        "learningRate": "0.000005",
    },
    trainingDataConfig={"s3Uri": s3_location},
    outputDataConfig={"s3Uri": s3_output},
)

{'ResponseMetadata': {'RequestId': '3f52b536-ae02-44f9-aab8-7cf97b8de36d',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sun, 26 Nov 2023 21:57:59 GMT',
   'content-type': 'application/json',
   'content-length': '112',
   'connection': 'keep-alive',
   'x-amzn-requestid': '3f52b536-ae02-44f9-aab8-7cf97b8de36d'},
  'RetryAttempts': 0},
 'jobArn': 'arn:aws:bedrock:us-east-1:079002598131:model-customization-job/meta.llama2-13b-v1:0:4k/nn382igms8q1'}

In [30]:
status = bedrock.get_model_customization_job(jobIdentifier=job_name)["status"]
status

'InProgress'

# Let's periodically check in on the progress.
### The next cell might run for ~40min

In [None]:
import time

status = bedrock.get_model_customization_job(jobIdentifier=job_name)["status"]

while status == "InProgress":
    print(status)
    time.sleep(30)
    status = bedrock.get_model_customization_job(jobIdentifier=job_name)["status"]
    
print(status)

In [None]:
completed_job = bedrock.get_model_customization_job(jobIdentifier=job_name)
completed_job

## 3. Testing

Now we can test the fine-tuned model

In [44]:
bedrock.list_custom_models()

{'ResponseMetadata': {'RequestId': 'cdf58f7e-ca6e-4f71-a33d-c1856c79a9a8',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sun, 26 Nov 2023 23:00:27 GMT',
   'content-type': 'application/json',
   'content-length': '1352',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'cdf58f7e-ca6e-4f71-a33d-c1856c79a9a8'},
  'RetryAttempts': 0},
 'modelSummaries': [{'modelArn': 'arn:aws:bedrock:us-east-1:079002598131:custom-model/meta.llama2-13b-v1:0:4k/090h29lu0qu6',
   'modelName': 'custom-llama2-1701035879',
   'creationTime': datetime.datetime(2023, 11, 26, 21, 57, 59, 827000, tzinfo=tzlocal()),
   'baseModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/meta.llama2-13b-v1:0:4k',
   'baseModelName': ''},
  {'modelArn': 'arn:aws:bedrock:us-east-1:079002598131:custom-model/meta.llama2-13b-v1:0:4k/ylbb90zehwl3',
   'modelName': 'custom-llama2-1701028200',
   'creationTime': datetime.datetime(2023, 11, 26, 19, 50, 1, 477000, tzinfo=tzlocal()),
   'baseModelArn': 'arn:aws:bedrock:us-ea

In [45]:
for job in bedrock.list_model_customization_jobs()["modelCustomizationJobSummaries"]:
    print("-----\n" + "jobArn: " + job["jobArn"] + "\njobName: " + job["jobName"] + "\nstatus: " + job["status"] + "\ncustomModelName: " + job["customModelName"])

-----
jobArn: arn:aws:bedrock:us-east-1:079002598131:model-customization-job/meta.llama2-13b-v1:0:4k/nn382igms8q1
jobName: llama2-1701035879
status: Completed
customModelName: custom-llama2-1701035879
-----
jobArn: arn:aws:bedrock:us-east-1:079002598131:model-customization-job/cohere.command-light-text-v14:7:4k/v8qgvki9ae1y
jobName: cohere-1701034305
status: Failed
customModelName: custom-cohere-1701034305
-----
jobArn: arn:aws:bedrock:us-east-1:079002598131:model-customization-job/meta.llama2-13b-v1:0:4k/4nad2l4z3gq0
jobName: llama2-1701028200
status: Completed
customModelName: custom-llama2-1701028200
-----
jobArn: arn:aws:bedrock:us-east-1:079002598131:model-customization-job/cohere.command-light-text-v14:7:4k/8ad33o2e2ozn
jobName: cohere-1701028147
status: Failed
customModelName: custom-cohere-1701028147
-----
jobArn: arn:aws:bedrock:us-east-1:079002598131:model-customization-job/meta.llama2-13b-v1:0:4k/en8raawbsg7e
jobName: llama2-1701026490
status: Stopped
customModelName: custom

## GetCustomModel

In [46]:
bedrock.get_custom_model(modelIdentifier=custom_model_name)

{'ResponseMetadata': {'RequestId': 'fa3efca0-3f15-4d23-b9a2-1a7b848a56e2',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sun, 26 Nov 2023 23:00:30 GMT',
   'content-type': 'application/json',
   'content-length': '849',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'fa3efca0-3f15-4d23-b9a2-1a7b848a56e2'},
  'RetryAttempts': 0},
 'modelArn': 'arn:aws:bedrock:us-east-1:079002598131:custom-model/meta.llama2-13b-v1:0:4k/090h29lu0qu6',
 'modelName': 'custom-llama2-1701035879',
 'jobArn': 'arn:aws:bedrock:us-east-1:079002598131:model-customization-job/meta.llama2-13b-v1:0:4k/nn382igms8q1',
 'baseModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/meta.llama2-13b-v1:0:4k',
 'hyperParameters': {'batchSize': '1',
  'epochCount': '10',
  'learningRate': '0.000005'},
 'trainingDataConfig': {'s3Uri': 's3://sagemaker-us-east-1-079002598131/bedrock/finetuning/train-question-answer.jsonl'},
 'outputDataConfig': {'s3Uri': 's3://sagemaker-us-east-1-079002598131/bedrock/finetuning/outp

In [47]:
custom_model_arn = bedrock.get_custom_model(modelIdentifier=custom_model_name)['modelArn']
custom_model_arn

'arn:aws:bedrock:us-east-1:079002598131:custom-model/meta.llama2-13b-v1:0:4k/090h29lu0qu6'

In [48]:
base_model_arn = bedrock.get_custom_model(modelIdentifier=custom_model_name)['baseModelArn']
base_model_arn

'arn:aws:bedrock:us-east-1::foundation-model/meta.llama2-13b-v1:0:4k'

## **Note:** To invoke custom models, you need to first create a provisioned throughput resource and make requests using that resource.

In [49]:
provisioned_model_name = "{}-provisioned".format(custom_model_name)
provisioned_model_name

'custom-llama2-1701035879-provisioned'

## !! **Note:** SDK currently only supports 1 month and 6 months commitment terms. Go to Bedrock console to manually purchase no commitment term option for testing !!

In [None]:
# bedrock.create_provisioned_model_throughput(
#     modelUnits = 1,
#     commitmentDuration = "OneMonth", ## Note: SDK is currently missing No Commitment option
#     provisionedModelName = provisioned_model_name,
#     modelId = base_model_arn
# ) 

## ListProvisionedModelThroughputs

In [50]:
bedrock.list_provisioned_model_throughputs()["provisionedModelSummaries"]

[{'provisionedModelName': 'custom-llama2-1701035879-provisioned',
  'provisionedModelArn': 'arn:aws:bedrock:us-east-1:079002598131:provisioned-model/qicn3gbfoq04',
  'modelArn': 'arn:aws:bedrock:us-east-1:079002598131:custom-model/meta.llama2-13b-v1:0:4k/090h29lu0qu6',
  'desiredModelArn': 'arn:aws:bedrock:us-east-1:079002598131:custom-model/meta.llama2-13b-v1:0:4k/090h29lu0qu6',
  'foundationModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/meta.llama2-13b-v1:0:4k',
  'modelUnits': 0,
  'desiredModelUnits': 1,
  'status': 'Creating',
  'creationTime': datetime.datetime(2023, 11, 26, 23, 0, 1, 180000, tzinfo=tzlocal()),
  'lastModifiedTime': datetime.datetime(2023, 11, 26, 23, 0, 16, 34000, tzinfo=tzlocal())},
 {'provisionedModelName': 'custom-llama2-1700891164-provisioned',
  'provisionedModelArn': 'arn:aws:bedrock:us-east-1:079002598131:provisioned-model/uqh842ttyg89',
  'modelArn': 'arn:aws:bedrock:us-east-1:079002598131:custom-model/meta.llama2-13b-v1:0:4k/9ntzuvjm70pa',
  'd

## GetProvisionedModelThroughput

In [None]:
# import boto3
# import json
# import os
# import sys

# from utils import bedrock, print_ww

# bedrock_admin = bedrock.get_bedrock_client(
#     assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
#     region=os.environ.get("AWS_DEFAULT_REGION", None),
#     runtime=False  # Needed for control plane
# )

# bedrock_runtime = bedrock.get_bedrock_client(
#     assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
#     region=os.environ.get("AWS_DEFAULT_REGION", None),
#     runtime=True  # Needed for control plane
# )

In [51]:
#provisioned_model_name = "<YOUR_PROVISIONED_MODEL_NAME>" # e.g. custom-titan-1698257909-provisioned
#provisioned_model_name = "custom-titan-1700411048-provisioned" 
#provisioned_model_name = "custom-llama2-1701028200-provisioned"

In [52]:
provisioned_model_arn = bedrock.get_provisioned_model_throughput(
     provisionedModelId=provisioned_model_name)["provisionedModelArn"]
provisioned_model_arn

'arn:aws:bedrock:us-east-1:079002598131:provisioned-model/qicn3gbfoq04'

In [53]:
deployment_status = bedrock.get_provisioned_model_throughput(
    provisionedModelId=provisioned_model_name)["status"]
deployment_status

'Creating'

## The next cell might run for ~10min

In [54]:
import time

deployment_status = bedrock.get_provisioned_model_throughput(
    provisionedModelId=provisioned_model_name)["status"]

while deployment_status == "Creating":
    
    print(deployment_status)
    time.sleep(30)
    deployment_status = bedrock.get_provisioned_model_throughput(
        provisionedModelId=provisioned_model_name)["status"]  
    
print(deployment_status)

Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
InService


# Qualitative Results with Zero Shot Inference BEFORE and AFTER Fine-Tuning

As with many GenAI applications, a qualitative approach where you ask yourself the question "is my model behaving the way it is supposed to?" is usually a good starting point. In the example below (the same one we started this notebook with), you can see how the fine-tuned model is able to create a reasonable summary of the dialogue compared to the original inability to understand what is being asked of the model.

## Before Fine-Tuning

In [55]:
prompt = "Answer the following question:\n\nHow can prompt engineering go wrong?\n\nAnswer: "

In [56]:
chat_base_model_id = "meta.llama2-13b-chat-v1"

In [57]:
body = {
    "prompt": prompt,
    "temperature": 0.5,
    "top_p": 0.9,
    "max_gen_len": 512,
}

response = bedrock_runtime.invoke_model(
    modelId=chat_base_model_id, 
    body=json.dumps(body)
)

response_body = response["body"].read().decode('utf8')
print(json.loads(response_body)["generation"])

Prompt engineering, like any other AI technique, can go wrong if not done carefully and responsibly. Here are some ways in which prompt engineering can go wrong:

1. Biased prompts: If the prompts are biased or discriminatory, the AI model may learn to replicate and even amplify these biases. This can lead to unfair or discriminatory outcomes, which can have serious consequences in areas like hiring, lending, or criminal justice.
2. Misleading prompts: If the prompts are misleading or inaccurate, the AI model may learn to generate incorrect or misleading responses. This can be particularly problematic in areas like healthcare, finance, or legal advice, where accurate information is critical.
3. Overly broad prompts: If the prompts are too broad or open-ended, the AI model may struggle to generate relevant or useful responses. This can lead to wasted time and resources, and may not provide the desired results.
4. Overly narrow prompts: If the prompts are too narrow or specific, the AI m

## After Fine-Tuning

In [58]:
body = {
    "prompt": prompt,
    "temperature": 0.5,
    "top_p": 0.9,
    "max_gen_len": 512,
}

response = bedrock_runtime.invoke_model(
    modelId=provisioned_model_arn, 
    body=json.dumps(body)
)

response_body = response["body"].read().decode('utf8')
print(json.loads(response_body)["generation"])

Prompt engineering can go wrong if the model is misaligned, leading to undesired responses.


## Delete Provisioned Throughput

When you're done testing, you can delete Provisioned Throughput to stop charges

In [None]:
# bedrock.delete_provisioned_model_throughput(
#     provisionedModelId = provisioned_model_name
# )