# Finetune and deploy Cohere Command Model from AWS Bedrock

This sample notebook shows you how to finetune and deploy cohere command models using Amazon SageMaker.

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

## Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that you have an IAM role set up for bedrock model customization https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-iam-role.html

## Contents:
1. [Subscribe to Amazon Bedrock](#1.-Subscribe-to-Amazon-Bedrock)
2. [Run the model customization job](#2.-Run-the-model-customization-job)
   1. [Upload training data](#A.-Upload-training-data)
   2. [Finetune models on uploaded data](#B.-Finetune-models-on-uploaded-data)
   3. [Wait for job to complete](#C.-Wait-for-job-to-complete)
3. [Create an endpoint for inference](#3.-Create-an-endpoint-for-inference)
   1. [Provision model throughput](#A.-Provision-model-throughput)
   2. [Perform real-time inference](#B.-Perform-real-time-inference)
4. [Clean-up](#4.-Clean-up)
    1. [Delete the endpoint](#A.-Delete-the-endpoint)    

## Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

## 1. Subscribe to Amazon Bedrock

Follow the instructions in the [Amazon Bedrock](https://console.aws.amazon.com/bedrock) console and [add access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) to the cohere command model.

## 2. Run the model customization job

In [None]:
!pip install --upgrade setuptools==69.5.1 cohere-aws
# if you upgrade the package, you need to restart the kernel

import cohere_aws
import boto3

### A. Upload training data

Choose a directory on S3 to store the training data:

In [None]:
s3_bucket_name = "finetune-data" # bucket where data should be uploaded to, your bedrock model customization IAM role should also have access to this bucket
s3_train_data_path = "generative/train.jsonl" # the path where train data will be stored
s3_eval_data_path = "generative/eval.jsonl" # the path where eval data will be stored (optional)

Upload sample training data and optional eval data to S3:

In [None]:
s3 = boto3.client('s3')

# upload example data to s3
s3.upload_file("../examples/sample_generative_data.jsonl", s3_bucket_name, s3_train_data_path)
s3.upload_file("../examples/sample_generative_data_eval.jsonl", s3_bucket_name, s3_eval_data_path) # (optional)

### B. Finetune models on uploaded data

Specify a directory on S3 where finetuned models should be stored:

In [None]:
s3_models_dir = "s3://finetuned_models/" # where the models will be saved

Create Cohere client:

In [None]:
region = boto3.Session().region_name
co = cohere_aws.Client(mode=cohere_aws.Mode.BEDROCK, region_name=region)

Create the fine-tuning job:
> **Note**: Update the role ARN with the role created in step 2 of the pre-requisites

In [None]:
train_data_url = f"s3://{s3_bucket_name}/{s3_train_data_path}"
eval_data_url = f"s3://{s3_bucket_name}/{s3_eval_data_path}"

job_id = co.create_finetune(
    name="finetuned-model",
    base_model_id="cohere.command-text-v14:7:4k", 
    train_data=train_data_url,
    s3_models_dir=s3_models_dir,
    eval_data=eval_data_url, 
    role="arn:aws:iam::<ACCOUNT_ID>:role/service-role/<ROLE_NAME>"
)
print(job_id)

### C. Wait for job to complete

> **Note**: This job may take a while to complete, if it does not complete within the timeout your job will still be running and you may need to wait longer

In [None]:
model_id = co.wait_for_finetune_job(job_id)
print(model_id)

## 3. Create an endpoint for inference

### A. Provision model throughput

The Cohere AWS SDK provides a built-in method for provisioning throughput to inference finetuned models.


In [None]:
model_arn = co.provision_throughput(model_id=model_id, name="custom-model-throughput", model_units=1)
print(model_arn)

### B. Perform real-time inference

Now, you can access all models deployed on the endpoint for inference:

In [None]:
result = co.generate(prompt="hello", model_id=model_arn)
print(result)

## 4. Clean-up

### A. Delete the endpoint

After you've successfully performed inference, you can delete the deployed endpoint to avoid being charged continuously.

In [None]:
bedrock = boto3.client("bedrock", region_name=region)
bedrock.delete_provisioned_model_throughput(model_arn)