# Chapter 13 - SageMaker JumpStart: Fine-tuning CodeLlama with SageMaker JumpStart for Code Generation

## Overview
This notebook demonstrates how to fine-tune CodeLlama model using Amazon SageMaker JumpStart for code generation tasks. We'll explore how to adapt the model for specific programming languages and use cases, then deploy it for automated code generation.

## Introduction

This notebook demonstrates how to fine-tune a CodeLlama model on programming-related instruction data using Amazon SageMaker JumpStart. We'll leverage the Dolphin Coder dataset to enhance the model's coding capabilities, deploy it as a SageMaker endpoint, and compare the performance between the original and fine-tuned models.

## Prerequisites

- AWS account with SageMaker access
- Appropriate permissions for JumpStart models
- SageMaker Execution role with S3 access
- G5 instance quota in your AWS account

## Setup

### Install Required Dependencies

In [None]:
!pip install --quiet --upgrade sagemaker jmespath datasets

### Import Libraries and Select Model

In [None]:
from ipywidgets import Dropdown
from sagemaker.jumpstart.notebook_utils import list_jumpstart_models


try:
    dropdown = Dropdown(
        options=list_jumpstart_models("search_keywords includes Text Generation"),
        value="meta-textgeneration-llama-codellama-7b",
        description="Select a JumpStart text generation model:",
        style={"description_width": "initial"},
        layout={"width": "max-content"},
    )
    display(dropdown)
except:
    dropdown = None
    pass

In [None]:
if dropdown:
    model_id = dropdown.value
else:
    model_id = "meta-textgeneration-llama-codellama-7b"
model_version = "*"

## Deploy Base Model

### Initialize and Deploy Model

In [None]:
from sagemaker.jumpstart.model import JumpStartModel

model = JumpStartModel(model_id=model_id, model_version=model_version)

In [None]:
predictor = model.deploy(
    accept_eula=True
)  # please change `accept_eula` to be True to accept EULA.

### Test Base Model with Example Payloads

In [None]:
example_payloads = model.retrieve_all_examples()

In [None]:
import jmespath


for payload in example_payloads:
    response = predictor.predict(payload.body)
    generated_text = jmespath.search(payload.raw_payload["output_keys"]["generated_text"], response)
    print("Input:\n", payload.body[payload.prompt_key])
    print("Output:\n", generated_text.strip())
    print("\n===============\n")

## Data Preparation

### Load and Process Dataset

In [None]:
from datasets import load_dataset


dolphin = load_dataset("cognitivecomputations/dolphin-coder", split="train")

# We split the dataset into two where test data is used to evaluate at the end.
train_and_test_dataset = dolphin.train_test_split(test_size=0.9, seed=0)

# Dumping the training data to a local file to be used for training.
train_and_test_dataset["train"].to_json("train.jsonl")
train_and_test_dataset["test"].select(range(10)).to_json("test.jsonl")

In [None]:
train_and_test_dataset["train"][0]

### Create Prompt Template

In [None]:
import json

template = {
    "prompt": """{system_prompt}

### Input:
{question}
""",
    "completion": " {response}",
}
with open("template.json", "w") as f:
    json.dump(template, f)

### Upload Training Data to S3

In [None]:
from sagemaker.s3 import S3Uploader
import sagemaker
import random

output_bucket = sagemaker.Session().default_bucket()
local_data_file = "train.jsonl"
train_data_location = f"s3://{output_bucket}/dolphin_coder_dataset"
S3Uploader.upload(local_data_file, train_data_location)
S3Uploader.upload("template.json", train_data_location)
print(f"Training data: {train_data_location}")

## Model Fine-tuning

### Configure Hyperparameters

In [None]:
from sagemaker import hyperparameters

my_hyperparameters = hyperparameters.retrieve_default(
    model_id=model_id, model_version=model_version
)

print(my_hyperparameters)

In [None]:
my_hyperparameters["epoch"] = "1"
print(my_hyperparameters)

hyperparameters.validate(
    model_id=model_id, model_version=model_version, hyperparameters=my_hyperparameters
)

### Initialize and Train the Model

In [None]:
from sagemaker.jumpstart.estimator import JumpStartEstimator


estimator = JumpStartEstimator(
    model_id=model_id,
    model_version=model_version,
    hyperparameters=my_hyperparameters,
    instance_type="ml.g5.24xlarge",
    environment={
        "accept_eula": "true"
    },  # please change `accept_eula` to be `true` to accept EULA.
)

estimator.fit({"training": train_data_location})

### Deploy Fine-tuned Model

In [None]:
finetuned_predictor = estimator.deploy()

## Evaluation

### Compare Original and Fine-tuned Models

In [None]:
import pandas as pd
from IPython.display import display, HTML

test_dataset = load_dataset("json", data_files="test.jsonl")["train"]
prompt_inference = template["prompt"]
inputs, ground_truth_responses, responses_before_finetuning, responses_after_finetuning = (
    [],
    [],
    [],
    [],
)


def predict_and_print(datapoint):
    # For instruction fine-tuning, we insert a special key between input and output
    input_output_demarkation_key = "\n\n### Response:\n"

    payload = {
        "inputs": prompt_inference.format(
            system_prompt=datapoint["system_prompt"], question=datapoint["question"]
        )
        + input_output_demarkation_key,
        "parameters": {"max_new_tokens": 100},
    }
    inputs.append(payload["inputs"])
    ground_truth_responses.append(datapoint["response"])
    pretrained_response = predictor.predict(payload)
    responses_before_finetuning.append(pretrained_response[0]["generated_text"])
    finetuned_response = finetuned_predictor.predict(payload)
    responses_after_finetuning.append(finetuned_response[0]["generated_text"])


try:
    for i, datapoint in enumerate(test_dataset.select(range(5))):
        predict_and_print(datapoint)

    df = pd.DataFrame(
        {
            "Inputs": inputs,
            "Ground Truth": ground_truth_responses,
            "Response from non-finetuned model": responses_before_finetuning,
            "Response from fine-tuned model": responses_after_finetuning,
        }
    )
    display(HTML(df.to_html()))
except Exception as e:
    print(e)

## Conclusion

In this notebook, we've successfully fine-tuned CodeLlama on programming instruction data using SageMaker JumpStart. The process involved:

1. Deploying a pre-trained CodeLlama model as a baseline
2. Preparing the Dolphin Coder dataset for fine-tuning
3. Configuring and executing the fine-tuning job
4. Deploying the fine-tuned model as an endpoint
5. Comparing the performance between the original and fine-tuned models

The results demonstrate how fine-tuning can significantly improve the model's ability to follow coding instructions and generate more accurate and relevant code. This approach can be extended to other domains by swapping out the training data and adjusting hyperparameters.

For production deployments, consider:
- Using a larger training dataset for better results
- Experimenting with different hyperparameters like learning rate and batch size
- Implementing auto-scaling for your endpoint to handle variable traffic
- Setting up monitoring to track model performance over time
- Optimizing the deployment for cost efficiency