# Chapter 13 - SageMaker JumpStart: Fine-tuning and Inference with GPT-J 6B on SageMaker JumpStart

## Overview
This notebook demonstrates how to fine-tune and deploy GPT-J 6B model using Amazon SageMaker JumpStart. We'll explore the complete workflow from model selection and fine-tuning to deployment and inference for various natural language processing tasks.

## Introduction

This notebook demonstrates how to fine-tune the Hugging Face GPT-J 6B model using Amazon SageMaker JumpStart.
We'll leverage pre-existing SEC financial data to train the model to better understand and generate
financial text. After fine-tuning, we'll deploy the model to a SageMaker endpoint for real-time inference.

## Prerequisites

- AWS account with SageMaker access
- Sufficient permissions to access JumpStart models
- IAM role with appropriate SageMaker execution permissions
- Familiarity with Python and basic machine learning concepts

## Setup

### Define the Model

In [None]:
# Specify the JumpStart model we want to fine-tune
model_id = "huggingface-textgeneration1-gpt-j-6b"

### Import Required Libraries

In [None]:
# Import necessary packages for working with SageMaker JumpStart
import json
from sagemaker.jumpstart.estimator import JumpStartEstimator
from sagemaker.jumpstart.utils import get_jumpstart_content_bucket



## Data Preparation

### Locate Training Data

In [None]:
# Sample training data is available in this bucket
data_bucket = get_jumpstart_content_bucket()
data_prefix = "training-datasets/sec_data"
# Define paths to training and validation datasets
training_dataset_s3_path = f"s3://{data_bucket}/{data_prefix}/train/"
validation_dataset_s3_path = f"s3://{data_bucket}/{data_prefix}/validation/"

## Model Fine-tuning

### Configure Fine-tuning Parameters

In [None]:
# Initialize the JumpStart estimator with our desired hyperparameters
estimator = JumpStartEstimator(
    model_id=model_id,
    hyperparameters={"epoch": "3", "per_device_train_batch_size": "4"},
)

### Execute Fine-tuning Job

In [None]:
# You can now fit the estimator by providing training data to the train channel

estimator.fit(
    {"train": training_dataset_s3_path, "validation": validation_dataset_s3_path}, logs=True
)

In [None]:
model_id

## Model Deployment

### Deploy to SageMaker Endpoint

In [None]:
# Deploy our fine-tuned model to a real-time endpoint
predictor = estimator.deploy()

## Inference

### Test with Sample Input

In [None]:
# Create a sample payload to test the model
payload = {"inputs": "This Form 10-K report shows that", "parameters": {"max_new_tokens": 400}}
predictor.predict(payload)

### Alternative: Using Existing Endpoint

In [None]:
# If you already have a deployed endpoint, you can test it directly
endpoint_name = "jumpstart-dft-hf-textgeneration1-gp-20240712-023109"
payload = {"inputs": "This Form 10-K report shows that", "parameters": {"max_new_tokens": 400}}
import boto3
import json

client = boto3.client('sagemaker-runtime')
# Invoke the endpoint with our test prompt
response = client.invoke_endpoint(
  EndpointName=endpoint_name,
  Body=json.dumps(payload),
  ContentType='application/json'
)
# Parse and display the generated text
result = json.loads(response['Body'].read().decode())


In [None]:
print(result[0]['generated_text'])

## Conclusion

In this notebook, we've successfully fine-tuned GPT-J 6B on financial SEC data using SageMaker JumpStart.
This demonstrates how to leverage pre-trained models and customize them for specific domains without
requiring extensive infrastructure management or deep machine learning expertise.

Key accomplishments:
1. Configured and initiated a fine-tuning job on a large language model (6B parameters)
2. Used domain-specific financial data to adapt the model to SEC filings
3. Deployed the model to a SageMaker endpoint for real-time inference
4. Tested the model with sample financial text prompts

This approach can be extended to other domains by swapping out the training data and adjusting
hyperparameters. The deployed model can be integrated into applications that require financial
text generation, summarization, or completion capabilities.

For production deployments, consider:
- Optimizing endpoint configurations for cost and performance
- Implementing proper monitoring and logging
- Setting up auto-scaling to handle variable traffic patterns
- Adding proper request validation and error handling
