<p style="font-size: 18px;">
This notebook provides an example on how to fine-tune an LLM for item classification using the Fireworks API. In this example, we will classify items ordered by an oil facility.<br><br>
<b>Steps:</b><br>
0. Configure Environment<br>
1. Curate Dataset<br>
2. Train Model<br>
3. Evaluate Results<br><br>
</p>

<p style="font-size: 18px;">
**********************<br><b>STEP 0: Configure Environment</b><br><br>
Install necessary libraries, import them, and configure Fireworks API access.
</p>

In [1]:
# Install the required Python libraries
!pip install fireworks-ai pandas scikit-learn

In [17]:
# BEFORE PROCEEDING, MAKE SURE YOU FOLLOW THE INSTRUCTIONS TO INSTALL FIRECTL, BASED ON YOUR ARCHITECTURE 
# https://readme.fireworks.ai/reference/installation-1

# sign in to firectl
!firectl signin

In [2]:
import json
import os

import fireworks.client as fc
from fireworks.client import Fireworks
import pandas as pd
from sklearn.model_selection import train_test_split

In [3]:
# Set this to your Fireworks account id
account_id = 'sdkramer10-5e98cb'

# Uncomment the line below and set the value to your account's API key
# fc.api_key = '<API KEY>'

client = Fireworks()

<p style="font-size: 18px;">
**********************<br><b>STEP 1: Curate Dataset</b><br><br>
Transform a CSV dataset containing item descriptions and classes into the format required for fine-tuning, and upload the dataset to Fireworks. For more details on the dataset requirements, refer to the <a href="https://docs.fireworks.ai/fine-tuning/fine-tuning-models#conversation">Fireworks guide</a>.
</p>

In [4]:
# Load the csv dataset, which contains the item descriptions (under column "Item") and their correct classification (under column "Class").
item_class_csv_url = 'item_classification.csv'
df = pd.read_csv(item_class_csv_url)

# Perform a train/test split of the data. Refer back to the "Training Evaluation" slides from the "Train Your Model" class 
# (https://gamma.app/docs/Train-Your-Model-fotbcos9eduoh0v) on why it's important to have a test set
x_train, x_test, y_train, y_test = train_test_split( df['Item'], df['Class'], test_size=0.2, random_state=42)

In [5]:
# As discussed during the first workshop, the system message contains instructions for the LLM to perform a task. 
# For classification tasks, I recommend that you list the possible classes within the system message. 
# For base models, this ensures that the LLM responds with the correct class names.
# For fine-tuned models, while including the class names isn't technically required, including them in the system message 
# helps the LLM learn more quickly.
possible_classes = '\n'.join(list(y_train.unique()))
system_message = f'''Classify the following item. Respond with ONLY the name of the appropriate class from the list below.

CLASSES:
{possible_classes}'''

print(system_message)

Classify the following item. Respond with ONLY the name of the appropriate class from the list below.

CLASSES:
Pressure Safety Device
Piping
Structure
Pressure Vessel (VIE)
FU Items
Non Structural Tank
Campaign
Lifting 
Corrosion Monitoring
Pressure Vessel (VII)
Lifting
Flare TIP
Flame Arrestor
Flare Tip
Intelligent Pigging


In [31]:
# Transform the data into the format required by Fireworks for fine-tuning.
json_objs = list()
for i in range(len(x_train)):
    msg = {"messages": [
        {"role": "system", "content": system_message}, 
        {"role": "user", "content": x_train.iloc[i]}, 
        {"role": "assistant", "content": y_train.iloc[i]}
    ]}  

    json_objs.append(msg)

dataset_file_name = 'item_classification.jsonl'
dataset_name = 'item-class-v1'

with open(dataset_file_name, 'w') as f:
    for obj in json_objs:
        json.dump(obj, f)
        f.write('\n')

In [4]:
# Upload the dataset to Fireworks
!firectl create dataset {dataset_name} {dataset_file_name}

<p style="font-size: 18px;">
**********************<br><b>STEP 2: Train Model</b><br><br>
Create a fine-tuning job within Fireworks. For more details on the firectl commands and how Fireworks implements fine-tuning, refer to the <a href="https://docs.fireworks.ai/fine-tuning/fine-tuning-models#starting-your-tuning-job">FireWorks guide</a>.
</p>


In [5]:
# Create a fine-tuning job within Fireworks. For advanced users, you can optionally adjust the item_classification.yaml 
# file to tweak the training parameters. Refer back to the "QLoRA Hyperparameters" section of the "Train Your Model" workshop 
# (https://gamma.app/docs/Training-Your-Model-fotbcos9eduoh0v) for more details on what each parameter represents.
!firectl create fine-tuning-job --settings-file item_classification.yaml  --display-name {dataset_name} --dataset {dataset_name} 

In [83]:
# Replace model_id with the id of your fine-tuned model.
# After executing the cell above, the first line in it's output will be "Name: accounts/<ACCOUNT_ID>/fineTuningJobs/<MODEL_ID>"
model_id = 'c55a772445ed48f1bbad2b15c1508b86'

In [2]:
# Get the status of the fine-tuning job. Wait until the state says COMPLETED before continuing (~10-20 mins).
!firectl get fine-tuning-job {model_id}

In [85]:
# Deploy the model to a FireWorks endpoint
!firectl deploy {model_id}

In [3]:
# Get the model's status. Wait until "Deployed Model Refs" says DEPLOYED before running reference (~15-20 mins).
!firectl get model {model_id}

<p style="font-size: 18px;">
**********************<br><b>STEP 3: Evaluate Results</b><br><br>
Evaluate the accuracy of the fine-tuned model and compare it to the accuracy of the base model.
</p>

In [23]:
def generate_responses(item_descriptions, model_name, system_message=system_message):
    """
    Generate the predicted class for each item description using the FireWorks API.

    This function iterates through a list of item descriptions, sends each description
    to the FireWorks API, and collects the generated responses.

    Args:
        item_descriptions (list): A list of strings, where each string is an item description.
        model_name (str): The name of the model to use for generating responses.

    Returns:
        list: A list of generated responses, where each response corresponds to an item description.
    """    
    responses = list()
    
    for item_descr in item_descriptions:      
        msg = [
              {"role": "system", "content": system_message},
              {"role": "user", "content": item_descr}
        ]
        response = client.chat.completions.create(
            model=model_name,
            messages=msg,
            # Set the temp to 0 for tasks where there is a single correct answer, such as classification
            temperature=0, 
        )
    
        response = response.choices[0].message.content
        responses.append(response) 

    return responses

In [24]:
# Generate predictions on the test set using the base model (Llama 3 8B instruct), and calculate the accuracy.
base_model_name = 'accounts/fireworks/models/llama-v3-8b-instruct'
predictions = generate_responses(x_test.tolist(), base_model_name)

num_correct = len([i for i in range(len(predictions)) if predictions[i] == y_test.iloc[i]])
total = len(predictions)
pct_accuracy = round(100 * num_correct / total, 2)
print(f'Base Model Test Set Accuracy: {pct_accuracy}%')

Base Model Test Set Accuracy: 50.0%


In [19]:
# Generate predictions on the test set using the fine-tuned model (Llama 3 8B instruct), and calculate the accuracy.
ft_model_name = f'accounts/{account_id}/models/{model_id}'
predictions = generate_results(x_test.tolist(), ft_model_name)

num_correct = len([i for i in range(len(predictions)) if predictions[i] == y_test.iloc[i]])
total = len(predictions)
pct_accuracy = round(100 * num_correct / total, 2)
print(f'Fine-Tuned Model Test Set Accuracy: {pct_accuracy}%')

Fine-Tuned Model Test Set Accuracy: 98.23%


<p style="font-size: 18px;">
**********************<br><b>Conclusion</b><br><br>
After fine-tuning, the accuracy of item classification improved from 50% to 98.23%, demonstrating the significant impact of fine-tuning! Since this evaluation was performed on a test set not included in the training data, we can confidently expect these gains to apply to new items as well.
</p>

In [None]:
# OPTIONAL: undeploy the model. Fireworks currently does not charge for deploy models, only for usage, 
# so up to you whether you want to keep it deployed
!fireworks undeploy {model_id}