# Fine Tuning GPT for Sentiment Analysis

## Load Libraries and Data

In [1]:
# Installing OpenAI on Colab:
# !pip install openai

# Installing OpenAI locally:
# pip install openai
# conda install openai

import os
import io
import requests
import re
import numpy as np
import pandas as pd
import openai
from sklearn.metrics import accuracy_score

# Note: see instructions for OpenAI setup at
# https://github.com/openai/openai-python
openai.api_key = os.getenv('OPENAI_API_KEY')

### Data Source

IMDB movie reviews via: http://ai.stanford.edu/~amaas/data/sentiment/

In [2]:
train_url = 'https://raw.githubusercontent.com/natecraig/aiml/main/Data/Movie_Train.txt'
test_url = 'https://raw.githubusercontent.com/natecraig/aiml/main/Data/Movie_Test.txt'

train_download = requests.get(train_url).content
test_download = requests.get(test_url).content

# For train and test sets, the first 12,500 reviews are positive,
# and the second 12,500 reviews are negative

X_train_raw = []
for l in io.StringIO(train_download.decode('utf-8')):
    X_train_raw.append(l.strip())
    
X_test_raw = []
for l in io.StringIO(test_download.decode('utf-8')):
    X_test_raw.append(l.strip())
    
categories = ['Negative', 'Positive']
y_train = [1 if i < 12500 else 0 for i in range(25000)] 
y_test = [1 if i < 12500 else 0 for i in range(25000)]

In [3]:
# Drop HTML line breaks
regex = re.compile("(<br\s*/><br\s*/>)|(\-)|(\/)")
X_test = [regex.sub(' ', x) for x in X_test_raw]
X_train = [regex.sub(' ', x) for x in X_test_raw]

In [4]:
print(X_test[1000])



In [5]:
print(y_test[1000])

1


In [6]:
# Create a subset of the training data for fine tuning
ntrain = 2000
randidx = np.random.choice(len(X_train), ntrain, replace=False)
X_train_sub = [X_train[i] for i in randidx]
y_train_sub = [y_train[i] for i in randidx]

# Subset the testing data for speed
ntest = 100
randidx = np.random.choice(len(X_test), ntest, replace=False)
X_test_sub = [X_test[i] for i in randidx]
y_test_sub = [y_test[i] for i in randidx]

## Fine Tuning GPT

In [7]:
# Setup the training data
y_train_sub_labels = ['Positive' if y == 1 else 'Negative' for y in y_train_sub]
df = pd.DataFrame(zip(X_train_sub, y_train_sub_labels),
                  columns = ['prompt', 'completion'])
df.to_json('Data/movie_reviews.jsonl', orient='records', lines=True)

In [8]:
# Prepare data for tuning
!openai tools fine_tunes.prepare_data -f Data/movie_reviews.jsonl -q

Analyzing...

- Your file contains 2000 prompt-completion pairs
- Based on your data it seems like you're trying to fine-tune a model for classification
- For classification, we recommend you try one of the faster and cheaper models, such as `ada`
- For classification, you can estimate the expected model performance by keeping a held out dataset, which is not used for training
- There are 2 duplicated prompt-completion sets. These are rows: [129, 1779]
- There are 1 examples that are very long. These are rows: [17]
For conditional generation, and for classification the examples shouldn't be longer than 2048 tokens.
- Your data does not contain a common separator at the end of your prompts. Having a separator string appended to the end of the prompt makes it clearer to the fine-tuned model where the completion should begin. See https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset for more detail and examples. If you intend to do open-ended generation, then you shou

In [9]:
!openai api fine_tunes.create -m "curie" --suffix "movie reviews" -t "Data/movie_reviews_prepared_train.jsonl" -v "Data/movie_reviews_prepared_valid.jsonl" --compute_classification_metrics --classification_positive_class " Positive"
!echo \n

Upload progress: 100%|████████████████████| 2.13M/2.13M [00:00<00:00, 2.65Git/s]
Uploaded file from Data/movie_reviews_prepared_train.jsonl: file-jWM8CiksFhbTBZLbG6uqlvvD
Upload progress: 100%|███████████████████████| 538k/538k [00:00<00:00, 428Mit/s]
Uploaded file from Data/movie_reviews_prepared_valid.jsonl: file-6IEutAkBFCZBpwiFQlwi0xLo
Created fine-tune: ft-hCgD29h8asM8RX15LdZvshi5
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2023-03-09 10:47:01] Created fine-tune: ft-hCgD29h8asM8RX15LdZvshi5

Stream interrupted (client disconnected).
To resume the stream, run:

  openai api fine_tunes.follow -i ft-hCgD29h8asM8RX15LdZvshi5

n


In [19]:
!openai api fine_tunes.follow -i ft-hCgD29h8asM8RX15LdZvshi5

[2023-03-09 10:47:01] Created fine-tune: ft-hCgD29h8asM8RX15LdZvshi5
[2023-03-09 10:56:21] Fine-tune costs $5.47
[2023-03-09 10:56:21] Fine-tune enqueued. Queue number: 1
[2023-03-09 10:56:23] Fine-tune is in the queue. Queue number: 0
[2023-03-09 10:56:44] Fine-tune started
[2023-03-09 11:04:17] Completed epoch 1/4
[2023-03-09 11:10:58] Completed epoch 2/4
[2023-03-09 11:17:38] Completed epoch 3/4
[2023-03-09 11:24:18] Completed epoch 4/4
[2023-03-09 11:24:49] Uploaded model: curie:ft-research:movie-reviews-2023-03-09-16-24-48
[2023-03-09 11:24:50] Uploaded result file: file-wx8WT6gwf3uRZrXSHmXhKIIp
[2023-03-09 11:24:50] Fine-tune succeeded

Job complete! Status: succeeded 🎉
Try out your fine-tuned model:

openai api completions.create -m curie:ft-research:movie-reviews-2023-03-09-16-24-48 -p <YOUR_PROMPT>


## Using the Tuned Model

In [20]:
# Add the suffix separator to the test data
X_test_sub = [x + ' ->' for x in X_test_sub]

In [21]:
y_pred = [0]*ntest

for i in range(ntest):
    prompt = X_test_sub[i]

    response = openai.Completion.create(
        model='curie:ft-research:movie-reviews-2023-03-09-16-24-48',
        prompt=prompt,
        max_tokens=1
    )
    
    sentiment = response['choices'][0]['text'].strip()
    print(sentiment)
    y_pred[i] = 1 if sentiment == 'Positive' else 0

Negative
Positive
Negative
Negative
Positive
Positive
Negative
Positive
Negative
Negative
Positive
Positive
Positive
Negative
Negative
Negative
Positive
Negative
Negative
Positive
Negative
Positive
Positive
Positive
Negative
Positive
Negative
Negative
Negative
Negative
Negative
Positive
Negative
Positive
Negative
Negative
Negative
Negative
Negative
Negative
Negative
Positive
Positive
Positive
Positive
Positive
Positive
Negative
Negative
Positive
Positive
Positive
Positive
Positive
Positive
Negative
Positive
Negative
Positive
Positive
Negative
Positive
Negative
Positive
Positive
Positive
Negative
Positive
Negative
Negative
Negative
Positive
Negative
Negative
Negative
Negative
Negative
Negative
Positive
Negative
Positive
Positive
Negative
Negative
Positive
Positive
Positive
Positive
Negative
Negative
Positive
Negative
Negative
Positive
Positive
Negative
Negative
Positive
Positive
Negative


In [22]:
accuracy_score(y_test_sub, y_pred)

0.97