# Introduction

This notebook accompanies my blog post on using prompt-based GPT for analyzing textual data. 

You can find the blog post here: <link>

In [2]:
NAME = 'prompt_based_gpt_example' 
PYTHON_VERSION = '3.10'

## Preamble

### Imports  

In [3]:
import re, math, time, sys, copy, random, json
from pathlib import Path
import pandas as pd
import numpy as np

### Settings

In [4]:
pd.options.mode.chained_assignment = None  # default='warn'
pd.set_option('display.max_columns', 150)
pd.set_option('display.max_rows', 150)

------
# Code

## Generate dataset

Below I create some fake employee reviews and I manually labeled 3 of them to generate training data. 

In [34]:
reviews_list = [
    {
        'review_id': 1, 
        'review_text': '''Plumbing Co is a great company to work for! The compensation is great and above the industry standard. The benefits are also very good. The company is very fair and treats its employees well. I would definitely recommend Plumbing Co to anyone looking for a great place to work.''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'The compensation is great and above the industry standard.', 'sentiment': 'positive'},
            {'sentence': 'The benefits are also very good.', 'sentiment': 'positive'}
        ]
    },
    {
        'review_id': 2, 
        'review_text': '''I've been working at Plumbing Co for a few months now, and I've found it to be a pretty decent place to work. The salary is pretty average, but the coffee is really great. Overall, I think it's a pretty good company to work for. The hours are reasonable, and the work is fairly easy. I would recommend it to anyone looking for a decent job.''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'The salary is pretty average, but the coffee is really great.', 'sentiment': 'neutral'},
        ]
    },
    {
        'review_id': 3, 
        'review_text': '''Plumbing Co is a great place to work for those who are interested in the field of plumbing. The company is always expanding and there is room for advancement. The pay is too low, however, and this is the only downside to working for Plumbing Co.''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'The pay is too low, however, and this is the only downside to working for Plumbing Co.', 'sentiment': 'negative'},
        ]
    },
    {
        'review_id': 4, 
        'review_text': '''I had a great time working with Chairlift Brothers! They were very professional and compensated me well for my time. I would definitely recommend them to anyone looking for a great chairlift company.''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'They were very professional and compensated me well for my time.', 'sentiment': 'positive'},
        ]
    },
    {
        'review_id': 5, 
        'review_text': '''The pay is good, and the benefits are great. I found the work at Plumbing Co to be very interesting and enjoyable. Their lunches could use improvement though, the salads are never fresh. I would definitely recommend this company to anyone looking for a great place to work.''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'The pay is good, and the benefits are great.', 'sentiment': 'positive'},
        ]
    },
    {
        'review_id': 6, 
        'review_text': '''Stay away from this company!! Upper level management is horrible, very bad experience.''',
        'in_training' : True,
        'pay_sentences' : [
        ]
    },
    {
        'review_id': 7, 
        'review_text': '''The office buildings are quite nice. The cleaning lady on Thursday morning sometimes even brought me coffee, insane! I enjoyed the collegiality of the group and the short work hours. Salary was reasonable and the work hours were standard. Worth a look.''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'Salary was reasonable and the work hours were standard.', 'sentiment': 'neutral'},
        ]
    },
    {
        'review_id': 8, 
        'review_text': '''I didn't enjoy my time working at Plumbing Co. The parking spaces are quite wide, which was great for my mini-van. The pay was too low, and the work was very boring. Joel from IT was great though, very helpful.''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'The pay was too low, and the work was very boring.', 'sentiment': 'negative'},
        ]
    },
    {
        'review_id': 9, 
        'review_text': '''The CEO, Terresa, was just a great human being, thanks for everything! The pay is great, and the benefits are great. The only reason I left was because I had the opportunity to become a YouTube celebrity. If you are looking to join a VR startup, this is the company! Did I mention the pay is great?''',
        'in_training' : True,
        'pay_sentences' : [
            {'sentence': 'The pay is great, and the benefits are great.', 'sentiment': 'positive'},
            {'sentence': 'Did I mention the pay is great?', 'sentiment': 'positive'},
        ]
    },
    {
        'review_id': 10, 
        'review_text': '''Nothing to mention. Standard company.''',
        'in_training' : True,
        'pay_sentences' : [
        ]
    },
    {
        'review_id': 11, 
        'review_text': '''I had a great time working with Chairlift Brothers! They were very professional and compensated me well for my time. I would definitely recommend them to anyone looking for a great chairlift company.''',
        'in_training' : False,
    },
    {
        'review_id': 12, 
        'review_text': '''I would not recommend working for Chairlift Brothers. They do not pay well and the hours are long. The work is also very physical and challenging, so it's not for everyone. There are better companies out there that will pay you more for your time and effort.''',
        'in_training' : False,
    },
    {
        'review_id': 13, 
        'review_text': '''The Chairlift Brothers company is terrible. They don't pay their employees well, and the work is extremely demanding. The company is also very disorganized, and it's hard to get anything done. Overall, I would not recommend working for this company.''',
        'in_training' : False,
    },
    {
        'review_id': 14, 
        'review_text': '''I don't love working at Company ABC, but the wage is decent and the benefits are decent. I would recommend this company to anyone looking for a decent place to work.''',
        'in_training' : False,
    }
]

## Create a function that can generate a prompt and completion

Designing the prompt and completion correctly is very important, both for performance and costs. The costs and speed of inference is linearly correlated with the size of the input+output, so you gener6ly want to design the prompt and completion to be as small as possible without impacting the performance.

You generally need at least a separator that indicates where the prompt ends and the completion starts, and a separator that indicates where the completion ends and the next prompt starts

In [35]:
def generate_prompt_and_completion(review_data, prompt_end = "\n####\n", completion_end = "\n<|endoftext|>"):
    ## Note, I add newlines everywhere to make the example easier to read, but those are often not nescessary for the model.
    
    ret_dict = {}
    ret_dict['prompt'] = review_data['review_text'] + prompt_end 

    if review_data['in_training']:
        completion_list = []
        for sentence in review_data['pay_sentences']:
            completion_list.append(f'''<{sentence['sentiment']}> {sentence['sentence']}''')
        ret_dict['completion'] = '\n'.join(completion_list) + completion_end

    return ret_dict

#### Show example:

In [36]:
tmp = generate_prompt_and_completion(reviews_list[0])
print(tmp['prompt'] + tmp['completion'])

Plumbing Co is a great company to work for! The compensation is great and above the industry standard. The benefits are also very good. The company is very fair and treats its employees well. I would definitely recommend Plumbing Co to anyone looking for a great place to work.
####
<positive> The compensation is great and above the industry standard.
<positive> The benefits are also very good.
<|endoftext|>


#### Create prompts for all

In [37]:
training_list = []
to_predict_list = []
for review in reviews_list:
    if review['in_training']:
        training_list.append(generate_prompt_and_completion(review))
    else:
        to_predict_list.append(generate_prompt_and_completion(review))

In [40]:
training_list[:4]

[{'prompt': 'Plumbing Co is a great company to work for! The compensation is great and above the industry standard. The benefits are also very good. The company is very fair and treats its employees well. I would definitely recommend Plumbing Co to anyone looking for a great place to work.\n####\n',
  'completion': '<positive> The compensation is great and above the industry standard.\n<positive> The benefits are also very good.\n<|endoftext|>'},
 {'prompt': "I've been working at Plumbing Co for a few months now, and I've found it to be a pretty decent place to work. The salary is pretty average, but the coffee is really great. Overall, I think it's a pretty good company to work for. The hours are reasonable, and the work is fairly easy. I would recommend it to anyone looking for a decent job.\n####\n",
  'completion': '<neutral> The salary is pretty average, but the coffee is really great.\n<|endoftext|>'},
 {'prompt': 'Plumbing Co is a great place to work for those who are interested

## Logic to parse a completion

A completion is a text string that is generated by GPT-3, so we need to parse it to get the information out into a Python object.

This is relatively easy with some regular expressions given that we control how the completion is organized. 

In [10]:
def parse_completion(completion):
    compensation_sentences = re.findall(r'<(.*?)> (.*?)\n', completion) 

    completion_dict = {
        'sentences' : compensation_sentences,
        'num_positive' : len([sen for sen in compensation_sentences if sen[0] == 'positive']),
        'num_negative' : len([sen for sen in compensation_sentences if sen[0] == 'negative']),
        'num_neutral' : len([sen for sen in compensation_sentences if sen[0] == 'neutral']),
        'num_sentences' : len(compensation_sentences)
    }

    return completion_dict

In [11]:
completion = '<positive> The compensation is great and above the industry standard.\n<positive> The benefits are also very good.\n<|endoftext|>'
parse_completion(completion)

{'sentences': [('positive',
   'The compensation is great and above the industry standard.'),
  ('positive', 'The benefits are also very good.')],
 'num_positive': 2,
 'num_negative': 0,
 'num_neutral': 0,
 'num_sentences': 2}

-------
-------
# Predictions using OpenAI
-------
-------

## Using OpenAI

You can sign up for an OpenAI account here: https://beta.openai.com/overview
They give you a small amount of free credit to use for your first experiments.

You can interact with OpenAI models in two ways:

1. You can use their Playground on the website and manually feed it prompts and alter the parameters
2. You can use their API directly from your code to provide a prompt and retrieve the completion. 

In the sections below I will demonstrate how to use the API because the Playground is intuitive and doesn't require much explanation.  
You can access the Playground here: https://beta.openai.com/playground

I will skip over a lot of details in the code below, so I strongly recommend also reading the OpenAI documentation:
https://beta.openai.com/docs

-------

#### Setting up your OpenAI API credentials

To interact with the API you need to follow the following steps:

1. Log in to the openAI website
2. Click on the "Personal" button in the top right corner and select "View API Keys"
3. Create a new secret key or copy an existing one
4. Install the OpenAI Python library by running `pip install openai`

You can now interact with the API using the OpenAI Python client and your secret key. 

**Two important warnings:**

**Warning 1:** Keep your secret key private! The secret key is like your password, if someone has it they can interact with OpenAI using your payment info. It is best to store it as an environment variable instead of storing it in your code.

**Warning 2:** This is a paid API and every requests will use up your credits. Once you run out of free credits you will be charged for running predictions, these are generally very small amounts, but code responsibly! You can set soft and hard limits in on the OpenAI website. 

In [12]:
import openai

In [13]:
if 'OPENAI_API_KEY' not in os.environ:
    os.environ['OPENAI_API_KEY'] = input('Enter your OpenAI API key: ')
openai.api_key = os.environ['OPENAI_API_KEY']

## Zero shot example 
-------------------

In a zero shot setting we don't give the model any training data.

In [14]:
zero_shot_prompt = training_list[0]['prompt']
print(zero_shot_prompt)

Plumbing Co is a great company to work for! The compensation is great and above the industry standard. The benefits are also very good. The company is very fair and treats its employees well. I would definitely recommend Plumbing Co to anyone looking for a great place to work.
####



#### API example

In [15]:
result = openai.Completion.create(
    model = 'davinci', 
    prompt = zero_shot_prompt,
    stop = ["<|endoftext|>"],
    max_tokens =  500,
    temperature= 0.7
)

In [16]:
print(result['choices'][0]['text'])

We are a small, but growing company in the Dallas/Ft Worth area looking for talented and driven individuals to join out team! If you are interested please contact our office at 972.847.8165 and ask for Rob or Eric.


**Conclusion:** The completion generated by the model is nothing like what we are after. This is not surprising given that we did not show the model what we want it to generate, so it just added extra text to the review. 

We could try to resolve that by explicitly including an instruction in our prompt, but this will only work for very common tasks. 

As you can see below, it behaves unpredictably for our use case.

In [17]:
instruction = 'Extract the sentences that talk about compensation and label each compensation sentence as either positive, negative, or neutral:'
result = openai.Completion.create(
    model = 'davinci', 
    prompt = zero_shot_prompt + '\n' + instruction,
    stop = ["<|endoftext|>"],
    max_tokens =  200,
    temperature= 0.7
)

In [18]:
print(result['choices'][0]['text'])


Compensation
Compensation is great!
Compensation is very good.
Compensation is good.
Compensation is very fair.
Compensation is above the industry standard.
Compensation is great and above the industry standard.
Compensation is good and above the industry standard.
Compensation is very good and above the industry standard.
Compensation is very fair and above the industry standard.
Compensation is above the industry standard and the industry standard is very fair.
Compensation is great.
Compensation is good.
Compensation is very fair.
Compensation is above the industry standard.
Compensation is great and above the industry standard.
Compensation is good and above the industry standard.
Compensation is very good and above the industry standard.
Compensation is very fair and above the industry standard.
Compensation is above the industry standard and the industry standard is very fair.
Compensation is great.
Compensation is


# Few shot example 
-------------------

By including a few example prompts + completions in our prompt we can better tell the model what we are expecting.

The downside is that costs and speed are linearly related to our prompt+completion length, so including examples is more costly and takes longer.

In [19]:
few_shot_examples = ''
for training_item in training_list[:3]:
    few_shot_examples += training_item['prompt']
    few_shot_examples += training_item['completion'] + '\n'

In [20]:
few_shot_prompt = few_shot_examples + to_predict_list[0]['prompt']
print(few_shot_prompt)

Plumbing Co is a great company to work for! The compensation is great and above the industry standard. The benefits are also very good. The company is very fair and treats its employees well. I would definitely recommend Plumbing Co to anyone looking for a great place to work.
####
<positive> The compensation is great and above the industry standard.
<positive> The benefits are also very good.
<|endoftext|>
I've been working at Plumbing Co for a few months now, and I've found it to be a pretty decent place to work. The salary is pretty average, but the coffee is really great. Overall, I think it's a pretty good company to work for. The hours are reasonable, and the work is fairly easy. I would recommend it to anyone looking for a decent job.
####
<neutral> The salary is pretty average, but the coffee is really great.
<|endoftext|>
Plumbing Co is a great place to work for those who are interested in the field of plumbing. The company is always expanding and there is room for advancement

#### API example

In [21]:
result = openai.Completion.create(
    model = 'davinci', 
    prompt = few_shot_prompt,
    stop = ["<|endoftext|>"],
    max_tokens =  1000,   ## <-- Because we give it a bunch of extra text in our prompt we need to allow for longer predictions.
    temperature= 0.7
)

In [22]:
completion = result['choices'][0]['text']
print(completion)

<positive> They were very professional and compensated me well for my time.



That seems to work! So now we can parse it and get the information out.

In [23]:
parse_completion(completion)

{'sentences': [('positive',
   'They were very professional and compensated me well for my time.')],
 'num_positive': 1,
 'num_negative': 0,
 'num_neutral': 0,
 'num_sentences': 1}

### We can also feed all of the remaining reviews through the model this way:

In [24]:
prediction_list = []
for predict_item in to_predict_list:
    ## Generate prompt with examples
    few_shot_prompt = few_shot_examples + predict_item['prompt']

    ## Generate prediction
    result = openai.Completion.create(
        model = 'davinci', 
        prompt = few_shot_prompt,
        stop = ["<|endoftext|>"],
        max_tokens =  1000,
        temperature= 0.7
    )

    ## Parse completion
    completion = result['choices'][0]['text']
    completion_dict = parse_completion(completion)

    ## Store prediction
    prediction_list.append({
        'prompt' : predict_item['prompt'],
        'completion' : completion,
        'completion_dict' : completion_dict
    })

In [25]:
for item in prediction_list:
    print(item['prompt'].strip())
    print(item['completion'].strip())
    print('-------')

I had a great time working with Chairlift Brothers! They were very professional and compensated me well for my time. I would definitely recommend them to anyone looking for a great chairlift company.
####
<positive> They were very professional and compensated me well for my time.
-------
I would not recommend working for Chairlift Brothers. They do not pay well and the hours are long. The work is also very physical and challenging, so it's not for everyone. There are better companies out there that will pay you more for your time and effort.
####
<negative> They do not pay well and the hours are long.
<negative> The work is also very physical and challenging, so it's not for everyone.
-------
The Chairlift Brothers company is terrible. They don't pay their employees well, and the work is extremely demanding. The company is also very disorganized, and it's hard to get anything done. Overall, I would not recommend working for this company.
####
<negative> The work is extremely demanding.

**Conclusion:** When running all the examples through the model we can see that the model consistently generates a completion that follows our structure, however, it often extracts sentences that do not relate to compensation. This, again, is not particularly surprising given that we only gave the model three examples. However, giving the model a few more examples (let's say 10 to 100) will likely result in very reasonable performance.

# Fine-tuning example
-------------------

Fine-tuning a GPT model is a powerful way to improve the performance of a model. 

Below will just demonstrate a very basic fine-tuning setup, for the full details and discussion see the documentation: https://beta.openai.com/docs/guides/fine-tuning

### Step 1: Generate training data file

OpenAI requires the training data to be in a specific format, but they provide helper functions to generate this format. This what I am doing below.

In [42]:
training_csv = pd.DataFrame(training_list)
training_csv.to_csv(r'F:\training_data.csv', index=False)

To generate the training data, run the following command in the terminal:

```bash
openai tools fine_tunes.prepare_data -f F:\training_data.csv
```

This will generate the `.jsonl` that we need in the same directory:

`"F:\training_data_prepared.jsonl"`

### Step 2: Fine tune the model

We can finetune the model by running the following command in the terminal:

```bash
openai api fine_tunes.create -t "F:\training_data_prepared.jsonl" -m curie
```

**Note #1:** OpenAI provides various models of different sizes. Davinci is the largest model available and is very expensive to fine-tune. For most tasks it makes the most sense to fine-tune a Curie or Babbage model, as they are easier to fine-tune, faster to generate predictions, and much cheaper relative to Davinci. 

**Note #2:** OpenAI recommends at least a few hundred examples for fine-tuning. So performance likely isn't great with 10 examples, so consider this a matter of demonstration. :)

**Note #3:** There is a cost associated with fine-tuning with OpenAI, when running the `openai api fine_tunes.create` it will provide you with a cost estimate for the fine-tuning. Davinci becomes expensive quite fast, but a Curie or Babbage model is generally economical (i.e., between $10 and $50 for most tasks).

After the fine-tuning is complete, you will receive a model ID that you can use to generate your predictions using the API or the Playground.

In this case my fine-tuned model is called `curie:ft-personal-2022-04-14-03-40-04`


### Step 3: Generate predictions using our custom model

Fine-tuned models take little while to become available after fine-tuning is complete.    
More generally, the first prediction generally takes a moment to load and is likely to time-out, but subsequent predictions should go through normally. 

In [43]:
model_id = 'curie:ft-personal-2022-04-14-03-40-04'

#### Run a quick test prompt

In [47]:
test_prompt = 'The weather was great. The benefits were ok and nothing special. I did like the sandwiches, especially the cheese.' + '\n####\n'

In [48]:
result = openai.Completion.create(
    model = model_id,  ## <--- We change this with our custom model ID
    prompt = test_prompt,
    stop = ["<|endoftext|>"],
    max_tokens =  500,
    temperature= 0.7
)

In [50]:
completion = result['choices'][0]['text']
print(completion.strip())

<negative> The benefits were ok and nothing special.


#### Run for all

In [51]:
prediction_list = []
for predict_item in to_predict_list:
    ## Generate prompt without (!) examples
    ft_prompt = predict_item['prompt']

    ## Generate prediction
    result = openai.Completion.create(
        model = model_id,  ## <--- We change this with our custom model ID
        prompt = ft_prompt,
        stop = ["<|endoftext|>"],
        max_tokens =  500,
        temperature= 0.7
    )

    ## Parse completion
    completion = result['choices'][0]['text']
    completion_dict = parse_completion(completion)

    ## Store prediction
    prediction_list.append({
        'prompt' : predict_item['prompt'],
        'completion' : completion,
        'completion_dict' : completion_dict
    })

In [52]:
for item in prediction_list:
    print(item['prompt'].strip())
    print(item['completion'].strip())
    print('-------')

I had a great time working with Chairlift Brothers! They were very professional and compensated me well for my time. I would definitely recommend them to anyone looking for a great chairlift company.
####
<positive> They were very professional and compensated me well for my time.
-------
I would not recommend working for Chairlift Brothers. They do not pay well and the hours are long. The work is also very physical and challenging, so it's not for everyone. There are better companies out there that will pay you more for your time and effort.
####
<negative> The work is also very physical and challenging, so it's not for everyone.
-------
The Chairlift Brothers company is terrible. They don't pay their employees well, and the work is extremely demanding. The company is also very disorganized, and it's hard to get anything done. Overall, I would not recommend working for this company.
####
<neutral> The company is also very disorganized, and it's hard to get anything done.
-------
I don'

**Conclusion:** The predictions are similar to the few-shot example, however, we switched from the Davinci model to the Curie model (which is much smaller), so the fact that this still works with only 10 examples is quite impressive. Creating prediction using our fine-tuned Curie model is also much faster and cheaper relative to using few shot with Davinci. If you'd fine-tune a Curie model with a few hundred examples I'd expect the model to yield usable and reliable results.