## Using GPT-3 to classify labels(yp2201)

### Description:
- This notebook is using GPT-3 API to classify label results.
- This model uses data from r/wallstreetbets, and focuses on 6 labels
- labels(6): TOXICITY, SEVERE_TOXICITY, IDENTITY_ATTACK, INSULT, PROFANITY, THREAT
- GPT-3 can be found at https://beta.openai.com/

### Version:
#### - ver 0.1.0(220418): 
- baseline results for only current comments
- classifies 800 comments from r/wallstreetbets

#### - ver ///(TBD): 
- results for current comments, but using preceding/following comments as reference
- using fine-tuning methods such as: ///
- classifies 800 + @ comments from r/wallstreetbets



In [1]:
# Make sure you've installed openai. If not, uncomment below line and install openai.
# pip install openai

In [2]:
# loading dataset
import numpy as np
import pandas as pd

# should change file directory to given csv file's path
data = pd.read_csv('/Users/yoontaepark/nlu-reddit-toxicity-dataset/data/labelled_master_data_2022-04-18.csv')
print('shape of dataset:', data.shape)
data.head()

shape of dataset: (800, 10)


Unnamed: 0,example_id,preceding_comment,comment_for_evaluation,following_comment,toxicity,severe_toxicity,identity_attack,insult,profanity,threat
0,104857,[https://www.youtube.com/watch?v=hRBOnA0ak4w&a...,GME just watching this all and doing nothing lol,[https://www.youtube.com/watch?v=hRBOnA0ak4w&a...,0,0,0,0,0,0
1,105125,#GOTTA GET THAT 1.48% ON A BOND WHOOOO,"Everything is down today, knee jerk reaction t...","AMC, I held all this time and I decided to buy...",0,0,0,0,0,0
2,105684,Having 30k to lose sounds pretty nice to me,"No, this is the Warthog",a good trump tweet would turn all this carnage...,0,0,0,0,0,0
3,105904,His tools for inflation are literally rates an...,The market is not worth what you guys have bee...,Can somebody make clear that it's not us who a...,0,0,0,0,0,0
4,1138,Wot?,Didnt he sell all his shares?,This is the way lol holding 100 March 12 / $51...,0,0,0,0,0,0


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 800 entries, 0 to 799
Data columns (total 10 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   example_id              800 non-null    int64 
 1   preceding_comment       800 non-null    object
 2   comment_for_evaluation  800 non-null    object
 3   following_comment       800 non-null    object
 4   toxicity                800 non-null    int64 
 5   severe_toxicity         800 non-null    int64 
 6   identity_attack         800 non-null    int64 
 7   insult                  800 non-null    int64 
 8   profanity               800 non-null    int64 
 9   threat                  800 non-null    int64 
dtypes: int64(7), object(3)
memory usage: 62.6+ KB


In [4]:
data['comment_for_evaluation'][120]

'It’s up 8%.'

In [5]:
comments_for_eval = data['comment_for_evaluation']

In [6]:
# defining prompt
# prompt = instruction + label + definition + comment + result
# instruction/label/definition/result is stable, while comment is changed
instruction = 'instructions:\n\
Please evaluate comment for evaluation on the labels provided. You can refer to the definitions of each labels below. Answer as format " , , , , , ".\n\n'

label = 'labels provided:\n\
TOXICITY, SEVERE_TOXICITY, IDENTITY_ATTACK, INSULT, PROFANITY, THREAT\n\n\
label values:\n\
1 - Matches the Definition\n\
0 - Does not match the Definition\n\n'

definition = 'definitions of each labels:\n\
TOXICITY: A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion.\n\
SEVERE_TOXICITY: A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words.\n\
IDENTITY_ATTACK: Negative or hateful comments targeting someone because of their identity.\n\
INSULT: Insulting, inflammatory, or negative comment towards a person or a group of people.\n\
PROFANITY: Swear words, curse words, or other obscene or profane language.\n\
THREAT: Describes an intention to inflict pain, injury, or violence against an individual or group.\n\n'

result = 'result:\n'

In [7]:
for i in comments_for_eval.values:
    print(i)

GME just watching this all and doing nothing lol
Everything is down today, knee jerk reaction to something they didn’t want to hear from jpow
No, this is the Warthog
The market is not worth what you guys have been paying for it. Simple as that. Last April was closer to the truth
Didnt he sell all his shares?
What's GME opening at
ɴᴏᴡ ᴘʟᴀʏɪɴɢ: [Gwen Stefani - Hollaback Gir](https://www.youtube.com/watch?v=Kgjkth6BRRY) ─────────⚪───── ◄◄⠀[▶](https://www.youtube.com/watch?v=Kgjkth6BRRY)⠀►►⠀ 2:18 / 3:28 ⠀ ───○ 🔊 ᴴᴰ ⚙️
Let me start from the beginning, when I was a boy in Bulgaria...

*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/wallstreetbets) if you have any questions or concerns.*
$ASS
Tech boutta get cock slapped again
May i ask why?
Bruh even BB is green i can't believe it
Alexa, play Pump It by the Black Eyed Peas
Flip a coin on spy puts or calls
Very fair point. However I would say to this, and it’s

In [8]:
import os
import openai

# replace with your api_key (should start with 'sh-///', https://beta.openai.com/account/api-keys)
openai.api_key = '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'

# using davinci-002 as a engine, as it has better performance among available engines
# replace your prompt
# below parameters are used for baseline result

res = []

for each_comment in comments_for_eval.values:
    comment_full_sentence = 'comment for evaluation:\n' + each_comment + '\n\n'
    prompt_wsb = instruction + label + definition + comment_full_sentence + result
    
    response = openai.Completion.create(
      engine="text-davinci-002",
      prompt=prompt_wsb,
      temperature=0,
      max_tokens=60,
      top_p=1.0,
      frequency_penalty=0.0,
      presence_penalty=0.0
    )
    
    res.append(response)

In [9]:
# two issues may arise -> need to figure out how to fix 
# 1) 0,0,0,0,0,0
# 2) answering only some of the labels
for i in range(len(res)):
    print(res[i]['choices'][0]['text'].split('\n'))

['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 1, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 1, 1']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 1, 1, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 1, 1, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 0, 0']
['0, 0, 0, 0, 1, 0']
['0, 0, 0, 0,

In [10]:
res[120]

<OpenAIObject text_completion id=cmpl-4yT0juMFxXqdQmU682nqRiCgINDQQ at 0x7f812004cb30> JSON: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "text": "0, 0, 0, 0, 0, 0"
    }
  ],
  "created": 1650316545,
  "id": "cmpl-4yT0juMFxXqdQmU682nqRiCgINDQQ",
  "model": "text-davinci:002",
  "object": "text_completion"
}

In [11]:
gpt3_toxic, gpt3_sev_toxic, gpt3_identity, gpt3_insult, gpt3_profanity, gpt3_threat = [], [], [] ,[], [], []

for each_res in range(len(res)):
    toxic_res = int(res[each_res]['choices'][0]['text'].split(', ')[0])
    sev_toxic_res = int(res[each_res]['choices'][0]['text'].split(', ')[1])
    identity_res = int(res[each_res]['choices'][0]['text'].split(', ')[2])
    insult_res = int(res[each_res]['choices'][0]['text'].split(', ')[3])
    profanity_res = int(res[each_res]['choices'][0]['text'].split(', ')[4])
    threat_res = int(res[each_res]['choices'][0]['text'].split(', ')[5])
                                                                                                                                                                                                                             
    gpt3_toxic.append(toxic_res)
    gpt3_sev_toxic.append(sev_toxic_res)
    gpt3_identity.append(identity_res)
    gpt3_insult.append(insult_res)
    gpt3_profanity.append(profanity_res)
    gpt3_threat.append(threat_res)

In [12]:
# adding results into dataframe, by defining new columns

data['gpt3_toxic'] = gpt3_toxic
data['gpt3_sev_toxic'] = gpt3_sev_toxic
data['gpt3_identity'] = gpt3_identity
data['gpt3_insult'] = gpt3_insult
data['gpt3_profanity'] = gpt3_profanity
data['gpt3_threat'] = gpt3_threat

In [16]:
# data.to_csv('./result_0418.csv')

## Evaluation of the model

### Now given all labels using gpt-3, compare gpt-3 with human labeling
<b>- Results:  
total) f1: 0.51, precision: 0.40, recall: 0.69</b>  
toxicity) f1: 0.16, precision: 1.00, recall: 0.09  
severe_toxicity) f1: 0.33, precision: 0.33, recall: 0.33  
identity_attack) f1: 0.20, precision: 1.00, recall: 0.11  
insult) f1: 0.44, precision: 0.33, recall: 0.67  
profanity) f1: 0.59, precision: 0.44, recall: 0.87  
threat) f1: NaN, precision: 0, recall: NaN

In [47]:
def f1_precision_recall(y_true, y_pred): 
    
    # recall that f1 score = 2 * (precision * recall) / (precision + recall)
    # precision = tp / (tp + fp)
    # recall = tp / (tp + fn)
    tp, tn, fp, fn = 0, 0, 0, 0
    precision, recall = 0, 0
    
    for i in range(len(y_true)):
        if y_true[i] == 1 and y_pred[i] == 1: tp += 1
        elif y_true[i] == 0 and y_pred[i] == 0: tn += 1
        elif y_true[i] == 0 and y_pred[i] == 1: fp += 1
        elif y_true[i] == 1 and y_pred[i] == 0: fn += 1            

    precision = tp / (tp + fp)
    recall = tp / (tp + fn)

    f1 = 2 * (precision * recall) / (precision + recall)
    
    return f1, precision, recall

In [48]:
# will get f1, precision, recall, respectively
toxic_f1, toxic_precision, toxic_recall = f1_precision_recall(data['toxicity'], data['gpt3_toxic'])
sev_f1, sev_precision, sev_recall = f1_precision_recall(data['severe_toxicity'], data['gpt3_sev_toxic'])
idn_f1, idn_precision, idn_recall = f1_precision_recall(data['identity_attack'], data['gpt3_identity'])
insult_f1, insult_precision, insult_recall = f1_precision_recall(data['insult'], data['gpt3_insult'])
prof_f1, prof_precision, prof_recall = f1_precision_recall(data['profanity'], data['gpt3_profanity'])
# threat_f1, threat_precision, threat_recall = f1_precision_recall(data['threat'], data['gpt3_threat'])

In [52]:
# Note: we can't calculate threat as threat_recall = 0/0. threat_precision = 0/8 = 0.0 (0 792 8 0)

print('toxicity) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(toxic_f1, toxic_precision, toxic_recall))
print('severe_toxicity) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(sev_f1, sev_precision, sev_recall))
print('identity_attack) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(idn_f1, idn_precision, idn_recall))
print('insult) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(insult_f1, insult_precision, insult_recall))
print('profanity) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(prof_f1, prof_precision, prof_recall))
# print('threat) f1:{:.2f}, precision:{:.2f}, recall:{:.2f}'.format(threat_f1, threat_precision, threat_recall))

toxicity) f1: 0.16, precision: 1.00, recall: 0.09
severe_toxicity) f1: 0.33, precision: 0.33, recall: 0.33
identity_attack) f1: 0.20, precision: 1.00, recall: 0.11
insult) f1: 0.44, precision: 0.33, recall: 0.67
profanity) f1: 0.59, precision: 0.44, recall: 0.87


In [56]:
# calcuating total(but can also average the results as rows for each results are even)
y_true_total = pd.concat([data['toxicity'], data['severe_toxicity'], data['identity_attack'], data['insult'], \
                          data['profanity'], data['threat']], axis=0)

y_pred_total = pd.concat([data['gpt3_toxic'], data['gpt3_sev_toxic'], data['gpt3_identity'], data['gpt3_insult'], \
                          data['gpt3_profanity'], data['gpt3_threat']], axis=0)

In [67]:
total_f1, total_precision, total_recall = f1_precision_recall(list(y_true_total), list(y_pred_total))
print('total) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(total_f1, total_precision, total_recall))

total) f1: 0.51, precision: 0.40, recall: 0.69
