## Using GPT-3 to classify labels(yp2201)

### Description:
- This notebook is using GPT-3 API to classify label results.
- This model uses data from r/wallstreetbets, and focuses on 6 labels
- labels(6): TOXICITY, SEVERE_TOXICITY, IDENTITY_ATTACK, INSULT, PROFANITY, THREAT
- GPT-3 can be found at https://beta.openai.com/

### Version:
#### - ver 0.1.0(220418): 
- baseline results for only current comments
- classifies 800 comments from r/wallstreetbets

#### - ver ///(TBD): 
- results for current comments, but using preceding/following comments as reference
- using fine-tuning methods such as: ///
- classifies 800 + @ comments from r/wallstreetbets



In [None]:
# Make sure you've installed openai. If not, uncomment below line and install openai.
# pip install openai

In [2]:
# loading dataset
import numpy as np
import pandas as pd

# should change file directory to given csv file's path
data = pd.read_csv('/Users/yoontaepark/nlu-reddit-toxicity-dataset/data/labelled_master_data_2022-05-11.csv')
print('shape of dataset:', data.shape)
data.head(3)

shape of dataset: (800, 11)


Unnamed: 0.1,Unnamed: 0,example_id,preceding_comment,comment_for_evaluation,following_comment,toxicity,severe_toxicity,identity_attack,insult,profanity,threat
0,6,275362,Which one of u bought my stocks? >:(,posted this in the other thread but the more i...,That thing is 1.64% rn. JPow has to say somet...,0.0,0.0,0.0,0.0,1.0,0.0
1,12,30108,Alexa play down by 311.,No one cares about 10 year bonds bro - thats j...,Sorry you were born the bad kind of retarded.,0.0,0.0,0.0,0.0,1.0,0.0
2,21,316461,Today = no Vaseline,The entire market is fucking hemorrhaging,Imagine putting your money in bonds lol,0.0,0.0,0.0,0.0,1.0,0.0


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 800 entries, 0 to 799
Data columns (total 11 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Unnamed: 0              800 non-null    int64  
 1   example_id              800 non-null    int64  
 2   preceding_comment       800 non-null    object 
 3   comment_for_evaluation  800 non-null    object 
 4   following_comment       800 non-null    object 
 5   toxicity                800 non-null    float64
 6   severe_toxicity         800 non-null    float64
 7   identity_attack         800 non-null    float64
 8   insult                  800 non-null    float64
 9   profanity               800 non-null    float64
 10  threat                  800 non-null    float64
dtypes: float64(6), int64(2), object(3)
memory usage: 68.9+ KB


In [4]:
data.drop(['Unnamed: 0'], axis=1, inplace=True)

In [5]:
data.head(3)

Unnamed: 0,example_id,preceding_comment,comment_for_evaluation,following_comment,toxicity,severe_toxicity,identity_attack,insult,profanity,threat
0,275362,Which one of u bought my stocks? >:(,posted this in the other thread but the more i...,That thing is 1.64% rn. JPow has to say somet...,0.0,0.0,0.0,0.0,1.0,0.0
1,30108,Alexa play down by 311.,No one cares about 10 year bonds bro - thats j...,Sorry you were born the bad kind of retarded.,0.0,0.0,0.0,0.0,1.0,0.0
2,316461,Today = no Vaseline,The entire market is fucking hemorrhaging,Imagine putting your money in bonds lol,0.0,0.0,0.0,0.0,1.0,0.0


In [6]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 800 entries, 0 to 799
Data columns (total 10 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   example_id              800 non-null    int64  
 1   preceding_comment       800 non-null    object 
 2   comment_for_evaluation  800 non-null    object 
 3   following_comment       800 non-null    object 
 4   toxicity                800 non-null    float64
 5   severe_toxicity         800 non-null    float64
 6   identity_attack         800 non-null    float64
 7   insult                  800 non-null    float64
 8   profanity               800 non-null    float64
 9   threat                  800 non-null    float64
dtypes: float64(6), int64(1), object(3)
memory usage: 62.6+ KB


In [7]:
len_total = 0
for comment in data['comment_for_evaluation'].str.split(' '):
    len_total += len(comment)

print('Approximate total tokens: ', len_total)
print('Average tokens per comment: ', len_total / len(data['comment_for_evaluation']))

Approximate total tokens:  11009
Average tokens per comment:  13.76125


In [8]:
data['comment_for_evaluation'][20]

"We're all one senate covid relief bill failed vote (couch Joe Manchin cough) from getting ass fucked"

In [24]:
# defining prompt
# prompt = instruction + label_name + values + definition + comment +answer
# omment is changed, while others remain stable
instruction = "instruction:\nEvaluate list of comments on the labels provided. Refer to the definitions of each labels below. Answer as format [' , , , , , '], [' , , , , , '], \n\n"
label_name = 'label provided:\nTOXICITY, SEVERE_TOXICITY, IDENTITY_ATTACK, INSULT, PROFANITY, THREAT\n\n'
label_values = 'label values:\n1 - Matches the Definition\n0 - Does not match the Definition\n\n'
definition = 'definitions of each labels: \nTOXICITY: A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion. \nSEVERE_TOXICITY: A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words. \nIDENTITY_ATTACK: Negative or hateful comments targeting someone because of their identity. \nINSULT: Insulting, inflammatory, or negative comment towards a person or a group of people. \nPROFANITY: Swear words, curse words, or other obscene or profane language. \nTHREAT: Describes an intention to inflict pain, injury, or violence against an individual or group. \n\n'
answer = 'answers: \n'

In [25]:
import os
import openai

# replace with your api_key (should start with 'sh-///', https://beta.openai.com/account/api-keys)
openai.api_key = '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'

# using davinci-002 as a engine, as it has better performance among available engines
# replace your prompt
# below parameters are used for baseline result

# batch_idx = [100, 200, 300, 400, 500, 600, 700, 800]
batch_idx = [5, 10]

res = []

for ith_batch in batch_idx: 
    comments_list = ''
#     for idx, each_comment in enumerate(data['comment_for_evaluation'][ith_batch-10:ith_batch].values):
#         comments_list += str(idx+1) + '. ' + each_comment + '\n'
    for each_comment in data['comment_for_evaluation'][ith_batch-5:ith_batch].values:
        comments_list += '- ' + each_comment + '\n'
#     print(comments_list)
        
    comment_full_sentence = 'list of comments:\n' + comments_list + '\n'
    prompt_wsb = instruction + label_name + label_values + definition + comment_full_sentence + answer
    
    response = openai.Completion.create(
      engine="text-davinci-002",
      prompt=prompt_wsb,
      temperature=0,
      max_tokens=2000,
      top_p=1.0,
      frequency_penalty=0.0,
      presence_penalty=0.0
    )
    
    # append result into a new list
    res.append(response)
    

In [26]:
resres = []

for i in range(len(res)):
    resres.extend(res[i]['choices'][0]['text'].split('\n'))

In [27]:
resres

["['0', '0', '0', '0', '0', '0'], ",
 "['0', '0', '0', '1', '0', '0'], ",
 "['0', '0', '0', '0', '0', '0'], ",
 "['0', '0', '0', '0', '0', '0'], ",
 "['0', '0', '0', '0', '0', '0']",
 "['0', '1', '0', '0', '1', '0'], ['0', '0', '0', '1', '0', '0'], ['0', '0', '0', '0', '0', '0']"]

### Trial and errors for prompt setting

### Placing multiple comments for evaluation

In [42]:
resres = []

for i in range(len(res)):
    print(res[i]['choices'][0]['text'].split('\n'))
    resres.extend(res[i]['choices'][0]['text'].split('\n'))

['0, 0, 1, 1, 1, 0, 0, 0, 1, 0']
['0, 1, 1, 0, 0, 0, ', '0, 0, 1, 1, 0, 0, ', '0, 0, 0, 0, 0, 0, ', '0, 0, 0, 0, 0, 0, ', '1, 0, 0, 0, 0, 1']


In [None]:
# two issues may arise -> need to figure out how to fix 
# 1) 0,0,0,0,0,0
# 2) answering only some of the labels
for i in range(len(res)):
    print(res[i]['choices'][0]['text'].split('\n'))

In [None]:
res[120]

In [None]:
gpt3_toxic, gpt3_sev_toxic, gpt3_identity, gpt3_insult, gpt3_profanity, gpt3_threat = [], [], [] ,[], [], []

for each_res in range(len(res)):
    toxic_res = int(res[each_res]['choices'][0]['text'].split(', ')[0])
    sev_toxic_res = int(res[each_res]['choices'][0]['text'].split(', ')[1])
    identity_res = int(res[each_res]['choices'][0]['text'].split(', ')[2])
    insult_res = int(res[each_res]['choices'][0]['text'].split(', ')[3])
    profanity_res = int(res[each_res]['choices'][0]['text'].split(', ')[4])
    threat_res = int(res[each_res]['choices'][0]['text'].split(', ')[5])
                                                                                                                                                                                                                             
    gpt3_toxic.append(toxic_res)
    gpt3_sev_toxic.append(sev_toxic_res)
    gpt3_identity.append(identity_res)
    gpt3_insult.append(insult_res)
    gpt3_profanity.append(profanity_res)
    gpt3_threat.append(threat_res)

In [None]:
# adding results into dataframe, by defining new columns

data['gpt3_toxic'] = gpt3_toxic
data['gpt3_sev_toxic'] = gpt3_sev_toxic
data['gpt3_identity'] = gpt3_identity
data['gpt3_insult'] = gpt3_insult
data['gpt3_profanity'] = gpt3_profanity
data['gpt3_threat'] = gpt3_threat

In [None]:
# data.to_csv('./result_0418.csv')

## Evaluation of the model

### Now given all labels using gpt-3, compare gpt-3 with human labeling
<b>- Results:  
total) f1: 0.51, precision: 0.40, recall: 0.69</b>  
toxicity) f1: 0.16, precision: 1.00, recall: 0.09  
severe_toxicity) f1: 0.33, precision: 0.33, recall: 0.33  
identity_attack) f1: 0.20, precision: 1.00, recall: 0.11  
insult) f1: 0.44, precision: 0.33, recall: 0.67  
profanity) f1: 0.59, precision: 0.44, recall: 0.87  
threat) f1: NaN, precision: 0, recall: NaN

In [None]:
def f1_precision_recall(y_true, y_pred): 
    
    # recall that f1 score = 2 * (precision * recall) / (precision + recall)
    # precision = tp / (tp + fp)
    # recall = tp / (tp + fn)
    tp, tn, fp, fn = 0, 0, 0, 0
    precision, recall = 0, 0
    
    for i in range(len(y_true)):
        if y_true[i] == 1 and y_pred[i] == 1: tp += 1
        elif y_true[i] == 0 and y_pred[i] == 0: tn += 1
        elif y_true[i] == 0 and y_pred[i] == 1: fp += 1
        elif y_true[i] == 1 and y_pred[i] == 0: fn += 1            

    precision = tp / (tp + fp)
    recall = tp / (tp + fn)

    f1 = 2 * (precision * recall) / (precision + recall)
    
    return f1, precision, recall

In [None]:
# will get f1, precision, recall, respectively
toxic_f1, toxic_precision, toxic_recall = f1_precision_recall(data['toxicity'], data['gpt3_toxic'])
sev_f1, sev_precision, sev_recall = f1_precision_recall(data['severe_toxicity'], data['gpt3_sev_toxic'])
idn_f1, idn_precision, idn_recall = f1_precision_recall(data['identity_attack'], data['gpt3_identity'])
insult_f1, insult_precision, insult_recall = f1_precision_recall(data['insult'], data['gpt3_insult'])
prof_f1, prof_precision, prof_recall = f1_precision_recall(data['profanity'], data['gpt3_profanity'])
# threat_f1, threat_precision, threat_recall = f1_precision_recall(data['threat'], data['gpt3_threat'])

In [None]:
# Note: we can't calculate threat as threat_recall = 0/0. threat_precision = 0/8 = 0.0 (0 792 8 0)

print('toxicity) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(toxic_f1, toxic_precision, toxic_recall))
print('severe_toxicity) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(sev_f1, sev_precision, sev_recall))
print('identity_attack) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(idn_f1, idn_precision, idn_recall))
print('insult) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(insult_f1, insult_precision, insult_recall))
print('profanity) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(prof_f1, prof_precision, prof_recall))
# print('threat) f1:{:.2f}, precision:{:.2f}, recall:{:.2f}'.format(threat_f1, threat_precision, threat_recall))

In [None]:
# calcuating total(but can also average the results as rows for each results are even)
y_true_total = pd.concat([data['toxicity'], data['severe_toxicity'], data['identity_attack'], data['insult'], \
                          data['profanity'], data['threat']], axis=0)

y_pred_total = pd.concat([data['gpt3_toxic'], data['gpt3_sev_toxic'], data['gpt3_identity'], data['gpt3_insult'], \
                          data['gpt3_profanity'], data['gpt3_threat']], axis=0)

In [None]:
total_f1, total_precision, total_recall = f1_precision_recall(list(y_true_total), list(y_pred_total))
print('total) f1: {:.2f}, precision: {:.2f}, recall: {:.2f}'.format(total_f1, total_precision, total_recall))