# Hand-Crafted Prompts Testing Playground
This notebook is made to offer means for testing the hand-crafted prompting method on a selection of LLMs. The notebook is meant to be a plug-and-play manner, where you could set the model id and run the cells to generate results directly.

## Disclaimer
This notebook is made to recreate Plaza-Del-Arco, F., Nozza, D., & Hovy, D. (2023). Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech. Retrieved from https://aclanthology.org/2023.woah-1.6.pdf

## Instruction Fine-Tuned Model
In this section, we try the instruction fine-tuned models with their prompt template.

In [3]:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, AutoModelForCausalLM, AutoModelForSequenceClassification
import pandas as pd
from tqdm.notebook import tqdm

### Load Model and Tokenizer

In [2]:
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small")

### Setting the Prompt Template

In [3]:
prompt_template = "Classify this text as hate or non-hate. Text:"
output_indicator = "Answer:"

### Loading Test Data

In [4]:
df = pd.read_csv('./data/HSOL/preprocessed_data.csv')
df.head()

Unnamed: 0,class,tweet
0,2,as a woman you shouldnt complain about cleanin...
1,1,boy dats coldtyga dwn bad for cuffin dat hoe i...
2,1,dawg you ever fuck a bitch and she start to cr...
3,1,she look like a tranny
4,1,the shit you hear about me might be true or it...


In [5]:
labels = {
    "0": "Hate Speech",
    "1": "Offensive",
    "2": "Non-Hate"
}

### Concatenate Prompt Template to Input Samples

In [6]:
def concat_prompt_template(df_column):
    return df_column.apply(lambda x: f"{prompt_template} {x}. {output_indicator}")

In [7]:
df['input'] = concat_prompt_template(df['tweet'])
text_data = df['input'].astype("str").tolist()

### Tokenize and Predict

In [8]:
inputs = tokenizer(text_data, return_tensors="pt", padding=True, truncation=True)

In [1]:
filename = 'flan_t5_hsol_results.csv'

In [25]:
file_object = open(filename, 'w')
for i, out in enumerate(decoded):
    file_object.write(out)
    file_object.write(',')
    file_object.write(labels[str(df["class"][i])])
    file_object.write('\n')
file_object.close()

In [26]:
results = pd.read_csv(filename)
results.columns = ["output", "truth"]
results.head()

Unnamed: 0,output,truth
0,Hate None,Offensive
1,Hate None-hate,Offensive
2,Non-hate,Offensive
3,Hate None-hate,Offensive
4,Hate None-hate,Offensive


### Answer Mapping

In [4]:
filename = 'flan_t5_hsol_results.csv'
results = pd.read_csv(filename)
results.columns = ["output", "truth"]
results.head()

Unnamed: 0,output,truth
0,Hate None,Offensive
1,Hate None-hate,Offensive
2,Non-hate,Offensive
3,Hate None-hate,Offensive
4,Hate None-hate,Offensive


### Qualitative Analysis
First of all, we want to know how many answers were not mapped to either "Hate" or "Non-Hate"

In [5]:
def count_non_correct_outputs(series):
    return series.apply()

In [6]:
results['isCorrect'] = results['output'].apply(lambda x: x.lower() == 'hate' or x.lower() == 'non-hate')

In [7]:
results.head()

Unnamed: 0,output,truth,isCorrect
0,Hate None,Offensive,False
1,Hate None-hate,Offensive,False
2,Non-hate,Offensive,True
3,Hate None-hate,Offensive,False
4,Hate None-hate,Offensive,False


In [8]:
correctCount = results[results.isCorrect == True]
len(correctCount)

10108

We got almost half of the samples with correct output, so we begin and map those.

In [9]:
correctCount.head()

Unnamed: 0,output,truth,isCorrect
2,Non-hate,Offensive,True
6,Non-hate,Offensive,True
8,Non-hate,Offensive,True
9,Non-hate,Offensive,True
11,Non-hate,Offensive,True


Convert all offensive labels to Non-hate

In [10]:
correctCount['truthModified'] = correctCount['truth'].apply(lambda x: 'Non-hate' if x.lower() == 'offensive' else x)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  correctCount['truthModified'] = correctCount['truth'].apply(lambda x: 'Non-hate' if x.lower() == 'offensive' else x)


In [11]:
correctCount.head()

Unnamed: 0,output,truth,isCorrect,truthModified
2,Non-hate,Offensive,True,Non-hate
6,Non-hate,Offensive,True,Non-hate
8,Non-hate,Offensive,True,Non-hate
9,Non-hate,Offensive,True,Non-hate
11,Non-hate,Offensive,True,Non-hate


Now we count the correct answers

In [12]:
correctCount['isCorrectAns'] = correctCount.apply(lambda x: x.output.lower() == x.truthModified.lower(), axis=1)
correctAnsCount = len(correctCount[correctCount.isCorrectAns == True])
correctAnsCount

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  correctCount['isCorrectAns'] = correctCount.apply(lambda x: x.output.lower() == x.truthModified.lower(), axis=1)


9648

### Mapping the answer classes to either 1 or 0

In [13]:
def mapAnswers(answer):
    if answer.lower() == 'non-hate':
        return 0
    elif answer.lower() == 'hate':
        return 1
    else: return None

correctCount['outputLabel'] = correctCount['output'].apply(mapAnswers)
correctCount['truthLabel'] = correctCount['truthModified'].apply(mapAnswers)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  correctCount['outputLabel'] = correctCount['output'].apply(mapAnswers)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  correctCount['truthLabel'] = correctCount['truthModified'].apply(mapAnswers)


In [14]:
correctCount['truthLabel']

2        0.0
6        0.0
8        0.0
9        0.0
11       0.0
        ... 
23990    0.0
23993    0.0
23994    0.0
23995    0.0
23996    0.0
Name: truthLabel, Length: 10108, dtype: float64

In [15]:
correctCount['outputLabel']

2        0
6        0
8        0
9        0
11       0
        ..
23990    0
23993    0
23994    0
23995    0
23996    0
Name: outputLabel, Length: 10108, dtype: int64

In [16]:
correctCount = correctCount.dropna()

In [17]:
len(correctCount)

9648

In [19]:
from sklearn.metrics import f1_score

In [21]:
score = f1_score(correctCount['truthLabel'], correctCount['outputLabel'], labels=[0, 1])

  _warn_prf(average, "true nor predicted", "F-score is", len(true_sum))


In [22]:
score

0.0