## Example Selection

In [None]:
import pandas as pd
import json

label_column = 'Class'


train_df = pd.read_csv(f'../../Tasks/conan/train.csv')
train_df['length'] = train_df['text'].apply(lambda x: len(x))

context = ""
for j in train_df[label_column].unique():
	print(f'Class: {j}')
	sample = train_df[train_df['length'] < 180]
	sample = sample[sample[label_column] == j].sample(3)
	for index, row in sample.iterrows():
		print(f'  {row["text"]}')
		context += f'<text_icl_begin> CONTEXT: {row["context"]}\nCOMMENT: {row["text"]} <text_icl_end>\nLABEL: {j}\n'

#save the context into a json file
context = f"You are an expert in social psychology.\nAlways read the CONTEXT before the COMMENT and decide one of: facts · support · denouncing · hypocrisy · unrelated · humor · question. Reply with the single label on its own line — no extra words.\n\n### EXAMPLES\n{context}### END EXAMPLES"
with open('../icl_promtps/conan.json', 'w') as f:
	json.dump([context], f, indent=4)

Class: facts
  So all the Muslim doctors, business owners, servicemen/women, taxi drivers, chefs, etc. That I've encountered must be part of that 1% then?
  No religion encourages violence towards others. I do not think it is a muslim issue at all.
  The quran teaches justice and that criminals should be treated justly and be given the chance to reform.
Class: support
  In the swimming pool there are baths only for Muslims, I cannot enter. If I, a white men, asked for something like this nobody would allow me. Muslims can do anything they want.
  If a girl has muslim parents, she is at risk of FGM!
  Our country wants to shut up those who speak against islam, but it isn't right, they are the only ones who expose themselves.
Class: denouncing
  How can you say such hateful things about an entire faith?
  An entire faith of 1.6 billion people is not a cancer.
  That is quite a big generalisation, and obviously a false and discriminatory one too.
Class: hypocrisy
  When you say 'cult' you

### Pompt GPT-4o model and save results for each fold

In [1]:
import json 

with open('../icl_promtps/conan.json', 'r') as f:
	data = json.load(f)[0]

In [None]:

import pandas as pd
import openai, os
import pandas as pd
from tqdm import tqdm
import traceback, pickle

import sys, dotenv
sys.path.append('../..')

dotenv.load_dotenv()
client = openai.OpenAI(api_key = os.getenv("OPENAI_API_KEY"))

def get_response( message ): 
  
    response = client.chat.completions.create(
      model="chatgpt-4o-latest",
      # logprobs = True,
      messages=[
        {"role": "system", "content": "You are an expert in social psychology."},
        {"role": "user", "content": message},
      ],
       max_tokens = 10,

    )
    # print(response.choices[0].message.content)
    return response.choices[0].message.content


def get_inference(message, label_set):
    
	pred = "unknown"
	for i in range(6):
		try:
			z = get_response(message)
			if z.strip().lower() in label_set:
				pred = z
				break
		except:
			print(traceback.format_exc())
			pass

	return pred

df = pd.read_csv(f'../../Tasks/conan/test.csv')
df = df.dropna(subset=['text'])
label_column = 'Class'


for i in range(4):

	for index, row in tqdm(df.iterrows(), total=len(df)):

		prompt = f"{data}\n\nCONTEXT: {row['context']}\nTEXT:{row['text']}\nLABEL:"
		
		response = get_inference(prompt, label_set=list(df[label_column].unique()))
		df.at[index, 'predicted_label'] = response

	df.to_csv(f'outputs/conan/test_{i}.csv', index=False)

In [1]:
from sklearn.metrics import f1_score
import pandas as pd
import numpy as np

f1_scores = []
for i in range(4):
	df = pd.read_csv(f'outputs/conan/test_{i}.csv')
	f1_scores += [f1_score(df['Class'], df['predicted_label'], average='macro')]
	
print('Avg', np.mean(f1_scores))
print('Std', np.std(f1_scores))

Avg 0.762594667811815
Std 0.006665365445820801
