# Explore Classifier Results

This notebook explores the results from each of our sameness classification methods. 

In [5]:
import pandas as pd
from vis import describe_results

In [6]:
RESULT_DFS = [
    pd.read_csv('data/outputs/gpt_4o_mini_no_examples_eval.csv'),
    pd.read_csv('data/outputs/gpt_35_turbo_no_examples_eval.csv'),
    pd.read_csv('data/outputs/semantic_similarity_eval.csv'),
    pd.read_csv('data/outputs/second_half_similarity_eval.csv'),
    pd.read_csv('data/outputs/tf_idf_similarity_eval.csv')
]

MODELS_USED = [
    'gpt_4o_mini_no_examples',
    'gpt_35_turbo_no_examples',
    'semantic_similarity_threshold88',
    'second_half_similarity_threshold88',
    'tf_idf_similarity_threshold88'
]

metrics_df = describe_results(
    dfs=RESULT_DFS,
    models_used=MODELS_USED, 
    y_true_col='classification', 
    y_pred_col='pred'
)

metrics_df.to_csv('data/outputs/results.csv')

metrics_df

Unnamed: 0,model_used,precision,recall,f1
0,gpt_4o_mini_no_examples,0.870968,0.482143,0.62069
1,gpt_35_turbo_no_examples,0.708861,1.0,0.82963
2,semantic_similarity_threshold88,0.428571,0.75,0.545455
3,second_half_similarity_threshold88,0.42268,0.732143,0.535948
4,tf_idf_similarity_threshold88,0.703704,0.678571,0.690909


We used 5 methods to classify whether each data pair is describing the same entity.The first two methods use LLMs using the cooperative name and abbreviation as input:

1. GPT 4o Mini prompted without any examples directly relating to the data. (uses name and abbreviation for all cooperatives)
2. GPT 3.5 Turbo prompted without any examples directly relating to the data. (uses name and abbreviation for all cooperatives)

The rest use cosine similarity on embeddings of the names only and classify True only for the top 88th percentile: 

3. Semantic similarity with paraphrase-MiniLM-L6-v2. 
4. Semantic similarity of the second halves of each name with paraphrase-MiniLM-L6-v2.
5. TF-IDF vector similarity.

GPT 4o-Mini performed the best in precision while GPT 3.5-Turbo had perfect recall and the best F1 score. Notably, TF-IDF similarity -- the least expensive method -- performed as well as 3.5-Turbo in precision and better than 4o-Mini in recall and F1. 

## What did the LLMs get wrong?

In [12]:
gpt_4o_mini_no_examples_result = RESULT_DFS[0]
wrong = gpt_4o_mini_no_examples_result[gpt_4o_mini_no_examples_result['classification'] != gpt_4o_mini_no_examples_result['pred']]
print(wrong.shape)
wrong

(33, 7)


Unnamed: 0.1,Unnamed: 0,Producer Name_x,Producer Name_y,Abbreviation Name_x,Abbreviation Name_y,classification,pred
0,0,societe cooperative agricole de kouibly,societe cooperative agricole de kouibly,socak scoops,socas,0,True
6,6,societe cooperative agricole de soubre,societe cooperative agricole de soubre,socopaso scoops,scasou-coop-ca,0,True
7,7,societe cooperative agricole de djoroplo,societe cooperative agricole de djoroplo,coop-ca socadjo,socadjo,1,False
12,12,societe cooperative agricole de guitry,societe cooperative agricole de guitry,coop-ca socoopgui,coop-ca-socoagui,0,True
18,18,societe cooperative agricole de guitry,societe cooperative agricole de guitry,socoopag coop-ca,coop-ca-socoagui,0,True
24,24,societe cooperative agricole de soubre,societe cooperative agricole de soubre,scasou,scasou-coop-ca,1,False
25,25,societe cooperative agricole source de guitry,societe cooperative agricole source de guitry,socopasg coop-ca,scoopasg,1,False
26,26,societe cooperative ivoirienne du negoce des p...,cooperative ivoirienne du negoce des produits ...,scinpa coop ca,scinpa,1,False
30,30,entreprise cooperative agricole de mogekeledougou,entreprise cooperative agricole de mogokeledougou,e.c.amog,ecamog,1,False
32,32,societe cooperative de negoce de lakota,societe coopperative de negoce de lakota,so-co-ne-l scoops,soconel,1,False


In [13]:
gpt_35_turbo_no_examples_result = RESULT_DFS[1]
wrong = gpt_35_turbo_no_examples_result[gpt_35_turbo_no_examples_result['classification'] != gpt_35_turbo_no_examples_result['pred']]
print(wrong.shape)
wrong

(85, 7)


Unnamed: 0.1,Unnamed: 0,Producer Name_x,Producer Name_y,Abbreviation Name_x,Abbreviation Name_y,classification,pred
0,0,societe cooperative agricole de kouibly,societe cooperative agricole de kouibly,socak scoops,socas,0,True
3,3,cooperative agricole des producteurs de divo,cooperative agricole des producteurs de divo,coopapd coop-ca,coopradi,0,True
5,5,societe cooperative agricole sinikan,societe cooperative agricole sinikan,scoopas coop-ca,sinikan-scoopas,0,True
6,6,societe cooperative agricole de soubre,societe cooperative agricole de soubre,socopaso scoops,scasou-coop-ca,0,True
10,10,societe cooperative agricole sinikan,societe cooperative agricole sinikan,coopas,sinikan-scoopas,0,True
...,...,...,...,...,...,...,...
289,289,societe cooperative agricole des producteurs d...,societe cooperative simplifiee agricole toumto...,scoopaphs scoops,coopaths-scoops,0,True
293,293,societe cooperative progres,societe cooperative simplifiee le progres des ...,scap scoops,le progres,0,True
295,295,societe cooperative agricole de koffikro,cooperative agrocile de koffikro,scak scoops,scoopakof coop-ca,0,True
302,302,societe cooperative agricole de youkou,societe cooperative agricole espoir de petit-g...,socayou,scaepgy,0,True
