# Model Evaluation ChatGPT

# 1. Set up environment

Import the OpenAI, sklearn, numpy, and pandas libraries. Input your OpenAI API key. Load the CSV file containing the summary of the evaluation using Sentence Transformers into a Pandas DataFrame. This DataFrame will be overwritten with the metrics resulting from the evaluation using ChatGPT.

In [3]:
from openai import OpenAI
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
import pandas as pd

pd.set_option('display.max_colwidth', None)

api_key = 'YOUR_API_KEY'
client = OpenAI(api_key=api_key)

path = './Evaluation_Summary/'
file = 'evaluation_summary'
type_file = '.csv'

summary = pd.read_csv(path+file+type_file, index_col=0)

# 2. Define the functions to be applied

Create the functions for performing the following operations: calculate ChatGPT cosine similarity, update the dataframe with the obtained metrics, calculate the mean cosine similarity and store it in the DataFrame that contains the summary of the ChatGPT evaluation.

In [4]:
def chatgpt_similarity(sentence1, sentence2):
    model = 'text-embedding-ada-002'
    
    embedding1 = client.embeddings.create(input = [sentence1], model=model).data[0].embedding
    embedding2 = client.embeddings.create(input = [sentence2], model=model).data[0].embedding

    embedding1_np = np.array(embedding1)
    embedding2_np = np.array(embedding2)

    similarity = cosine_similarity([embedding1_np], [embedding2_np])
    
    return similarity

def update_cosine_similarity_column(df):
    for index, row in df.iterrows():
        sentence1 = row['analysis_expected']
        sentence2 = row['analysis_generated']

        cosine_similarity = chatgpt_similarity(sentence1, sentence2)
        
        df.at[index, 'cosine_similarity'] = cosine_similarity

    return df
    
def sentence_similarity_mean(df):
    mean = df['cosine_similarity'].sum() / len(df)
    formatted_mean = '{:.6f}'.format(mean)
    return formatted_mean
    
def store_evaluation(index, ratio_type, df_template, df_results):
    df_template.at[index, ratio_type] = sentence_similarity_mean(df_results)

# 3. Zero-Shot Evaluation

For each ratio, load the file containing the evaluation using Sentence Transformers into a Pandas DataFrame. Call the 'update_similarity_column' function with this DataFrame as input. The function will overwrite the cosine similarity using Sentence Transformers with the cosine similarity using ChatGPT and return the DataFrame with the updated metrics. Finally, call the 'store_evaluation' function with the updated DataFrame and the summary evaluation DataFrame as input. The function will calculate the mean cosine similarity of ChatGPT and update the summary evaluation DataFrame with this value (also, specify the index and column to update as parameters to the function).

## Current ratio

In [3]:
path = './Zero_Shot_Evaluation/'
file = 'zero_shot_current_ratio_similarity'
type_file = '.csv'

zs_current = pd.read_csv(path+file+type_file, index_col=0)

In [5]:
zs_current = update_cosine_similarity_column(zs_current)

In [None]:
store_evaluation('zero_shot', 'current_ratio', summary, zs_current)

## Quick ratio

In [9]:
path = './Zero_Shot_Evaluation/'
file = 'zero_shot_quick_ratio_similarity'
type_file = '.csv'

zs_quick = pd.read_csv(path+file+type_file, index_col=0)

In [11]:
zs_quick = update_cosine_similarity_column(zs_quick)

In [None]:
store_evaluation('zero_shot', 'quick_ratio', summary, zs_quick)

## Cash ratio

In [17]:
path = './Zero_Shot_Evaluation/'
file = 'zero_shot_cash_ratio_similarity'
type_file = '.csv'

zs_cash = pd.read_csv(path+file+type_file, index_col=0)

In [18]:
zs_cash = update_cosine_similarity_column(zs_cash)

In [None]:
store_evaluation('zero_shot', 'cash_ratio', summary, zs_cash)

# 4. Few-Shot Evaluation

For each ratio, load the file containing the evaluation using Sentence Transformers into a Pandas DataFrame. Call the 'update_similarity_column' function with this DataFrame as input. The function will overwrite the cosine similarity using Sentence Transformers with the cosine similarity using ChatGPT and return the DataFrame with the updated metrics. Finally, call the 'store_evaluation' function with the updated DataFrame and the summary evaluation DataFrame as input. The function will calculate the mean cosine similarity of ChatGPT and update the summary evaluation DataFrame with this value (also, specify the index and column to update as parameters to the function).

## Current ratio

In [23]:
path = './Few_Shot_Evaluation/'
file = 'few_shot_current_ratio_similarity'
type_file = '.csv'

fs_current = pd.read_csv(path+file+type_file, index_col=0)

In [24]:
fs_current = update_cosine_similarity_column(fs_current)

In [25]:
store_evaluation('few_shot', 'current_ratio', summary, fs_current)

## Quick ratio

In [26]:
path = './Few_Shot_Evaluation/'
file = 'few_shot_quick_ratio_similarity'
type_file = '.csv'

fs_quick = pd.read_csv(path+file+type_file, index_col=0)

In [27]:
fs_quick = update_cosine_similarity_column(fs_quick)

In [28]:
store_evaluation('few_shot', 'quick_ratio', summary, fs_quick)

## Cash ratio

In [29]:
path = './Few_Shot_Evaluation/'
file = 'few_shot_cash_ratio_similarity'
type_file = '.csv'

fs_cash = pd.read_csv(path+file+type_file, index_col=0)

In [30]:
fs_cash = update_cosine_similarity_column(fs_cash)

In [34]:
store_evaluation('few_shot', 'cash_ratio', summary, fs_cash)

# 5. Fine-Tuned Evaluation

For each ratio, retrieve the evaluation file containing the 5 iterations (k1-k5) generated using the Cross-Validation methodology. Load the file into a Dataframe.

Call the 'update_similarity_column' function with the DataFrame as input. The function will overwrite the cosine similarity using Sentence Transformers with the cosine similarity using ChatGPT and return the DataFrame with the updated metrics. Finally, call the 'store_evaluation' function with the updated DataFrame and the summary evaluation DataFrame as input. The function will calculate the mean cosine similarity of ChatGPT and update the summary evaluation DataFrame with this value (also, specify the index and column to update as parameters to the function).

## Current ratio

In [14]:
path = './Fine_Tuned_Evaluation/'
file = 'fine_tuned_current_k1_k5_similarity'
type_file = '.csv'
ft_current_k1_k5 = pd.read_csv(path + file + type_file, index_col=0)

In [None]:
ft_current_k1_k5 = update_cosine_similarity_column(ft_current_k1_k5)
store_evaluation('fine_tuned', 'current_ratio', summary, ft_current_k1_k5)

## Quick ratio

In [16]:
path = './Fine_Tuned_Evaluation/'
file = 'fine_tuned_quick_k1_k5_similarity'
type_file = '.csv'
ft_quick_k1_k5 = pd.read_csv(path + file + type_file, index_col=0)

In [None]:
ft_quick_k1_k5 = update_cosine_similarity_column(ft_quick_k1_k5)
store_evaluation('fine_tuned', 'quick_ratio', summary, ft_quick_k1_k5)

## Cash ratio

In [18]:
path = './Fine_Tuned_Evaluation/'
file = 'fine_tuned_cash_k1_k5_similarity'
type_file = '.csv'
ft_cash_k1_k5 = pd.read_csv(path + file + type_file, index_col=0)

In [None]:
ft_cash_k1_k5 = update_cosine_similarity_column(ft_cash_k1_k5)
store_evaluation('fine_tuned', 'cash_ratio', summary, ft_cash_k1_k5)

# 6. Evaluation summary

Display the DataFrame containing the evaluation summary and save it to a CSV file.

In [20]:
summary

Unnamed: 0,current_ratio,quick_ratio,cash_ratio
zero_shot,0.938144,0.940716,0.937442
few_shot,0.962357,0.965288,0.961328
fine_tuned,0.948098,0.953441,0.954979


In [21]:
path = './Evaluation_Summary/'
file = 'evaluation_summary_chatgpt'
type_file = '.csv'

summary.to_csv(path+file+type_file, index=True)