## The goal is to compare different models

For this task we already have a set of queries that have been evaluated, we will load them and use to calculate the scores for the models search results

In [49]:
import pandas as pd
import re
import os

# trying to load the review bank
try:
    review_bank = pd.read_excel('reviews/review_bank.xlsx')
except:
    review_bank = pd.DataFrame()

# the function for getting the ratings for pre-evaluated query-recipe pairs
def lookup_rating(query, recipe):
    try:
        ratings = review_bank[(review_bank['Query'] == query) & (review_bank['Receita'] == recipe)][["Nota", "Evaluator"]]
        person_rating = ratings[ratings['Evaluator'] == "Person"]
        if not person_rating.empty:
            # If there is a human evaluation, it gets the preference
            return person_rating.values[0][0]
        else:
            return ratings.iloc[0].values[0][0]
    except:
        return None

In [50]:
# getting all the files in the output folder that are in the format Results_*.xlsx
pattern = r"Results_.*\.xlsx$"

model_results_paths = [os.path.join('output', file) for file in os.listdir('output') if re.match(pattern, file)]

models = {}
for model_result_path in model_results_paths:
    model_name = re.search(r"Results_(.*).xlsx", os.path.basename(model_result_path)).group(1)

    result_df = pd.read_excel(model_result_path)
    result_df["Nota"] = result_df.apply(lambda row: lookup_rating(row['Query'], row['title']), axis=1)

    models[model_name] = result_df


In [51]:
missing_reviews = pd.DataFrame()

for model in models:
    df = models[model]
    # Filtrar as linhas onde Nota é None
    model_missing_reviews = df[df['Nota'].isnull()]
    
    # Calcular a média de Nota
    mean_score = df['Nota'].mean()
    
    # Imprimir o relatório
    print(f'Modelo: {model}')
    print(f'Avaliações ausentes: {len(model_missing_reviews)}')
    print(f'Média de pontuação: {mean_score}\n')

    missing_reviews = pd.concat([missing_reviews, model_missing_reviews])

Modelo: Bm25
Avaliações ausentes: 0
Média de pontuação: 2.6

Modelo: hybrid
Avaliações ausentes: 1
Média de pontuação: 3.314814814814815

Modelo: semantic
Avaliações ausentes: 0
Média de pontuação: 3.3454545454545452

Modelo: Tfidf
Avaliações ausentes: 0
Média de pontuação: 2.6363636363636362



In [52]:
missing_reviews

Unnamed: 0,Tipo,Descrição,Query,id,title,body,Nota
52,Semantica,Pergunta difícil,what can I make for a romantic dinner,208724,pineapple tempeh,pineapple tempeh\n\nRecipe posted on: 2007-02-...,
