# Establishing a threshold

## Setup


In [3]:
import os
import pandas as pd
pd.set_option('display.max_rows', 20)
pd.set_option('display.max_columns', 5)
pd.set_option('display.width', 1000)
pd.set_option('display.max_colwidth', 200)

from nutritionrag.rag_pipeline import rag_setup_qdrant, query_vector_db_list_qdrant, rag_query_list_qdrant

In [2]:
%cd ../..

/home/szaboildi/code/szaboildi/nutrition-rag


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


In [4]:
try:
    import tomllib # type: ignore
except ModuleNotFoundError:
    import tomli as tomllib

with open(os.path.join("parameters.toml"), mode="rb") as fp:
    config = tomllib.load(fp)

config_name = "default"
from_scratch = False

In [5]:
eval_df = pd.read_csv(os.path.join("data", "eval", "test_questions_raw.csv"))
query_list = eval_df["user_question"].to_list()

In [7]:
vector_db_client, encoder, llm_client = rag_setup_qdrant(
    config=config[config_name])

Vector database loaded
RAG setup complete


## Retrieval

In [8]:
raw_answers = query_vector_db_list_qdrant(
    vector_db_client, encoder, query_list,config=config[config_name])

In [9]:
# Data formatting
processed_answers = []

# unpack the payloads into a single dataframe
for i in range(len(raw_answers)):
    for doc in raw_answers[i]["retrieved"]:
        processed_answers.append(
            {"user_question": raw_answers[i]["user_question"],
            **doc})

processed_answers = pd.DataFrame(processed_answers).merge(eval_df, how="inner")
processed_answers_grouped = processed_answers.groupby(["user_question", "answerable"]).agg({"cosine": ["min", "max"]}).reset_index()
processed_answers_grouped.columns = ["user_question", "answerable", "min_cosine", "max_cosine"]

In [10]:
# processed_answers.loc[~(processed_answers.answerable)]

In [11]:
processed_answers_grouped

Unnamed: 0,user_question,answerable,min_cosine,max_cosine
0,Are any foods no-go for someone with diabetes?,True,0.90566,0.930672
1,"As a diabetic, should I choose an apple or a cake for dessert?",True,0.85727,0.882059
2,"As a diabetic, should I skip either lunch or dinner?",True,0.887702,0.947247
3,"Can I drink a caramel cappuccino, if I have diabetes?",True,0.872102,0.884836
4,Can I eat white bread as a diabetic?,True,0.864177,0.885423
5,Can you eat berries with diabetes?,True,0.870394,0.927814
6,Can you eat pineapple with diabetes?,True,0.864517,0.897337
7,I'm considering intermittent fasting. Could it help me maintain my blood sugar?,True,0.878517,0.919834
8,Is it better to have a high blood sugar or a low blood sugar?,True,0.880271,0.911115
9,Should I not eat carbohydrates at all as a diabetic?,True,0.90462,0.948491


Based on these questions there is no consistent boundary that could be established as a cutoff for a minimum cosine similarity (with these embeddings). If the cutoff is chosen at for example 0.9, that suggests that questions #1 () and #4 ("Can I eat white bread as a diabetic?")cannot be answered based on the provided data, where  

## RAG

In [None]:
rag_responses = rag_query_list_qdrant(
    query_list, vector_db_client, encoder, llm_client, config[config_name])

In [13]:
qa_df = pd.DataFrame({"user_question": rag_responses[0], "llm_response": rag_responses[1]})
qa_df_meta = pd.DataFrame([{**item, "user_question": row["user_question"]} for row in rag_responses[2] for item in row["retrieved"]])

rag_df_processed = qa_df.merge(qa_df_meta, how="inner")

In [14]:
rag_df_processed

Unnamed: 0,user_question,llm_response,question,answer,cosine
0,Are any foods no-go for someone with diabetes?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",Are there any foods I should stay away from with diabetes?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",0.930672
1,Are any foods no-go for someone with diabetes?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",What are some unhealthy foods for people with diabetes?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",0.929724
2,Are any foods no-go for someone with diabetes?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",What should diabetics not eat?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",0.923179
3,Are any foods no-go for someone with diabetes?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",What foods should I avoid as a diabetic?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",0.920402
4,Are any foods no-go for someone with diabetes?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",Are there low-sugar snacks that are good for people with diabetes?,"Healthy snack options include Greek yogurt, almonds, boiled eggs, and vegetables with hummus.",0.905660
...,...,...,...,...,...
70,Should I not eat carbohydrates at all as a diabetic?,"Yes, but focus on complex carbs like whole grains, legumes, and vegetables, and control portions.",Should I avoid all carbs with diabetes?,"Yes, but focus on complex carbs like whole grains, legumes, and vegetables, and control portions.",0.948491
71,Should I not eat carbohydrates at all as a diabetic?,"Yes, but focus on complex carbs like whole grains, legumes, and vegetables, and control portions.",Can I eat carbs if I have diabetes?,"Yes, but focus on complex carbs like whole grains, legumes, and vegetables, and control portions.",0.929237
72,Should I not eat carbohydrates at all as a diabetic?,"Yes, but focus on complex carbs like whole grains, legumes, and vegetables, and control portions.",Are carbohydrates bad for diabetics?,"Yes, but focus on complex carbs like whole grains, legumes, and vegetables, and control portions.",0.923894
73,Should I not eat carbohydrates at all as a diabetic?,"Yes, but focus on complex carbs like whole grains, legumes, and vegetables, and control portions.",What should diabetics not eat?,"Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.",0.905080


In [15]:
for i, row in rag_df_processed[["user_question", "llm_response"]].drop_duplicates().iterrows():
    print(row["user_question"])
    print(row["llm_response"])
    print("")

Are any foods no-go for someone with diabetes?
Avoid sugary drinks, processed snacks, white bread, and high-sugar desserts.

Can you eat berries with diabetes?
Yes, you can eat berries with diabetes. They are a good option due to their fiber content, but just watch portion sizes.

Can you eat pineapple with diabetes?
Sorry, I don't have information on that. Please try a different question.

Can I eat white bread as a diabetic?
Sorry, I don't have information on that. Please try a different question.

What's a good lunch for someone with diabetes?
Sorry, I don't have information on that. Please try a different question.

Can I drink a caramel cappuccino, if I have diabetes?
Sorry, I don't have information on that. Please try a different question.

I'm considering intermittent fasting. Could it help me maintain my blood sugar?
It depends on the individual and medication. Always consult a healthcare provider before starting any fasting regimen.

What's your favorite snack?
Sorry, I don't 