## Building an Advanced RAG System for "The Healthy Keto Plan" PDF

# Introduction

## Project Goal
This project implements a Retrieval-Augmented Generation (RAG) system to build an intelligent chatbot capable of answering questions based on the content of the book "The Healthy Keto Plan" by Dr. Eric Berg. The goal is to explore various retrieval strategies to provide the most accurate and contextually relevant answers.

## Technology Stack
**Data Processing:** PyMuPDF (fitz), pandas

**Embeddings:** sentence-transformers/all-mpnet-base-v2

**Vector Store:** FAISS (Facebook AI Similarity Search)

**LLM:** Google Gemini 2.5 Flash via API

**Framework:** LangChain

# Understanding the Dataset

In [2]:
import requests
pdf_path = 'The_Healthy_Keto_Plan_PDF_from_eric_berg.pdf'
pdf_url = 'https://cdn.bookey.app/files/pdf/book/en/the-healthy-keto-plan.pdf'
response = requests.get(pdf_url)
if response.status_code==200:
    with open(pdf_path,'wb') as file:
        file.write(response.content)

In [3]:
import fitz
import re
doc = fitz.open(pdf_path)
full_text = ''
for page in doc:
    page_text = page.get_text()
    cleaned_page_text = re.sub(r'scan to download', '', page_text, flags=re.IGNORECASE)
    full_text+=cleaned_page_text

print(full_text[:500])

The Healthy Keto Plan PDF
Eric Berg

The Healthy Keto Plan
Transform Your Health to Naturally Achieve
Sustainable Weight Loss.
Written by Bookey
Check more about The Healthy Keto Plan Summary
Listen The Healthy Keto Plan Audiobook

About the book
In "The Healthy Keto Plan," Dr. Eric Berg shifts the focus
from merely losing weight to achieving holistic health as the
foundation for effective weight management. This updated
edition of his best-selling guide presents new strategies
tailored to your 


In [4]:
def text_formatter(text):
    cleaned_text = text.replace("\n"," ").strip()
    return cleaned_text

I have to remove after page number 208 because the model can understand wrong for example page 208 contains a sentence: **Identifying your body type is not important in the Healthy
 Keto Plan**

 the model can undersand it wrong.


 after page 120 i have lots of question and answers those informations are so valuable. every question/answer will be a chunk.

In [75]:
from tqdm import tqdm
def describe_the_pdf(doc):
    pages_and_text = []
    for page_number, page in tqdm(enumerate(doc)):
        text = page.get_text()
        text = text_formatter(text)
        cleaned_page_text = re.sub(r'scan to download', '', text, flags=re.IGNORECASE)
        cleaned_page_text = re.sub(r'Install Bookey App to Unlock Full Text and Audio','',cleaned_page_text,flags=re.IGNORECASE)
        pages_and_text.append({
            "page_number": page_number,
            "page_char_count": len(text),
            "page_word_count": len(text.split(' ')),
            "page_sentence_count": len(text.split('. ')),
            "page_token_count": len(text) / 4,
            "text": cleaned_page_text
        })
    return pages_and_text

pages_and_text = describe_the_pdf(doc)
pages_and_text[207:208]

227it [00:00, 2541.34it/s]


[{'page_number': 207,
  'page_char_count': 842,
  'page_word_count': 141,
  'page_sentence_count': 10,
  'page_token_count': 210.5,
  'text': ' The Healthy Keto Plan Quiz and Test Check the Correct Answer on Bookey Website Chapter 1 | 1. Missing Link—the Educational Step| Quiz and Test 1.Losing weight is primarily a challenge of willpower according to Eric Berg. 2.Healthy Ketosis™ and intermittent fasting are combined in the Healthy Keto Plan to help with weight loss. 3.Identifying your body type is not important in the Healthy Keto Plan. Chapter 2 | 2. The 7 Principles of Fat Burning| Quiz and Test 1.Food is primarily considered as a source of pleasure according to Eric Berg. 2.Weight gain is more influenced by metabolism and hormones rather than just calorie intake. 3.To achieve effective weight loss, one should focus on reducing their weight before addressing any health issues. Chapter 3 | 3. Hormones and Your Body Shape| Quiz and Test '}]

In [6]:
doc_removed = doc[:206]
dataset = describe_the_pdf(doc_removed)
dataset[7] #this is a table

206it [00:00, 3676.13it/s]


{'page_number': 7,
 'page_char_count': 1630,
 'page_word_count': 228,
 'page_sentence_count': 12,
 'page_token_count': 407.5,
 'text': ' Chapter 1 Summary : 1. Missing Link—the Educational Step Section Summary Introduction to Weight Loss Difficulties Weight loss challenges often stem from lack of education on effective methods rather than willpower issues. The Importance of Education Understanding fat burning and health is crucial; education on triggering fat-burning hormones is essential for success. Healthy Ketosis™ and Intermittent Fasting The program combines Healthy Ketosis™ and intermittent fasting to normalize insulin levels and address health issues. Identifying the Real Problem Weight is a symptom; comprehending hormonal influences on metabolism is vital for effective weight loss. Tailoring the Plan to Your Body Type This plan recognizes different body types require varied dietary and exercise strategies; a quiz will help determine individual needs. Nutrient-Dense Foods for Be

In [7]:
import random
random.sample(pages_and_text, k=3)

[{'page_number': 180,
  'page_char_count': 923,
  'page_word_count': 143,
  'page_sentence_count': 8,
  'page_token_count': 230.75,
  'text': " 7.Question What if I'm feeling more tired on the healthy eating plan? Answer:Increased fatigue can be due to B vitamin depletion caused by fat burning. Taking nutritional yeast can replenish these vitamins easily—just one teaspoon a day can make a significant difference. 8.Question Why might I experience vivid dreams or nightmares while on this plan? Answer:These can be signs of a B-vitamin deficiency, specifically if you’re not getting enough from your diet. Adding non-fortified nutritional yeast to your daily routine can help alleviate this issue. 9.Question How important is it to focus on portion sizes while eating? Answer:Awareness of your body's fullness cues is vital. Many people are conditioned to finish everything on their plates due to upbringing. Structuring smaller plates or servings at home can help you stop eating when satisfied in

In [8]:
import pandas as pd
df = pd.DataFrame(dataset)
df.head()

Unnamed: 0,page_number,page_char_count,page_word_count,page_sentence_count,page_token_count,text
0,0,52,10,1,13.0,The Healthy Keto Plan PDF Eric Berg
1,1,210,33,2,52.5,The Healthy Keto Plan Transform Your Health to...
2,2,1000,151,10,250.0,"About the book In ""The Healthy Keto Plan,"" Dr...."
3,3,785,116,7,196.25,About the author Dr. Eric Berg is a prominent ...
4,4,0,1,1,0.0,


In [9]:
df.describe().round(2)

Unnamed: 0,page_number,page_char_count,page_word_count,page_sentence_count,page_token_count
count,206.0,206.0,206.0,206.0,206.0
mean,102.5,695.17,107.54,5.87,173.79
std,59.61,329.12,50.53,3.16,82.28
min,0.0,0.0,1.0,1.0,0.0
25%,51.25,608.25,102.25,4.0,152.06
50%,102.5,809.0,122.5,6.0,202.25
75%,153.75,884.75,134.0,7.0,221.19
max,205.0,1831.0,313.0,21.0,457.75


# Strategic Chunking
Chunking is the process of breaking down the large document into smaller, semantically meaningful pieces. A one-size-fits-all approach is not ideal for this document.

Q&A Section (Pages 120+): This section has a clear structure. Each question and its corresponding answer is a perfect, self-contained chunk. We use a regular expression (`regex`) to split the text precisely along these Q&A pairs.

Normal Text Section (Pages 1-119): For the standard prose, we use a RecursiveCharacterTextSplitter. This method is robust as it tries to split text along natural boundaries (paragraphs, sentences) to keep related content together, using a defined `chunk_size` and `chunk_overlap`.

In [10]:
qa_doc = doc_removed[119:]
normal_doc = doc_removed[:119]
normal_doc[-2].get_text()

'Scan to Download\nChapter 19 | Quotes From Pages 837-888\n1.Better indicators for improvement would be\nenergy level, sleep quality, digestion and cravings,\nas these tell if the body is healing.\n2.If you’re only losing one to two pounds per week, it might\nnot be a bad thing; it might actually be normal. And it’s\nimportant to understand this so you don’t get discouraged.\n3.Something is better than nothing. Do what you can and try\nto improve your eating each week, making gradual\nimprovements over time.\n4.You are basically creating a higher level of health, and it’s\nworth the investment.\n5.The goal of this program is to stabilize your organs so you\ncan keep the weight off.\n'

In [11]:
qa_list_of_dict = describe_the_pdf(qa_doc)
qa_text = ''
for page in range(len(qa_list_of_dict)):
    qa_text+=qa_list_of_dict[page]['text']

87it [00:00, 3899.68it/s]


In [12]:
pattern = r'(?=\d+\.Question)'

qa_raw_chunks = re.split(pattern, qa_text)
qa_chunks = [chunk.strip() for chunk in qa_raw_chunks if "Answer:" in chunk]


In [13]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=150
)

normal_list_of_dict = describe_the_pdf(normal_doc)
normal_text = ''
for page in range(len(normal_list_of_dict)):
    normal_text+=normal_list_of_dict[page]['text'] + "\n\n"

normal_chunks = text_splitter.split_text(normal_text)


119it [00:00, 4914.26it/s]


In [14]:
all_chunks = []
all_chunks.extend(qa_chunks)
all_chunks.extend(normal_chunks)

## Adding Metadata on chunks

### Source Citing

In [15]:
lenn = []
for i in all_chunks:
    lenn.append(len(i))

lenn.sort(reverse=True)
max_10_chuncks = lenn[:10]

print(max_10_chuncks) #~997/4 = 225 tokens is good.

[997, 996, 992, 989, 987, 983, 975, 974, 964, 959]


# 4. Embedding and Vector Storage
To enable semantic search, we must convert our text chunks into numerical representations called embeddings. We use the `all-mpnet-base-v2` model from Sentence-Transformers, which is excellent for generating meaningful text embeddings.

These embeddings are then stored in FAISS, a high-performance in-memory vector store. FAISS allows for incredibly fast similarity searches, which is perfect for retrieving the most relevant chunks for a given user query.

In [16]:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cuda')

## bi-encoder

In [17]:
from sentence_transformers import SentenceTransformer
check_point = 'sentence-transformers/all-mpnet-base-v2'
embedding_model = SentenceTransformer(check_point)
embedding_model.to(device)
embeddings = embedding_model.encode(all_chunks, show_progress_bar = True)

Batches:   0%|          | 0/10 [00:00<?, ?it/s]

# Vector Database

| Feature | FAISS (Facebook AI Similarity Search) | ChromaDB |
| :--- | :--- | :--- |
| **Type** | High-performance C++ **Library** with Python bindings. | Developer-first, standalone **Database** (runs as a server or in-memory). |
| **Primary Use Case** | Raw, ultra-fast similarity search on static, in-memory numerical vectors. | Building and managing the entire lifecycle of RAG applications. |
| **Metadata Filtering** | **No built-in support.** Requires complex, manual post-filtering workarounds. | **Yes, native support.** Allows `where` clauses to filter by metadata before search. |
| **Data Persistence** | **Manual.** You must explicitly save/load the index file. No database state. | **Automatic.** Manages storage for you, either in-memory or on disk. |
| **API & Ecosystem** | Low-level API focused purely on vector indexing and search. | High-level, developer-friendly API. Client-server architecture. |
| **Ease of Use** | More complex. Requires a deeper understanding of indexing algorithms. | **Extremely easy.** Designed for a fast and simple developer experience. |
| **Scalability** | Can handle massive datasets on a single machine (CPU/GPU). | Designed for easier horizontal scaling with a client-server model. |
| **Best for...** | Academic research, performance benchmarks, core of a larger system. | Prototyping, full-stack RAG applications, projects needing metadata filters. |

In [19]:
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings

In [20]:
hf_embeddings = HuggingFaceEmbeddings(model_name = check_point)
vector_store = FAISS.from_texts(texts=all_chunks, embedding=hf_embeddings)

  hf_embeddings = HuggingFaceEmbeddings(model_name = check_point)


In [21]:
query = "How much kilo I can loss with keto plan?"
results_with_scores = vector_store.similarity_search_with_score(query, k=5)

score_threshold = 0.9
relevant_docs = [doc for doc, score in results_with_scores if score < score_threshold]


if not relevant_docs:
    print("No sufficiently similar answer was found for this question.")
else:
    context_text = "\n\n---\n\n".join([doc.page_content for doc in relevant_docs])
    print(context_text)

1.Question How much weight loss can be realistically expected on the keto plan? Answer:The maximum fat loss achievable per week on a keto diet is between one and two pounds. This realistic expectation is crucial to avoid discouragement when progress seems slow. It's important to prioritize other indicators of health improvement, such as energy levels, sleep quality, and digestion.

---

Thanks to contributors for their roles in the creation and refinement of the book's content and message.



 Best Quotes from The Healthy Keto Plan by Eric Berg with Page Numbers View on Bookey Website and Generate Beautiful Quote Images Chapter 1 | Quotes From Pages 10-17 1.If you are simply told what to eat and it doesn’t work for you, you’ll chalk up a loss and go on to the next diet, creating further losses, and then think there is a problem with you, that you have poor willpower. 2.Once a person has the background and understanding of HOW fat is burned and HOW health is created, they succeed. 3.The

# Prompt
we going to use api so we don't need to create a prompt.

RetrievalQA already doing it for us.

But i want to add it too.

# Using LLM Model

In [22]:
import torch

In [23]:
!Nvidia-smi

Fri Oct 17 11:55:59 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.65                 Driver Version: 577.03         CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3050 ...    On  |   00000000:01:00.0 Off |                  N/A |
| N/A   50C    P0             40W /   70W |    1543MiB /   6144MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [24]:
import gc # Garbage Collector 

del embedding_model

torch.cuda.empty_cache()

gc.collect()

3667

In [25]:
!Nvidia-smi

Fri Oct 17 11:56:00 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.65                 Driver Version: 577.03         CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3050 ...    On  |   00000000:01:00.0 Off |                  N/A |
| N/A   49C    P0             21W /   61W |    1023MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


# Building the QA Pipeline: Retrieval & Generation

This section combines our retriever with a Large Language Model (LLM) to form the complete question-answering chain.

In [26]:
import google.generativeai as genai

genai.configure(api_key="AIzaSyA_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX")

models = genai.list_models()
for m in models:
    print(m.name)


E0000 00:00:1760691360.834998    1473 alts_credentials.cc:93] ALTS creds ignored. Not running on GCP and untrusted ALTS is not enabled.


models/embedding-gecko-001
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash
models/gemini-2.5-flash-lite-preview-06-17
models/gemini-2.5-pro-preview-05-06
models/gemini-2.5-pro-preview-06-05
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-preview-image-generation
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemini-2.0-flash-thinking-exp
models/gemini-2.0-flash-thinking-exp-1219
models/gemini-2.5-flash-preview-tts
models/gemini-2.5-pro-preview-tts
models/learnlm-2.0-flash-experimental
models/gemma-3-1b-it
models/gemma-3-4b-it
models/gemma-3-12b-it
models/gemma-3-27b-it
models/gemma-3n-e4b-it
mo

In [27]:
prompt_template_string = """
CONTEXT:
{context}

QUESTION:
{question}

INSTRUCTIONS:
Act as a helpful expert. Provide a clear and direct answer to the question using only the information in the context.
- You can perform simple calculations like unit conversions (e.g., pounds to kg) to make the answer more helpful.
- If the answer is not in the context, state that the document does not contain this information.
- Answer directly without starting your response with "Based on the context....
"""

In [103]:
import os
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

os.environ["GOOGLE_API_KEY"] = "AIzaSyAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"


llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash") #gemini-2.0-flash


custom_prompt = PromptTemplate(
    template=prompt_template_string,
    input_variables = ["context", "question"]
)


def creae_qa_chain(retriever):
    """
    Create a new RetrivalQA with using given retriver
    """
    print(f"New QA Chain creating with : {type(retriever).__name__}" )
    return RetrievalQA.from_chain_type(
        llm = llm,
        chain_type = "stuff", # it will return us :{context} and  {question}
        retriever = retriever,
        chain_type_kwargs={"prompt": custom_prompt}
    )

E0000 00:00:1760703862.150451    1473 alts_credentials.cc:93] ALTS creds ignored. Not running on GCP and untrusted ALTS is not enabled.


# Testing

In [48]:
faiss_retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={'k': 4}
)

qa_chain =  creae_qa_chain(faiss_retriever)

New QA Chain creating with : VectorStoreRetriever


In [49]:
question = "How much kilo I can loss with keto plan?"
response = qa_chain.invoke({"query": question})

print(response['result'])

The maximum fat loss achievable per week on a keto diet is between 0.45 and 0.9 kilograms.


In [50]:
question = "What is the keto diet?"
response = qa_chain.invoke({"query": question})

print(response['result'])

The document does not contain a definition of what the keto diet is. It discusses "Healthy Ketosis™" and mentions "Ketogenic Diet" in comparison, but does not describe the general keto diet itself.


#### I want to ask a question is not about keto

In [31]:
question = "What are the benefits of being an Engineer in 2025?"
response = qa_chain.invoke({"query": question})

print(response['result'])

The document does not contain information about the benefits of being an Engineer in 2025.


In [32]:
question = "What happens if I drink too much water?"
response = qa_chain.invoke({"query": question})

print(response['result'])

The document does not contain information on what happens if you drink too much water.


In [33]:
response

{'query': 'What happens if I drink too much water?',
 'result': 'The document does not contain information on what happens if you drink too much water.'}

In [34]:
question = "What is the best benefit of doing Healthy Ketosis?"
response = qa_chain.invoke({"query": question})
response

{'query': 'What is the best benefit of doing Healthy Ketosis?',
 'result': 'The best benefit of doing Healthy Ketosis is the enhancement of seven health factors: Energy, Sleep, Stress tolerance, Reduced cravings, Digestion, Reduced inflammation, and Reduced waist size. It also leads to improved metabolic health, optimizes overall health, and provides a more efficient fuel source for the body, especially for the brain.'}

My vram just 6gb so it is hard to use a model locally so i will use an api 

In [35]:
#from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfig
#from langchain.llms import HuggingFacePipeline
#from langchain.chains import RetrievalQA
#
#
## --- MISTRAL 7B (4-BIT QUANTIZED) ---
#
#model_id = "mistralai/Mistral-7B-Instruct-v0.2"
#
## 4-bit quantization settings
#quantization_config = BitsAndBytesConfig(
#    load_in_4bit=True,
#    bnb_4bit_compute_dtype=torch.bfloat16
#)
#attn_implementation = "sdpa" # Start with the safe default
#try:
#    # Attempt to load the model with the faster Flash Attention 2
#    print("Attempting to load model with Flash Attention 2...")
#    model = AutoModelForCausalLM.from_pretrained(
#        model_id,
#        quantization_config=quantization_config,
#        low_cpu_mem_usage = False,
#        device_map="auto",
#        attn_implementation="flash_attention_2"
#    )
#    attn_implementation = "flash_attention_2" # Update if successful
#    print("✅ Successfully loaded model with Flash Attention 2.")
#except (ImportError, ValueError) as e:
#    # If it fails, fall back to the standard implementation
#    print(f"⚠️ Could not load with Flash Attention 2 due to: {e}")
#    print("Falling back to the standard 'sdpa' attention mechanism.")
#    model = AutoModelForCausalLM.from_pretrained(
#        model_id,
#        quantization_config=quantization_config,
#        low_cpu_mem_usage = False,
#        device_map="auto",
#        attn_implementation="sdpa"
#    )
#
#tokenizer = AutoTokenizer.from_pretrained(model_id)
#
## Pipeline ve RAG zinciri oluşturma
#
#pipe = pipeline(
#    "text-generation", 
#    model=model, 
#    tokenizer=tokenizer, 
#    max_new_tokens=512
#)
#llm = HuggingFacePipeline(pipeline=pipe)
#
#qa_chain = RetrievalQA.from_chain_type(
#    llm=llm,
#    chain_type="stuff",
#    retriever=vector_store.as_retriever()
#)

# More Advanced Retrieval Strategies:

In [51]:
def chatgkn(query):
    creae_qa_chain
    response = qa_chain.invoke({"query": query})

    print(response['result'])

## Hibrid search:

Hibrid search mean is not only semantic search but also contains Keyword Search too . When searching for a specific name, it can sometimes be difficult. Because it generalizes the meaning, it can miss these specific keywords.
So we should add **Keyword Searching** 

In [37]:
#!pip install rank_bm25

In [53]:
from langchain.retrievers import BM25Retriever,EnsembleRetriever
bm25_retriver = BM25Retriever.from_texts(
    texts = all_chunks,
    metadatas=[{'source':i} for i in range(len(all_chunks))]
)
bm25_retriver.k = 4

ensemble_retriever = EnsembleRetriever(
    retrievers =[bm25_retriver,faiss_retriever],
    weights = [0.3,0.7]
)
qa_chain = creae_qa_chain(ensemble_retriever)

New QA Chain creating with : EnsembleRetriever


In [54]:
chatgkn('Eric Berg?') #with hibrid semantic search

Dr. Eric Berg is a prominent physician and health educator known for his expertise in ketogenic diets and weight management. With over 30 years of experience in chiropractic medicine and holistic health, he has dedicated his career to helping individuals achieve optimal health through nutritional strategies and lifestyle changes. He is the author of several bestselling books, including "The Healthy Keto Plan," and is a trusted figure in the health and wellness community due to his approachable style and commitment to education. He has also shared his expertise through YouTube educational content and a keto summit.


In [40]:
chatgkn('Eric Berg?') # with semantic search

'Dr. Eric Berg is a prominent physician and health educator with over 30 years of experience in chiropractic medicine and holistic health. He is known for his expertise in ketogenic diets and weight management, dedicating his career to helping individuals achieve optimal health through nutritional strategies and lifestyle changes. He is the author of several bestselling books, including "The Healthy Keto Plan," which combines scientific research with practical advice. Dr. Berg is a trusted figure in the health and wellness community, recognized for his approachable style and passionate commitment to education.'

## Re-query:

In [61]:
from langchain.retrievers.multi_query import MultiQueryRetriever
import logging
logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)
multiquery_esemble_retriever = MultiQueryRetriever.from_llm(
    retriever = ensemble_retriever,
    llm = llm
)
qa_chain = creae_qa_chain(multiquery_esemble_retriever)

New QA Chain creating with : MultiQueryRetriever


In [57]:
chatgkn('Bugun ne yesem keto diyeti yapmis olurum?') #as you see model understand foregein languages too.

Sağlıklı Keto Planı'na göre, beslenme düzeninizde %70 yağ, %20 protein, %5 karbonhidrat ve %5 yapraklı yeşil sebzelere odaklanmalısınız. Yüksek kaliteli, organik, otla beslenmiş (grass-fed) ve vahşi yakalanmış (wild-caught) ürünleri tercih etmelisiniz. Şeker veya un içermeyen keto dostu çikolata, kurabiye ve dondurma gibi atıştırmalıklar da tüketilebilir.

Ancak, belirli bir günlük yemek listesi veya "bugün ne yesem" sorusuna doğrudan yanıt verecek spesifik yiyecek örnekleri bu dokümanda bulunmamaktadır.


In [62]:
query = "Do we have a cheat day while on a ketogenic diet?"
response = qa_chain.invoke({"query": query})
response

INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the effects of having a cheat meal or day on the state of ketosis and the progress of a ketogenic diet?', '2. Are cheat days recommended or advised against for individuals following a ketogenic diet, according to nutrition guidelines?', "3. How do concepts like refeed days or carb cycling relate to taking breaks from a strict ketogenic diet, as an alternative to a traditional 'cheat day'?"]


{'query': 'Do we have a cheat day while on a ketogenic diet?',
 'result': 'The document does not contain information suggesting that cheat days are allowed on the Healthy Keto Plan. Instead, it emphasizes strict adherence to low carb intake (5% of calories, 20-50 grams daily) to maintain ketosis and healthy insulin levels. To combat cravings, the plan recommends replacing unhealthy foods with keto-friendly alternatives like chocolate, cookies, and ice cream made without sugar or flour. The plan also advises against snacking and limits meals to two or three per day to prevent insulin spikes.'}

## Chat Memory:

In [64]:
from langchain.memory import ConversationBufferWindowMemory
from langchain.chains import ConversationalRetrievalChain

In [104]:
def create_conversational_qa_chain(retriever):
    """
    Creates a new ConversationalRetrievalChain with a given retriever and memory.
    """
    print(f"New Conversational QA Chain creating with: {type(retriever).__name__}")

    memory = ConversationBufferWindowMemory(
        k=5,
        memory_key="chat_history",
        return_messages=True,
        output_key = "answer"
    )

    conversational_chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        chain_type='stuff',
        retriever=retriever,
        memory=memory,
        combine_docs_chain_kwargs={"prompt": custom_prompt},
        verbose=False,
        return_source_documents = True
        
    )
    return conversational_chain
    
qa_chain = create_conversational_qa_chain(multiquery_esemble_retriever)

New Conversational QA Chain creating with: MultiQueryRetriever


In [108]:
question = "What is the keto diet?"
response = qa_chain.invoke({"question": question})

print(response['answer'])

INFO:langchain.retrievers.multi_query:Generated queries: ['Explain the ketogenic diet.', 'How does a ketogenic eating plan work?', 'What are the core principles of a very low-carbohydrate, high-fat diet?']


The recommended ketogenic diet, called Healthy Ketosis™, focuses on high-quality, nutrient-dense foods, relying on ketone fuel instead of carbs. A daily carb intake of 20 to 50 grams helps the body utilize fat for energy. The proposed macronutrient breakdown is 70% fat, 20% protein, 5% carbohydrates, and 5% leafy green vegetables.


In [105]:
query = "Do we have a cheat day while on a ketogenic diet?"
response = qa_chain.invoke({"question": query})
response['answer']

INFO:langchain.retrievers.multi_query:Generated queries: ['Are cheat meals or refeed days compatible with a ketogenic diet?', 'What are the effects of consuming non-keto foods for a day on the state of ketosis?', 'Is it advisable to have planned breaks from strict carbohydrate restriction on a keto diet?']


'The document does not contain this information.'

In [92]:
response.keys()

dict_keys(['question', 'chat_history', 'answer', 'source_documents'])

In [90]:
follow_up_query = "Okay, tell me the alternatives"

response = qa_chain.invoke({"question": follow_up_query})
response['answer']

INFO:langchain.retrievers.multi_query:Generated queries: ['Healthy keto snack ideas to curb food cravings.', 'Nutritious low-carb options for satisfying sweet or salty cravings on a ketogenic diet.', 'Effective ketogenic foods and methods to reduce hunger and prevent unhealthy snacking.']


'To combat cravings for unhealthy foods, the Healthy Keto Plan suggests replacing them with healthier alternatives such as keto-friendly treats like chocolate, cookies, and ice cream made without sugar or flour. For sweet cravings, natural substitutes like stevia, xylitol, or erythritol can be used sparingly for special occasions. Additionally, focusing on nutrient-dense, satisfying whole foods can help reduce cravings.'

In [74]:
print(response['answer'])

Here are some healthier, keto-friendly alternatives:

*   **For pasta:** Use spaghetti squash or zucchini to create Zucchini Pasta.
*   **For mashed potatoes:** Use cauliflower to create Cauliflower Mashed.
*   **For sugar:** Use stevia, xylitol, and erythritol.
*   **For traditional comfort foods:** Substitute high-carb ingredients with vegetables or nuts.
*   **To enhance meals:** Utilize flavorful herbs, spices, healthy fats (like avocados, nuts, olive oil), crunchy textures like nuts in salads, and creamy dressings.
*   **For protein sources:** Choose high-quality options like chicken, fish, and eggs.
*   **For omelet fillings:** Include vegetables (like spinach and peppers), meats (like ground turkey or chicken), and dairy (like cheese or cream).


# Evaluate Your RAG Pipeline
**RAGAs** will score your system on a scale of 0 to 1 (higher is better) for several critical aspects:

**Faithfulness**: This measures how factually accurate the generated answer is based only on the provided context. A low faithfulness score means the model is hallucinating or making things up.

**Answer Relevancy:** This checks if the generated answer is actually relevant to the user's question. A low score here means the model might be giving a factually correct but off-topic answer.

**Context Precision:** This evaluates the signal-to-noise ratio in your retrieved context. It answers: "Of all the chunks retrieved, how many were actually useful for answering the question?"

**Context Recall:** This measures if your retriever found all the necessary information to answer the question. It answers: "Did you miss any important chunks of information?"

In [76]:
eval_questions = [
    # --- Basic Concepts & Numbers ---
    "How much weight can I expect to lose per week in terms of fat?",
    "What is the recommended daily carbohydrate intake for Healthy Ketosis™?",
    "What are the seven health factors the plan aims to improve, besides weight loss?",
    "According to Dr. Berg, what is the true purpose of food?",
    "Why are snacks discouraged on this plan?",
    "What is the difference between Healthy Ketosis™ and a standard Ketogenic Diet?",

    # --- Body Type Specifics ---
    "What are the main physical characteristics of the Adrenal body type?",
    "What kind of exercise is recommended for someone with an Adrenal body type to start with?",
    "What causes the 'saddlebag' appearance in the Ovary body type?",
    "A sluggish thyroid can lead to cravings for what type of food?",
    "What are the best foods for the liver?",
    "How does a damaged liver affect hormonal balance?",

    # --- Dietary Specifics (What to eat/avoid) ---
    "What can be used as a healthy alternative to mashed potatoes?",
    "Are fruits allowed on the eating plan for weight loss?",
    "Why are eggs considered a 'perfect food'?",
    "What are some acceptable sugar substitutes mentioned in the book?",
    "What types of fish should be avoided due to high mercury content?",

    # --- Troubleshooting & Lifestyle ---
    "What should I do if I feel more tired after starting the plan?",
    "How does stress prevent the body from burning fat?",
    "Why is quality sleep essential for losing weight?",
    "What is the most common reason for sleep problems?",
    "What are the recommended fillings for a nutritious omelet?",
    "How can one avoid eating out of boredom?",
    "What are the benefits of broccoli sprouts?"
]

In [78]:
eval_answers = [
    # --- Answers for Basic Concepts & Numbers ---
    "The maximum fat loss achievable per week is between one and two pounds.",
    "The recommended daily carb intake is 20 to 50 grams.",
    "The seven health factors are: Energy, Sleep, Stress tolerance, Reduced cravings, Digestion, Reduced inflammation, and Reduced waist size.",
    "The purpose of food is to sustain life, provide nourishment, and promote tissue repair, rather than just being a source of pleasure.",
    "Snacks are discouraged because every time you eat, you trigger insulin, which hinders fat burning. The goal is to let your body use its own fat stores for energy between meals.",
    "Healthy Ketosis™ is different because it emphasizes not just macronutrients but also micronutrients, with a focus on high-quality, nutrient-dense foods like vegetables to optimize overall health.",

    # --- Answers for Body Type Specifics ---
    "The main characteristics are excessive abdominal fat (a pendulous abdomen), a fat pad on the lower neck known as a 'buffalo hump,' and fat accumulation in the face, creating a 'moon face.'",
    "Adrenal body types should start with low-intensity aerobic exercises, like walking, to avoid over-stimulating the adrenal glands.",
    "Excess estrogen from dysfunctional ovaries leads to increased fat deposits around the hips, thighs, and lower abdomen, resulting in a 'saddlebag' appearance.",
    "Individuals with a sluggish thyroid often crave refined carbohydrates such as bread, sugar, pastries, and pasta.",
    "The best foods for the liver are raw cruciferous vegetables and small amounts of lean, high-quality proteins.",
    "A damaged liver can disrupt hormone levels, causing issues such as excessive estrogen, which can lead to weight gain and mood changes.",

    # --- Answers for Dietary Specifics (What to eat/avoid) ---
    "Cauliflower can be used to create 'Cauliflower Mashed' as a healthy alternative to mashed potatoes.",
    "No, fruits are generally high in sugar and should be avoided for weight loss, with the exception of lemons, limes, and small amounts of berries after reaching weight goals.",
    "Eggs are considered highly nutritious, easy to digest, beneficial for liver function, and do not negatively impact blood cholesterol levels.",
    "Acceptable sugar substitutes include stevia, xylitol, and erythritol.",
    "You should steer clear of high-mercury fish such as shark or swordfish.",

    # --- Answers for Troubleshooting & Lifestyle ---
    "Increased fatigue can be due to a depletion of B vitamins caused by fat burning. Taking non-fortified nutritional yeast can help replenish these vitamins.",
    "Stress raises cortisol levels, a hormone that increases blood sugar, blocks fat from being burned, and promotes fat storage, particularly around the abdomen.",
    "Quality sleep, particularly deep sleep, is essential because it activates the fat-burning growth hormone, which peaks during sleep cycles.",
    "The most common reason for sleep problems is overactive adrenal glands causing the body to be in a stress mode.",
    "Nutritious fillings for omelets include vegetables like spinach and peppers, meats like ground turkey or chicken, and dairy like cheese or cream.",
    "To avoid eating out of boredom, one should keep busy with activities instead of sitting idle.",
    "Broccoli sprouts have significantly higher concentrations of nutrients that aid detoxification and contain cancer-fighting enzymes compared to mature broccoli."
]

In [109]:
from tqdm import tqdm
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.WARNING)
results = []
for query in eval_questions:
    response = qa_chain.invoke({'question':query})
    results.append(response)

generated_answers = [r['answer']for r in results]
contexts = [[doc.page_content for doc in r['source_documents']] for r in results]

In [113]:
from datasets import Dataset

data_samples = {
    'question': eval_questions,
    "answer": generated_answers,
    "retrieved_contexts": contexts,
    "ground_truth": eval_answers
}
dataset = Dataset.from_dict(data_samples)

In [None]:
from ragas import evaluate
from ragas.metrics import (
faithfulness,
answer_relevancy,
context_recall,
context_precision,
)
metrics = [
    faithfulness,
    answer_relevancy,
    context_precision,
    context_recall,
]

result = evaluate(
    dataset=dataset,
    metrics=metrics,
    llm=llm,                
    embeddings=hf_embeddings
)
print(result)

In [116]:
print(result)

{'faithfulness': 0.9596, 'answer_relevancy': 0.8328, 'context_precision': nan, 'context_recall': 1.0000}
