# RAG Pipeline for Q&A over a Text File

This notebook implements a clean Retrieval-Augmented Generation (RAG) pipeline.

1.  **Install** required libraries.
2.  **Load** an `OPENAI_API_KEY` (if available).
3.  **Load** a source `.txt` file.
4.  **Chunk, Embed, & Store** the text in a Chroma vector database.
5.  **Build** a LangChain RAG chain to answer questions.
6.  **Run** an interactive chat loop.


In [None]:
## 1) Install dependencies
import sys
print(sys.version)

# Core libs
!pip -q install langchain langchain-community chromadb sentence-transformers

# For optional local LLM fallback
!pip -q install transformers accelerate

# For OpenAI
!pip -q install langchain-openai




In [9]:
## 1) Install dependencies
import sys
print(sys.version)

# Core libs
!pip -q install langchain langchain-community chromadb sentence-transformers
!pip -q install langchain-openai
!pip -q install pyttsx3 speechrecognition
!pip -q install langdetect  # <--- NEW LIBRARY

3.13.9 | packaged by Anaconda, Inc. | (main, Oct 21 2025, 19:09:58) [MSC v.1929 64 bit (AMD64)]



[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: C:\Users\Sadiq\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip

[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: C:\Users\Sadiq\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip

[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: C:\Users\Sadiq\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip
The system cannot find the file specified.


In [1]:
## 2) Load API Key
from pathlib import Path
from dotenv import load_dotenv

# Look for .env in the current working directory (same folder as the notebook)
env_path = Path(".") / ".env"
load_dotenv(dotenv_path=env_path)


True

In [2]:
## 3) Set Constants & Check Key
import os
from pathlib import Path

# Path where Chroma (vector DB) will be persisted
CHROMA_DIR = "./chroma"
COLLECTION = "uploaded_text"

# --- Optional: OpenAI ---
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "").strip()
USE_OPENAI = bool(OPENAI_API_KEY)

if USE_OPENAI:
    print("‚úÖ Using OpenAI for generation.")
else:
    print("‚ÑπÔ∏è OPENAI_API_KEY not set ‚Äî will use local Transformers fallback.")

Path(CHROMA_DIR).mkdir(parents=True, exist_ok=True)
print("CHROMA_DIR =", Path(CHROMA_DIR).resolve())
print("COLLECTION  =", COLLECTION)


‚úÖ Using OpenAI for generation.
CHROMA_DIR = E:\Lessons_By_Week\Project_Rag\Final_Codes\chroma
COLLECTION  = uploaded_text


In [3]:
## 4) Load Text Document
from pathlib import Path

# Use a relative path. Ensure "RAG_TEXT.txt" is in the same folder as this notebook.
uploaded_path = Path("./RAG_TEXT.txt") 

if not uploaded_path.exists():
    print(f"‚ö†Ô∏è Error: File not found at {uploaded_path.resolve()}")
    print("Please make sure 'RAG_TEXT.txt' is inside the project folder.")
else:
    text = uploaded_path.read_text(encoding="utf-8", errors="ignore")
    print(f"‚úÖ Loaded {len(text):,} characters.")

‚úÖ Loaded 22,028 characters.


In [4]:
## 5) Define LLM (Generator)

generator = None

if USE_OPENAI:
    from langchain_openai import ChatOpenAI
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    generator = llm
    print("Using ChatOpenAI: gpt-4o-mini")
else:
    # Local Transformers text2text generation via HF pipeline
    from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
    print("Loading local model: google/flan-t5-base...")
    model_id = "google/flan-t5-base"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
    hf_pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer)

    class HFText2TextLLM:
        def __call__(self, prompt_text: str) -> str:
            out = hf_pipe(prompt_text, max_new_tokens=256, truncation=True)
            return out[0]["generated_text"]
    
    generator = HFText2TextLLM()
    print("Using local Transformers: flan-t5-base")


Using ChatOpenAI: gpt-4o-mini


In [5]:
## 6) Chunk, Embed, and Store in Vector DB (Standard Mode)

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document
from langchain_community.embeddings import HuggingFaceEmbeddings
import shutil

# --- 1. Chunk the Text ---
splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50,
    separators=["\n\n", "\n", ". ", " ", ""],
)
docs = [Document(page_content=c, metadata={"source": str(uploaded_path.name)}) 
        for c in splitter.split_text(text)]
print(f"Chunks created: {len(docs)}")

# --- 2. Initialize Embeddings ---
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    encode_kwargs={"normalize_embeddings": True},
)

# --- 3. Create Vector Store (Chroma) ---
# Force a clean reload of the DB to ensure data is fresh
if Path(CHROMA_DIR).exists():
    shutil.rmtree(CHROMA_DIR)

vs = Chroma(
    collection_name=COLLECTION,
    persist_directory=CHROMA_DIR,
    embedding_function=embeddings,
)
vs.add_documents(docs)
print("‚úÖ Stored in Chroma at:", Path(CHROMA_DIR).resolve())

# --- 4. Standard Retriever (No Threshold) ---
# This will ALWAYS return the closest text, even if it's not very relevant.
retriever = vs.as_retriever(
    search_type="similarity",  # Changed from 'similarity_score_threshold'
    search_kwargs={"k": 5}     # Removed 'score_threshold'
)

print("\n‚úÖ Created Standard Retriever (No safety filter).")

Chunks created: 86


  embeddings = HuggingFaceEmbeddings(





  vs = Chroma(


‚úÖ Stored in Chroma at: E:\Lessons_By_Week\Project_Rag\Final_Codes\chroma

‚úÖ Created Standard Retriever (No safety filter).


In [None]:
## 7) Build Conversational RAG Chain (With Memory)

from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory



# --- 1. Contextualize Question ---
# This prompt helps the LLM understand follow-up questions (e.g., "What about him?")
contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history.Tell user politly the the history does not exit, "
    
)

contextualize_q_prompt = ChatPromptTemplate.from_messages([
    ("system", contextualize_q_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}"),
])

# --- 2. Answer Question ---
# This is the prompt that actually answers the user
qa_system_prompt = (
   
    "You are a here to help discussed the tools *Give summary of the tools in the text*. **Be concise.** "
    "Do not call it text, say it is a knowlegde base"
     "Give the initial summary of what you here for "
    "Do not tell anyone you were trainned by Open AI"
    "You are a helpful assistant. Answer the question only from the provided context. "
    "You may engage in friendly conversation, but never fabricate facts outside the context. "
    "Analyze the text and always give good sumaries and action point available"
    "You answer should be concise"
    "Limit the use of tokens and be very concise"
    "You are a helpful assistant named Chiity. "
    "Use the following pieces of retrieved context to answer the question. "
    "If you don't know the answer, say that you don't know. "
    "IMPORTANT: Answer in the same language that the user asks the question. "
    "If they ask in Spanish, answer in Spanish. If in English, answer in English."


    "\n\n"
    "{context}"
)

qa_prompt = ChatPromptTemplate.from_messages([
    ("system", qa_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}"),
])

# --- 3. Build the Chain ---
if USE_OPENAI:
    # Create a retriever that can handle history
    history_aware_retriever = create_history_aware_retriever(
        llm, retriever, contextualize_q_prompt
    )
    
    # Create the document combining chain
    question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)
    
    # Combine them
    rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

    # --- 4. Memory Management ---
    store = {}

    def get_session_history(session_id: str) -> BaseChatMessageHistory:
        if session_id not in store:
            store[session_id] = ChatMessageHistory()
        return store[session_id]

    conversational_chain = RunnableWithMessageHistory(
        rag_chain,
        get_session_history,
        input_messages_key="input",
        history_messages_key="chat_history",
        output_messages_key="answer",
    )
    print("‚úÖ Conversational RAG (OpenAI + Memory) is ready.")

else:
    # Local Fallback (Simplified memory for local testing)
    # Note: Local models often struggle with the complex history-rewriting step
    print("‚ÑπÔ∏è Memory is disabled for local fallback to ensure stability.")
    conversational_chain = None 
    # We will handle the fallback logic in the ask function

‚úÖ Conversational RAG (OpenAI + Memory) is ready.


In [None]:
## 8 & 9) Interactive Multilingual Chat + Push-to-Talk

import time
import speech_recognition as sr
import pyttsx3

# --- 1. Interactive Language Selector ---
print("üåê LANGUAGE SELECTION / SELECCI√ìN DE IDIOMA")
print("1. English (en)")
print("2. Spanish (es)")
print("3. French  (fr)")
print("4. German  (de)")
print("5. Hindi   (hi)")

choice = input("\nüëâ Choose your language (e.g., 'en', 'es'): ").strip().lower()

# Default to English if invalid
lang_map = {
    'en': {'name': 'English', 'code': 'en-US', 'voice_key': 'english'},
    'es': {'name': 'Spanish', 'code': 'es-ES', 'voice_key': 'spanish'},
    'fr': {'name': 'French',  'code': 'fr-FR', 'voice_key': 'french'},
    'de': {'name': 'German',  'code': 'de-DE', 'voice_key': 'german'},
    'hi': {'name': 'Hindi',   'code': 'hi-IN', 'voice_key': 'hindi'}
}

selected = lang_map.get(choice, lang_map['en'])
CURRENT_LANG_NAME = selected['name']
CURRENT_LANG_CODE = selected['code']
VOICE_KEYWORD     = selected['voice_key']

print(f"‚úÖ System set to: {CURRENT_LANG_NAME} ({CURRENT_LANG_CODE})")

# --- 2. Setup Offline TTS Engine ---
try:
    engine = pyttsx3.init()
    voices = engine.getProperty('voices')
    
    # Smart Voice Switcher
    selected_voice_id = None
    for v in voices:
        if VOICE_KEYWORD in v.name.lower():
            selected_voice_id = v.id
            print(f"üó£Ô∏è Voice loaded: {v.name}")
            break
    
    if not selected_voice_id:
        print(f"‚ö†Ô∏è No installed {CURRENT_LANG_NAME} voice found. Using system default.")
        selected_voice_id = voices[0].id

    engine.setProperty('voice', selected_voice_id)
    engine.setProperty('rate', 145) 

except Exception as e:
    print(f"TTS Init Error: {e}")

# --- 3. Main Chat Setup ---
if 'conversational_chain' not in globals():
    print("‚ö†Ô∏è Brain not loaded. Run previous cells first.")
else:
    session_id = f"session_{choice}"
    r = sr.Recognizer()
    mic = sr.Microphone()
    
    print(f"\nü§´ Calibrating mic for {CURRENT_LANG_NAME}...")
    with mic as source:
        r.adjust_for_ambient_noise(source, duration=1.0)

    def speak_text(text):
        if not text: return
        try:
            engine.say(text)
            engine.runAndWait()
        except: pass

    def listen_instantly():
        with mic as source:
            print(f"üî¥ LISTENING ({CURRENT_LANG_NAME})...")
            try:
                audio = r.listen(source, timeout=5, phrase_time_limit=15)
                print("‚è≥ Processing...")
                # Force Google to listen in the selected language
                return r.recognize_google(audio, language=CURRENT_LANG_CODE)
            except sr.UnknownValueError:
                print("ü§∑ ???")
                return None
            except Exception as e:
                print(f"‚ö†Ô∏è Error: {e}")
                return None

    print("="*50)
    print(f"üéôÔ∏è READY! Press [ENTER] to speak in {CURRENT_LANG_NAME}")
    print("="*50)

    try:
        while True:
            user_input = input(f"\nüëâ [ENTER]=Voice | [Type]=Text: ").strip()
            if user_input.lower() in ["exit", "quit", "salir", "au revoir"]: 
                print("üëã Goodbye!")
                break
            
            query = user_input if user_input else listen_instantly()
            if not query: continue

            print(f"üó£Ô∏è {query}")
            
            # Note: The AI brain will reply in whatever language you ask in
            response = conversational_chain.invoke(
                {"input": query},
                config={"configurable": {"session_id": session_id}}
            )
            answer = response["answer"]
            
            print(f"\nü§ñ {answer}\n")
            speak_text(answer)

    except KeyboardInterrupt:
        print("\nüõë Stopped.")

üéôÔ∏è SYSTEM READY (Session: session_push_to_talk_final)
üëâ Press [ENTER] to talk. Or type to chat.
ü§î Thinking...

ü§ñ Hello Kayode! How can I assist you today?



ü§î Thinking...

ü§ñ I'm here to provide insights and summaries about love and relationships, particularly lessons learned over time. The focus is on understanding love, which is often misunderstood, and sharing powerful messages that resonate across different age groups. If you have specific questions or topics in mind, feel free to ask!



ü§î Thinking...

ü§ñ I can only provide information based on the knowledge base provided. If you have questions related to love, relationships, or the content discussed earlier, feel free to ask!



ü§î Thinking...

ü§ñ I can only provide insights related to the knowledge base on love and relationships. If you have questions about that topic, I'm here to help!



ü§î Thinking...

ü§ñ The knowledge base focuses on understanding love and relationships, emphasizing how media influences our perceptions. It highlights the importance of emotional growth, healing, and commitment in relationships. Key points include the impact of romantic narratives on expectations and the necessity of mutual patience and growth in a partnership. If you have specific questions or need further insights, let me know!



ü§î Thinking...

ü§ñ Love is often misunderstood and influenced by societal narratives, such as movies and media. It can be confused with familiar emotional patterns, which may stem from past wounds. True love involves understanding, consistency, and the ability to repair after conflicts, rather than grand gestures or the absence of challenges. It's essential to give yourself validation and approval to avoid seeking it solely from others. If you have more specific questions about love, feel free to ask!



ü§î Thinking...

ü§ñ Here are the top five important facts about love from the knowledge base:

1. **Chemistry vs. Compatibility**: Chemistry is not the same as compatibility; real love requires steadiness and mutual support, not just sparks.

2. **Self-Validation**: It's crucial to give yourself daily validation and approval to avoid relying on others for your self-worth.

3. **Mutual Growth**: A healthy relationship involves both partners committing to personal growth and healing together.

4. **Understanding Love**: Many chase love based on feelings rather than understanding what love truly means and how it should feel.

5. **Emotional Regulation**: True love brings peace and stability, helping to regulate emotions rather than adding chaos to your life. 

If you need more details on any of these points, let me know!



ü§î Thinking...

ü§ñ The topic is "Understanding Love and Relationships." If you have more questions or need further insights, feel free to ask!



ü§î Thinking...

ü§ñ To understand love, consider these steps:

1. **Reflect on Influences**: Recognize how media and past experiences shape your perception of love.

2. **Define Love**: Write down what love means to you and what it isn't. Clarify your expectations and feelings.

3. **Communicate**: Discuss your definitions and expectations of love with your partner to ensure alignment.

4. **Focus on Self-Love**: Practice self-validation and approval to build a strong foundation for healthy relationships.

5. **Recognize Patterns**: Be aware of emotional patterns that may confuse love with familiarity or past trauma.

By following these steps, you can gain a deeper understanding of love and foster healthier relationships. If you have more questions, let me know!



üëã Goodbye!


In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# --- 1. Define the "LLM-as-Judge" Prompt ---
EVAL_PROMPT_TEMPLATE = """
You are an expert evaluator for a Question-Answering system.
Your goal is to assess whether the 'Generated Answer' correctly and faithfully answers the 'User Question' based *only* on the 'Ground Truth Answer'.

RULES:
- If the 'Generated Answer' is consistent with, and supported by, the 'Ground Truth Answer', respond with **CORRECT**.
- If the 'Generated Answer' contradicts, fabricates information, or misses the main point of the 'Ground Truth Answer', respond with **INCORRECT**.
- If the 'Generated Answer' says "I don't know" and the 'Ground Truth' says the info is missing, this is **CORRECT**.

--- TASK ---
User Question: {question}
Ground Truth Answer: {ground_truth}
Generated Answer: {generated_answer}

Assessment (CORRECT or INCORRECT):"""

eval_prompt = ChatPromptTemplate.from_template(EVAL_PROMPT_TEMPLATE)

# Create the Evaluation Chain
# We reuse 'llm' (ChatOpenAI) if available, otherwise the local 'generator'
judge_llm = llm if USE_OPENAI else generator

if USE_OPENAI:
    evaluation_chain = eval_prompt | judge_llm | StrOutputParser()
else:
    # Fallback wrapper for local model
    def eval_local(inputs):
        txt = eval_prompt.format(**inputs)
        return judge_llm(txt)
    evaluation_chain = eval_local

# --- 2. Define a RELEVANT Test Set (Matches your Uploaded Text) ---
# I have updated this to match the "Content Hub / VFX" content seen in your logs.
evaluation_test_set = [
    {
        "question": "What is ConformPulls?",
        "ground_truth_answer": "ConformPulls is a tool that automates the request for original camera files to reduce delays and manages framing for renders."
    },
    {
        "question": "Is SSTV mentioned on the document",
        "ground_truth_answer": "It handles uploads of camera files, scans directories for errors, organizes media formats, and allows progress tracking from anywhere."
    },
    {
        "question": "Who is Chiity?",
        "ground_truth_answer": "Chiity is the name of the helpful assistant here to discuss the tools in the knowledge base."
    },
    {
        "question": "How do I bake a cake?",
        "ground_truth_answer": "The document does not contain information about cooking or baking cakes."
    }
]

print("‚úÖ Judge Persona & Test Data Loaded.")

In [None]:
import time

def run_evaluation():
    print(f"üìâ Starting Evaluation on {len(evaluation_test_set)} test cases...\n")
    
    score_card = []
    
    for i, item in enumerate(evaluation_test_set):
        q = item["question"]
        gt = item["ground_truth_answer"]
        
        print(f"--- Test Case {i+1}: {q} ---")
        
        # 1. Get answer from your RAG Pipeline
        # We use the 'rag_chain' created in Cell 7 (stateless) or invoke conversational_chain
        try:
            # Using the conversational chain with a fresh session ID for isolation
            response = conversational_chain.invoke(
                {"input": q}, 
                config={"configurable": {"session_id": f"eval_session_{i}"}}
            )
            generated_text = response["answer"]
        except Exception as e:
            generated_text = f"Error generating response: {e}"

        print(f"ü§ñ Bot Answer: {generated_text[:100]}...") # Print first 100 chars
        
        # 2. Ask the Judge to score it
        eval_input = {
            "question": q,
            "ground_truth": gt,
            "generated_answer": generated_text
        }
        
        if USE_OPENAI:
            grade = evaluation_chain.invoke(eval_input).strip()
        else:
            grade = evaluation_chain(eval_input).strip()
            
        print(f"üë®‚Äç‚öñÔ∏è Judge: {grade}\n")
        score_card.append(grade)
        time.sleep(1) # Avoid rate limits

    # Calculate Final Score
    correct_count = sum(1 for g in score_card if "CORRECT" in g.upper())
    accuracy = (correct_count / len(score_card)) * 100
    
    print("="*30)
    print(f"üèÅ Final Accuracy: {accuracy:.1f}% ({correct_count}/{len(score_card)})")
    print("="*30)

# Run it!
run_evaluation()

In [None]:
#!/usr/bin/env python
# coding: utf-8

# ## 1) Install dependencies
import sys
print(sys.version)

# In a .py script, you would typically run these from your terminal first:
# !pip -q install langchain langchain-community chromadb sentence-transformers
# !pip -q install transformers accelerate
# !pip -q install langchain-openai


# ## 2) Load API Key
from pathlib import Path
from dotenv import load_dotenv

# *** UPDATE THIS PATH to your .env file ***
env_path = Path("/Volumes/Untitled/Lessons_By_Week/Project_Rag/Final_Codes/ATT81022.env")
load_dotenv(dotenv_path=env_path)


# ## 3) Set Constants & Check Key
import os
from pathlib import Path

# Path where Chroma (vector DB) will be persisted
CHROMA_DIR = "./chroma"
COLLECTION = "uploaded_text"

# --- Optional: OpenAI ---
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "").strip()
USE_OPENAI = bool(OPENAI_API_KEY)

if USE_OPENAI:
    print("‚úÖ Using OpenAI for generation.")
else:
    print("‚ÑπÔ∏è OPENAI_API_KEY not set ‚Äî will use local Transformers fallback.")

Path(CHROMA_DIR).mkdir(parents=True, exist_ok=True)
print("CHROMA_DIR =", Path(CHROMA_DIR).resolve())
print("COLLECTION  =", COLLECTION)


# ## 4) Load Text Document

# *** UPDATE THIS PATH to your .txt file ***
uploaded_path = "/Volumes/Untitled/Youtube_QA_Rag_System/Working_Pipelines/text/RAG_TEXT.txt"
from pathlib import Path

p = Path(uploaded_path).expanduser()
assert p.exists(), f"File not found: {p}"

text = p.read_text(encoding="utf-8", errors="ignore")
print(f"Loaded {len(text):,} characters from:", p.resolve())


# ## 5) Define LLM (Generator)

generator = None

if USE_OPENAI:
    from langchain_openai import ChatOpenAI
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    generator = llm
    print("Using ChatOpenAI: gpt-4o-mini")
else:
    # Local Transformers text2text generation via HF pipeline
    from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
    print("Loading local model: google/flan-t5-base...")
    model_id = "google/flan-t5-base"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
    hf_pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer)

    class HFText2TextLLM:
        def __call__(self, prompt_text: str) -> str:
            out = hf_pipe(prompt_text, max_new_tokens=256, truncation=True)
            return out[0]["generated_text"]
    
    generator = HFText2TextLLM()
    print("Using local Transformers: flan-t5-base")


# ## 6) Chunk, Embed, and Store in Vector DB

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document
from langchain_community.embeddings import HuggingFaceEmbeddings

# 1) Chunk the text
splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50,
    separators=["\n\n", "\n", ". ", " ", ""],
)
docs = [Document(page_content=c, metadata={"source": str(p.name)}) 
        for c in splitter.split_text(text)]
print(f"Chunks created: {len(docs)}")

# 2) Embedding function
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    encode_kwargs={"normalize_embeddings": True},
)

# 3) Create (or re-open) the Chroma collection
vs = Chroma(
    collection_name=COLLECTION,
    persist_directory=CHROMA_DIR,
    embedding_function=embeddings,
)

# 4) Add docs
vs.add_documents(docs)
print("‚úÖ Stored in Chroma at:", Path(CHROMA_DIR).resolve())

# 5) Create the retriever
retriever = vs.as_retriever(search_kwargs={"k": 5})
print("\n‚úÖ Created 'retriever' variable.")


# ## 7) Build RAG Chain

from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

def format_docs(docs):
    out = []
    for i, d in enumerate(docs):
        src = d.metadata.get("source", "")
        out.append(f"[{i}] {d.page_content}\n(source: {src})")
    return "\n\n".join(out)

SYSTEM_PROMPT = (
    "You are a helpful assistant. Answer the question **only** from the provided context.however you can give general friend conversations "
    "If the answer isn't present, say: 'I don't see that in the file.'"
      "store the last 2 questions for retreval '"
     "give full context to make the user understand '"
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "{system}"),
    ("human", "Question: {question}\n\nContext:\n{context}\n\nAnswer succinctly:"),
])

# Build chain
if USE_OPENAI:
    chain = (
        RunnableParallel({
            "context": (retriever | format_docs),
            "question": RunnablePassthrough(),
            "system": (lambda _: SYSTEM_PROMPT),
        })
        | prompt
        | generator
        | StrOutputParser()
    )
    print("‚úÖ RAG chain (OpenAI) is ready.")
else:
    # Emulate the same behavior for the local model in a function
    def answer_local(question: str) -> str:
        ctx = format_docs(retriever.get_relevant_documents(question))
        full_prompt = (
            f"{SYSTEM_PROMPT}\n\n"
            f"Question: {question}\n\n"
            f"Context:\n{ctx}\n\n"
            f"Answer succinctly:"
        )
        return generator(full_prompt)

    chain = answer_local
    print("‚úÖ RAG function (Local Transformers) is ready.")


# ## 8) Ask Questions (Interactive)

# Define the 'ask' function
def ask(question: str):
    if not question.strip():
        return "Please enter a non-empty question."
    if callable(chain) and not hasattr(chain, "invoke"):
        # Local HF function path
        return chain(question)
    # OpenAI path via LangChain
    return chain.invoke(question)


# ---
# ## 9) Evaluation Module
# ---

# 1. Define your Test Set
# *** UPDATE THIS TEST SET with questions and answers relevant to your document ***

evaluation_test_set = [
    {
        "question": "What is the file about?",
        "ground_truth_answer": "The file is a comprehensive guide for travelers exploring Europe, focusing on its diverse cultural and natural landscapes, history, art, and the blend of tradition and modernity."
    },
    {
        "question": "What is the Netherlands known for?",
        "ground_truth_answer": "The Netherlands is known for its canals, tulip fields, and cycling culture. Key attractions include Amsterdam‚Äôs Rijksmuseum and Anne Frank House."
    },
    {
        "question": "Where is the Netherlands located?",
        "ground_truth_answer": "The Netherlands is located in Western Europe."
    },
    {
        "question": "What is the capital of Spain?",
        "ground_truth_answer": "The document mentions Spain's cultural heritage, architecture, and historical sites, but it does not explicitly state the capital city."
    }
]

print(f"Loaded {len(evaluation_test_set)} evaluation questions.")


# 2. Define the Evaluation Function (LLM-as-Judge)

EVAL_PROMPT_TEMPLATE = """
You are an expert evaluator for a Question-Answering system. 
Your goal is to assess whether the 'Generated Answer' correctly and faithfully answers the 'User Question' based *only* on the 'Ground Truth Answer'.

RULES:
- If the 'Generated Answer' is consistent with, and supported by, the 'Ground Truth Answer', respond with **CORRECT**.
- If the 'Generated Answer' contradicts, fabricates information, or misses the main point of the 'Ground Truth Answer', respond with **INCORRECT**.
- If the 'Generated Answer' is something like 'I don't see that in the file' and the 'Ground Truth Answer' also indicates the information is missing, this is **CORRECT**.

--- EXAMPLES ---
User Question: What is the capital of France?
Ground Truth Answer: Paris is the capital of France.
Generated Answer: The capital of France is Paris.
Assessment: CORRECT

User Question: What is the capital of France?
Ground Truth Answer: Paris is the capital of France.
Generated Answer: I don't see that in the file.
Assessment: INCORRECT

User Question: What is the capital of Mars?
Ground Truth Answer: The document does not mention the capital of Mars.
Generated Answer: I don't see that in the file.
Assessment: CORRECT
--- END EXAMPLES ---

Provide only the final assessment ('CORRECT' or 'INCORRECT').

--- TASK ---
User Question: {question}
Ground Truth Answer: {ground_truth}
Generated Answer: {generated_answer}

Assessment:"""

eval_prompt = ChatPromptTemplate.from_template(EVAL_PROMPT_TEMPLATE)

# Note: We re-use the 'generator' (LLM) from Cell 5 as our judge
if USE_OPENAI:
    evaluation_chain = (
        eval_prompt
        | generator 
        | StrOutputParser()
    )
    print("‚úÖ LLM-as-Judge (OpenAI) is ready.")
else:
    # Local models require the full prompt to be built manually
    def eval_local(inputs: dict) -> str:
        full_prompt = eval_prompt.format(**inputs)
        return generator(full_prompt)
    
    evaluation_chain = eval_local
    print("‚úÖ LLM-as-Judge (Local Transformers) is ready.")


def evaluate_pipeline():
    print("Running evaluation...")
    print("="*30)
    
    correct