## Expert Knowledge Worker

### A question answering agent that is an expert knowledge worker
### To be used by the new users, of Fitflix
### The agent needs to be accurate and the solution should be low cost.

This project will use RAG (Retrieval Augmented Generation) to ensure our question/answering assistant has high accuracy.

This first implementation will use a simple, brute-force type of RAG..

### Sidenote: Business applications of this week's projects

RAG is perhaps the most immediately applicable technique of anything that we cover in the course! In fact, there are commercial products that do precisely what we build this week: nuanced querying across large databases of information, such as company contracts or product specs. RAG gives you a quick-to-market, low cost mechanism for adapting an LLM to your business area.

In [17]:
pip install --upgrade langchain langchain-community langchain-core langchain-google-genai google-generativeai python-dotenv chromadb sentence-transformers matplotlib scikit-learn plotly

Collecting langchain
  Downloading langchain-0.3.27-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-core
  Using cached langchain_core-0.3.72-py3-none-any.whl.metadata (5.8 kB)
Collecting langchain-google-genai
  Using cached langchain_google_genai-2.1.8-py3-none-any.whl.metadata (7.0 kB)
Collecting google-generativeai
  Using cached google_generativeai-0.8.5-py3-none-any.whl.metadata (3.9 kB)
Collecting python-dotenv
  Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB)
Collecting chromadb
  Downloading chromadb-1.0.15-cp39-abi3-win_amd64.whl.metadata (7.1 kB)
Collecting sentence-transformers
  Downloading sentence_transformers-5.0.0-py3-none-any.whl.metadata (16 kB)
Collecting matplotlib
  Downloading matplotlib-3.10.5-cp312-cp312-win_amd64.whl.metadata (11 kB)
Collecting scikit-learn
  Downloading scikit_learn-1.7.1-cp312-cp312-win_amd64.whl.metadata (11 k

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ultralytics 8.3.78 requires numpy<=2.1.1,>=1.23.0, but you have numpy 2.1.3 which is incompatible.
ydata-profiling 4.16.1 requires matplotlib<=3.10,>=3.5, but you have matplotlib 3.10.5 which is incompatible.

[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: C:\Users\dandu\AppData\Local\Programs\Python\Python312\python.exe -m pip install --upgrade pip


In [3]:

# rag.ipynb

# imports
import os
from dotenv import load_dotenv
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate
from langchain.document_loaders import DirectoryLoader, TextLoader # For loading markdown files
from langchain.text_splitter import MarkdownHeaderTextSplitter # Ideal for structured markdown
import shutil # For safely removing Chroma DB directory


# GOOGLE_API_KEY is automatically picked up by LangChain's Google integrations

# loading model (LLM for generation)
# IMPORTANT: Use the exact model name from Google AI Studio (e.g., "gemini-1.5-pro")
# I'm using "gemini-1.5-pro" as it's the current leading model.
# If you actually have a "gemini-2.5-pro" from Google (highly unusual/new), use that exact string.
gemini_llm = ChatGoogleGenerativeAI(model="gemini-2.5-pro", temperature=0.7 ,google_api_key="AIzaSyDLQ-mxPkiamxA0g_4j9i-wiX3cX2m4Eo8")



In [4]:
# making the process for vector embeddings

# Define the path to your knowledge base and the specific markdown file
knowledge_base_path = "knowledge_base"
markdown_file_glob = "**/*.md" # Matches 'about.md' and any other .md files in subfolders

# Load documents from the 'knowledge-base' folder
# Use TextLoader for .md files and DirectoryLoader to find them
print(f"Loading documents from '{knowledge_base_path}'...")
loader = DirectoryLoader(
    knowledge_base_path,
    glob=markdown_file_glob,
    loader_cls=TextLoader,
    loader_kwargs={'encoding': 'utf-8'} # Specify encoding for broad compatibility
)
documents = loader.load()
print(f"Loaded {len(documents)} document(s).")

# Split documents into chunks using MarkdownHeaderTextSplitter
# This splitter understands markdown structure (like # and ## headings)
# and keeps them as metadata, which is great for RAG context.
headers_to_split_on = [
    ("#", "Header1"),
    ("##", "Header2")
]
markdown_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)

# Split each loaded document's content
chunks = []
for doc in documents:
    # Ensure doc.page_content is a string for splitting
    if isinstance(doc.page_content, str):
        # Pass the content and original document's metadata
        split_docs = markdown_splitter.split_text(doc.page_content)
        for s_doc in split_docs:
            # Combine original metadata with new header metadata
            s_doc.metadata = {**doc.metadata, **s_doc.metadata}
        chunks.extend(split_docs)
    else:
        print(f"Warning: Document content is not a string for splitting: {doc.metadata}")

print(f"Split into {len(chunks)} chunks.")

Loading documents from 'knowledge_base'...
Loaded 9 document(s).
Split into 43 chunks.


In [5]:
# Initialize Gemini Embeddings Model
# 'models/embedding-001' is a common and stable choice for text embeddings.
gemini_embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
print("Embeddings model loaded.")

# Define the directory to persist your Chroma vector database
db_persist_dir = "fitflix_chroma_db_gemini"

# Clear existing vector database for a fresh start (optional, but good for development)
if os.path.exists(db_persist_dir):
    print(f"Clearing existing vector database at '{db_persist_dir}'...")
    try:
        shutil.rmtree(db_persist_dir)
        print("Existing database removed.")
    except Exception as e:
        print(f"Could not remove database directory: {e}")

# Create and persist the vector store
print(f"Creating vector store in '{db_persist_dir}' and adding {len(chunks)} chunks...")
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=gemini_embeddings,
    persist_directory=db_persist_dir
)
vectorstore.persist() # Ensures data is written to disk
print(f"Vector store created with {vectorstore._collection.count()} entries and persisted.")


# then the code have the rag system

# Create a retriever from the vector store
# 'k' specifies how many top relevant chunks to retrieve
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# Define the prompt template for the LLM
# This guides the LLM on how to use the retrieved context.
prompt_template = """
You are an AI assistant specialized in information about Fitflix entities.
Use the following pieces of context to answer the question at the end.
and try to interactive if any body greets you greet them also. and follow the humanity rules
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Keep the answer concise, professional, and directly address the question.

Context:
{context}

Question: {question}
Answer:
"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

# Set up conversational memory for the RAG chain
# This allows the RAG system to remember past turns in the conversation.
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


DefaultCredentialsError: Your default credentials were not found. To set up Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information.

In [None]:

# # Create the Conversational Retrieval QA Chain
# # This chain combines the LLM, retriever, and memory for a full RAG experience.
qa_chain = ConversationalRetrievalChain.from_llm(
    llm=gemini_llm,
    retriever=retriever,
    memory=memory,
    combine_docs_chain_kwargs={"prompt": PROMPT}
)

# # Interactive chat loop for Jupyter Lab
# print("\n--- RAG System Ready for Chat ---")
# print("Ask questions about Fitflix (type 'quit' to exit).")

# def chat_with_rag():
#     while True:
#         user_query = input("\nYour question: ")
#         if user_query.lower() == 'quit':
#             break

#         try:
#             # Invoke the QA chain with the user's question
#             result = qa_chain.invoke({"question": user_query})
#             print("\nAI Answer:", result["answer"])

#         except Exception as e:
#             print(f"An error occurred: {e}")
#             print("Please check your internet connection and API key.")

# # Start the chat
# chat_with_rag()

# print("\nChat session ended. Goodbye!")


In [None]:
import gradio as gr  # Make sure Gradio is imported

# --- Gradio Chat Interface Integration ---

# Define your chat function
def chat_with_rag_gradio(message, history):
    result = qa_chain.invoke({"question": message})
    return result["answer"]

# Set up the Gradio ChatInterface
demo = gr.ChatInterface(
    fn=chat_with_rag_gradio,
    type="messages",
    title="Fitflix RAG Chatbot (Powered by Gemini)",
    description="Ask me questions about Fitflix from the provided knowledge base.",
    examples=[
        "Who founded Fitflix Gym Brookfield?",
        "What are the services offered by Fitflix Gyms?",
        "Tell me about Fitflix VV Nutrition."
    ],
    theme="soft",  # Optional: modern theme
    textbox=gr.Textbox(placeholder="Type your question here...", scale=7)
)

# Launch the interface
demo.launch(inbrowser=True, share=False)
