This notebook builds a Retrieval-Augmented Generation (RAG) chatbot that retrieves relevant information from PDF documents and generates responses using a Language Model (LLM). The chatbot processes PDFs, extracts text, embeds it in a vector database, and performs semantic search for accurate answers.

## 1- Import Libraries 

In [1]:
import os
from langchain_fireworks import ChatFireworks
from langchain_fireworks import Fireworks
from langchain_fireworks import FireworksEmbeddings
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.vectorstores import Chroma
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate
import warnings
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
from langchain.retrievers import EnsembleRetriever
import re
from dateparser import parse
from dateparser.search import search_dates
from datetime import datetime
from supabase import create_client, Client
from dotenv import load_dotenv
from langchain.schema import Document


## 2- Set API key 

In [2]:
# Load the .env file
load_dotenv()

# Access the variables
api_key = os.getenv("API_KEY")


## 3- Call LLM 

In [3]:
llm = Fireworks(api_key=api_key, model="accounts/fireworks/models/deepseek-v3", temperature=1.0)
response = llm.invoke("Hello, how are you?")
print(response)

 Welcome to our blog "Your Health Our Priority". Here, you will find important


## 4-Initialize Embedding

In [4]:
embeddings = FireworksEmbeddings(api_key=api_key)

## 5- Reading pdfs

In [5]:
pdf_files = [
  r"How-to-Manage-your-Finances.pdf",
            r"pdf_50_20_30.pdf",
            r"Personal-Finance-Management-Handbook.pdf",
            r"reach-my-financial-goals.pdf",
            r"tips-to-manage-your-money.pdf",
            r"beginners-guide-to-saving-2024.pdf",
            r"40MoneyManagementTips.pdf",
            r"1_55_ways_to_save.pdf"
]

## 6-Spliting documents into smaller meanigful chunks

In [6]:
# Load and split PDF
documents = []
for pdf in pdf_files:
    loader = PyPDFLoader(pdf)
    documents.extend(loader.load())

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)
#  Generate embeddings
def batch_texts(texts, batch_size=256):
    for i in range(0, len(texts), batch_size):
        yield texts[i:i + batch_size]

batch_size = 256
chunk_batches = list(batch_texts(chunks, batch_size))

pdf_embeddings = []
for batch in chunk_batches:
    batch_texts = [chunk.page_content for chunk in batch]
    batch_embeddings = embeddings.embed_documents(batch_texts)
    pdf_embeddings.extend(batch_embeddings)

## 7- Store chunks in vectorestore FIASS


In [7]:
#  Store in FAISS
vector_store = FAISS.from_embeddings(
    text_embeddings=list(zip([chunk.page_content for chunk in chunks], pdf_embeddings)),
    embedding=embeddings
)

pdf_retriever = vector_store.as_retriever(search_kwargs={"k": 5})  # Retrieve top 5 relevant chunks

## 8- Create memory

In [8]:
#  Initialize memory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


  memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


## 9- Define a prompt templete

In [9]:
#  Step : Define prompt template for financial advice
finance_template = PromptTemplate(
    input_variables=["context", "question", "chat_history"],
    template="""
You are an expert financial advisor. Use the chat history and retrieved context to answer the question in a conversational manner.

Chat History:
{chat_history}

Context:
{context}

Question:
{question}

Answer:
"""
)


In [10]:
#  Initialize Fireworks LLM
llm = Fireworks(
    api_key=api_key,
    model="accounts/fireworks/models/deepseek-v3",
    temperature=1.0,
    max_tokens=1024
)


## 10- Create converational RAG pipline 

In [11]:
#  Step : Create Conversational RAG Pipeline
conversational_rag = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=pdf_retriever,
    chain_type="stuff",
    memory=memory,
    combine_docs_chain_kwargs={"prompt": finance_template}
)

In [27]:
query_1 = "What are the best strategies for saving money?"
response_1 = conversational_rag.invoke({"question": query_1})
print(response_1["answer"])



Great question! Saving money is a crucial part of financial well-being, and here are two of the best strategies to get started:

1. **Create a Budget:**  
   The foundation of saving is understanding your income and expenses. Start by tracking where your money goes each month. This will help you identify areas where you can cut back and allocate more toward savings. 

2. **Pay Yourself First:**  
   Treat your savings like a non-negotiable expense. Set aside a portion of your income as soon as you get paid, before spending on anything else. Automating this process can make it easier and more consistent.

These two strategies work hand-in-hand: budgeting helps you understand your financial situation and identify opportunities to save, while paying yourself first ensures that saving becomes a priority. Let me know if you'd like help creating a personalized savings plan!


## Example follow-up questions

In [26]:
query_1 = "By using previous advice can you tell the best two ideas?"
response_1 = conversational_rag.invoke({"question": query_1})
print(response_1["answer"])



 Based on the advice provided and the context, the **two best strategies for saving money** are:

1. **Budgeting:**  
   Budgeting is the foundation of saving money. Before you can save effectively, you need to understand your income and expenses. This allows you to identify areas where you can cut back and allocate more toward savings. As the context mentions, "You must budget before you can save, otherwise you won’t know what is affordable and your plan won’t be realistic." Tools and resources, like those from *Money Saving Expert*, can help you create a realistic and sustainable budget.

2. **Paying Yourself First:**  
   Treating your savings like a non-negotiable expense is another powerful strategy. As the context explains, "One way to get into the habit of saving money is to 'pay yourself first.' That means putting money in your savings account before you spend it on other things." Automating this process, such as having your employer deduct money directly into a savings account

## 12-Fetch database

In [14]:
# Load .env
load_dotenv()
url = os.getenv("SUPABASE_URL")
key = os.getenv("SUPABASE_KEY")

# Connect to Supabase
supabase: Client = create_client(url, key)

# Fetch data from the "Transaction" table
response = supabase.table("transactions").select("*").execute()
data = response.data
print(data)

[{'transaction_id': '3683c4fc-018d-42ee-803b-69845dd1f0cd', 'user_id': '027051a8-3887-4150-9cfb-a51efb9146b5', 'income': 1000, 'expenses': 0, 'data': 'Monthly salary deposit', 'category_id': '8867173b-22a3-408e-a2b8-9ee2f0bc70b2', 'description': 'August salary', 'created_at': '2025-04-21T10:51:09.833846'}]


## 13- Convert transaction table into text 

In [15]:
documents = [
    Document(
        page_content=(
            f"Transaction on {tx['created_at']}:\n"
            f"- Description: {tx['description']}\n"
            f"- Notes: {tx['data']}\n"
            f"- Income: {tx['income']} EGP\n"
            f"- Expenses: {tx['expenses']} EGP\n"
            f"- Category ID: {tx['category_id']}"
        ),
        metadata={"source": "transactions", "transaction_id": tx['transaction_id']}
    )
    for tx in data
]



In [16]:
print(documents[0].page_content)


Transaction on 2025-04-21T10:51:09.833846:
- Description: August salary
- Notes: Monthly salary deposit
- Income: 1000 EGP
- Expenses: 0 EGP
- Category ID: 8867173b-22a3-408e-a2b8-9ee2f0bc70b2


In [17]:
print(documents[0].metadata)


{'source': 'transactions', 'transaction_id': '3683c4fc-018d-42ee-803b-69845dd1f0cd'}


## 14- Create embeddings for database data

In [18]:
texts = [doc.page_content for doc in documents]
db_embeddings = embeddings.embed_documents(texts)


## 15-  Create a vector store for both the database and PDF embeddings

In [19]:
# Create FAISS vector database
db_vector_store = FAISS.from_embeddings(
    text_embeddings=list(zip(texts, db_embeddings)),
    embedding=embeddings
)

# Create a retriever for searching
db_retriever = db_vector_store.as_retriever(search_kwargs={"k": 5})

## 16- Merge pdf and db retrieval

In [20]:
combined_retriever = EnsembleRetriever(retrievers=[pdf_retriever, db_retriever], weights=[0.5, 0.5])


## 17- Edit RAG pipeline

In [30]:
edit_prompt = PromptTemplate.from_template("""
You are a helpful financial assistant. Use the following context to answer the question in a friendly and practical way. 
If the question is about spending, use transaction data. 
If it's about financial advice, give tips based on the provided documents or general best practices.

Context:
{context}

Question:
{question}

Answer in a clear, helpful way:
""")

In [31]:
#  RAG Pipeline
conversational_rag = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=combined_retriever,
    chain_type="stuff",
    memory=memory,
    combine_docs_chain_kwargs={"prompt": edit_prompt}
)

In [32]:
query = "How much do I spend on augest"
response = conversational_rag.invoke({"question": query})
print(response["answer"])


 Based on the provided transaction data, you did not record any expenses in August. The only transaction listed for that month is your August salary deposit of **1000 EGP**. If you did spend money but didn’t track it, consider using a spending notebook or app to monitor your expenses moving forward!


In [39]:
query2 = "how to save money"
response2 = conversational_rag.invoke({"question": query2})
print(response2["answer"])


Saving money effectively starts with creating a **realistic budget** to understand your income and expenses. Here’s how to get started:

1. **Track Your Spending**: Use your transaction data (e.g., salary deposits, expenses) to identify where your money is going. For example, if your monthly income is 1000 EGP, track how much you spend on essentials and non-essentials like eating out or entertainment.

2. **Set a Savings Goal**: Decide what you’re saving for—whether it’s an emergency fund, a big purchase, or long-term security. For instance, aim to save 10% of your income (e.g., 100 EGP monthly) as a starting point.

3. **Cut Unnecessary Expenses**: Review your spending habits and identify areas to cut back. For example, reduce dining out or subscription services that you don’t fully use.

4. **Automate Savings**: Set up a standing order or use apps to automatically transfer money into a savings account right after you get paid. This ensures you “pay yourself first.”

5. **Start Small*

In [41]:
query_3="how i can manage my finance ? "
response_3=conversational_rag.invoke({"question": query_3})
print(response_3["answer"])


Here’s a practical and friendly guide to managing your finances effectively:

1. **Track Your Spending**: Know where your money goes. Use a budgeting app or simply write down your expenses to identify areas where you can cut back. Even small daily expenses (like coffee) can add up over time.

2. **Create a Budget**: Plan how much you’ll spend and stick to it. Include essentials like rent, groceries, and bills, but also allocate some money for savings and fun. Adjust your budget as needed to stay on track.

3. **Save First**: Treat savings like a bill you must pay. Set aside money every month—even if it’s a small amount—toward an emergency fund, a big purchase, or future goals.

4. **Reduce Debt**: Avoid unnecessary debt, especially high-interest credit cards. If you already have debt, prioritize paying it off. Aim to keep debt payments below 20% of your income.

5. **Plan for Big Expenses**: Save in advance for major purchases (like a car or vacation) instead of relying on credit. This

In [36]:
query_4 = "What are the best strategies for saving money?"
response_4 = conversational_rag.invoke({"question": query_4})
print(response_4["answer"])



 Here are some practical strategies to help you save money effectively:

1. **Budget First**: Before saving, create a realistic budget to understand your income and expenses. This helps you identify areas where you can cut back and allocate funds toward savings. Use budgeting tools or apps to track your spending.

2. **Pay Yourself First**: Treat savings as a non-negotiable expense. Set aside a portion of your income (even if it’s small) as soon as you get paid. Automate transfers to a savings account to make this easier.

3. **Start Small**: If you’re new to saving, begin with manageable goals, like saving spare change in a jar. Gradually increase your savings as your confidence and income grow.

4. **Reduce Unnecessary Expenses**: Identify non-essential spending (e.g., eating out, subscriptions) and cut back on these. Redirect the saved money into your savings or investments.

5. **Avoid Debt**: Limit credit card use and focus on paying off existing debt. Keeping debt payments below 

In [40]:
query_5="What's the result of 2+2"
response_5 = conversational_rag.invoke({"question": query_5})
print(response_5["answer"])


The result of 2 plus 2 is **4**. This is a basic arithmetic addition, where you combine two quantities to get a total. 🧮
