This notebook builds a smart finance assistant chatbot that leverages AI to help users track and manage their personal transactions. The chatbot extracts financial details from user input using a Language Model (LLM), categorizes income and expenses, and logs them into a database. **It personalizes categories for each user**, validates entries, and provides insightful feedback about spending habits. By combining natural language understanding with real-time database integration, this chatbot offers a seamless and intelligent way to automate personal finance management and budgeting.

## 1- Import Libraries 

In [2]:
import os
from langchain_fireworks import Fireworks
from langchain.memory import ConversationBufferMemory
from datetime import datetime
from langchain.retrievers import EnsembleRetriever
import re
from dateparser import parse as date_parse
from datetime import datetime
from supabase import create_client, Client
from dotenv import load_dotenv

## 2- Set API key 

In [None]:
# Load the .env file
load_dotenv()

# Access the variables
api_key = os.getenv("API_KEY")

## 3- Call LLM 

In [4]:
llm = Fireworks(api_key=api_key, model="accounts/fireworks/models/deepseek-v3", temperature=1.0, max_tokens=1024)
response = llm.invoke("Hello, how are you?")
print(response)

 My name is Jack and I’m from Japan. Today, we’re going to talk about two English words that often confuse learners: “tooth” and “teeth”. Many people wonder why one word uses an “oo” sound, while the other uses an “ee” sound. Well, we’re going to explore that today and also look at some related grammar points. Let’s get started!

### 1. The basics: singular and plural forms

The first thing you need to know is that these words are related.

- Tooth (pronounced /tuːθ/): This is the singular form. It refers to one of the hard, white structures in the mouth used for biting and chewing.

Example sentences to learn the usage of “tooth”:

- I need to brush my tooth every day. (Here, “tooth” refers to one specific tooth, perhaps a problem tooth.)
- The child lost a tooth yesterday. (Referring to a single lost tooth.)
- My tooth hurts; I think I need to see the dentist.

- Teeth (pronounced /tiːθ/): This is the plural form. It refers to more than one tooth.

Example sentences to learn the usag

## 4- Create memory

In [5]:
#  Initialize memory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


  memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


## 5- Fetch database

In [None]:
# Load .env
load_dotenv()
url = os.getenv("SUPABASE_URL")
key = os.getenv("SUPABASE_KEY")

# Connect to Supabase
supabase: Client = create_client(url, key)

In [7]:
# Fetch data from the "Transaction" table
response = supabase.table("transactions").select("*").execute()
data = response.data
print(data)

[{'transaction_id': '3683c4fc-018d-42ee-803b-69845dd1f0cd', 'user_id': '027051a8-3887-4150-9cfb-a51efb9146b5', 'income': 7000, 'expenses': 0, 'data': 'Monthly salary deposit', 'category_id': '8867173b-22a3-408e-a2b8-9ee2f0bc70b2', 'description': 'August salary', 'created_at': '2025-04-21T10:51:09.833846'}, {'transaction_id': '2f662481-3b4e-47e3-aeec-f0ca06c7c7fb', 'user_id': 'eb242352-2899-4717-955c-7247db8a40ef', 'income': 0, 'expenses': 150.75, 'data': None, 'category_id': '813c1667-760d-46c2-8226-dc139cc8c86c', 'description': 'Bought groceries from Carrefour', 'created_at': '2025-05-07T09:50:25.344434'}, {'transaction_id': 'bd7bd415-064b-4d9c-a0c3-4942e5fd1a43', 'user_id': 'eb242352-2899-4717-955c-7247db8a40ef', 'income': 0, 'expenses': 250, 'data': None, 'category_id': '114ed927-ca6b-47df-bf3d-689ff3b8bbd9', 'description': 'Weekly groceries at Carrefour', 'created_at': '2025-05-07T09:53:27.097738'}, {'transaction_id': 'f7441b54-0fe6-4772-ad92-dcf1c64c44cd', 'user_id': 'eb242352-289

In [8]:
# Fetch data from the "categories" table
response = supabase.table("categories").select("*").execute()
data = response.data
print(data)

[{'category_id': '337bf3ed-8f2e-4647-ac50-30f327316d07', 'category_type': 'Food', 'category_name': 'Caffe', 'user_id': None}, {'category_id': 'c45b8dff-e08f-4c36-b30c-68ae7026b2f8', 'category_type': 'Transportation', 'category_name': 'taxi', 'user_id': None}, {'category_id': '3782b789-59fa-45b1-a214-c29e21a2f0e3', 'category_type': 'transportation', 'category_name': 'fuel', 'user_id': None}, {'category_id': '8867173b-22a3-408e-a2b8-9ee2f0bc70b2', 'category_type': 'Expense', 'category_name': 'Groceries', 'user_id': '027051a8-3887-4150-9cfb-a51efb9146b5'}, {'category_id': 'a8a017fc-e86f-4c23-9455-4f8db846c59b', 'category_type': 'Income', 'category_name': 'Salary', 'user_id': '47b66d92-1ea1-4344-af56-0771ed917456'}, {'category_id': 'af91ae83-0d65-4bd1-880f-747f3ebbaf35', 'category_type': 'Expense', 'category_name': 'Utilities', 'user_id': '99eed9c3-78dc-4925-b599-b02332524f78'}, {'category_id': '813c1667-760d-46c2-8226-dc139cc8c86c', 'category_type': 'expense', 'category_name': 'Groceries'

## 6- Fetch user categories

In [16]:
def fetch_user_categories(user_id, supabase, type_group="expense"):
    cat_type = 'Expense' if type_group.lower() == "expense" else 'Income'
    response = (
        supabase
        .table("categories")
        .select("*")
        .eq("user_id", user_id)
        .eq("category_type", cat_type)
        .execute()
    )
    cat_data = response.data
    categories = [c["category_name"] for c in cat_data]
    return categories

In [17]:
# Global cache for user categories
user_categories_cache = {}

# Caching function
# Whenever you want to (re)fetch and store categories for a user (call at login/session start, or after user edits their categories):
def update_user_categories_cache(user_id, supabase):
    expense_categories = fetch_user_categories(user_id, supabase, type_group="expense")
    income_categories = fetch_user_categories(user_id, supabase, type_group="income")
    user_categories_cache[user_id] = {
        "expense": expense_categories,
        "income": income_categories
    }

# Cache-aware fetcher
# Use cached categories if present, or update (and cache) if not:
def get_cached_user_categories(user_id, supabase):
    if user_id not in user_categories_cache:
        update_user_categories_cache(user_id, supabase)
    return user_categories_cache[user_id]

## 7- Prompt

In [33]:
def process_transaction_with_llm(text, user_id, supabase):
    # Get cached categories
    categories = get_cached_user_categories(user_id, supabase)
    expense_categories = categories['expense']
    income_categories = categories['income']
    expense_list = ", ".join(expense_categories)
    income_list = ", ".join(income_categories)
    
    # Prompt for extracting transaction details
    transaction_prompt = f"""
    Extract transaction details from the following text and classify the amount as either income or expense.
    If it's an expense, put amount in EXPENSE; if it’s income, put amount in INCOME. Extract date (handle relative dates like 'yesterday'), a description, and the category.
    If a specific date is not mentioned, assume today's date. If you cant classify the amount as either income or expense,
    assume it is expense and keep it uncategorized. Provide a natural response about the spending.

    Text: {{input_text}}

    Current date: {{current_date}}

    Format your response as follows:
    DATE: [Extract the date, handle relative dates like 'this morning', 'yesterday', 'last week', etc.]
    EXPENSE: [amount, if spending; leave blank if not]
    INCOME: [amount, if income; leave blank if not]
    CATEGORY: [Classify into exactly one from this list if expense: {expense_list}
               Classify into into exactly one from this list if income: {income_list} 
               Do not invent new categories! If none fit, choose the nearest]
    Description: [one or two word description of the transaction]
    FEEDBACK: [Provide a natural, conversational response about the spending. Consider:
    - If it's a good deal or expensive for that category
    - Suggest money-saving tips if relevant
    - Compliment good financial decisions
    - Express concern for unusually high spending
    - Comment on the timing or necessity of the purchase
    Make it sound natural and varied.]

    Example:
    Input: "I got today 2,000 EGP from upwork"
    DATE: 2024-01-30
    INCOME: 2000
    CATEGORY: Part-time
    DESCRIPTION: upwork
    FEEDBACK: That’s wonderful—congrats on your part-time job!
    
    Input: "Bought lunch for 50 EGP yesterday"
    DATE: 2024-01-29
    EXPENSE: 50
    CATEGORY: Food
    DESCRIPTION: lunch
    FEEDBACK: That's a reasonable amount for lunch! If you're looking to save more, you might consider bringing lunch from home occasionally.
    """

    # Prepare the prompt with current date
    current_date = datetime.now().strftime('%Y-%m-%d')
    formatted_prompt = transaction_prompt.format(
        input_text=text,
        current_date=current_date
    )

    # Get response from LLM
    response = llm.invoke(formatted_prompt)
    
    # Parse the response (you might need to adjust this based on your LLM's output format)
    return response



this user has no food category so if dinner categorized as Groceries the test is successed

In [27]:
categories = get_cached_user_categories(user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(categories)

{'expense': ['Groceries', 'Kids', 'Pets'], 'income': ['Salary', 'Freelance']}


In [34]:
print(process_transaction_with_llm("Had dinner for 150 EGP last night", user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase))





    DATE: 2025-05-09
    EXPENSE: 150
    CATEGORY: Groceries
    DESCRIPTION: dinner
    FEEDBACK: Dinner for 150 EGP seems pretty standard—just make sure it fit within your budget! If dining out frequently, you might want to balance it with home-cooked meals to save some extra cash.


In [25]:
print(process_transaction_with_llm("Milk 50", user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase))





    Input: "Milk 50"
    DATE: 2025-05-10
    EXPENSE: 50
    CATEGORY: Groceries
    DESCRIPTION: Milk
    FEEDBACK: That’s a reasonable price for milk. It’s a good idea to keep an eye out for discounts or bulk offers to save even more.


In [29]:
print(process_transaction_with_llm("work on tasks got 500", user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase))




    DATE: 2025-05-10
    INCOME: 500
    CATEGORY: Freelance
    DESCRIPTION: tasks
    FEEDBACK: Great job earning from your freelance work! It’s always rewarding to see your efforts pay off. Keep it up!


In [35]:
print(process_transaction_with_llm("got 500", user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase))




    DATE: 2025-05-10
    INCOME: 500
    CATEGORY: Uncategorized
    DESCRIPTION: got
    FEEDBACK: That's great to receive 500! If this is from a gig or side hustle, consider saving a portion of it for future investments.


In [36]:
print(process_transaction_with_llm("SON 200", user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase))

 

    DATE: 2025-05-10
    EXPENSE: 200
    INCOME: 
    CATEGORY: Kids
    DESCRIPTION: SON
    FEEDBACK: It seems like you’ve spent on something for your son. Kids’ expenses can add up quickly, so always look for opportunities to save—like reusing items or finding discounts when possible!


In [37]:
print(process_transaction_with_llm("catii eat 200", user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase))

	
    DATE: 2025-05-10
    EXPENSE: 200
    CATEGORY: Pets
    DESCRIPTION: catii
    FEEDBACK: Cat food or supplies can add up, so 200 seems reasonable for your pet’s needs. Just make sure to budget for these recurring expenses!


In [38]:
print(process_transaction_with_llm("little shirt 200", user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase))




    DATE: 2025-05-10
    EXPENSE: 200
    CATEGORY: Kids
    DESCRIPTION: shirt
    FEEDBACK: 200 for a shirt seems a bit steep, especially if it’s for kids who outgrow clothes quickly. Maybe look for sales or second-hand options next time!


## 8- Parse response

In [39]:
import re
from datetime import datetime
from dateparser import parse as date_parse

def parse_llm_response(response):
    # Extract date (default to today if not found or "Unknown")
    date_match = re.search(r"DATE:\s*(.*)", response)
    date_str = date_match.group(1).strip() if date_match else None
    if not date_str or date_str.lower() == "unknown":
        date_obj = datetime.now()
    else:
        date_obj = date_parse(date_str)
        if not date_obj:
            date_obj = datetime.now()
    date_iso = date_obj.strftime("%Y-%m-%d")

    # Extract EXPENSE and INCOME
    expense_match = re.search(r"EXPENSE:\s*([\d.]+)", response, re.IGNORECASE)
    income_match = re.search(r"INCOME:\s*([\d.]+)", response, re.IGNORECASE)
    expinc_match = re.search(r"EXPENSE/INCOME:\s*([\d.]+)", response, re.IGNORECASE)

    if expense_match:
        expense = float(expense_match.group(1))
        income = 0.0
    elif income_match:
        expense = 0.0
        income = float(income_match.group(1))
    elif expinc_match:
        expense = float(expinc_match.group(1))
        income = 0.0  #  default this to expense!
    else:
        expense = 0.0
        income = 0.0

    # Extract DESCRIPTION
    desc_match = re.search(r"DESCRIPTION:\s*(.*)", response, re.IGNORECASE)
    description = desc_match.group(1).strip() if desc_match and desc_match.group(1).strip() else None

    # Extract CATEGORY (optional)
    category_match = re.search(r"CATEGORY:\s*(.*)", response)
    category = category_match.group(1).strip() if category_match and category_match.group(1).strip() else None

    # Extract FEEDBACK (optional)
    feedback_match = re.search(r"FEEDBACK:\s*(.*)", response)
    feedback = feedback_match.group(1).strip() if feedback_match else ""

    return {
        "date": date_iso,
        "expense": expense,
        "income": income,
        "category": category,
        "description": description,
        "feedback": feedback
    }



## 9- Get category_id

In [40]:
def get_category_id(category_name: str, supabase):
    # Returns None if category_name is None or not found
    if not category_name:
        return None
    cat_response = supabase.table("categories").select("category_id").eq("category_name", category_name).execute()
    cat_data = cat_response.data
    if cat_data:
        return cat_data[0]["category_id"]
    else:
        return None


## 10- Add transaction into DB

In [41]:
def add_transaction_from_llm(llm_response, user_id, supabase):
    extracted = parse_llm_response(llm_response)
    description_valid = extracted["description"] not in (None, "", "No Description")
    amount_valid = (extracted["expense"] > 0 or extracted["income"] > 0)

    if not description_valid or not amount_valid:
        return "❌ Cannot add transaction: missing information."

    category_id = get_category_id(extracted["category"], supabase)
    new_transaction = {
        "income": extracted["income"],
        "expenses": extracted["expense"],
        "description": extracted["description"],
        "created_at": extracted["date"],
        "user_id": user_id,
        "category_id": category_id
    }
    response = supabase.table("transactions").insert(new_transaction).execute()

    # Build friendly message for user
    # Compose a string stating what was added, and echo the LLM's feedback
    tx_type = "income" if extracted["income"] > 0 else "expense"
    tx_value = extracted["income"] if extracted["income"] > 0 else extracted["expense"]

    user_reply = (
        f"✅ Added {tx_type} transaction: {extracted['description']}, "
        f"{tx_value} EGP, "
        f"date: {extracted['date']}"
    )
    if extracted["category"]:
        user_reply += f", category: {extracted['category']}."
    else:
        user_reply += "."
    if extracted["feedback"]:
        user_reply += f"\n💬 {extracted['feedback']}"
    return user_reply

## 11- Test

In [42]:
user_input = "Received salary 8000 EGP today"
llm_response = process_transaction_with_llm(user_input, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase) 
print(llm_response)
result = add_transaction_from_llm(llm_response, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(result)




    DATE: 2025-05-10
    INCOME: 8000
    CATEGORY: Salary
    DESCRIPTION: salary
    FEEDBACK: Congratulations on receiving your salary! This is a great time to plan your budget and ensure you're saving or investing wisely. Keep up the good work and enjoy your earnings.




✅ Added income transaction: salary, 8000.0 EGP, date: 2025-05-10, category: Salary.
💬 Congratulations on receiving your salary! This is a great time to plan your budget and ensure you're saving or investing wisely. Keep up the good work and enjoy your earnings.


In [43]:
user_input = "Had dinner for 150 EGP last night"
llm_response = process_transaction_with_llm(user_input, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(llm_response)
# Parse and save
add_transaction_from_llm(llm_response, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)




    DATE: 2025-05-09
    EXPENSE: 150
    CATEGORY: [uncategorized]
    DESCRIPTION: dinner
    FEEDBACK: Dinner for 150 EGP seems a bit on the higher side. Next time, you might want to explore more budget-friendly options or consider cooking at home to save some extra cash.




'✅ Added expense transaction: dinner, 150.0 EGP, date: 2025-05-09, category: [uncategorized].\n💬 Dinner for 150 EGP seems a bit on the higher side. Next time, you might want to explore more budget-friendly options or consider cooking at home to save some extra cash.'

In [44]:
user_input = "got 150 EGP"
llm_response = process_transaction_with_llm(user_input, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(llm_response)
# Parse and save
result = add_transaction_from_llm(llm_response, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(result)




    DATE: 2025-05-10
    INCOME: 150
    CATEGORY: Freelance
    DESCRIPTION: Got
    FEEDBACK: That's great to see some extra income coming in! Keep up the good work on your freelance gigs.




✅ Added income transaction: Got, 150.0 EGP, date: 2025-05-10, category: Freelance.
💬 That's great to see some extra income coming in! Keep up the good work on your freelance gigs.


In [45]:
user_input = "Dinner today"
llm_response = process_transaction_with_llm(user_input, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(llm_response)
# Parse and save
result = add_transaction_from_llm(llm_response, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(result)




    DATE: 2025-05-10
    EXPENSE: 
    INCOME: 
    CATEGORY: 
    DESCRIPTION: Dinner
    FEEDBACK: It looks like you had dinner today. If you'd like to keep track of your spending, it would be helpful to note the amount for better budgeting!
❌ Cannot add transaction: missing information.




In [46]:
user_input = "50 today"
llm_response = process_transaction_with_llm(user_input, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(llm_response)
# Parse and save
result = add_transaction_from_llm(llm_response, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(result)

 

    DATE: 2025-05-10
    EXPENSE: 50
    INCOME: 
    CATEGORY: Uncategorized
    DESCRIPTION: Transaction
    FEEDBACK: You spent 50 EGP today. Without more details, it's hard to categorize, but it's always good to keep track of your spending!




✅ Added expense transaction: Transaction, 50.0 EGP, date: 2025-05-10, category: Uncategorized.
💬 You spent 50 EGP today. Without more details, it's hard to categorize, but it's always good to keep track of your spending!


In [47]:
user_input = "Dibers 300"
llm_response = process_transaction_with_llm(user_input, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(llm_response)
# Parse and save
result = add_transaction_from_llm(llm_response, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(result)




DATE: 2025-05-10
EXPENSE: 300
CATEGORY: Groceries
DESCRIPTION: Dibers
FEEDBACK: Hmm, Dibers doesn’t seem like a typical grocery item. If it’s a special purchase, make sure it’s something you really need. Otherwise, consider more budget-friendly options next time!




✅ Added expense transaction: Dibers, 300.0 EGP, date: 2025-05-10, category: Groceries.
💬 Hmm, Dibers doesn’t seem like a typical grocery item. If it’s a special purchase, make sure it’s something you really need. Otherwise, consider more budget-friendly options next time!


In [48]:
user_input = "Diapers 300"
llm_response = process_transaction_with_llm(user_input, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(llm_response)
# Parse and save
result = add_transaction_from_llm(llm_response, user_id="027051a8-3887-4150-9cfb-a51efb9146b5", supabase=supabase)
print(result)




    DATE: 2025-05-10
    EXPENSE: 300
    CATEGORY: Kids
    DESCRIPTION: Diapers
    FEEDBACK: Diapers are a necessity for kids, so this expense is understandable. However, it might be worth checking if there are any bulk purchase options or discounts available to save on these recurring costs.




✅ Added expense transaction: Diapers, 300.0 EGP, date: 2025-05-10, category: Kids.
💬 Diapers are a necessity for kids, so this expense is understandable. However, it might be worth checking if there are any bulk purchase options or discounts available to save on these recurring costs.
