# 🧠 Personalized Memory Engine: Final Version
This notebook contains the final, consolidated code for the RAG-based personalized memory engine. It demonstrates the complete backend logic, including memory storage, hybrid-score retrieval, conversational context, and memory maintenance. The final cell provides a command-line interface to interact with the engine.

## 📦 Step 1: Installation and Imports

In [3]:
!pip install chromadb sentence-transformers numpy ollama

import uuid
import time
import datetime as dt
import numpy as np
import ollama
import subprocess
from sentence_transformers import SentenceTransformer
import chromadb
from chromadb.config import Settings
from typing import List, Dict, Any, Tuple



## ⚙️ Step 2: Initialize Core Components
We initialize the SentenceTransformer model for embeddings and the ChromaDB client for our persistent vector store.

In [5]:
# --- 1. INITIALIZE CORE COMPONENTS ---
model = SentenceTransformer("all-MiniLM-L6-v2")
client = chromadb.Client(Settings(persist_directory="./chroma_store", is_persistent=True))
collection = client.get_or_create_collection(name="memory_store")

print("✅ Core components initialized.")

✅ Core components initialized.


## 📝 Step 3: Define All Helper Functions
This section contains all the necessary helper functions for storing, pruning, ranking, and retrieving memories, as well as formatting prompts and querying the local LLM.

In [7]:
# --- 2. HELPER FUNCTIONS ---

def now_ts() -> float:
    return time.time()

def store_memory(entry_text: str, user_id: str, topic: str = "general"):
    emb = model.encode(entry_text).tolist()
    doc_id = str(uuid.uuid4())
    metadata = {"user_id": user_id.lower(), "timestamp": now_ts(), "topic": topic}
    collection.add(documents=[entry_text], embeddings=[emb], metadatas=[metadata], ids=[doc_id])
    print(f"Stored for {user_id.lower()}: {entry_text[:60]}...")

def prune_old_memories(user_id: str, days_to_keep: int = 90):
    if not isinstance(days_to_keep, int) or days_to_keep < 0: return
    cutoff_timestamp = time.time() - (days_to_keep * 24 * 60 * 60)
    old_memories = collection.get(
        where={"$and": [{"user_id": {"$eq": user_id.lower()}}, {"timestamp": {"$lt": cutoff_timestamp}}]}
    )
    ids_to_delete = old_memories.get("ids")
    if ids_to_delete:
        print(f"🧹 Pruning {len(ids_to_delete)} old memories for user '{user_id.lower()}'...")
        collection.delete(ids=ids_to_delete)

def _cosine(a: np.ndarray, b: np.ndarray) -> float:
    a, b = np.asarray(a), np.asarray(b)
    num = np.dot(a, b)
    den = (np.linalg.norm(a) * np.linalg.norm(b)) + 1e-12
    return float(num / den)

def rerank_results(query_emb: List[float], docs: List[str], metas: List[Dict[str, Any]], embs: List[List[float]]) -> List[Tuple[str, Dict, float]]:
    now = now_ts()
    ranked = []
    alpha = 0.7
    decay_days = 14.0
    for d, m, e in zip(docs, metas, embs):
        sim = _cosine(np.asarray(query_emb), np.asarray(e))
        age_days = max(0.0, (now - float(m.get("timestamp", now))) / (60 * 60 * 24))
        rec = np.exp(-age_days / decay_days)
        score = alpha * sim + (1 - alpha) * rec
        ranked.append((d, m, score))
    ranked.sort(key=lambda x: x[2], reverse=True)
    return ranked

def format_chat_history(chat_history: List[Dict]) -> str:
    if not chat_history: return ""
    return "\n".join([f"{msg['role'].title()}: {msg['content']}" for msg in chat_history[-4:]])

def query_local_llm(prompt: str, model_name: str = "mistral") -> str:
    ollama_path = r"C:\Users\crazz\AppData\Local\Programs\Ollama\ollama.exe"
    try:
        result = subprocess.run([ollama_path, "run", model_name], input=prompt.encode("utf-8"), stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True)
        return result.stdout.decode("utf-8").strip()
    except Exception as e:
        return f"⚠️ An unexpected error occurred: {e}"

print("✅ Helper functions defined.")

✅ Helper functions defined.


## 🧠 Step 4: The Chatbot Brain (`rag_chatbot`)
This is the main orchestrator function. It uses a robust, hybrid system to determine the user's intent. It first checks for obvious questions using rules. If it's a question, it runs the full RAG pipeline. If it's not a clear question, it responds conversationally while being aware of the short-term chat history.

In [9]:
def rag_chatbot(user_query: str, user_id: str, chat_history: List[Dict]) -> Tuple[str, list]:
    history_str = format_chat_history(chat_history)
    user_query_lower = user_query.lower()
    question_words = ["who", "what", "where", "when", "why", "how", "did", "do", "am i", "what's"]
    is_question = any(user_query_lower.startswith(word) for word in question_words) or user_query.endswith("?")

    if is_question:
        print("DEBUG: Intent is [question] by rule.")
        q_emb = model.encode(user_query).tolist()
        res = collection.query(query_embeddings=[q_emb], n_results=10, where={"user_id": user_id.lower()}, include=["documents", "metadatas", "embeddings"])
        docs, metas, embs = res.get("documents", [[]])[0], res.get("metadatas", [[]])[0], res.get("embeddings", [[]])[0]
        
        ranked_memories = []
        if docs:
            ranked_memories = rerank_results(q_emb, docs, metas, embs)[:3]
        
        context = "\n".join([f"- {doc}" for doc, meta, score in ranked_memories]) if ranked_memories else "No relevant long-term memories found."
        prompt = f"""You are an AI assistant. Given the conversation history and the user's long-term memories, answer the user's question.
        --- CONVERSATION HISTORY ---\n{history_str}\n--- END HISTORY ---
        --- LONG-TERM MEMORIES ---\n{context}\n--- END MEMORIES ---
        User's question: {user_query}\nAnswer:"""
        response = query_local_llm(prompt)
        return response, ranked_memories

    else:
        print("DEBUG: Intent is [chat].")
        prompt = f"""You are a helpful AI assistant. Continue the conversation naturally.
        --- CONVERSATION HISTORY ---\n{history_str}\n--- END HISTORY ---
        User: {user_query}\nAssistant:"""
        response = query_local_llm(prompt)
        return response, []

print("✅ Main chatbot logic defined.")

✅ Main chatbot logic defined.


## ⚙️ Step 5: Database Seeding
This function populates the database with sample data for a user if their memory store is empty.

In [11]:
def populate_db_if_empty(user_id: str):
    if len(collection.get(where={"user_id": user_id.lower()})['ids']) == 0:
        print(f"No memories found for {user_id.lower()}. Seeding database...")
        seed_entries = [
            ("Booked train tickets to Jaipur for next weekend.", "travel"),
            ("I had dinner with Rahul and we discussed career plans.", "social"),
            ("Watched F1 qualifying highlights on YouTube.", "media")
        ]
        for txt, topic in seed_entries:
            store_memory(txt, user_id=user_id, topic=topic)

print("✅ Database seeding function defined.")

✅ Database seeding function defined.


## 🚀 Step 6: Final Run - Interactive Command-Line Chat
This final cell runs the complete application in the command line. The user can store new memories explicitly, and the chatbot will use both long-term and short-term conversational memory to respond.

**Note:** To save a memory, you must type `/save` followed by your statement. All other inputs are treated as questions or chat.

In [13]:
chat_history = []

print("--- Personalized Memory Engine (CLI) ---")
user_id = input("Please enter your username to begin: ").lower()

prune_old_memories(user_id)
populate_db_if_empty(user_id)

print(f"\nHello, {user_id}! Type '/save [your memory]' to save something, or just chat. Type 'exit' to end.")
print("-" * 40)

while True:
    prompt = input(f"[{user_id}]> ")
    if prompt.lower() == 'exit':
        print("Goodbye!")
        break
    
    chat_history.append({"role": "user", "content": prompt})
    
    if prompt.lower().startswith("/save "):
        memory_to_save = prompt[6:]
        store_memory(memory_to_save, user_id)
        response = "Okay, I'll remember that."
        print(f"[Assistant]> {response}")
    else:
        response, memories = rag_chatbot(prompt, user_id, chat_history)
        if memories:
            print("\n--- Retrieved Memories ---")
            for doc, meta, score in memories:
                print(f"  - (Score: {score:.2f}) {doc}")
            print("------------------------")
        print(f"\n[Assistant]> {response}\n")
    
    chat_history.append({"role": "assistant", "content": response})

--- Personalized Memory Engine (CLI) ---


Please enter your username to begin:  siddharth


No memories found for siddharth. Seeding database...
Stored for siddharth: Booked train tickets to Jaipur for next weekend....
Stored for siddharth: I had dinner with Rahul and we discussed career plans....
Stored for siddharth: Watched F1 qualifying highlights on YouTube....

Hello, siddharth! Type '/save [your memory]' to save something, or just chat. Type 'exit' to end.
----------------------------------------


[siddharth]>  /save My fav cricketer is Virat Kohli.


Stored for siddharth: My fav cricketer is Virat Kohli....
[Assistant]> Okay, I'll remember that.


[siddharth]>  Who is my favourite cricketer?


DEBUG: Intent is [question] by rule.

--- Retrieved Memories ---
  - (Score: 0.87) My fav cricketer is Virat Kohli.
  - (Score: 0.51) I had dinner with Rahul and we discussed career plans.
  - (Score: 0.40) Booked train tickets to Jaipur for next weekend.
------------------------

[Assistant]> Your favorite cricketer is Virat Kohli, as per our previous conversation.



[siddharth]>  exit


Goodbye!
