Discussion RAG

This project is a Discussion RAG system designed for coherent, context-aware, and stable long-term conversations. It features a unique 4-layer memory architecture to prevent common pitfalls like context drift and reasoning loops.

The system is designed not just to remember facts, but to maintain the foundational pillars of a structured discussion: premises, constraints, and established agreements.

Core Concept: The 4-Layer Memory Architecture

The core of this project is a unique memory system that separates different kinds of information based on their role and lifespan in a conversation. This prevents the LLM from getting confused by transient thoughts or overriding foundational premises.

┌────────────────────────────────────┐
│ Layer 1: Ephemeral Session Context │
│   • Adapts to the user's current state (non-persistent)
├────────────────────────────────────┤
│ Layer 2: Explicit Long-Term Memory │
│   • Core premises, values, and constraints of the discussion
├────────────────────────────────────┤
│ Layer 3: Decision Digest           │
│   • Immutable record of confirmed agreements and choices
├────────────────────────────────────┤
│ Layer 4: Sliding Window Messages   │
│   • The "scratchpad" for recent turns, hypotheses, and reasoning
└────────────────────────────────────┘

Ephemeral Session Context: Captures the user's immediate state (e.g., fatigue level, discussion mode) to adapt the AI's response tone and style in real-time. This context is not saved.
Explicit Long-Term Memory (LTM): Stores the foundational pillars of the discussion, such as core assumptions, constraints, and evaluation criteria. This memory is persistent and ensures the conversation remains consistent over long periods.
Decision Digest: A persistent log of explicit agreements and decisions made during the conversation. This prevents re-litigating settled points.
Sliding Window Messages: A standard conversational buffer that holds the most recent exchanges. This is where active reasoning and exploration happen.

This structured approach makes the assistant a more reliable partner for complex, long-running discussions. For a deeper dive into the philosophy, please read the Memory System Design Document.

✨ Features

Advanced 4-Layer Memory: Provides a stable, long-term conversational foundation.
LLM-Powered Memory Management: The LLM itself determines when and what to save to LTM or the Decision Digest based on the conversation, responding with a structured JSON object containing both the conversational reply and memory operations.
Separated Backend and Frontend: A robust FastAPI backend handles all logic, while a clean Streamlit frontend provides an interactive user experience.
Real-time Memory Inspection: The UI allows the user to view the contents of the Long-Term Memory and Decision Digest at any time.
Dependency Injection: The backend uses a modern dependency injection pattern for robustness and testability.
Async API: The entire backend is built on an asynchronous framework (FastAPI) for high performance.

🛠️ Technology Stack

Backend:
- Framework: FastAPI
- Language: Python 3.12+
- Core Logic: LangChain, Pydantic
- LLM Support: langchain-openai, langchain-ollama
Frontend:
- Framework: Streamlit
Package Management & Venv: uv
Code Quality:
- Linter/Formatter: Ruff
- Type Checking: MyPy

🏗️ Architecture

The application is split into two main components for a clean separation of concerns:

Backend (FastAPI): A powerful backend that serves a REST API for all core functionalities. It encapsulates the 4-layer memory logic, LLM interactions, and data persistence. All business logic resides here.
Frontend (Streamlit): A purely presentational layer that consumes the backend API. It is responsible for rendering the chat interface, capturing user input, and displaying the memory state.

This decoupled architecture makes the system highly scalable and maintainable. The backend was refactored to use a Dependency Injection pattern, where a single, cached instance of the chat orchestrator is supplied to the API endpoints. This improves testability and predictability.

For more details, see the Architecture & API Documentation.

API Endpoints

All endpoints are prefixed with /api/v1.

Method	Path	Description
`POST`	`/chat`	Send a message and get an AI response.
`GET`	`/ltm`	Retrieve all items in the Long-Term Memory.
`GET`	`/decisions`	Retrieve all items in the Decision Digest.

🚀 Getting Started

Follow these instructions to set up and run the project on your local machine.

1. Prerequisites

Python 3.12+
uv: An extremely fast Python package installer and resolver.

2. Installation

Clone the repository:

git clone https://github.com/your-username/discussion-rag.git
cd discussion-rag

Set up environment variables: Create a .env file by copying the example template.
```
cp .env.example .env
```
Now, edit the .env file to configure your desired LLM provider, API keys, etc. The default configuration uses a mock LLM.
Install dependencies: uv will create a virtual environment (.venv) and install all required packages from pyproject.toml.
```
uv sync --extra dev
```

3. Running the Application

You need to run the backend and frontend servers in two separate terminal sessions.

Start the Backend (FastAPI) server:
```
uv run uvicorn backend.main:app --reload
```
The API will be available at http://localhost:8000. You can view the auto-generated documentation at http://localhost:8000/docs.
Start the Frontend (Streamlit) application:
```
uv run streamlit run frontend/app.py
```
The chat interface will be available at http://localhost:8501.

✅ Development & Testing

We adhere to a TDD workflow and use modern tooling to maintain high code quality.

Running Tests

To run the entire test suite, use pytest:

uv run pytest

Code Quality Checks

Format code with Ruff:
```
uv run ruff format .
```
Lint with Ruff (with auto-fix):
```
uv run ruff check --fix .
```
Type-check with MyPy:
```
uv run mypy .
```

📄 License

This project is licensed under the MIT License - see the LICENSE file for details (if one exists).

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
backend		backend
docs		docs
frontend		frontend
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENT.md		AGENT.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Discussion RAG

Core Concept: The 4-Layer Memory Architecture

✨ Features

🛠️ Technology Stack

🏗️ Architecture

API Endpoints

🚀 Getting Started

1. Prerequisites

2. Installation

3. Running the Application

✅ Development & Testing

Running Tests

Code Quality Checks

📄 License

About

Uh oh!

Releases

Packages

Languages

chottokun/Learn02_MemorySystemConcept

Folders and files

Latest commit

History

Repository files navigation

Discussion RAG

Core Concept: The 4-Layer Memory Architecture

✨ Features

🛠️ Technology Stack

🏗️ Architecture

API Endpoints

🚀 Getting Started

1. Prerequisites

2. Installation

3. Running the Application

✅ Development & Testing

Running Tests

Code Quality Checks

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages