A modular system for data analysis with memory management, powered by Large Language Models.
The Data Memory System integrates large language models with specialized tools to create an intelligent data analysis assistant that can:
- Remember past interactions and data artifacts
- Execute SQL queries based on natural language questions
- Create visualizations that answer specific data exploration goals
- Summarize datasets and extract key insights
- Refine visualizations based on user feedback
The system consists of several modular components:
- Memory Server: Stores and retrieves conversation history and data artifacts
- SQL Agent Server: Generates and executes SQL queries
- Data Visualization Server: Creates and evaluates Plotly visualizations
- Data Summarization Server: Summarizes datasets and generates exploration goals
- Memory System: Specialized vector-based memory for conversation history and data artifacts
- Client: Orchestrates interaction between servers and user
- LLM Service: Interface to language models for intelligence
- Utils: Helper functions for data manipulation and safe code execution
- Python 3.9+
- Sentence Transformers
- FAISS or another vector database
- Ollama (for local models)
- SQLite3
- Plotly
- Clone the repository:
git clone https://github.com/yourusername/data-memory-system.git
cd data-memory-system- Install dependencies:
pip install -r requirements.txt- Set up the configuration in
config/llm_config.pyandconfig/server_config.py.
- Start the interactive console:
python app/main.py --interactive- Or use the Gradio UI:
python app/ui/gradio_app.pyYou: Summarize the sales data by region
System: Analyzing sales data...
[Visualizations and insights are displayed]
You: Can you change the chart to a bar graph?
System: [Updates visualization to a bar graph]
data_memory_system/
│
├── mcp_servers/ # MCP servers for different capabilities
│ ├── memory_server.py
│ ├── sql_agent_server.py
│ ├── data_visualization_server.py
│ └── data_summarization_server.py
│
├── client/ # Client orchestration
│ ├── data_analysis_client.py
│ ├── intent_detection.py
│ └── response_formatter.py
│
├── memory/ # Memory management
│ ├── base.py
│ ├── conversation_memory.py
│ └── data_artifacts.py
│
├── models/ # Model interfaces
│ ├── embeddings.py
│ └── llm.py
│
├── prompts/ # LLM prompts
│ ├── sql_prompts.py
│ ├── visualization_prompts.py
│ └── summarization_prompts.py
│
├── utils/ # Utility functions
│ ├── database.py
│ ├── safe_execution.py
│ └── formatters.py
│
├── config/ # Configuration
│ ├── server_config.py
│ └── llm_config.py
│
└── app/ # Application entry points
├── main.py # CLI application
└── ui/ # User interfaces
└── gradio_app.py # Gradio UI
- Create a new MCP server in the
mcp_serversdirectory - Define tool functions using the
@mcp.tool()decorator - Define prompts using the
@mcp.prompt()decorator - Add the server to the configuration in
config/server_config.py
- Create a new class that extends
memory.base.BaseMemoryormemory.base.VectorMemory - Implement the required methods:
store,retrieve,update, anddelete - Add custom methods for specialized memory operations
This project is licensed under the MIT License - see the LICENSE file for details.