THIS CODEBASE IS IN CONSTANT FLUX RIGHT NOW AND FILES SOMETIMES DEADEND OR IT JUST WON'T LOAD OR I'LL DELETE CORE INFRASTRUCTURE SO I CAN REBUILD IT IN A FUTURE COMMIT. I'M IN MY FAST AND LOOSE ERA. CHECK OUT THE CODE.. ITS PRETTY NEAT. BUT DON'T JUDGE ME ON THE CONTENT CONTAINED WITHIN RIGHT NOW. THIS IS MY METHOD AND IT WORKS EVEN THOUGH ITS VERY WUSHI DRUNKEN FIST STYLE.
MIRA is an AI agent system that provides natural conversation capabilities with persistent memory, intelligent tool usage, and workflow automation. It combines the conversational abilities of modern language models with practical tools and a sophisticated memory system to create a genuinely useful long-running AI assistant. My vision for this project is to create a little brain-in-a-box. I will not stop till my vision is realized.
I originally started writing what would become MIRA because I wanted to create a bot that could generate recipes tailored to my personal tastes by following a decision tree. A thousand scope creeps later and now it proactively remembers that next week is your mom's birthday and asks if you'd like it to arrange a gardenia delivery to her work address because that is what you got her last year lol.
I am distributing MIRA 100% unequivocally free for personal and research use. Download it, modify it, heck,, if you're feeling kind please contribute to the source. But please don't use it in commercial applications without getting a license from me (I'll probably give you one for free for an interesting usecase tbh) or wait for the license to decay into Apache 2.0.
Thanks! I hope you find value in this project.
- Features
- Architecture
- Installation
- Usage
- Memory System
- Tool System
- Available Tools
- Development
- Troubleshooting
- License
- Natural Conversation: Talk to MIRA like you would a human - no special commands or syntax required
- Continuous Memory: Maintains one long continuous conversation with daily summarization and context management. Advanced consolidation services run background tasks to organize memory and prevent context bloat. Each day is a standalone event judged against larger trends with integrated decay at multiple strategic points.
- Intelligent Tool Usage: Automatically selects and uses appropriate tools based on conversation context. This just-in-time approach to adding tool definitions to the context window in conjunction with dynamically updating workflows that only show the current step allows MIRA to reliably execute multi-step tool calls on pitifully low parameter LLMs. This was designed so that MIRA can run totally offline on edge devices. Heck, my testbed for this ability is an old Android phone.
- Workflow Automation: Handles multi-step tasks with context-aware workflows
- Provider Flexibility: Works with Anthropic (via proxy), Hugging Face, Lambda, any provider that follow the OAI API format, or local models through Ollama. llm_provider.py is provider agnostic.
- Extensible Tool System: Easy to add new capabilities through the modular tool framework with user data isolation and secure credential storage. Includes comprehensive development guides and resources for reliable tool creation.
- User Data Isolation: Complete separation of user data with individual storage directories and credential management for multi-user privacy and security.
MIRA consists of several key components:
- Conversation Manager: Handles message flow, tool execution, and response generation
- Working Memory: Manages dynamic context visible throughout conversations
- Long-Term Memory: Sophisticated memory system with conversation archiving, semantic search, and consolidation services
- Tool System: Modular framework with user-isolated data storage and intelligent credential management
- Data Isolation: User-specific data storage ensuring complete privacy and separation between users
- Credential Management: Secure storage and retrieval of user credentials and API keys
MIRA includes automated setup scripts that handle all configuration and dependencies:
git clone https://github.com/yourusername/mira.git
cd mira
# Setup core system
chmod +x autodeploy.sh
./autodeploy.sh
# Setup secure credential storage (optional but recommended)
cd setup
chmod +x setup_vault.sh
./setup_vault.sh
The scripts will:
- Set up Python virtual environment
- Install all dependencies
- Configure PostgreSQL with pgvector
- Create required databases
- Generate configuration files
- Initialize secure credential storage (Vault)
- Set up user data isolation
- Initialize the system
Manual installation instructions are available in docs/MANUAL_SETUP.md
.
MIRA supports flexible embedding model providers for both tool classification and memory operations:
MIRA uses BAAI BGE (Bidirectional Generative Embeddings) models for efficient local inference:
- BGE-large-en-v1.5: 1024-dimensional embeddings with INT8 quantization for CPU efficiency
- BGE-reranker-base: FP16 precision reranker for improved memory search relevance
These models run entirely on your local machine with ONNX Runtime optimization. embeddings_provider.py goes hand-in-hand with llm_provider.py wherein you can use any embeddings provider you want including Vertex and OpenAI.
My theory was that though using BGM with Reranker produces awesome results not every computer can handle these tasks so MIRA can seamlessly offload heavy lifting to a remote server.
MIRA implements a novel unified embeddings approach that optimizes both performance and relevance:
Shared Infrastructure with Specialized Usage:
- Both tool relevance and memory search use the same
EmbeddingsProvider
infrastructure - Tool relevance embeds the current user message for immediate tool classification
- Memory search embeds weighted conversation context from recent messages for historical relevance
- Each system optimizes what text it embeds for its specific use case
Differentiated Processing for Optimal Results:
- Tool relevance uses efficient matrix operations to compare against all tool embeddings and returns a top set
- Memory search applies the same embedding model but includes an additional reranking step to ensure only truly relevant memories are surfaced
- This dual approach balances speed for tool selection with precision for memory retrieval
Advantages over Traditional Approaches:
- Performance: Eliminates redundant embedding infrastructure and model loading
- Consistency: Ensures consistent embedding quality across all system components
- Efficiency: Shared LRU and disk caching & optimization reduces computational overhead
- Contextual Precision: Tool relevance focuses on immediate needs while memory search considers conversational flow
- Scalability: Single embedding provider scales to support additional AI subsystems
This architecture enables MIRA to make intelligent decisions about both tool activation and memory retrieval while maintaining optimal performance through shared resources.
For cloud-based embeddings, configure the provider in your environment:
# Set embeddings provider to remote
export EMBEDDINGS_PROVIDER=remote
API keys are now stored securely using the credential management system rather than environment variables. Use the credential storage interface to securely store your OpenAI API key.
If you choose to use local embeddings (preferred): The system will automatically download and cache BGE models on first use (~1.2GB for base model, ~500MB for reranker).
Start MIRA in interactive mode:
python main.py
Available commands:
/exit
- Exit the program/save
- Save conversation/clear
- Clear conversation history/reload_user
- Reload user profile/tokens
- Show token usage/toolfeedback
- Provide tool feedback
MIRA's memory system consists of two complementary components that work together to provide both immediate context and long-term knowledge retention.
Working memory serves as the dynamic context system that's active throughout every conversation. It functions as a live dashboard that updates automatically before each response, ensuring MIRA always has current information available.
Key Components:
- Content Categories: Organizes information into logical groups (reminders, datetime, system_status, proactive_memories, archived_conversations)
- Manager System: Registered components automatically update their content before each LLM call
- Trinket Architecture: Specialized utility classes handle specific types of dynamic content:
TimeManager
: Provides current date/time contextReminderManager
: Surfaces overdue and upcoming remindersSystemStatusManager
: Reports system health and noticesProactiveMemoryTrinket
: Intelligently surfaces relevant long-term memoriesConversationArchiveManager
: Injects relevant historical conversationsOtherOnes
: Trinkets follow a factory pattern so new ones can be added quickly and reliably
How It Works:
- Before generating each response, all registered managers update their content
- The system builds a formatted context block that's included in the LLM prompt
- Content is organized by category with clear formatting for easy LLM consumption
- Items are tracked with UUIDs and can be dynamically added/removed during conversation
Long-term memory provides persistent knowledge storage with sophisticated retrieval capabilities. It uses PostgreSQL with pgvector for semantic similarity search, enabling MIRA to find and utilize relevant information from past conversations and learned knowledge. The system includes advanced consolidation services that run background tasks to organize and optimize memory storage.
Core Components:
Memory Blocks (Core Memory)
- Always-visible context that MIRA can self-edit
- Three categories:
persona
(MIRA's personality/directive),human
(user information),system
(operational context) - Version tracking with differential storage for change history
- Character limits ensure focused, relevant content (2048 for persona/human, 1024 for system)
- There is also a blind core memory called
scratchpad
that allows MIRA to log notes "It is Friday and the user asked me to provide them a report again again this week. Consider proactively providing that in their morning report on Fridays". MIRA can write to this block but cannot read it. Allowing MIRA to read the block would pollute the context. The contents ofscratchpad
are reviewed nightly during the consolidation routine and cleared for the next day.
Memory Passages (Archival Memory)
- Searchable long-term memories with vector embeddings
- Sources include conversations, documents, and automation systems
- Importance scoring for relevance weighting
- Access tracking to identify frequently used information
- Optional expiration dates for temporal facts
- Human verification flags for trusted information
- Time and Retrieval Delay for eventually pruning memories that have faded into obscurity
Archived Conversations
- Complete conversation history organized by date with sophisticated archiving system
- Pre-generated summaries at multiple time scales (daily, weekly, monthly)
- Efficient temporal indexing for quick retrieval
- Integration with working memory for contextual access
- Advanced conversation archiving with metadata preservation and retrieval optimization
Memory Consolidation Service
- Background service for automated memory organization and optimization
- Handles conversation archiving, memory block creation, and intelligent summarization
- Runs independently of active sessions to maintain system performance
- Manages memory lifecycle including aging, consolidation, and cleanup
Memory Management System
- Block manager for atomic memory units with similarity search and temporal organization
- Passage manager for text passages with embedding-based search and contextual retrieval
- Memory task system for background operations and optimization routines
Memory Snapshots
- Point-in-time captures of entire memory state
- Used for recovery, auditing, and debugging
- Enables rollback to previous memory states if needed
Semantic Search Capabilities:
- Uses OpenAI's text-embedding-3-small model (1024 dimensions) or BGE-base (1024 dimensions)
- pgvector's IVFFlat indexing for efficient similarity search
- Configurable similarity thresholds (default 0.6 for proactive surfacing)
- Multiple filter options (source, date range, importance level)
- Weighted context building that prioritizes recent conversation content
Integration Between Systems: The two memory systems work together through:
MemoryBridge
: Automatically injects core memory blocks into working memoryProactiveMemoryTrinket
: Uses semantic search to surface relevant memories based on conversation contextConversationArchiveManager
: Provides access to historical conversations when contextually relevant
This architecture enables MIRA to maintain both immediate conversational awareness and accumulated knowledge over time, creating a truly continuous interaction experience with intelligent background optimization.
MIRA's tool system is designed for both intelligent automation and developer ease-of-use. The system automatically determines which tools are relevant to a conversation and enables them just-in-time, while providing developers with a streamlined path to add new capabilities.
Tool Relevance Engine MIRA uses a sophisticated relevance engine that analyzes conversation context to determine which tools should be available:
- Embedding-Based Classification: Uses semantic similarity to match user messages with tool capabilities
- Context Persistence: Keeps relevant tools enabled for several message exchanges after initial activation
- Topic Change Detection: Automatically adjusts tool availability when conversation topics shift
- Training Data: System automatically creates high-quality synthetic examples for each tool that are verified for diversity and token length before finalization (no human in-the-loop)
Automatic Training When a new tool is added:
- MIRA examines the tool's description and parameters
- Generates synthetic conversation examples that would require the tool
- Trains its relevance classifier to recognize similar patterns
- The tool becomes available for automatic activation after application is reloaded
The tool system follows a consistent pattern that makes it easy to add new capabilities using natural language with AI assistance:
Standardized Tool Pattern
All tools inherit from a base Tool
class and implement a simple interface:
class MyTool(Tool):
name = "my_tool"
description = "What the tool does"
def run(self, **params):
# Tool implementation
return {"result": "success"}
Natural Language Development Workflow
- Load MIRA's codebase into an AI assistant like Claude Code
- Examine existing tools in the
tools/
directory for patterns - Reference the comprehensive guide at
tools/HOW_TO_BUILD_A_TOOL.md
for best practices - Review the collaborative programming framework at
tools/COLLABORATIVE-PROGRAMMING-FRAMEWORK.md
- Describe the desired functionality in natural language
- The AI assistant generates a complete tool implementation following MIRA's patterns
- Drop the completed tool file into the
tools/
directory if Claude doesn't put it there automatically - MIRA automatically discovers, loads, and trains itself to use the new tool
Automatic Integration
- Discovery: Tools are automatically discovered and loaded from the
tools/
directory - Configuration: Tools can register their own configuration schemas and system allows for dependency injection
- Error Handling: Consistent error handling with recovery guidance
- Documentation: Self-documenting through description and parameter schemas
- Testing: Standard testing patterns ensure reliability
Example Development Process A developer wanting to add Slack integration might say:
"I need a tool that can send messages to Slack channels, list channels, and get recent messages from a channel. It should use the Slack API and handle authentication with a bot token."
The AI assistant would then:
- Examine existing tools like
email_tool.py
for patterns - Generate a complete
slack_tool.py
implementation - Include proper error handling, configuration management, and OpenAI schema
- Provide the tool ready to drop into the
tools/
directory
This approach makes MIRA highly extensible while maintaining code quality and consistency across all tools.
A note from the developer: This rocks. It is amazing. You can imagine a tool and for like 40 cents in API calls later your new tool works with MIRA. I was going to give MIRA the ability to build its own tools with headless Claude Code and maybe that'll be a thing in the future but for now I made it dead simple to add tools relevant to your needs.
MIRA includes several built-in tools that are automatically enabled based on conversation context. Tools now use secure, user-isolated credential storage and data management:
Tool | Description | Storage Type | Auto-Discovery |
---|---|---|---|
Send and manage emails with secure credential storage | User credentials | ✓ | |
Calendar | Manage calendar events with CalDAV integration | User credentials | ✓ |
Weather | Get weather forecasts and heat stress calculations | API config | ✓ |
Web Access | Search the web and fetch page content | API config | ✓ |
Customer Database | Manage customer information with user-isolated SQLite | User-specific DB | ✓ |
Contacts | Manage personal contacts with file-based storage | User JSON files | ✓ |
Kasa | Control Kasa smart home devices | Device config | ✓ |
Square API | Business operations with secure API key storage | User credentials | ✓ |
Reminders | Create and manage reminders with file-based storage | User JSON files | ✓ |
Automation | Set up automated workflows | System config | ✓ |
Pager | Virtual paging system with TOFU authentication | User database | ✓ |
pytest
# Format code
black .
# Type checking
mypy .
# Linting
flake8
mira/
├── auth/ # Authentication system and user credentials
├── config/ # Configuration management and tenant isolation
├── lt_memory/ # Long-term memory system with consolidation
│ ├── managers/ # Memory management components
│ └── tools/ # Memory-related tools
├── setup/ # Installation and setup scripts
├── tools/ # Tool implementations with user isolation
├── tests/ # Test suite
├── utils/ # Utility modules including credential security
├── data/ # User data storage
│ └── users/ # User-specific data directories
├── docs/ # Documentation and implementation guides
│ └── implementation_plan_guides/ # Technical specifications
├── conversation.py # Core conversation logic
├── working_memory.py # Working memory system
├── userdata_manager.py # User-scoped data management
└── main.py # CLI entry point
-
PostgreSQL Connection Errors
- Ensure PostgreSQL is running:
sudo service postgresql status
- Verify database exists and user has permissions
- Check connection string in environment variables
- Ensure PostgreSQL is running:
-
Missing pgvector Extension
- Install pgvector for your PostgreSQL version
- Enable it in the lt_memory database:
CREATE EXTENSION vector;
-
Tool Errors
- Verify credentials are stored in the credential management system
- Check that user-specific data directories exist and are accessible
- Verify external services are accessible
- Review logs for detailed error messages
-
Credential Management Issues
- Ensure Vault is properly configured and accessible
- Verify user credentials are stored correctly
- Check file permissions on user data directories
Enable debug logging:
export AGENT_LOG_LEVEL=DEBUG
This project is licensed under the BSL License - see the LICENSE file for details.
The BGE (BAAI General Embedding) models are licensed under the MIT License:
- BGE-large-en-v1.5: https://huggingface.co/BAAI/bge-large-en-v1.5
- BGE-reranker-base: https://huggingface.co/BAAI/bge-reranker-base
These models are developed by the Beijing Academy of Artificial Intelligence (BAAI) and are freely available for both commercial and non-commercial use under the MIT License.