A domain-agnostic agentic operating system with a financial specialist
Status: Phase 1 - Foundation COMPLETE (MVP 1.0) Last Updated: 2026-02-01
For New Developers/LLMs: BEFORE RUNNING ANYTHING REMEMBER TO ACTIVATE VENV
- Start with README.md (this file)
- Read docs/ROADMAP.md (what we're building)
- Read docs/STATUS.md (current state)
- Read docs/AGENTOS_SPEC.md (how to build)
- Read docs/STRUCTURE.md (where things are)
Omni-Finn is a two-layer autonomous system:
-
AgentOS - A reusable agentic framework (the "operating system")
- Provides orchestration, skill management, persistent memory
- Domain-agnostic - can power any type of agent
- Built on LangGraph + Local Ollama LLMs
- NEW: Three-tier memory system (HOT/WARM/COLD) with LanceDB
-
Agent Finn - A financial portfolio manager (the "application")
- Autonomous data ingestion from bank statements
- Zero-error financial calculations (decimal precision)
- Proactive research and delta reports
Current Focus: Building AgentOS first, then integrating Finn
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent Finn (Financial Specialist) β
β β’ Portfolio management β
β β’ Data ingestion & reconciliation β
β β’ Research & analysis β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AgentOS (Domain-Agnostic Framework) β
β β’ Classifier Node (Task vs Question Routing) [NEW] β
β β’ Planner β Actor β Auditor workflow β
β β’ Enhanced Auditor with Verification Strategies [NEW] β
β β’ Skill registry & execution β
β β’ Persistent memory (NOW.md + LOG.md + SQLite + LanceDB) β
β β’ Self-healing loops (Phase 2) β
β β’ Multi-agent orchestration (Phase 2) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Infrastructure β
β β’ Local Ollama LLMs (RTX 4090) β
β β’ LangGraph for state management β
β β’ SQLite for structured memory & facts β
β β’ LanceDB for semantic search (Cold Memory) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Required
- Python 3.10+ (3.14 supported)
- Ollama running locally
- Models: gpt-oss:20b, llama3.1:8b
# Check Ollama is running
curl http://192.168.4.102:11434/api/tags# Clone and setup
git clone <repo-url>
cd Agent-FIN
# Create virtual environment
python -m venv venv
# Windows
.\venv\Scripts\Activate.ps1
# Linux/Mac
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Verify installation
python -c "import pydantic, yaml; print('β
Dependencies OK')"
# Configure environment
cp .env.example .env
# Edit .env with your Ollama URL and model namesCreate .env file:
# LLM Provider
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://192.168.4.102:11434
# Models
REASONING_MODEL=gpt-oss:20b
PARSER_MODEL=llama3.1:8b
TOOL_MODEL=llama3.1:8b
# Observability
ENABLE_OBSERVABILITY=true# Quick tests (mock provider)
python tests/test_auditor.py
python tests/test_graph_routing.py
python tests/test_actor.py
# Integration tests (requires Ollama)
python tests/test_planner_pydantic.py
python tests/test_agent_integration.py# Run a simple workflow
python core/engine.py "Create a file named hello.txt with content 'AgentOS is working'"
# OR via Agent class (Recommended)
# See scripts or interactive shell/Agent-FIN
βββ /core # AgentOS - Domain-Agnostic Framework
β βββ engine.py # Main entry point
β βββ agent.py # Agent class definition
β βββ graph.py # LangGraph workflow definition
β βββ state.py # AgentState schema
β βββ models.py # Pydantic models (Plan, PlanStep, etc.)
β βββ memory_manager.py # Persistent memory system (LanceDB)
β βββ memory_schema.sql # Memory database schema
β βββ llm.py # LLM provider interface
β βββ two_stage_client.py # Reasoning + parsing pipeline
β βββ observability.py # Tracing & monitoring
β βββ /nodes # Execution nodes
β β βββ planner.py # Intent β Plan
β β βββ actor.py # Plan β Execution
β β βββ auditor.py # Execution β Verification
β βββ /skills/memory # Native memory skills
β
βββ /docs # π All Documentation
β βββ ROADMAP.md # Development roadmap
β βββ STATUS.md # Current status
β βββ AGENTOS_SPEC.md # Technical specification
β βββ STRUCTURE.md # File organization
β βββ MEMORY_SYSTEM.md # Memory architecture
β βββ AGENT_STRUCTURE_STANDARD.md # Agent directory standard
β βββ /legacy # Archived documentation
β
βββ /finn # Agent Finn - Financial Specialist
β βββ /config # Agent configuration & SOPs
β βββ /skills # Finn-specific skills
β βββ /directives # SOPs & procedures
β βββ /memory # Agent memory (NOW.md, LOG.md, memory.db)
β βββ /inbox # Watch folder for ingestion
β
βββ /tests # Test suite
β βββ README.md # Test documentation
β βββ test_*.py # Test files
β βββ /results # Test outputs & artifacts
β
βββ README.md # This file - start here
βββ requirements.txt # Python dependencies
βββ .env # Environment configuration
- docs/ROADMAP.md - Development plan, priorities, milestones
- docs/STATUS.md - Current state, progress, known issues
- docs/AGENTOS_SPEC.md - Technical specification
- docs/STRUCTURE.md - File system organization
- docs/MEMORY_SYSTEM.md - Memory architecture
- docs/AGENT_STRUCTURE_STANDARD.md - Directory standards
- finn/config/PRD.md - Product vision & requirements
- finn/config/AGENTS.md - Agent architecture philosophy
- tests/README.md - Test suite documentation
- β
Intent Classification (Smart Routing)
- Distinguishes between Tasks ("Create file") and Questions ("What is X?")
- Routes Questions to fast-path
Respondernode - Routes Tasks to full
Plannerloop
- β
Enhanced Auditor (Reliable Verification)
- Verifies actual side-effects (file creation, content presence)
- Uses strategy pattern (
verify_file_exists,verify_content, etc.)
- β Planner β Actor β Auditor workflow
- β Agent-Engine Integration (Agent class drives workflow)
- β Two-stage reasoning (gpt-oss:20b + llama3.1:8b)
- β LangGraph state management
- β
Persistent Memory System (HOT/WARM/COLD architecture)
- NOW.md for current status
- LOG.md for activity history
- SQLite for user facts & metadata
- LanceDB for semantic search (Cold Memory)
- Auto-logging & context injection
- Self-annealing error recovery
- β Skill registry system with metadata discovery
- β Memory skills (update_status, log_activity, save_fact, etc.)
- β Test suite (All passing)
- π Phase 2: Self-Healing Loops
- β Multi-agent support (Phase 2)
- β Agent Finn integration (Phase 3)
See docs/STATUS.md for detailed progress
Goal: Build tested, reusable agentic core
- 1A: Core workflow β 100%
- 1B: Skill Registry & Integration β 100%
- 1C: Intent Classification β 100% (Smart Routing)
- 1D: Enhanced auditor β 100% (Strategy Verification)
- 1E: Persistent memory β 100% (LanceDB)
- Self-healing loops
- Multi-agent support
- Advanced workflow patterns
- Financial skill catalog
- Ingestion pipeline
- Sub-agents (Accountant, OCR, Researcher)
- Observability & monitoring
- Performance optimization
- User interface
See docs/ROADMAP.md for detailed breakdown
# Run all quick tests (<5 seconds)
python tests/test_auditor.py && \
python tests/test_graph_routing.py && \
python tests/test_actor.py
# Run integration tests (requires Ollama, ~30 seconds)
python tests/test_planner_pydantic.py
python tests/test_agent_integration.py
# Run full E2E test (requires Ollama, ~2-3 minutes)
python tests/test_e2e_workflow.pyTest Coverage: ~85%
- Unit tests: Passing
- Integration tests: Passing
- End-to-end tests: Passing
See tests/README.md for detailed test documentation
| Component | Technology | Purpose |
|---|---|---|
| Orchestration | LangGraph | State management & workflow routing |
| LLM Server | Ollama | Local LLM inference |
| Reasoning Model | gpt-oss:20b | Planning & high-level reasoning |
| Tool Model | llama3.1:8b | Code generation & structured output |
| Validation | Pydantic V2 | Type-safe data models |
| Database | SQLite | Portfolio data + memory storage |
| Vector DB | LanceDB | Semantic memory (cold tier) |
| Observability | LangSmith | Tracing & debugging |
-
Read the docs
- Start with docs/ROADMAP.md
- Check docs/STATUS.md for current state
- Review docs/AGENTOS_SPEC.md for technical details
-
Pick a task
- See docs/ROADMAP.md for current sprint
- Check GitHub issues (if available)
-
Write tests first
- Follow existing test patterns in
/tests - See tests/README.md for guidelines
- Follow existing test patterns in
-
Submit PR (if applicable)
- Include tests
- Update documentation
- Follow Pydantic V2 patterns
- AgentOS = Domain-agnostic infrastructure
- Agents = Domain-specific knowledge & skills
- Never mix domain logic into core
- All capabilities as discoverable skills
- Metadata-driven skill registry
- Dynamic skill loading & execution
- All models configurable via environment
- Support for local & cloud LLMs
- Graceful degradation
-
80% code coverage target
- Tests for every component
- Mock for speed, Ollama for integration
- Trace every LLM call
- Log all state transitions
- Debug-friendly error messages
- Auditor Loop - Failed verification doesn't yet trigger automatic retry/healing (Phase 2)
- Context Window - Very long conversation histories may hit context limits (Need to implement summary rollover)
See docs/STATUS.md for full list
- Ollama: https://ollama.ai
- LangGraph: https://langchain-ai.github.io/langgraph/
- Pydantic: https://docs.pydantic.dev/latest/
- LanceDB: https://lancedb.com/
[To be determined]
- Development: [Your Name]
- Architecture: AI-Assisted Design
- Infrastructure: RTX 4090 Homelab
For questions about:
- AgentOS Core: See docs/AGENTOS_SPEC.md
- Agent Finn: See finn/config/PRD.md
- Development: See docs/ROADMAP.md
- Current Status: See docs/STATUS.md
Last Updated: 2026-02-01
Version: 1.0 (MVP - Phase 1 Complete)
Next Milestone: Self-Healing Loops (Phase 2)