Autonomous BI Analyst

Autonomous BI Analyst is a full-stack prototype that turns natural-language questions into SQL, executes them safely against a PostgreSQL analytics database, and returns a human-readable answer. It combines:

a LangGraph orchestration workflow (the agent),
a Model Context Protocol (MCP) database tool server,
PostgreSQL with seeded analytics data,
and a Streamlit chatbot frontend.

Repository Layout

bi-agent/
├── docker-compose.yml
├── backend/
│   ├── agent/                        # LangGraph agent (FastAPI + graph)
│   │   ├── api/
│   │   │   ├── routes.py             # POST /chat endpoint
│   │   │   └── schemas.py            # Request / response Pydantic models
│   │   ├── config/
│   │   │   ├── prompts.py            # Centralised prompt strings
│   │   │   └── state.py              # AgentState TypedDict
│   │   ├── nodes/                    # One function per graph node
│   │   │   ├── guardrail.py
│   │   │   ├── handle_tool_result.py
│   │   │   ├── query_database.py
│   │   │   ├── retry.py
│   │   │   ├── security_warning.py
│   │   │   └── summarize.py
│   │   ├── routers/                  # LangGraph conditional edge functions
│   │   │   ├── guardrail.py
│   │   │   ├── query_database.py
│   │   │   └── tool_execution.py
│   │   ├── graph.py                  # Graph compilation
│   │   ├── app.py                    # FastAPI application factory
│   │   ├── Dockerfile
│   │   └── requirements.txt
│   ├── core/
│   │   ├── llm/                      # Provider-agnostic LLM abstraction
│   │   │   ├── base.py               # LLMEngine abstract base class
│   │   │   ├── factory.py            # EngineFactory registry
│   │   │   └── engines/
│   │   │       ├── langchain_anthropic.py      # Claude via LangChain
│   │   │       ├── langchain_azure_openai.py  # Azure OpenAI via LangChain
│   │   │       └── langchain_openai.py         # GPT via LangChain
│   │   ├── logger.py
│   │   └── schemas.py                # Shared ToolResponse schema
│   ├── mcp/                          # MCP tool server (streamable HTTP)
│   │   ├── tools/
│   │   │   ├── describe_table.py
│   │   │   ├── execute_readonly_query.py
│   │   │   └── list_tables.py
│   │   ├── client.py                 # Tool loader + cache
│   │   ├── server.py                 # FastMCP server entrypoint
│   │   ├── utils.py
│   │   ├── Dockerfile
│   │   └── requirements.txt
│   ├── postgres/
│   │   └── init/
│   │       ├── 001_schema.sql
│   │       ├── 002_seed_data.sql
│   │       └── 003_roles.sql
│   └── utils/
│       └── parsing.py
├── frontend/
│   └── chatbot/
│       ├── app.py
│       ├── Dockerfile
│       └── requirements.txt
└── tests/
    ├── integration/
    │   └── llm/
    │       ├── test_anthropic_engine.py    # Live API tests for Anthropic engine
    │       ├── test_azure_openai_engine.py # Live API tests for Azure OpenAI engine
    │       └── test_openai_engine.py       # Live API tests for OpenAI engine
    └── unit/
        ├── agent/
        │   └── test_graph_retry.py
        └── mcp/
            └── test_mcp_server.py

Docker Services

docker-compose.yml defines four services:

Service	Port	Description
`postgres`	5432	Analytics database
`mcp`	8001	MCP tool server (internal network)
`agent`	8000	LangGraph agent + FastAPI API
`frontend`	3000	Streamlit chatbot

# Start all services
docker compose up -d --build

# Stop
docker compose down

Agent Workflow

The LangGraph flow in backend/agent/graph.py:

guardrail — classifies whether the request is safe to process
route → query_database or security_warning
query_database — generates SQL and calls the MCP tool
route → execute_tools or finish
handle_tool_result — inspects tool output for Postgres errors
route → retry (up to 3×) or summarize
summarize — turns query results into a natural-language answer

LLM Engine Layer

backend/core/llm/ provides a provider-agnostic abstraction over LLM calls used throughout the agent.

How it works

LLMEngine (in base.py) uses the Template Method pattern. Each engine subclass implements _build_llm() to construct its provider-specific BaseChatModel; the base class stores it and exposes two concrete async methods covering the two usage patterns in the agent:

# Structured classification (e.g. guardrail node)
result = await engine.generate_structured(MySchema, messages)   # returns MySchema instance

# Tool-augmented generation (e.g. query_database node)
response = await engine.generate_with_tools(tools, messages)    # returns BaseMessage

The active engine is selected at runtime via the LLM_ENGINE environment variable — no code changes required when switching providers.

EngineFactory (in factory.py) maps string keys to engine classes:

# Explicit engine
engine = EngineFactory.create("langchain-anthropic", temperature=0)

# Engine from LLM_ENGINE env var (used by all agent nodes)
engine = EngineFactory.create_default()

Supported engines

Key	Class	Default model / deployment
`langchain-openai`	`LangChainOpenAIEngine`	`OPENAI_MODEL` env var / `gpt-4o`
`langchain-anthropic`	`LangChainAnthropicEngine`	`ANTHROPIC_MODEL` env var / `claude-sonnet-4-6`
`langchain-azure-openai`	`LangChainAzureOpenAIEngine`	`AZURE_OPENAI_DEPLOYMENT` env var

Environment variables

# Select the active engine (default: langchain-openai)
LLM_ENGINE=langchain-openai

# OpenAI
OPENAI_API_KEY=...
OPENAI_MODEL=gpt-4o                          # optional

# Anthropic
ANTHROPIC_API_KEY=...
ANTHROPIC_MODEL=claude-sonnet-4-6            # optional

# Azure OpenAI
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://<resource>.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT=your-deployment-name
AZURE_OPENAI_API_VERSION=2024-08-01-preview  # optional

Local Development & Testing

Setup

python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

This installs the full runtime and all dev dependencies (pytest, pytest-asyncio, python-dotenv) in one step.

Environment variables

Copy .env.example to .env and fill in your API keys (the file is gitignored):

cp .env.example .env

Running the tests

# All tests
pytest

# Unit tests only
pytest tests/unit/

# Integration tests only (make real API calls — requires valid keys in .env)
pytest -m integration

# Skip integration tests
pytest -m "not integration"

Test structure

tests/
├── unit/
│   ├── agent/
│   │   ├── graph/                        # Full graph wiring and routing
│   │   │   ├── test_happy_path.py        # Tool succeeds on first attempt
│   │   │   ├── test_retry.py             # Tool fails then recovers
│   │   │   └── test_security.py          # Guardrail blocks unsafe request
│   │   └── nodes/                        # Individual node input/output
│   │       ├── test_guardrail.py
│   │       ├── test_handle_tool_result.py
│   │       ├── test_query_database.py
│   │       ├── test_retry.py
│   │       ├── test_security_warning.py
│   │       └── test_summarize.py
│   └── mcp/
│       └── test_mcp_server.py
└── integration/
    └── llm/                              # Live API calls — requires valid keys
        ├── test_openai_engine.py
        ├── test_anthropic_engine.py
        └── test_azure_openai_engine.py

Unit tests run fully offline with mocked LLM calls and mocked MCP tools. They are split into two levels:

Node tests (tests/unit/agent/nodes/) — call each node function directly and assert on the partial state dict it returns. Fast and focused on a single node's logic.
Graph tests (tests/unit/agent/graph/) — compile and run the full LangGraph workflow. Verify that nodes are wired correctly, state is merged by the reducers, and conditional edges route as expected.

Integration tests (tests/integration/llm/) make real API calls to verify that each engine (langchain-openai, langchain-anthropic, langchain-azure-openai) can produce structured output. Each engine has an easy test (maths question with a definitive answer) and a difficult test (open-ended question that validates structure only).

Architecture & Design Principles

This codebase is designed to remain understandable and extensible as it grows. The decisions below are deliberate applications of standard software design principles.

Single Responsibility

Every module has one reason to change. Graph nodes (nodes/) only transform agent state. Routers (routers/) only decide which edge to take. Prompt text lives exclusively in config/prompts.py. If the prompt for a node changes, only that file changes — nothing else.

Open/Closed — the LLM engine registry

EngineFactory uses a registry dict to map string keys to engine classes. Adding a new LLM provider (e.g. Google Gemini) means:

Create engines/langchain_google.py implementing LLMEngine.
Add one line to the registry in factory.py.

No existing code is modified. Callers that already use EngineFactory.create(...) are unaffected.

Dependency Inversion

Nodes and services depend on the LLMEngine abstraction, never on ChatOpenAI or ChatAnthropic directly. The concrete engine is injected at the call site via EngineFactory. This means the guardrail node, the summariser, and any future node can be tested with a mock engine that implements the same interface — no real API calls required.

Interface Segregation

LLMEngine exposes exactly two methods — generate_structured and generate_with_tools — covering the two distinct patterns the agent needs. Engines are not forced to implement streaming, fine-tuning configuration, or token counting — concerns that belong to other abstractions. Each layer only knows what it needs.

Separation of Concerns across layers

Layer	Responsibility
`frontend`	User input / output rendering
`agent/api`	HTTP boundary — deserialise request, serialise response
`agent/nodes`	Business logic — state transitions
`core/llm`	LLM I/O — prompt rendering, structured output
`mcp`	Database I/O — tool definitions and execution
`postgres`	Persistence — schema and seed data

Each layer depends only on the layer below it, never sideways or upward.

MCP as an I/O boundary

The MCP server (backend/mcp/) is deployed as a separate Docker service and communicates over streamable HTTP. The agent loads tools at startup via client.py and treats them as opaque LangChain tools. This means the set of database tools can evolve — or be replaced entirely with a different data source — without touching the agent graph.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.opencode @ b999e7a		.opencode @ b999e7a
backend		backend
frontend/chatbot		frontend/chatbot
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autonomous BI Analyst

Repository Layout

Docker Services

Agent Workflow

LLM Engine Layer

How it works

Supported engines

Environment variables

Local Development & Testing

Setup

Environment variables

Running the tests

Test structure

Architecture & Design Principles

Single Responsibility

Open/Closed — the LLM engine registry

Dependency Inversion

Interface Segregation

Separation of Concerns across layers

MCP as an I/O boundary

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autonomous BI Analyst

Repository Layout

Docker Services

Agent Workflow

LLM Engine Layer

How it works

Supported engines

Environment variables

Local Development & Testing

Setup

Environment variables

Running the tests

Test structure

Architecture & Design Principles

Single Responsibility

Open/Closed — the LLM engine registry

Dependency Inversion

Interface Segregation

Separation of Concerns across layers

MCP as an I/O boundary

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages