AI Data Analyst with Multi-Agent System

A system of specialized AI agents that work together to analyze databases and create visualizations. Ask questions in natural language, get SQL-powered insights with charts.

Key Features

CLI & Web UI - Typer CLI + Streamlit interface
User Permissions - YAML-based access control to specific databases
Smart SQL - Automatic error correction with LLM (up to 3 retries)
Company Branding - Configurable chart styles, colors, and metadata injection
Automated Tests and simple eval - Evaluation framework with infra, qualitative and quantitative analysis.
Type-Safe Agents - Structured JSON outputs with Pydantic validation

Documentation

Extended README - Detailed CLI reference, configuration, troubleshooting
Architecture & Design - Implementation details, guardrails, design decisions. Alongside adding new database and its configuration
Deliverables - How assignment requirements are met
Limitations - Current constraints and future improvements

Quick Setup

# 1. Install dependencies
uv sync

# 2. Set API key
export OPENROUTER_API_KEY="your-key-here"
# Can also put it as .env. See .env.example for the structure.

# 3. Run tests
uv run pytest tests/ -v

# 4. Try a query
uv run python -m src.cli analyze "What are the top 5 products by revenue?" --database sales_demo

Basic Usage

# List available databases
uv run python -m src.cli list-databases

# Ask a question (auto-generates SQL, runs query, creates viz if appropriate)
uv run python -m src.cli analyze "Show me sales by category" --user alice --database sales_demo

# View database schema
uv run python -m src.cli schema sales_demo

# Create standalone visualization
uv run python -m src.cli visualize examples/sample_analysis_result.json

# Launch web interface
uv run streamlit run app.py

Agent Flow

The system follows a deterministic 6-step pipeline:

1. Permission Check → Validate user can access database
2. Schema Retrieval → Get table/column information + company metadata
3. Clarity Check → Structured JSON output (ClarityResult: can we answer?)
4. Complexity Classification → Structured JSON output (ComplexityResult: simple vs complex)
5. Analysis → ReACT style CodeAgent which can generate and executes SQL (with auto-correction) + generates summary. Followed by guardrails
6. Visualization → CodeAgent creates charts if appropriate (with company style)

Evaluation Framework

The project includes automated testing:

Infrastructure Tests:

Permission system validation
Database operations
Visualization generation

Agent Evaluation:

Clarity Agent: 10 test cases → 100% accuracy
Complexity Agent: 14 test cases → 100% accuracy
SQL Generation: 20 test cases → Flexible/Strict matching

Run all tests:

# Quick tests (infrastructure only)
uv run pytest tests/ -v

# Full evaluation (includes LLM-based agent tests, ~30-60 min)
uv run python scripts/run_tests.py

# Specific agent evaluation
uv run python -m tests.evaluation.evaluate_clarity
uv run python -m tests.evaluation.evaluate_complexity

Future Direction

- Example-based prompting (few-shot). Current prompt has no examples, but in the futute one could fetch similar queries (or more invovled hydra like mechansim)
- Longer-term memory / personalization. Could be useful to build metadata around database

Credits

Built with smolagents, LiteLLM, SQLAlchemy, Matplotlib, Typer, Streamlit

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
databases		databases
examples		examples
readme		readme
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
ArchDiagram.png		ArchDiagram.png
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Data Analyst with Multi-Agent System

Key Features

Documentation

Quick Setup

Basic Usage

Agent Flow

Evaluation Framework

Future Direction

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Data Analyst with Multi-Agent System

Key Features

Documentation

Quick Setup

Basic Usage

Agent Flow

Evaluation Framework

Future Direction

Credits

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages