An agentic system that can autonomously design and execute biostatistical analyses for public health. Given a high-level research question, Bio-Stat-Agent gathers data from public health databases, selects and applies appropriate statistical models, and generates a comprehensive report with visualizations, making complex public health research accessible and efficient.
- Autonomous Research: The system handles the entire research workflow from a high-level question to a final report.
- Intelligent Data Retrieval: A Data Retrieval Agent connects to public health databases to gather relevant, up-to-date information.
- Hypothesis Formulation: A Hypothesis Agent translates natural language research questions into formal statistical hypotheses.
- Sophisticated Modeling: The Modeling Agent selects and runs appropriate biostatistical analyses, including correlation studies and epidemic modeling.
- Comprehensive Reporting: The Reporting Agent synthesizes all findings into a clear, visual, and easy-to-understand report.
- Expert-Level Knowledge: A specialized language model fine-tuned on biostatistical literature provides expert-level domain knowledge.
- Framework: Next.js 14/15 (App Router)
- Language: TypeScript
- Styling: Tailwind CSS
- UI Components: Radix UI / shadcn/ui
- Data Visualization: Recharts (for interactive dashboards)
- Markdown Rendering:
react-markdown(to display agent-generated reports)
- Runtime: Rust
- Web Framework: Actix-web or Axum
- Concurrency: Tokio
- API Client:
reqwest(to communicate with the Python Agent Bridge)
- Agent Framework: smolagents (by Hugging Face)
- Bridge API: FastAPI (Python)
- Language Model: Gemma-3 (Fine-tuned for Biostatistics)
- Inference: Hugging Face Inference Endpoints or local execution.
- Metadata DB: MongoDB (Stores user profiles, research history, and agent logs).
- Vector DB: Qdrant (Stores embedded medical literature and documentation for RAG).
- OLAP DB: ClickHouse (Handles massive public health datasets for lightning-fast statistical queries).
- Containerization: Docker & Docker Compose
- Environment: Node.js (Frontend), Cargo (Rust), Python 3.10+ (Agents).
- Rust
- Node.js (for Next.js)
- Python 3.10+ (for smolagents)
- Clone the repository:
git clone https://github.com/saadsalmanakram/Bio-Stat-Agent.git cd Bio-Stat-Agent - Set up the backend:
cargo build --release
- Set up the frontend:
cd frontend npm install - Configure the agents: Follow the instructions to set up your chosen agent orchestration framework.
Create a .env file to store your API keys and configuration variables for data sources and the language model.
Run the Next.js server and the Rust backend to start the agent, then submit a research question via the user interface.
To wrap everything up, here is the complete, final directory structure for Bio-Stat-Agent. This structure accounts for the hybrid Rust/Python/Node.js stack, the agentic orchestration, and the containerized database layer.
Bio-Stat-Agent/
βββ frontend/ # Next.js App Router (UI)
β βββ src/
β β βββ app/ # Routes (Home, Dashboard, Reports)
β β β βββ report/[id]/ # Dynamic report viewer
β β β βββ page.tsx # Main research input
β β βββ components/ # Shadcn/UI components & Charts
β β βββ hooks/ # API fetch logic
β β βββ lib/ # Utils (Markdown parser, types)
β βββ public/ # Static assets & saved plots
β βββ tailwind.config.ts
β βββ package.json
β
βββ backend-core/ # Rust High-Performance Core
β βββ src/
β β βββ main.rs # Actix-web/Axum Entry Point
β β βββ agents/
β β β βββ bridge.rs # Communication logic with Python
β β βββ stats/ # High-speed math modules
β β βββ db/ # MongoDB & Qdrant clients
β β βββ models/ # Rust structs for hypotheses/data
β βββ Cargo.toml
β βββ .env
β
βββ ai-agents/ # Python Agent Orchestration
β βββ main.py # FastAPI Bridge & Orchestrator
β βββ agents.py # Data Retrieval Agent
β βββ hypothesis_agent.py # Hypothesis Generation Agent
β βββ modeling_agent.py # Statistical Modeling Agent
β βββ reporting_agent.py # Report Synthesis Agent
β βββ tools.py # Custom Python tools (CDC API, etc.)
β βββ temp_data.csv # Cache for active analysis
β βββ requirements.txt
β
βββ docker-compose.yml # MongoDB, Qdrant, and ClickHouse
βββ README.md # Documentation