Skip to content

SaadxSalman/Bio-Stat-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Bio-Stat-Agent: Autonomous Biostatistics and Public Health Modeler πŸ“ˆπŸ©Ί

An agentic system that can autonomously design and execute biostatistical analyses for public health. Given a high-level research question, Bio-Stat-Agent gathers data from public health databases, selects and applies appropriate statistical models, and generates a comprehensive report with visualizations, making complex public health research accessible and efficient.


✨ Features

  • Autonomous Research: The system handles the entire research workflow from a high-level question to a final report.
  • Intelligent Data Retrieval: A Data Retrieval Agent connects to public health databases to gather relevant, up-to-date information.
  • Hypothesis Formulation: A Hypothesis Agent translates natural language research questions into formal statistical hypotheses.
  • Sophisticated Modeling: The Modeling Agent selects and runs appropriate biostatistical analyses, including correlation studies and epidemic modeling.
  • Comprehensive Reporting: The Reporting Agent synthesizes all findings into a clear, visual, and easy-to-understand report.
  • Expert-Level Knowledge: A specialized language model fine-tuned on biostatistical literature provides expert-level domain knowledge.

πŸ’» Final Stack

1. Frontend (Interface & Visualization)

2. Backend Core (Performance & Logic)

  • Runtime: Rust
  • Web Framework: Actix-web or Axum
  • Concurrency: Tokio
  • API Client: reqwest (to communicate with the Python Agent Bridge)

3. AI Agent Layer (Orchestration & Reasoning)

  • Agent Framework: smolagents (by Hugging Face)
  • Bridge API: FastAPI (Python)
  • Language Model: Gemma-3 (Fine-tuned for Biostatistics)
  • Inference: Hugging Face Inference Endpoints or local execution.

4. Data Layer (The Triple-Threat Storage)

  • Metadata DB: MongoDB (Stores user profiles, research history, and agent logs).
  • Vector DB: Qdrant (Stores embedded medical literature and documentation for RAG).
  • OLAP DB: ClickHouse (Handles massive public health datasets for lightning-fast statistical queries).

5. DevOps & Infrastructure

  • Containerization: Docker & Docker Compose
  • Environment: Node.js (Frontend), Cargo (Rust), Python 3.10+ (Agents).

πŸš€ Getting Started

Prerequisites

  • Rust
  • Node.js (for Next.js)
  • Python 3.10+ (for smolagents)

Installation

  1. Clone the repository:
    git clone https://github.com/saadsalmanakram/Bio-Stat-Agent.git
    cd Bio-Stat-Agent
  2. Set up the backend:
    cargo build --release
  3. Set up the frontend:
    cd frontend
    npm install
  4. Configure the agents: Follow the instructions to set up your chosen agent orchestration framework.

Configuration

Create a .env file to store your API keys and configuration variables for data sources and the language model.

Usage

Run the Next.js server and the Rust backend to start the agent, then submit a research question via the user interface.


To wrap everything up, here is the complete, final directory structure for Bio-Stat-Agent. This structure accounts for the hybrid Rust/Python/Node.js stack, the agentic orchestration, and the containerized database layer.

πŸ“‚ Final Project Structure

Bio-Stat-Agent/
β”œβ”€β”€ frontend/                       # Next.js App Router (UI)
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ app/                    # Routes (Home, Dashboard, Reports)
β”‚   β”‚   β”‚   β”œβ”€β”€ report/[id]/        # Dynamic report viewer
β”‚   β”‚   β”‚   └── page.tsx            # Main research input
β”‚   β”‚   β”œβ”€β”€ components/             # Shadcn/UI components & Charts
β”‚   β”‚   β”œβ”€β”€ hooks/                  # API fetch logic
β”‚   β”‚   └── lib/                    # Utils (Markdown parser, types)
β”‚   β”œβ”€β”€ public/                     # Static assets & saved plots
β”‚   β”œβ”€β”€ tailwind.config.ts
β”‚   └── package.json
β”‚
β”œβ”€β”€ backend-core/                   # Rust High-Performance Core
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ main.rs                 # Actix-web/Axum Entry Point
β”‚   β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”‚   └── bridge.rs           # Communication logic with Python
β”‚   β”‚   β”œβ”€β”€ stats/                  # High-speed math modules
β”‚   β”‚   β”œβ”€β”€ db/                     # MongoDB & Qdrant clients
β”‚   β”‚   └── models/                 # Rust structs for hypotheses/data
β”‚   β”œβ”€β”€ Cargo.toml
β”‚   └── .env
β”‚
β”œβ”€β”€ ai-agents/                      # Python Agent Orchestration
β”‚   β”œβ”€β”€ main.py                     # FastAPI Bridge & Orchestrator
β”‚   β”œβ”€β”€ agents.py                   # Data Retrieval Agent
β”‚   β”œβ”€β”€ hypothesis_agent.py         # Hypothesis Generation Agent
β”‚   β”œβ”€β”€ modeling_agent.py           # Statistical Modeling Agent
β”‚   β”œβ”€β”€ reporting_agent.py          # Report Synthesis Agent
β”‚   β”œβ”€β”€ tools.py                    # Custom Python tools (CDC API, etc.)
β”‚   β”œβ”€β”€ temp_data.csv               # Cache for active analysis
β”‚   └── requirements.txt
β”‚
β”œβ”€β”€ docker-compose.yml              # MongoDB, Qdrant, and ClickHouse
└── README.md                       # Documentation


About

Bio-Stat-Agent: Autonomous Biostatistics and Public Health Modeler πŸ“ˆπŸ©Ί

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published