Multi-Agent Healthcare Document Intelligence

A proof-of-concept multi-agent AI system that dynamically plans and executes document processing workflows for healthcare data. Upload a clinical PDF or Excel questionnaire — a pipeline of specialized agents extracts, classifies, and processes the content in real time, with every agent handoff and tool call visible in the browser as it happens.

What It Does

Input	Output
Clinical PDF (research paper, discharge note, etc.)	Concise prose summary rendered as HTML
Excel questionnaire / intake form	Numbered list of extracted questions

The system detects the document type automatically and routes it to the right agent — no manual selection required.

Architecture

Browser (HTMX + Bootstrap 5)
    │  WebSocket (Django Channels)
    ▼
Django View ──► Celery Task (Orchestrator)
                    │
                    ├─► Reader Agent ──► FastMCP Server
                    │       └── extract_pdf / extract_excel tools
                    │
                    ├─► Planner Agent ──► OpenRouter LLM
                    │       └── classifies intent: prose | questionnaire
                    │
                    ├─► Summarizer Agent ──► OpenRouter LLM  (prose path)
                    │
                    └─► Question Extractor Agent ──► OpenRouter LLM  (questionnaire path)

Key design decisions:

No JSON to the frontend. All state is communicated as server-rendered HTML fragments pushed over WebSocket via HTMX OOB swaps.
MCP for file I/O. The Reader Agent never reads files directly — it delegates to a sandboxed FastMCP server over HTTP. This keeps the agent layer free of filesystem concerns.
A2A messaging visible in the UI. Every agent handoff is logged in a real-time protocol viewer so the execution trace is fully observable.
Deterministic errors skip retries. Empty files, unsupported formats, and MCP ERROR: responses surface immediately rather than burning through Celery retry budget.

Tech Stack

Layer	Technology
Backend framework	Django 5 + Cookiecutter Django
Task queue	Celery + Redis
WebSocket	Django Channels (ASGI)
Frontend	HTMX + Bootstrap 5 (zero custom JS for state)
MCP server	FastMCP v2 (HTTP transport)
LLM gateway	OpenRouter (`mistralai/mistral-small-3.2-24b-instruct` → `deepseek/deepseek-chat` fallback)
PDF extraction	pymupdf4llm
Excel extraction	openpyxl
Markdown rendering	markdown-it-py
Database	PostgreSQL
Python	3.14
Containerisation	Docker Compose

Project Structure

multi_agent/
├── config/                    # Django settings, ASGI, Celery config
├── multi_agent/
│   └── orchestrator/
│       ├── agents/
│       │   ├── reader_agent.py        # MCP client — file extraction
│       │   ├── planner_agent.py       # LLM intent classification
│       │   ├── summarizer_agent.py    # LLM prose summarization
│       │   ├── extractor_agent.py     # LLM question extraction
│       │   ├── llm_client.py          # OpenRouter primary/fallback routing
│       │   └── ws_helpers.py          # WebSocket HTML push helpers
│       ├── tasks.py                   # Celery orchestration entry point
│       ├── consumers.py               # Django Channels WebSocket consumer
│       └── views.py                   # Upload endpoint
│   └── templates/orchestrator/
│       ├── dashboard.html
│       └── partials/                  # HTMX OOB swap fragments
├── mcp_server/
│   └── server.py                      # FastMCP tool definitions
└── tests/orchestrator/                # pytest test suite (94 tests)

Getting Started

Prerequisites

Docker and Docker Compose
An OpenRouter API key

1. Clone and configure

git clone <repo-url>
cd multi_agent
cp .envs/.local/.django.example .envs/.local/.django   # if example exists

Add your OpenRouter key to .envs/.local/.django:

OPENROUTER_API_KEY=sk-or-...

2. Build and start

docker-compose -f docker-compose.local.yml up --build

This starts five services: django, celery_worker, postgres, redis, mcp_server.

3. Open the app

Navigate to http://localhost:8000 and upload a PDF or Excel file.

Running Tests

All tests run inside Docker:

docker-compose -f docker-compose.local.yml run --rm django pytest tests/ -v

The suite covers all agent logic, template rendering, WebSocket status flows, and error-handling paths (94 tests).

How the Real-Time UI Works

Upload triggers a Celery task and returns an HTMX fragment that opens a WebSocket connection.
Each agent pushes HTML fragments to a Redis channel group (orchestrator_{task_id}).
Django Channels forwards those fragments to the browser over WebSocket.
HTMX applies them as out-of-band swaps — agent status cards update in place, the protocol log streams new entries, and the output panel is replaced with the final result.
The WebSocket closes automatically after the final output is delivered.

Workflow Examples

Clinical PDF (prose)

Reader  →  active (extracting PDF via MCP)
Reader  →  completed (12,340 chars extracted)
Planner →  active (classifying intent)
Planner →  completed (intent: prose)
Summarizer → active (generating clinical summary)
Summarizer → completed
Output panel: rendered markdown summary

Excel questionnaire

Reader    →  active (extracting Excel via MCP)
Reader    →  completed (890 chars extracted)
Planner   →  active (classifying intent)
Planner   →  completed (intent: questionnaire)
Extractor →  active (extracting questions)
Extractor →  completed (8 questions found)
Output panel: numbered question list

Empty or unreadable file

Reader → error (MCP returns ERROR: … or empty response)
Output panel: error card with human-readable message

No retry delay — deterministic file errors surface immediately.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
multi_agent		multi_agent
.gitignore		.gitignore
README.md		README.md
idea.md		idea.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent Healthcare Document Intelligence

What It Does

Architecture

Tech Stack

Project Structure

Getting Started

Prerequisites

1. Clone and configure

2. Build and start

3. Open the app

Running Tests

How the Real-Time UI Works

Workflow Examples

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Healthcare Document Intelligence

What It Does

Architecture

Tech Stack

Project Structure

Getting Started

Prerequisites

1. Clone and configure

2. Build and start

3. Open the app

Running Tests

How the Real-Time UI Works

Workflow Examples

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages