A backend-first AI medical platform that provides safe, educational health support across multiple medical departments. Built with FastAPI + OpenRouter + RAG pipeline using ChromaDB and HuggingFace embeddings, with MongoDB for persistent session storage.
This platform is for general educational purposes only. It is not a diagnostic tool, does not replace a licensed physician, and does not provide treatment or medication advice. Always consult a qualified healthcare professional for medical concerns.
- π¬ Conversational chat with multi-turn memory (in-memory, per session)
- π₯ Multi-tenant support β each department has its own knowledge base and prompt
- π§ LLM via OpenRouter (any model supported)
- π PDF ingestion pipeline β upload medical documents per department
- π RAG (Retrieval-Augmented Generation) β answers grounded in your documents
- ποΈ ChromaDB local vector store (one collection per tenant)
- π€ HuggingFace local embeddings (multilingual β Arabic + English)
- ποΈ MongoDB for persistent session storage (via Docker)
- π¨ Emergency escalation for red-flag symptoms
- π Out-of-scope refusal for unsafe requests
- π Plain HTML/CSS/JS frontend served by FastAPI
| Tenant ID | Department |
|---|---|
liver |
Liver Care / Hepatology |
cardiology |
Cardiology |
nephrology |
Nephrology |
Tenants are configured via ALLOWED_TENANTS in .env.
medical-platform/
βββ docker/
β βββ docker-compose.yml # MongoDB container setup
β βββ mongodb_data/ # Persistent MongoDB data volume
βββ src/
β βββ backend/
β β βββ main.py # FastAPI app entrypoint
β β βββ core/
β β β βββ config.py # App settings via pydantic-settings
β β β βββ logger.py # Centralized file + console logging
β β β βββ prompts.py # Dynamic tenant-aware prompt builder
β β βββ database/
β β β βββ mongodb.py # MongoDB client + collections + connection check
β β βββ enums/
β β β βββ chat.py # MessageRole enum
β β β βββ responses.py # ResponseSignal error codes
β β βββ schemas/
β β β βββ chat.py # Request/response Pydantic models
β β βββ routers/
β β β βββ chat.py # Chat endpoints
β β β βββ ingestion.py # PDF upload & management endpoints
β β βββ services/
β β β βββ chat_service.py # Session management + LLM orchestration
β β β βββ ingestion_service.py # PDF ingestion pipeline
β β β βββ orchestrator.py # RAG orchestrator
β β βββ providers/
β β β βββ llm_provider.py # OpenRouter API wrapper
β β β βββ embeddings.py # HuggingFace local embeddings
β β β βββ vector_store.py # ChromaDB multi-tenant wrapper
β β βββ utils/
β β βββ disk.py # File/disk utilities
β β βββ pdf_processor.py # PDF text extraction + chunking
β βββ frontend/
β βββ index.html # Chat UI with department selector
β βββ style.css # Styling
β βββ config.json # Frontend config (API URL)
βββ requirements.txt
βββ .env.example
βββ README.md
---
## Quickstart
### 1. Clone & navigate
```bash
git clone <your-repo-url>
cd medical-platform
conda create -n medical-platform python=3.11 -y
conda activate medical-platformpip install -r requirements.txtcp .env.example .env
# Edit .env and add your OPENROUTER_API_KEYcd docker
docker compose up -d
cd ..cd src/backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000
---
## API Reference
> All endpoints require the `X-Tenant-ID` header.
> Example: `X-Tenant-ID: liver`
---
### Chat
#### `POST /api/v1/chat`
**Headers:**
X-Tenant-ID: liver
Request:
{
"session_id": "user-abc-123",
"message": "What foods should I avoid with liver disease?"
}Response:
{
"session_id": "user-abc-123",
"reply": "For liver disease, it is generally recommended to avoid...",
"turn_count": 2
}Clear conversation history for a session.
Health check endpoint.
Upload a PDF and add it to the tenant's vector store.
Headers:
X-Tenant-ID: liver
Request: multipart/form-data
file: PDF file
Response:
{
"tenant_id": "liver",
"file_name": "liver_guidelines.pdf",
"chunks_count": 9,
"status": "success"
}Delete a specific document from the tenant's vector store.
Headers:
X-Tenant-ID: liver
Response:
{
"status": "success",
"message": "Document 'liver_guidelines.pdf' deleted from tenant 'liver'."
}Get the number of chunks stored for a tenant.
Headers:
X-Tenant-ID: liver
Response:
{
"tenant_id": "liver",
"chunks_in_store": 9
}Each department (tenant) has:
- Its own ChromaDB collection β documents are isolated per tenant
- Its own system prompt β LLM is specialized per department
- Its own document knowledge base β upload PDFs per department
The tenant is identified via the X-Tenant-ID header in every request.
X-Tenant-ID: liver β liver collection + liver prompt
X-Tenant-ID: cardiology β cardiology collection + cardiology prompt
X-Tenant-ID: nephrology β nephrology collection + nephrology prompt
β Upload PDF via POST /api/v1/ingestion/upload (with X-Tenant-ID)
β
β‘ Text extracted from PDF (PyMuPDF)
β
β’ Text split into chunks (500 words, 50 overlap)
β
β£ Chunks converted to vectors (HuggingFace local model)
β
β€ Vectors stored in tenant's ChromaDB collection
β
β₯ User asks a question via POST /api/v1/chat (with X-Tenant-ID)
β
β¦ Question converted to vector
β
β§ Top 3 closest chunks retrieved from tenant's collection
β
β¨ Chunks injected into LLM prompt as context
β
β© LLM answers based on the document content
If no documents are uploaded for a tenant, the chatbot falls back to general LLM knowledge.
| Variable | Default | Description |
|---|---|---|
OPENROUTER_API_KEY |
(required) | Your OpenRouter API key |
OPENROUTER_BASE_URL |
https://openrouter.ai/api/v1 |
OpenRouter base URL |
LLM_MODEL |
arcee-ai/trinity-large-preview:free |
Model identifier |
LLM_MAX_TOKENS |
1024 |
Max tokens in LLM response |
LLM_TEMPERATURE |
0.3 |
LLM temperature |
APP_HOST |
0.0.0.0 |
Uvicorn host |
APP_PORT |
8000 |
Uvicorn port |
SESSION_MAX_TURNS |
20 |
Max conversation turns per session |
ALLOWED_TENANTS |
["liver","cardiology","nephrology"] |
Allowed tenant IDs |
VECTOR_STORE_PATH |
./vector_store |
ChromaDB storage path |
EMBEDDING_MODEL |
paraphrase-multilingual-MiniLM-L12-v2 |
HuggingFace model |
CHUNK_SIZE |
500 |
Words per chunk |
CHUNK_OVERLAP |
50 |
Overlapping words between chunks |
RETRIEVAL_TOP_K |
3 |
Chunks retrieved per query |
UPLOAD_DIR |
./uploads |
PDF upload directory |
LOG_LEVEL |
DEBUG |
Logging level |
LOG_FILE |
./logs/app.log |
Log file path |
| Code | Meaning |
|---|---|
ERR-1000 |
Internal server error |
ERR-1001 |
Bad gateway |
ERR-1002 |
Service unavailable |
ERR-2002 |
Session not found |
ERR-3001 |
LLM call failed |
ERR-3002 |
LLM rate limit exceeded |
ERR-3003 |
LLM connection error |
ERR-4000 |
File type not supported |
ERR-4001 |
File upload failed |
ERR-4002 |
Document has no readable text |
ERR-5000 |
Invalid input |
ERR-5001 |
Message too long |
The platform enforces safety at the prompt level:
- Each tenant has a specialized system prompt scoped to its department
- Explicit prohibition of diagnosis, prescriptions, and unsafe claims
- Emergency symptoms trigger immediate urgent-care escalation
- Out-of-scope questions are politely refused
- RAG context is clearly separated from general knowledge
| Component | Technology |
|---|---|
| Backend | FastAPI + Uvicorn |
| LLM | OpenRouter (any model) |
| Embeddings | HuggingFace paraphrase-multilingual-MiniLM-L12-v2 |
| Vector DB | ChromaDB (local, multi-tenant) |
| PDF Processing | PyMuPDF |
| Settings | pydantic-settings |
| Frontend | Plain HTML/CSS/JS |
MIT