Skip to content

Prometheus2677/medworkflow-ai

Repository files navigation

Healthcare AI RAG Prototype

A simple, demo-ready RAG (Retrieval-Augmented Generation) system that answers patient-specific questions using patient data stored in Postgres + pgvector together with LLM capabilities.

The project is built with FastAPI microservices and designed as a lightweight healthcare AI architecture for demos, experimentation, and future production expansion.


Services

data_service (8000)

Responsible for:

  • Loading patient/FHIR JSON data
  • Creating text chunks from notes, medications, and lab results
  • Generating embeddings
  • Storing chunks and embeddings in Postgres + pgvector

vector_service (8001)

Responsible for:

  • Performing vector similarity search
  • Retrieving the most relevant patient chunks
  • Returning top-k matching records

ai_service (8002)

Responsible for:

  • Embedding user questions
  • Retrieving relevant context from vector_service
  • Building prompts
  • Generating grounded responses using OpenAI models

The AI service is stateless and can be scaled horizontally.


api_gateway (8003)

Responsible for:

  • Acting as the entry point for frontend or staff applications
  • Forwarding requests to the AI service
  • Providing a simplified external API layer

Architecture Overview

                +-------------------+
                |   API Gateway     |
                |      :8003        |
                +---------+---------+
                          |
                          v
                +-------------------+
                |    AI Service     |
                |      :8002        |
                +---------+---------+
                          |
          +---------------+----------------+
          |                                |
          v                                v
+-------------------+          +-------------------+
|  OpenAI API       |          |  Vector Service   |
| Embeddings + LLM  |          |      :8001        |
+-------------------+          +---------+---------+
                                          |
                                          v
                               +-------------------+
                               | Postgres +        |
                               | pgvector          |
                               +---------+---------+
                                         ^
                                         |
                               +---------+---------+
                               |   Data Service    |
                               |      :8000        |
                               +-------------------+
flowchart TB
  Client[User / Frontend]
  Gateway[API Gateway\nPOST /query]
  AI[AI Service\nPOST /ask]
  Vector[Vector Service\nPOST /search]
  Data[Data Service\nPOST /load_patient_data]
  DB[Postgres + pgvector]
  OpenAI[OpenAI API]

  Client -->|POST /query| Gateway
  Gateway -->|patient_id + question| AI
  AI -->|question embedding| Vector
  Vector -->|retrieve chunks| DB
  AI -->|send prompt + receive answer| OpenAI
  Data -->|ingest + store embeddings| DB
Loading

Prerequisites

  • Docker
  • Docker Compose
  • OpenAI API key

Create a .env file and configure:

  • OPENAI_API_KEY
  • VECTOR_DIM
  • EMBEDDING_MODEL
  • LLM_MODEL

Quick Setup

1. Build and start the stack

docker compose build --no-cache
docker compose up -d

2. Verify running services

docker compose ps

Configuration Priority

Configuration values are loaded in the following order:

  1. Runtime environment variables
  2. Values from .env
  3. Default values defined in the services

Database

The project stores patient chunks and embeddings in Postgres using pgvector.

Make sure the vector dimension matches the embedding model being used.

Examples:

Embedding Model Vector Dimension
text-embedding-3-small 1536
text-embedding-3-large 3072

End-to-End Verification

Load sample patient data

curl -X POST "http://localhost:8000/load_patient_data" \
  -H "Content-Type: application/json" \
  -d @sample_patient.json

Expected response:

  • status: ok
  • chunks_loaded > 0

Query through the gateway

curl -X POST "http://localhost:8003/query" \
  -H "Content-Type: application/json" \
  -d '{"patient_id":"patient-001","question":"Does the patient have any chronic conditions?"}'

Expected response:

  • Generated answer
  • Retrieved context chunks

Direct Service Testing

Query vector service directly

curl -X POST "http://localhost:8001/search" \
  -H "Content-Type: application/json" \
  -d '{"patient_id":"patient-001","query_embedding":[0.0,0.0,0.0],"top_k":3}'

Query AI service directly

curl -X POST "http://localhost:8002/ask" \
  -H "Content-Type: application/json" \
  -d '{"patient_id":"patient-001","question":"What medications is the patient taking?"}'

Postgres Inspection

Inspect the schema and vector dimensions:

docker compose exec postgres \
psql -U postgres -d healthcare \
-c "\d+ patient_chunks"

Troubleshooting

Vector dimension mismatch

If you see errors related to vector dimensions:

  • Ensure VECTOR_DIM matches the embedding model
  • Ensure the Postgres vector column dimension matches the configured model

Example mismatch:

  • Database uses vector(1536)
  • Application generates 3072 dimension embeddings

Missing dependencies

If services fail because of missing packages:

docker compose up --build --force-recreate -d

Suggested Improvements

  • Authentication and authorization
  • HIPAA/security controls
  • Audit logging
  • Redis caching
  • Better chunking strategies
  • Metadata filtering
  • Multi-patient isolation
  • Streaming responses
  • Kubernetes deployment
  • Direct FHIR API integration

Summary

This project demonstrates a lightweight healthcare AI RAG architecture using:

  • FastAPI microservices
  • Postgres + pgvector
  • OpenAI embeddings and LLMs
  • Retrieval-augmented generation
  • Docker-based deployment

It provides a clean foundation for building scalable healthcare AI systems that can answer patient-specific questions using grounded medical context.

About

A microservices-based healthcare AI prototype using RAG to assist care teams with patient workflow automation and note summarization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors