Healthcare AI RAG Prototype

A simple, demo-ready RAG (Retrieval-Augmented Generation) system that answers patient-specific questions using patient data stored in Postgres + pgvector together with LLM capabilities.

The project is built with FastAPI microservices and designed as a lightweight healthcare AI architecture for demos, experimentation, and future production expansion.

Services

`data_service` (8000)

Responsible for:

Loading patient/FHIR JSON data
Creating text chunks from notes, medications, and lab results
Generating embeddings
Storing chunks and embeddings in Postgres + pgvector

`vector_service` (8001)

Responsible for:

Performing vector similarity search
Retrieving the most relevant patient chunks
Returning top-k matching records

`ai_service` (8002)

Responsible for:

Embedding user questions
Retrieving relevant context from vector_service
Building prompts
Generating grounded responses using OpenAI models

The AI service is stateless and can be scaled horizontally.

`api_gateway` (8003)

Responsible for:

Acting as the entry point for frontend or staff applications
Forwarding requests to the AI service
Providing a simplified external API layer

Architecture Overview

                +-------------------+
                |   API Gateway     |
                |      :8003        |
                +---------+---------+
                          |
                          v
                +-------------------+
                |    AI Service     |
                |      :8002        |
                +---------+---------+
                          |
          +---------------+----------------+
          |                                |
          v                                v
+-------------------+          +-------------------+
|  OpenAI API       |          |  Vector Service   |
| Embeddings + LLM  |          |      :8001        |
+-------------------+          +---------+---------+
                                          |
                                          v
                               +-------------------+
                               | Postgres +        |
                               | pgvector          |
                               +---------+---------+
                                         ^
                                         |
                               +---------+---------+
                               |   Data Service    |
                               |      :8000        |
                               +-------------------+

flowchart TB
  Client[User / Frontend]
  Gateway[API Gateway\nPOST /query]
  AI[AI Service\nPOST /ask]
  Vector[Vector Service\nPOST /search]
  Data[Data Service\nPOST /load_patient_data]
  DB[Postgres + pgvector]
  OpenAI[OpenAI API]

  Client -->|POST /query| Gateway
  Gateway -->|patient_id + question| AI
  AI -->|question embedding| Vector
  Vector -->|retrieve chunks| DB
  AI -->|send prompt + receive answer| OpenAI
  Data -->|ingest + store embeddings| DB

Prerequisites

Docker
Docker Compose
OpenAI API key

Create a .env file and configure:

OPENAI_API_KEY
VECTOR_DIM
EMBEDDING_MODEL
LLM_MODEL

Quick Setup

1. Build and start the stack

docker compose build --no-cache
docker compose up -d

2. Verify running services

docker compose ps

Configuration Priority

Configuration values are loaded in the following order:

Runtime environment variables
Values from .env
Default values defined in the services

Database

The project stores patient chunks and embeddings in Postgres using pgvector.

Make sure the vector dimension matches the embedding model being used.

Examples:

Embedding Model	Vector Dimension
`text-embedding-3-small`	1536
`text-embedding-3-large`	3072

End-to-End Verification

Load sample patient data

curl -X POST "http://localhost:8000/load_patient_data" \
  -H "Content-Type: application/json" \
  -d @sample_patient.json

Expected response:

status: ok
chunks_loaded > 0

Query through the gateway

curl -X POST "http://localhost:8003/query" \
  -H "Content-Type: application/json" \
  -d '{"patient_id":"patient-001","question":"Does the patient have any chronic conditions?"}'

Expected response:

Generated answer
Retrieved context chunks

Direct Service Testing

Query vector service directly

curl -X POST "http://localhost:8001/search" \
  -H "Content-Type: application/json" \
  -d '{"patient_id":"patient-001","query_embedding":[0.0,0.0,0.0],"top_k":3}'

Query AI service directly

curl -X POST "http://localhost:8002/ask" \
  -H "Content-Type: application/json" \
  -d '{"patient_id":"patient-001","question":"What medications is the patient taking?"}'

Postgres Inspection

Inspect the schema and vector dimensions:

docker compose exec postgres \
psql -U postgres -d healthcare \
-c "\d+ patient_chunks"

Troubleshooting

Vector dimension mismatch

If you see errors related to vector dimensions:

Ensure VECTOR_DIM matches the embedding model
Ensure the Postgres vector column dimension matches the configured model

Example mismatch:

Database uses vector(1536)
Application generates 3072 dimension embeddings

Missing dependencies

If services fail because of missing packages:

docker compose up --build --force-recreate -d

Suggested Improvements

Authentication and authorization
HIPAA/security controls
Audit logging
Redis caching
Better chunking strategies
Metadata filtering
Multi-patient isolation
Streaming responses
Kubernetes deployment
Direct FHIR API integration

Summary

This project demonstrates a lightweight healthcare AI RAG architecture using:

FastAPI microservices
Postgres + pgvector
OpenAI embeddings and LLMs
Retrieval-augmented generation
Docker-based deployment

It provides a clean foundation for building scalable healthcare AI systems that can answer patient-specific questions using grounded medical context.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
ai_service		ai_service
api_gateway		api_gateway
common		common
data_service		data_service
vector_service		vector_service
.env.example		.env.example
.gitignore		.gitignore
Debug.md		Debug.md
ReadME.MD		ReadME.MD
docker-compose.yml		docker-compose.yml
init-db.sql		init-db.sql
sample_patient.json		sample_patient.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Healthcare AI RAG Prototype

Services

`data_service` (8000)

`vector_service` (8001)

`ai_service` (8002)

`api_gateway` (8003)

Architecture Overview

Prerequisites

Quick Setup

1. Build and start the stack

2. Verify running services

Configuration Priority

Database

End-to-End Verification

Load sample patient data

Query through the gateway

Direct Service Testing

Query vector service directly

Query AI service directly

Postgres Inspection

Troubleshooting

Vector dimension mismatch

Missing dependencies

Suggested Improvements

Summary

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Healthcare AI RAG Prototype

Services

data_service (8000)

vector_service (8001)

ai_service (8002)

api_gateway (8003)

Architecture Overview

Prerequisites

Quick Setup

1. Build and start the stack

2. Verify running services

Configuration Priority

Database

End-to-End Verification

Load sample patient data

Query through the gateway

Direct Service Testing

Query vector service directly

Query AI service directly

Postgres Inspection

Troubleshooting

Vector dimension mismatch

Missing dependencies

Suggested Improvements

Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`data_service` (8000)

`vector_service` (8001)

`ai_service` (8002)

`api_gateway` (8003)

Packages