Skip to content

winsznx/localrx

Repository files navigation

Python 3.11+ License MIT Offline First Docker VectorAI DB

LocalRx — Offline Medical Reference Engine

Hybrid-search powered medical reference for healthcare workers in low-connectivity environments. Built with Actian VectorAI DB for the VectorAI DB Build Challenge 2026.


The Problem

50% of the global population lacks access to essential health services (WHO). In rural clinics, internet is unreliable. Cloud-based medical references fail when needed most.

The Solution

LocalRx is a local-first medical reference engine that runs entirely offline. Search symptoms in natural language, get ranked diagnoses with treatment protocols, filtered by your clinic's available medicines.

No internet. No cloud. No patient data exposure. Sub-15ms search.

This is not a chatbot. It's a deterministic search engine where every result is traceable to verified WHO data.


Quick Start

git clone https://github.com/YOUR_USERNAME/localrx.git
cd localrx
docker compose up

That's it. The backend auto-ingests 70 medical records on first startup and begins serving immediately.

Manual setup (without Docker)

# Start VectorAI DB
docker run -d --name vectoraidb -p 50051:50051 -v ./data/vectoraidb:/data williamimoh/actian-vectorai-db:latest

# Python environment
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Ingest data + start app
python scripts/ingest.py
uvicorn app.main:app --port 8000

Technical Requirements Used

Hybrid Fusion (Primary)

Every search query runs two independent retrieval signals fused via Reciprocal Rank Fusion (RRF):

Signal Method What it catches
Semantic Cosine similarity on 384-dim embeddings (VectorAI DB vector search) "burning chest feeling" → matches GERD/acid reflux
Keyword BM25-style token overlap (client-side scroll + scoring) "paracetamol" → exact match, not just "pain reliever"
Fusion actian_vectorai.reciprocal_rank_fusion() — SDK-native RRF Merges both ranked lists without score calibration

Both signals pass through VectorAI DB's server-side metadata filters — category, specialty, age group, risk level — applied during the search, not after.

Filtered Search (Secondary)

Six filter dimensions applied inside VectorAI DB using the SDK's FilterBuilder DSL:

f = FilterBuilder().must(Field("risk_level").eq("high")).must(Field("age_group").eq("child")).build()
results = client.points.search("drugs", vector=q, filter=f, limit=10)

Bonus — 100% Offline

  • Embeddings: sentence-transformers/all-MiniLM-L6-v2 (local, 80MB, CPU-only)
  • Database: Actian VectorAI DB in Docker (ARM + x86)
  • Zero network calls in the query path

Architecture

flowchart TD
    User([🩺 Healthcare Worker])

    subgraph offline ["🔒 OFFLINE — One Machine · No Internet"]
        direction TB
        UI["<b>Frontend</b><br/>HTML · CSS · vanilla JS<br/><i>Search UI · Filters · Result Cards</i>"]
        API["<b>FastAPI Backend</b><br/>Python 3.11 · Async<br/><i>Hybrid search orchestration</i>"]
        Embed["<b>Embedding Service</b><br/>sentence-transformers<br/>all-MiniLM-L6-v2 · 384d"]
        DB[("<b>Actian VectorAI DB</b><br/>Docker · gRPC :50051<br/><i>Vector + Hybrid + Filtered</i>")]
    end

    User <--> UI
    UI <-->|"REST / JSON"| API
    API <-->|"encode query"| Embed
    API <-->|"hybrid search<br/>RRF fusion"| DB

    classDef user fill:#FFFFFF,stroke:#1A2332,color:#1A2332,stroke-width:2px
    classDef ui fill:#E6F4F4,stroke:#0D7377,color:#0F2B3C,stroke-width:2px
    classDef api fill:#0D7377,stroke:#0A5F62,color:#FFFFFF,stroke-width:2px
    classDef embed fill:#F3E8FF,stroke:#7C3AED,color:#1A2332,stroke-width:2px
    classDef db fill:#0F2B3C,stroke:#000,color:#FFFFFF,stroke-width:2px
    classDef boundary fill:#F8F9FB,stroke:#0D7377,stroke-width:2px,stroke-dasharray:5 5

    class User user
    class UI ui
    class API api
    class Embed embed
    class DB db
    class offline boundary
Loading

The entire stack runs on one machine. Zero network calls in the query path. Every component — embedding model, vector DB, API, UI — lives behind the dashed boundary.

Search Flow

Query: "child fever rash joint pain" + filter: age_group=child

  1. Embed query → 384-dim vector (local, ~15ms)

  2. Semantic signal:
     VectorAI DB cosine search → ranked by meaning

  3. Keyword signal:
     Scroll filtered collection → BM25-style token overlap scoring

  4. Fusion:
     reciprocal_rank_fusion([semantic, keyword], k=60) → single ranked list

  5. Results with dual scores + matched keywords:
     #1  Paracetamol         sem=0.32  kw=0.67  RRF=1.00  matched=[fever]
     #2  Vitamin A           sem=0.31  kw=0.33  RRF=0.98  matched=[child]
     #3  Amoxicillin         sem=0.30  kw=0.33  RRF=0.97  matched=[rash]

Why VectorAI DB

Capability VectorAI DB Pinecone pgvector ChromaDB FAISS
Native vector search Yes Yes Yes Yes Yes
Client-side RRF/DBSF SDK built-in No No No No
Filtered search in-engine Yes Cloud only SQL post-filter Basic No
Runs offline (Docker) Yes No (cloud) Yes Yes N/A
ARM support Yes N/A Yes Yes Yes
One-command deploy docker compose up N/A Extra setup Extra setup N/A

VectorAI DB is the only vector database that gives us native fusion helpers AND runs in a Docker container offline. For a medical tool in a clinic with no internet, it's the only viable architecture.


Data

70 curated records across 4 collections:

Collection Records Content
drugs 30 WHO Essential Medicines — dosing, contraindications, interactions
guidelines 15 Clinical protocols — malaria, pneumonia, PPH, TB, HIV, asthma
conditions 15 Disease presentations — symptoms, diagnosis criteria, complications
interactions 10 Critical drug interaction pairs — mechanism, risk, management

Sources: WHO ICD-11 (open) · WHO Essential Medicines List 2023 (open) · WHO Clinical Guidelines (open)


Tech Stack

Layer Choice
Vector DB Actian VectorAI DB (Docker, gRPC on port 50051)
Backend FastAPI (Python 3.11) — ~200 LOC
Embeddings all-MiniLM-L6-v2 (384-dim, local, CPU)
Frontend HTML + CSS + vanilla JS — light clinical UI (DM Sans + Source Sans 3)
Hybrid Fusion SDK-native reciprocal_rank_fusion()

Project Structure

localrx/
├── docker-compose.yml          # VectorAI DB + backend — one command
├── Dockerfile                  # Python 3.11 backend container
├── requirements.txt
├── app/
│   ├── main.py                 # FastAPI with auto-ingest lifespan
│   ├── search.py               # Hybrid search: semantic + keyword → RRF
│   ├── ingest.py               # Shared ingestion logic
│   ├── models.py               # Pydantic request/response models
│   └── config.py               # Settings from .env
├── scripts/
│   ├── ingest.py               # CLI ingestion tool
│   └── smoke_test.py           # SDK integration test
├── data/
│   ├── drugs.csv               # 30 WHO Essential Medicines
│   ├── guidelines.csv          # 15 clinical protocols
│   ├── conditions.csv          # 15 common conditions
│   └── interactions.csv        # 10 drug interactions
├── frontend/
│   ├── index.html
│   ├── styles.css              # Dark clinical terminal theme
│   └── app.js
└── docs/
    ├── ARCHITECTURE.md         # System design + RRF formula + comparison
    ├── SDK_NOTES.md            # Actian VectorAI SDK reference
    └── SUBMISSION_NOTES.md     # DoraHacks write-up

For full submission guide and judging context, see docs/SUBMISSION_NOTES.md.


Demo Video

[Demo video — coming soon]


Disclaimer

LocalRx is a reference tool for healthcare education and quick lookup. It is NOT a diagnostic system and does not replace clinical judgment. All data sourced from publicly available WHO publications.


License

MIT

About

Offline-First Medical Reference Engine for Low-Connectivity Clinics Built for the Actian VectorAI DB Build Challenge

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors