A semantic search engine for medication safety that retrieves dangerous drug interactions from FDA adverse event reports using vector embeddings — catching what keyword-based interaction checkers miss.
RxGuard analyzes drug interaction risks using FAERS (FDA Adverse Event Reporting System) data and semantic search. Describe a patient's medication regimen in plain English, and the system retrieves dangerous interactions, contraindications, and real FDA adverse event cases — ranked by severity and matched by semantic meaning, not just keywords.
- Natural Language Queries — extract drugs, demographics, and conditions from free-text clinical descriptions
- Multi-Engine Search — four search backends with progressively better semantic understanding:
- V1: Keyword exact match (baseline)
- V2: TF-IDF + cosine similarity
- V3: Vector search with sentence-transformers (in-memory)
- V3Actian: Vector search with Actian VectorAI DB (production-scale)
- Risk Scoring — weighted by semantic similarity, outcome severity, and demographic match (1-10 scale)
- Clinical Recommendations — automated safety recommendations with alternative medication suggestions
- LLM Summaries — optional Gemini API integration for natural language risk explanations
- React Dashboard — interactive adverse event visualizations (Recharts), similar cases table, and AI analysis
- Data Pipeline — batched FAERS ingestion with DailyMed drug label integration
User Query (Natural Language)
│
▼
┌─────────────────────────────────────────────────────────┐
│ QUERY PROCESSOR │
│ 1. Extract drug names (regex + drug dictionary) │
│ 2. Extract patient context (age, sex, conditions) │
│ 3. Generate query embedding (sentence-transformers) │
└──────────────────────┬──────────────────────────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
V1: Keyword V2: TF-IDF V3: Vector Search
(exact match) (cosine sim) (Actian VectorAI DB)
│ │ │
└────────────┼────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ RESULTS RANKER │
│ Semantic similarity + severity weighting + │
│ demographic match → Risk score (1-10) │
└──────────────────────┬──────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ RESPONSE GENERATOR │
│ Risk score, matched cases, warnings, demographics, │
│ recommendations + optional Gemini LLM summarization │
└──────────────────────┬──────────────────────────────────┘
▼
React Dashboard / Streamlit UI
- Python 3.12+
- Node.js 18+ (for frontend)
- Docker (optional, for Actian VectorAI DB)
pip install -r requirements.txtThe first run downloads the sentence-transformer model (~90MB).
Create a .env file in the project root:
GEMINI_API_KEY=your_api_key_here
Get a free key at Google AI Studio.
Option A: Streamlit UI
streamlit run app.pyOpens at http://localhost:8501.
Option B: FastAPI + React Frontend
# Terminal 1 — backend
python -m uvicorn api:app --reload
# Terminal 2 — frontend
cd frontend && npm install && npm run devFor production-scale vector search:
# Install the client
pip install actiancortex-0.1.0b1-py3-none-any.whl
# Start the database
docker compose up -dDownload the wheel from Actian VectorAI DB Beta. The app auto-detects the database and falls back to in-memory search if unavailable.
Enter a natural language query describing the patient and proposed drug combination:
65-year-old female on warfarin and metformin, doctor wants to add ibuprofen
Example output:
- RISK SCORE: 8.7/10 — HIGH RISK
- Primary Interaction: Warfarin + Ibuprofen (NSAID) — major GI bleeding, increased INR
- FAERS matches: 4,231 reports | 12% hospitalization, 3% fatal
- Recommendation: Consider acetaminophen as alternative. If NSAID required, use lowest effective dose with PPI gastroprotection and increased INR monitoring.
70-year-old male with diabetes on metformin, prescribed naproxen for arthritis
55-year-old female on warfarin, needs aspirin for heart protection
Patient on lithium and ACE inhibitor — risk assessment
Load FAERS data in batches with configurable pair counts and volume stages:
python run_pipeline_batched.py # defaults: 10 pairs/batch, 1K→2.5K→5K
python run_pipeline_batched.py --pair-batch 5 # 5 pairs per batch
python run_pipeline_batched.py --stages 1000 5000 10000 # custom volume stages
python run_pipeline_batched.py --labels # include DailyMed label pipelineSafe to stop and resume — cached pairs are skipped.
hacklytics2026/
├── app.py # Streamlit application
├── api.py # FastAPI backend server
├── config.py # API settings, 50 drug interaction pairs
├── query_processor.py # Query NLP and embedding generation
├── search_engines.py # V1/V2/V3/V3Actian search engines
├── results_ranker.py # Risk scoring and ranking
├── response_generator.py # Response formatting + Gemini integration
├── actian_vector_db.py # Actian VectorAI DB wrapper
├── data_models.py # FAERSCase dataclass
├── sample_data.py # Sample FAERS cases for testing
├── run_pipeline.py # Full FAERS data pipeline
├── run_pipeline_batched.py # Batched pipeline (pair batches + volume stages)
├── run_label_pipeline.py # DailyMed drug label pipeline
├── eval_search.py # Search evaluation and benchmarking
├── docker-compose.yml # Actian VectorAI DB container
├── requirements.txt # Python dependencies
├── src/
│ ├── data_collector.py # openFDA API data collection
│ ├── data_cleaner.py # FAERS cleaning and normalization
│ ├── document_builder.py # Searchable document chunk builder
│ ├── vector_store.py # Embedding generation and storage
│ ├── search.py # Semantic search with filters
│ ├── sphinx_eda.py # EDA charts and statistical analysis
│ ├── dailymed_ingestion.py # DailyMed drug label ingestion
│ ├── label_document_builder.py # Drug label document builder
│ └── label_vector_store.py # Drug label vector storage
├── frontend/ # React 19 + Vite 7 dashboard
│ ├── src/
│ │ ├── App.jsx # Router component
│ │ ├── SearchPage.jsx # Search input with type-ahead
│ │ └── rxguard_dashboard.jsx # Results dashboard with charts
│ └── package.json
└── data/
├── raw/ # Raw FAERS JSON from openFDA
└── processed/ # Cleaned parquet files
| Layer | Technology |
|---|---|
| Backend | Python 3.12, FastAPI, Streamlit |
| Frontend | React 19, Vite 7, Recharts, React Router |
| Embeddings | sentence-transformers (all-MiniLM-L6-v2, 384-dim) |
| Vector DB | Actian VectorAI DB (HNSW, gRPC) |
| LLM | Google Gemini API |
| Data | openFDA API, DailyMed, pandas, PyArrow |
| NLP | scikit-learn (TF-IDF), spaCy, regex |
| Visualization | Recharts (frontend), Plotly (backend) |
| DevOps | Docker, Docker Compose |
Run with Streamlit or FastAPI + React as described in Quick Start.
- Provision a server (e.g., Vultr) with Docker installed
- Clone the repo and run
docker compose up -dto start Actian VectorAI DB - Open port 50051 for team access:
ufw allow 50051/tcp - Set
ACTIAN_DB_HOST=<server-ip>:50051in each team member's.env - Run the app with V3Actian search engine selected
For monitoring: docker logs vectoraidb | docker stats vectoraidb
This system is for research and educational purposes. It should not be used as the sole basis for clinical decision-making. Always consult with qualified healthcare professionals for medical advice.
Hacklytics 2026.