Skip to content

easyharshmods/pharma-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PharmaGuard

RAG-powered drug interaction safety checker built on AWS. Uses Retrieval Augmented Generation over open FDA pharmacology data to return grounded, cited safety reports. Built as a hands-on demonstration of pgvector semantic search, citation enforcement, confidence thresholds, and the concrete difference between grounded and ungrounded LLM responses.

MEDICAL DISCLAIMER: This software is for educational and demonstration purposes only. It is not intended for clinical use. Do not use this software to make medical decisions. Always consult a licensed healthcare professional.

How It Works

  1. Ingestion -- 300 drug labels are fetched from the openFDA API, split into 512-token chunks, embedded with Amazon Titan Embed Text v2 (1024 dimensions), and loaded into Aurora PostgreSQL with pgvector.

  2. Query -- A user asks a drug interaction question via the Streamlit UI. The question is embedded, matched against stored chunks using HNSW cosine similarity search, and the top passages are sent to Amazon Nova Lite with a system prompt that enforces passage-level citations.

  3. Comparison -- A RAG ON/OFF toggle lets users see the same question answered with and without retrieval, side by side. RAG ON responses are grounded in FDA label text with citations. RAG OFF responses come purely from LLM training data with no traceability.

Architecture

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   CloudFront    β”‚  HTTPS, AWS Shield
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   ALB (HTTP)    β”‚  SG: CloudFront prefix list only
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚  ECS Fargate (Streamlit)     β”‚  Private subnet, 1 task
              β”‚  Python 3.12, port 8501      β”‚  1 vCPU, 2 GB
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚ HTTPS via NAT Gateway
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚  API Gateway (HTTP API v2)   β”‚  Public endpoint
              β”‚  POST /query                 β”‚  Lambda integration
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚  Lambda (Python 3.13)        β”‚  Private subnet, 512 MB
              β”‚  RAG Query Handler           β”‚  30s timeout, X-Ray
              β”‚                              β”‚
              β”‚  1. Embed question           │──▢ Bedrock Titan Embed v2
              β”‚  2. pgvector similarity      │──▢ Aurora PostgreSQL
              β”‚  3. Generate cited answer    │──▢ Bedrock Nova Lite v1
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚              β”‚
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚ Aurora         β”‚    β”‚ Bedrock              β”‚
           β”‚ Serverless v2  β”‚    β”‚ Titan Embed v2       β”‚
           β”‚ PostgreSQL 15  β”‚    β”‚  (1024-dim vectors)  β”‚
           β”‚ pgvector HNSW  β”‚    β”‚ Nova Lite v1         β”‚
           β”‚ 0.5–4 ACU      β”‚    β”‚  (cross-region EU)   β”‚
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

AWS Services Used

Service Role
CloudFront HTTPS CDN in front of ALB. Viewer-to-edge encryption, AWS Shield Standard.
ALB Routes HTTP traffic to ECS. Security group restricts inbound to CloudFront managed prefix list (pl-a3a144ca in eu-central-1).
ECS Fargate Runs the Streamlit container. Single task, private subnet, no auto-scaling. Image pulled from ECR via VPC endpoints.
ECR Private container registry for the Streamlit image. KMS-encrypted, scan-on-push, keeps last 5 images.
API Gateway (HTTP API v2) Public HTTP API with a single POST /query route backed by Lambda.
Lambda RAG query handler. Embeds the question, searches pgvector, calls Nova Lite for generation. VPC-attached (private subnet), X-Ray tracing enabled. Bundled with psycopg, pgvector, pydantic, structlog.
Aurora Serverless v2 PostgreSQL 15 with the vector extension. HNSW index for cosine similarity search. 0.5–4 ACU, KMS-encrypted, private subnet only. Data API enabled for ingestion scripts.
Bedrock Titan Embed Text v2 Embedding model. Converts text to 1024-dimensional vectors. Used by both the ingestion pipeline and the Lambda query handler.
Bedrock Nova Lite v1 LLM for answer generation. Called via cross-region inference profile (eu.amazon.nova-lite-v1:0). System prompt enforces passage-level citations.
S3 Stores chunked JSONL zip files from ingestion. Versioned, KMS-encrypted, lifecycle deletes non-current versions after 30 days.
KMS Single customer-managed key (pharma-guard/main) encrypts Aurora, S3, ECR, Secrets Manager, and CloudWatch log groups. Annual rotation enabled.
Secrets Manager Stores Aurora database credentials. Auto-generated during cluster creation. Never exposed as environment variables.
VPC 2 AZs, 2 public + 2 private subnets, 1 NAT Gateway.
VPC Endpoints Interface endpoints: ecr.api, ecr.dkr, secretsmanager, bedrock-runtime, logs. Gateway endpoint: s3. Minimizes NAT costs and keeps traffic private where possible.
CloudWatch Log groups for Lambda, ECS, API Gateway, VPC Flow Logs (3-day retention). Dashboard with Lambda/API GW/Aurora metrics. Alarms for error rates, latency, and connection counts.
CloudWatch Synthetics Canary runs every 5 minutes, invokes the Lambda with a test query, checks for the citations key in the response. Alarms on 2 consecutive failures.
X-Ray Distributed tracing on Lambda and API Gateway. Subsegments around embed, retrieve, and generate steps.
SSM Session Manager Shell access to ECS tasks without SSH or bastion hosts.

Security Design

  • No public database endpoint. Aurora is in private subnets only. Lambda connects via the VPC.
  • No public API Gateway. Although the HTTP API is technically public, the Streamlit frontend is the only caller (via NAT). The API could be further restricted with IAM auth or a resource policy.
  • Minimal security groups. Each component (ALB, ECS, Lambda, Aurora, VPC endpoints) has its own SG with least-privilege rules. Aurora only accepts connections from the Lambda SG on port 5432.
  • KMS encryption everywhere. A single CMK encrypts all data at rest: Aurora storage, S3 objects, ECR images, Secrets Manager values, and CloudWatch logs.
  • No secrets in code or environment. Aurora credentials are in Secrets Manager. The Lambda reads them at runtime via the AWS SDK.
  • Pre-commit hooks. detect-secrets blocks committed credentials, ruff lints and formats, bandit scans for security issues.

Repository Structure

pharma-guard/
β”œβ”€β”€ .env.example                    # Template with placeholder values
β”œβ”€β”€ .pre-commit-config.yaml         # detect-secrets, ruff, bandit, yamllint
β”œβ”€β”€ pyproject.toml                  # ruff + pytest config
β”‚
β”œβ”€β”€ infra/                          # CDK Python (7 stacks)
β”‚   β”œβ”€β”€ app.py                      # Stack wiring and dependency order
β”‚   β”œβ”€β”€ cdk.json
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── stacks/
β”‚       β”œβ”€β”€ kms_stack.py            # CMK with rotation
β”‚       β”œβ”€β”€ network_stack.py        # VPC, subnets, SGs, VPC endpoints
β”‚       β”œβ”€β”€ storage_stack.py        # S3 bucket, ECR repository
β”‚       β”œβ”€β”€ database_stack.py       # Aurora Serverless v2, pgvector schema
β”‚       β”œβ”€β”€ api_stack.py            # Lambda + API Gateway
β”‚       β”œβ”€β”€ frontend_stack.py       # ECS Fargate + ALB + CloudFront
β”‚       └── observability_stack.py  # Dashboard, alarms, synthetics canary
β”‚
β”œβ”€β”€ src/query_handler/              # Lambda source code
β”‚   β”œβ”€β”€ handler.py                  # Entry point, RAG pipeline orchestration
β”‚   β”œβ”€β”€ rag.py                      # Embed, retrieve, confidence check
β”‚   β”œβ”€β”€ llm.py                      # Nova Lite generation, citation parsing
β”‚   β”œβ”€β”€ db.py                       # Aurora connection, pgvector queries
β”‚   β”œβ”€β”€ models.py                   # Pydantic request/response models
β”‚   └── requirements.txt            # Bundled into Lambda deployment package
β”‚
β”œβ”€β”€ scripts/                        # Phase 1: local ingestion pipeline
β”‚   β”œβ”€β”€ drug_list.txt               # 300 drug names
β”‚   β”œβ”€β”€ ingest_drugs.py             # openFDA API β†’ chunked JSONL β†’ S3
β”‚   β”œβ”€β”€ load_aurora.py              # S3 β†’ Bedrock embed β†’ Aurora (Data API)
β”‚   β”œβ”€β”€ verify_ingestion.py         # Row counts, vector dims, sample query
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── README.md
β”‚
β”œβ”€β”€ docker/                         # Phase 2: Streamlit container
β”‚   β”œβ”€β”€ Dockerfile                  # Python 3.12-slim, amd64
β”‚   β”œβ”€β”€ app.py                      # UI with RAG ON/OFF toggle
β”‚   β”œβ”€β”€ api_client.py               # Calls API Gateway
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── README.md
β”‚
└── tests/                          # Unit tests (mocked AWS calls)
    β”œβ”€β”€ conftest.py
    β”œβ”€β”€ test_ingest_drugs.py
    β”œβ”€β”€ test_load_aurora.py
    β”œβ”€β”€ test_rag.py
    β”œβ”€β”€ test_llm.py
    └── test_handler.py

Prerequisites

  • AWS CLI v2 configured for eu-central-1
  • AWS account with Bedrock model access enabled:
    • amazon.titan-embed-text-v2:0
    • eu.amazon.nova-lite-v1:0 (cross-region inference)
  • Python 3.12+ (3.14 for local scripts, 3.12 for Docker)
  • Docker Desktop (with buildx for cross-platform builds)
  • Node.js 20+ (for CDK CLI)
  • AWS CDK CLI: npm install -g aws-cdk@2.170.0

Step-by-Step Deployment

1. Clone and configure

git clone <repo-url> && cd pharma-guard
cp .env.example .env
pip install pre-commit && pre-commit install

2. Deploy infrastructure (7 CDK stacks)

cd infra
pip install -r requirements.txt
cdk bootstrap aws://<ACCOUNT_ID>/eu-central-1    # first time only
cdk deploy --all --require-approval never

Stacks deploy in dependency order: KMS β†’ Network β†’ Storage β†’ Database β†’ API β†’ Frontend β†’ Observability. Takes ~15-20 minutes on first deploy.

After deployment, copy the stack outputs into your .env:

PHARMA_GUARD_BUCKET=<BucketName output from StorageStack>
AURORA_SECRET_ARN=<SecretArn output from DatabaseStack>
AURORA_CLUSTER_ARN=<from: aws rds describe-db-clusters --query "DBClusters[?starts_with(DBClusterIdentifier,'pharmaguard')].DBClusterArn" --output text>
API_GATEWAY_URL=<ApiGatewayUrl output from ApiStack>

3. Build and push the Streamlit container

Must be done before the ECS service can start (it needs an image in ECR).

cd docker

# Authenticate Docker with ECR
aws ecr get-login-password --region eu-central-1 | \
  docker login --username AWS --password-stdin <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com

# Build for amd64 (required β€” Fargate runs x86_64)
docker buildx build --platform linux/amd64 --provenance=false \
  -t pharma-guard/streamlit . --load

# Tag and push
docker tag pharma-guard/streamlit:latest \
  <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com/pharma-guard/streamlit:latest
docker push \
  <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com/pharma-guard/streamlit:latest

4. Run ingestion (Phase 1)

Runs from your laptop. Uses the RDS Data API (no VPC access needed).

cd scripts
pip install -r requirements.txt

# Fetch FDA drug labels, chunk, upload to S3
python ingest_drugs.py
# ~3 minutes for 300 drugs. Note the S3 key in the output.

# Embed chunks via Bedrock and load into Aurora
python load_aurora.py --s3-key ingestion/drug_chunks_<DATE>.jsonl.zip
# ~3 minutes for ~1700 chunks.

# Verify: row counts, vector dimensions, sample similarity search
python verify_ingestion.py

5. Access the application

Open the CloudFront URL from the FrontendStack output:

https://<distribution-id>.cloudfront.net

Enter drug names, type a clinical question, and click Check Interactions. Toggle RAG Mode off in the sidebar to see the ungrounded comparison.

6. Force ECS redeployment (after image updates)

aws ecs update-service \
  --cluster <cluster-name> \
  --service <service-name> \
  --force-new-deployment \
  --region eu-central-1

RAG Pipeline Detail

User question
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Titan Embed v2      β”‚  Convert question to 1024-dim vector
β”‚ (1024 dimensions)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ pgvector HNSW       β”‚  Cosine similarity search
β”‚ WHERE drug_name     β”‚  Filter by drug names (if provided)
β”‚   = ANY(...)        β”‚  Threshold: configurable (default 0.50)
β”‚ ORDER BY cosine     β”‚  Return top 5 chunks
β”‚ LIMIT 5             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
   Chunks above threshold?
      β”‚           β”‚
     YES          NO
      β”‚           β”‚
      β–Ό           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Nova Lite β”‚  β”‚ Fallback     β”‚
β”‚ + system  β”‚  β”‚ "Insufficientβ”‚
β”‚   prompt  β”‚  β”‚  data..."    β”‚
β”‚ + context β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
      β”‚
      β–Ό
  Response contains
  [Passage N] citations?
      β”‚           β”‚
     YES          NO
      β”‚           β”‚
      β–Ό           β–Ό
  Return with    Fallback
  interactions   response
  + citations

Fallback conditions (any one triggers it):

  • No chunks retrieved from pgvector
  • All chunks below the confidence threshold
  • Nova Lite returns INSUFFICIENT_DATA
  • Nova Lite response contains no [Passage N] markers

Teardown

cd infra
cdk destroy --all

If destroy fails due to non-empty resources:

# Delete ECR images
aws ecr batch-delete-image \
  --repository-name pharma-guard/streamlit \
  --image-ids "$(aws ecr list-images --repository-name pharma-guard/streamlit --query 'imageIds[*]' --output json)" \
  --region eu-central-1

# Empty S3 bucket
aws s3 rm s3://<bucket-name> --recursive

# Retry destroy
cdk destroy --all

Cost Estimate

Estimated cost for a weekend build and demo session:

Service Estimate
Aurora Serverless v2 (0.5 ACU, ~8 hrs) ~$0.48
NAT Gateway (data transfer) ~$1.00
Bedrock Titan Embed (~1700 chunks + queries) ~$0.30
Bedrock Nova Lite (demo queries) ~$0.50
ECS Fargate (1 task, ~8 hrs) ~$0.40
KMS (1 key, prorated) ~$0.03
Secrets Manager (1 secret, prorated) ~$0.01
VPC Endpoints (7 interface endpoints, ~8 hrs) ~$0.80
Lambda, API Gateway, S3, CloudFront, CloudWatch < $0.10
Total (weekend) ~$4-6

Cost tip: VPC interface endpoints and NAT Gateway are the largest ongoing costs. Tear down promptly after your demo session.

Running Tests

pip install pytest pydantic structlog tiktoken
python -m pytest tests/ -v

All 22 tests run without AWS credentials (mocked AWS calls).

Extending This Project

  • Add more drug data sources (DailyMed, EU EMA)
  • Tune HNSW parameters (m, ef_construction) for larger datasets
  • Add RDS Proxy for Lambda connection pooling at scale
  • Add Cognito authentication for the Streamlit frontend
  • Multi-region deployment with Aurora Global Database
  • EventBridge scheduled re-ingestion for FDA label updates
  • Feedback loop to track and improve citation quality over time
  • Swap Nova Lite for Claude for higher-quality generation

About

RAG-powered drug interaction checker built on AWS. Demonstrates pgvector semantic search, citation enforcement, and confidence thresholds using open FDA pharmacology data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors