Skip to content

MicroRAG – A scalable, microservice-based Retrieval-Augmented Generation system with Valkey-powered queues, LangGraph orchestration, and AWS-native deployment.

Notifications You must be signed in to change notification settings

mukulpythondev/MicroRAG

Repository files navigation

MicroRAG

A scalable, microservice-based Retrieval‑Augmented Generation system with Valkey‑powered queues, LangGraph orchestration, LangSmith tracing, and AWS EC2 + Load Balancer deployment.

Overview

MicroRAG is a production‑grade platform that analyzes Resumes and Job Descriptions (JDs), generates gap analyses & improvement suggestions, and exposes the workflow via APIs and minimal UI. The system is designed to be horizontally scalable, event‑driven, and observability‑first.

Core ideas:

  • Microservices for separation of concerns and independent scaling.
  • RAG pipeline for factual, source‑grounded responses.
  • Valkey as the queue backbone for decoupling producers/consumers.
  • LangGraph for deterministic, recoverable orchestration.
  • LangSmith for tracing, evaluation, and monitoring.
  • AWS EC2 + ALB for simple, flexible deployment.

Application Flow

  1. Upload JD and Resume (PDF) → Enqueue in Valkey → Generate file ID.

  2. Storage: Raw PDF stored in mounted Docker volume (/mnt/volume).

  3. Worker Step 1: Pick up file → Convert PDF pages into images → Save in Docker volume.

  4. Worker Step 2: Send images to OpenAI Vision/Text API → Extract text.

  5. Processing: Resume text + JD passed to LangGraph pipeline.

  6. LangGraph Flow:

    • Rewrite JD (clean, concise, ATS‑friendly).
    • Perform analysis on rewritten JD.
    • Generate suggestions & gap analysis.
  7. Result Delivery: JSON/PDF report available for download.


High‑Level Architecture

image

Services

  1. api-gateway (FastAPI)

    • Upload JD + Resume
    • Enqueue jobs in Valkey
    • Returns job ID
  2. pdf-worker

    • Reads jobs from Valkey
    • Converts PDF pages → Images
    • Stores in mounted Docker volume
    • Reads images
    • Sends to OpenAI API for text extraction
    • Stores extracted text
  3. rag-orchestrator (LangGraph)

    • Rewrite JD
    • Compare resume & rewritten JD
    • Generate improvement suggestions
  4. reporter

    • Consolidates analysis
    • Outputs JSON/PDF reports

📦 Tech Stack

  • Languages: Python (FastAPI, workers)
  • Orchestration: LangGraph
  • Observability: LangSmith, CloudWatch
  • Queue: Valkey (Redis‑compatible)
  • Database: MongoDB (via devcontainer for metadata & tracing)
  • Infra: Docker, Docker Compose, AWS EC2, ALB
  • CI/CD: GitHub Actions → EC2 deploy via SSH/scp
  • Dev Tools: .devcontainer for VSCode, Dockerfiles for all services

📁 Repo Structure

scalable-rag/
├─ app/
│  ├─ main.py
│  ├─ server.py
│  ├─ graph.py
│  ├─ db/
│  │  ├─ client.py
│  │  ├─ db.py
│  │  └─ collections/
│  ├─ queue/
│  └─ utils/
│     └─ file.py
├─ docker-compose.yaml
├─ docker-compose.prod.yaml
├─ Dockerfile
├─ requirements.txt
├─ freeze.sh
├─ run.sh
├─ start_worker.sh
├─ venv/
├─ .devcontainer/
└─ docs/
   └── architecture.png

Local Development (with Dev Container)

This project ships with a ready‑to‑use VSCode Dev Container for consistent local development.

Steps:

# 1. Clone the repo
 git clone https://github.com/mukulpythondev/MicroRAG.git
 cd MicroRAG

# 2. Open in VSCode with Dev Containers extension installed
# VSCode will detect `.devcontainer` and build environment automatically.

# 3. Start services inside container
 docker compose up -d

# 4. Run API service (FastAPI)
 docker compose run app python app/server.py

# 5. Start a worker (PDF→Image, OCR, LangGraph orchestration)
 docker compose run app bash start_worker.sh

Environment Variables

Copy .env:

OPENAI_API_KEY=
LANGSMITH_TRACING=true
LANGSMITH_ENDPOINT=
LANGSMITH_API_KEY=
LANGSMITH_PROJECT=

🛠️ Deployment on AWS EC2

  • Provision EC2 instances (Ubuntu 22.04+)
  • Install Docker + Docker Compose
  • Clone repo → Pull environment secrets from AWS SSM/Secrets Manager
  • Run services with docker compose -f docker-compose.prod.yaml up -d
  • Attach EC2 instances to AWS Application Load Balancer (ALB)
  • Scaling: Add/remove EC2 instances behind ALB

📈 Performance Notes

  • Store intermediate files in Docker volume (fast local access)
  • Parallelize workers by increasing consumer count
  • Batch OCR requests to OpenAI where possible

About

MicroRAG – A scalable, microservice-based Retrieval-Augmented Generation system with Valkey-powered queues, LangGraph orchestration, and AWS-native deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published