Skip to content
View PratyayRajak's full-sized avatar

Highlights

  • Pro

Block or report PratyayRajak

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
PratyayRajak/README.md

⚡ Pratyay Rajak

GenAI Engineer | Specializing in Agentic Workflows, Production RAG & LLMOps

Portfolio LinkedIn Hugging Face

"Models are cheap. Engineering is rare."

I build AI systems that survive the transition from a notebook to production. My focus is on multi-agent orchestration (LangGraph), automated evaluation (Ragas), and secure, cost-effective deployment.


📌 The Numbers That Matter

  • 0.72 → 0.87: Faithfulness lift on Legal RAG through hybrid retrieval and reranking.
  • 0.09%: Parameters trained to achieve domain-specific fine-tuning via QLoRA.
  • 28%: Precision improvement in context retrieval using RRF (Reciprocal Rank Fusion).
  • Zero GPU: Production deployment of a multi-stage legal agent on CPU-only infrastructure.

🚀 Featured Production Work

⚖️ Government Judicial RAG System

Commissioned for Indian District Courts

  • The Challenge: Processing massive volumes of bilingual (Hindi/English) legal filings with strict PII compliance.
  • The Solution: A 7-node LangGraph workflow featuring IndicTrans2 for translation and RAPTOR for hierarchical recursive chunking.
  • Compliance: Integrated Microsoft Presidio to mask sensitive identifiers (Names, Aadhaar, Litigant details) before model ingestion to meet legal requirements.
  • Tech: LangGraph, Neo4j, ChromaDB, Redis, FastAPI, Gradio.

🤖 Autonomous SWE-Agent (GitHub Issue Resolver)

  • The Challenge: Automating the end-to-end resolution of GitHub issues while ensuring code safety.
  • The Solution: A multi-agent system (Research → Coder → Tester → PR Writer) that resolves issues in a Docker-sandboxed environment.
  • Safety: Zero-network sandbox with resource caps ensures no unsafe code execution on host systems.
  • Tech: Gemini 2.0, LangGraph, Docker, LangSmith.

📊 LLM Eval & Drift Monitoring Pipeline

  • The Challenge: Catching quality degradation in production before users do.
  • The Solution: A production-ready monitoring system using Medallion Architecture on Delta Lake to ingest completions in real-time.
  • Key Feature: Automated MLflow alerts that trigger when faithfulness or context recall drifts >0.05 from the rolling baseline.
  • Tech: PySpark, Delta Lake, Ragas, MLflow, Groq.

🧰 Tech Stack

Category Tools & Frameworks
GenAI & Orchestration LangChain, LangGraph, Multi-Agent Systems, Agentic Workflows
RAG & Retrieval Hybrid Search (BM25 + Vector), RRF, CrossEncoders, Semantic Chunking
LLMOps & Eval Ragas, LangSmith, MLflow, Weights & Biases, Delta Lake
Fine-Tuning PEFT, LoRA/QLoRA, Unsloth, Hugging Face Transformers
Vector/Graph DBs ChromaDB, FAISS, Neo4j, Redis
Backend & Core FastAPI, Docker, GitHub Actions (CI/CD), Python, SQL, TypeScript

🎓 Education & Experience

  • M.Sc. Data Science | IIIT Lucknow (Expected June 2026)
  • Graduate Teaching Assistant | IIIT Lucknow (Mathematics Dept)
  • B.Sc. (Hons) Mathematics | University of Delhi (2021 – 2024)

📈 GitHub Stats


Actively seeking remote AI Engineer roles.
📫 pratyayrajak18@gmail.com

Pinned Loading

  1. Call-automation Call-automation Public

    Python

  2. llm-eval-drift-pipeline llm-eval-drift-pipeline Public

    Python

  3. multi-agent-dev-system multi-agent-dev-system Public

    Python

  4. Ramcharitra_Manas Ramcharitra_Manas Public

    Python