AI/ML Engineer and Data Scientist with 4+ years building and shipping production ML and GenAI systems across pharma, healthcare, and enterprise. I build production-grade agentic AI systems and LLM pipelines — not just experiments.
- 🏢 Currently: AI/ML Engineer @ Merck — production RAG pipelines, LLM summarization, embedding search, and anomaly detection
- 🎓 M.S. Data Science — DePaul University
- 🔨 Built: Agentic RAG agent with LangGraph + CRAG + HITL, hybrid RAG system with CI evaluation pipeline
- 🎯 Target roles: AI Engineer · GenAI Engineer · ML Engineer · LLM Engineer
| Role | What they want | What I have |
|---|---|---|
| AI Engineer | RAG, LLM APIs, vector DBs, production pipelines | ✅ Production RAG + monitoring + LangGraph agent shipped |
| GenAI Engineer | LangChain, LangGraph, agents, CRAG, HITL | ✅ 6-node LangGraph agent with CRAG + real HITL + LangSmith tracing |
| ML Engineer | PyTorch, MLOps, cloud deployment, pipelines | ✅ 5+ years PyTorch/TF, AWS/Azure/GCP, MLflow, Kafka, Spark |
1. AdaptiveRAG — Agentic RAG System with LangGraph, CRAG, and HITL
A LangGraph agent that routes queries to web, documents, or both — grades results before answering — and pauses for human input when unsure
- 🔀 Query routing — LLM classifies each query to web search, document retrieval, or hybrid
- ✅ CRAG — 100% block rate on low-confidence retrievals before hitting the LLM
- ⏸️ Real HITL — LangGraph
interrupt_beforepauses graph mid-execution for human clarification - 🔭 LangSmith tracing — all 6 nodes traced end-to-end for 100% of executions
2. StudyRAG — Production RAG System with Hybrid Retrieval and Observability
Ask questions over your own documents — hybrid AI search with full observability and CI-gated evaluation
- ⚡ Hybrid retrieval — BM25 + vector search via Reciprocal Rank Fusion across 107 document chunks
- 🎯 Cross-encoder reranking —
ms-marco-MiniLM-L-6-v2with citation enforcement (100% refusal rate on unsupported queries) - 🔭 LangSmith tracing + SQLite metrics store + live Streamlit dashboard
- 🧪 Ragas CI gate — 50-item golden dataset, blocks deployments on metric regression
| Area | Technologies |
|---|---|
| 🔍 RAG & Retrieval | RAG · Hybrid Search · BM25 · Cross-Encoder Reranking · FAISS · Pinecone · ChromaDB · CRAG · Self-RAG |
| 🤖 Agents & Graphs | LangGraph · Query Routing · HITL · Conditional Edges · MemorySaver · Subgraphs · MCP · Tavily |
| 🔭 Observability | LangSmith · Tracing · Latency Metrics · Query Logging · Ragas · SQLite · Streamlit Dashboards |
| 💬 LLM Frameworks | LangChain · OpenAI GPT-4o · Google Gemini · Claude · Prompt Engineering · CoT · Few-Shot |
| 🎯 Fine-Tuning | LoRA · QLoRA · Hugging Face Transformers · Parameter-Efficient Fine-Tuning |
| 📊 ML & Data Science | PyTorch · TensorFlow · Scikit-learn · XGBoost · Random Forests · SVM · CNN · NER · SHAP · Time-Series |
AI / GenAI / LLM
Languages
Machine Learning
Cloud & MLOps
Visualization & Monitoring
| Company | Role | AI Impact |
|---|---|---|
| Merck · 2025–Present | AI/ML Engineer | Production RAG ↓45% evidence-gathering · LLM summarization ↓45% review time · Embedding search ↓40% lookup time |
| Blue Cross Blue Shield · 2024–2025 | Data Scientist | LLM Q&A ↓35% SQL requests · Embedding search ↓50% lookup time · LLM summarization ↓60% review time |
| Dell Technologies · 2019–2021 | Data Scientist | Predictive failure models across 100+ product lines · Random Forest segmentation · Pricing analysis on 3M+ transactions |