class AIEngineer:
def __init__(self):
self.name = "Noach"
self.role = "AI / ML Engineer"
self.location = "{{CITY, COUNTRY}}"
self.focus = ["LLMs", "RAG", "Agents", "MLOps", "Eval"]
self.stack = ["PyTorch", "LangChain", "FastAPI", "Docker", "AWS"]
self.shipping = "RAG assistant w/ eval harness + observability"
self.learning = ["LangGraph", "DSPy", "vLLM at scale"]
self.philosophy = "The best ML engineers ship — and measure."
def looking_for(self):
return "AI / ML Engineer roles where I own systems end-to-end."
me = AIEngineer()🔭 Currently building — a production-grade RAG assistant with RAGAS eval + structured tracing 🌱 Currently learning — LangGraph orchestration, vLLM serving, RLHF basics 💬 Ask me about — RAG architecture trade-offs · evaluating non-deterministic systems · LLM cost optimization 🎯 Open to — ML / AI Engineer roles · contract work on LLM systems · interesting OSS collabs ⚡ Fun fact — The best ML engineers spend more time on data and evals than on models
Five repositories covering retrieval, agents, fine-tuning, MLOps, and computer vision. Each ships with a live demo and a numbers-driven results section.
|
Production-grade RAG over your docs Hybrid retrieval (BM25 + dense), reranking, citation-tracked answers, RAGAS eval harness in CI, FastAPI + Streamlit on HF Spaces. 📊 Faithfulness 0.91 · Answer Relevancy 0.88 · p95 480 ms |
Multi-tool LLM agent that drafts research memos LangGraph agent: plans, web-searches, reads PDFs, runs code, cites sources. Includes tool-use traces and a regression suite for non-deterministic outputs. 📊 Task completion 87% · Avg tool calls 6.3 · Cost / brief $0.18 |
|
Train → register → deploy → monitor, automated Full MLOps loop: DVC-versioned data, MLflow tracking, drift-triggered retraining, model registry, FastAPI serving, Prometheus + Grafana. 📊 Reproducible · Drift-triggered retraining · 99.5% uptime SLO |
QLoRA fine-tune of Llama-3.1-8B on a niche task Data prep pipeline, QLoRA training notebook, eval vs base model, and an honest write-up on when fine-tuning beats RAG (and when it doesn't). 📊 +18% domain accuracy vs base · 4× cheaper inference 🚀 Model on 🤗 — Repo |
|
Real-time defect detection on a (simulated) manufacturing line YOLOv8 fine-tuned on a custom dataset → ONNX → FastAPI → Streamlit dashboard that streams predictions over webcam or uploaded video. Includes a bias / failure-mode analysis section. 📊 mAP@0.5 0.93 · Real-time 38 FPS on CPU |
|
I'm open to ML Engineer / AI Engineer roles, contract work on LLM systems, and collaborations on interesting open-source AI projects.
If you found me through a job posting — thanks for actually clicking through. I'd love to hear what you're building.