Data & ML Engineer with 2.5 years of experience building data pipelines and ML systems. B.Tech in Computer Science from VIT Vellore. Published researcher in NLP — my work on predicting Indian film success using subtitle-derived document vectors was published in the International Journal of Image and Graphics (World Scientific, 2023).
Currently building AI products across computer vision, deep learning, and NLP.
Languages & Frameworks: Python, SQL, FastAPI, Flask, Streamlit
ML & Deep Learning: PyTorch, TensorFlow, Scikit-learn, XGBoost, Hugging Face Transformers
Computer Vision: MediaPipe, OpenCV, dlib
NLP & LLMs: LangChain, Sentence Transformers, Doc2Vec, Claude API
Data & Infrastructure: Snowflake, dbt, ChromaDB, Supabase, PostgreSQL
Tools: Git, Docker, CUDA/cuDNN, Jupyter
| Project | What it does | Stack |
|---|---|---|
| FormCheck AI | Real-time AI gym coach — detects exercise form via pose estimation and gives live voice feedback | MediaPipe, OpenCV, Streamlit, pyttsx3 |
| ClassSense | AI-powered attendance system using face recognition (dlib + SVM) and voice authentication (Resemblyzer) | dlib, SVM, Resemblyzer, Streamlit, Supabase |
| StyleCast | Neural style transfer web app using AdaIN — trained a custom decoder on 40K+ images (MS-COCO & WikiArt), deployed with Flask on Render | PyTorch, VGG-19, Flask, AdaIN |
| Film Success Prediction | NLP pipeline predicting movie profitability from subtitles using Doc2Vec embeddings (F1: 0.77) — published paper | Doc2Vec, XGBoost, AdaBoost, Scikit-learn |
Early Success Prediction of Indian Movies using Subtitles: A Document Vector Approach
International Journal of Image and Graphics, World Scientific, Vol. 23, No. 4, 2023
Built a two-stage ML pipeline (regression → classification) using custom subtitle vectors across 2700+ Indian films in five languages. Achieved F1-score of 0.77 and Cohen's Kappa of 0.48.
