Lead Data Scientist | Kaggle Competitions Master
With over a decade of experience in data science and AI, I specialize in machine learning, NLP, computer vision, and advanced signal processing. Iβve led projects and teams, collaborating with national laboratories and cross-functional groups to solve complex challenges across various industries. While Iβm passionate about staying at the forefront of AI through Kaggle Competitions, where Iβm ranked in the top 1% of ~173,000 competitors, I also believe in keeping solutions practical and straightforward. I follow the "Keep It Simple" philosophy, focusing on delivering reliable, scalable results with a pragmatic approach. I'm dedicated to solving real-world problems, driving business impact through actionable insights, and sharing knowledge with the community.
- Programming Languages: Python, SQL, MATLAB, Shell Scripting
- Machine Learning Specialties: Retrieval-Augmented Generation (RAG), Generative AI, Large Language Models (LLMs), Anomaly Detection, Computer Vision, Image and Signal Processing
- Machine Learning Frameworks: TensorFlow, PyTorch, Scikit-Learn, XGBoost, Hugging Face, LangChain, FAISS, Rapids AI, PySpark, OpenCV, Accelerate, Weights and Biases
- Cloud Platforms: AWS, GCP
- Tools & Technologies: Docker, Git, Linux, Project Management
Here are some of my GitHub repositories:
-
PyTorch and LLMs
Implement a generic workflow and best practices for fine-tuning Large Language Models (LLMs) using PyTorch. -
LangChain and Synthetic Data for RAG Evaluation
Demonstrate the use of LangChain and Llama2-Chat for synthetic data generation in Information Retrieval (IR) and Retrieval Augmented Generation (RAG) evaluations. -
Daily Object Detection Pipeline: YOLO + AWS Automate a pipeline that performs daily object detection on Columbus Circle EarthCam images using YOLOv5, deployed with AWS Lambda for seamless cloud processing and visualization.
-
StableDiffusion - Image to Text Develop and fine-tune a model capable of predicting the prompt used to create an AI generated image.
-
PII Detection and BIO Synthetic Data Generation
Focus on personal identifiable information (PII) entity detection and performance enhancement through synthetic data generation. -
Ecommerce Recommender System
Build a multi-objective recommender system using candidate ranker models, optimized for large-scale e-commerce datasets, to predict user interactions such as clicks, carts, and orders. -
LLM Serving and Inference
Deploy large language models (LLMs) on consumer-grade CPU hardware, emphasizing high-throughput and memory-efficient inference.
For a comprehensive list of my published work, please visit my Google Scholar profile.
- AWS Certified Developer β Associate
- AWS Certified Machine Learning β Specialty
- TensorFlow Developer
- SQL (HackerRank)
- Email: dunlap0924@gmail.com
- GitHub: mddunlap924
- Kaggle: dunlap0924
- LinkedIn: myles-dunlap
Feel free to explore my projects and reach out for collaborations or just to chat about data science and AI!