Quantitative Modeling | Data Science | Machine Learning Systems
I am an applied mathematician and data scientist with 10+ years of experience designing quantitative models, building machine learning systems, and working with distributed computing frameworks. My work focuses on connecting mathematical theory with practical implementation, especially in machine learning, probabilistic modeling, and scalable data systems.
Example output from fractal visualization tool (C/GTK project below)
- Approximate Nearest Neighbor (ANN) algorithms
- Retrieval-Augmented Generation (RAG) systems
- Bayesian modeling and change point detection
- Distributed computing (Hadoop, Spark)
- From-scratch implementations of machine learning algorithms
Layered graph ANN structure inspired by HNSW, emphasizing clarity, experimentation, and tunable performance.
- Cosine and Euclidean similarity
- Batch construction and incremental insertion
- Graph connectivity via MST-based reconnection
- Empirical evaluation (recall, inflation, timing)
π https://github.com/byoung77/Approximate-Nearest-Neighbors-Project
Build Times vs. Dataset Size for ANN System
Configurable feedforward neural network implemented from first principles.
- Implemented generic feedforward architecture with user-defined layers
- Coded backpropagation and gradient updates manually
- Designed modular training loop with support for classification and regression
- Trained model exposed as a reusable callable class
π https://github.com/byoung77/Neural-Net-Implementation
Neural Net Training Loss for Two Moons Dataset
End-to-end retrieval-augmented generation system with vector search and reranking.
- FAISS-based vector search
- Cross-encoder reranking
- Citation-aware responses
- Desktop GUI for interactive querying
π https://github.com/byoung77/Doctor-Who-Oracle
Dr. Who Oracle Interface
An interactive Python implementation of the classic Lights Out puzzle, extended with multiple algebraic state spaces and nontrivial topological grids.
π https://github.com/byoung77/lights-out
Built a Python/MySQL/LaTeX system to replace fragmented committee records stored across multiple documents, enabling searchable history and automated generation of professional PDF reports.
π https://github.com/byoung77/committee-appeals-db
Interactive fractal visualization tool built in C.
- Mandelbrot and Julia sets
- Real-time zoom and navigation
- Custom function exploration
π https://github.com/byoung77/GUI-Fractal-Project
Bayesian nonparametric model for change point detection.
- Hierarchical Dirichlet Process Hidden Markov Model
- Integration of topological data analysis features
- Full inference pipeline in Julia
π https://github.com/byoung77/hdp-hmm-te
Programming: Python, Julia, Go, C
Machine Learning: Bayesian modeling, HMMs, clustering, topological data analysis, neural networks
Data & Distributed Systems: Hadoop, Spark, MapReduce
Tools: Git, Linux, LaTeX
- Built and deployed distributed analytics experiments on an 8-node Hadoop/Spark cluster
- Designed Bayesian nonparametric change-point detection model (HDP-HMM)
- Implemented machine learning systems from first principles
- Developed retrieval-augmented generation (RAG) system with GUI interface
- Ph.D., Mathematics β Rutgers University
- M.S., Data Science β Fordham University (2025)
- Associate Professor of Mathematics, Wilkes University (2015βPresent)
My academic work includes mathematical modeling, probability, machine learning, and computational methods.
I focus on:
- Building systems from first principles to understand core mechanics
- Bridging theory and implementation
- Evaluating trade-offs through experimentation
- Email: bojy77@gmail.com
- GitHub: https://github.com/byoung77
All projects are released under the MIT License unless otherwise noted.






