I am a Data Engineer who is passionate about machine learning, mathematics, and data science. With experience in cloud technologies, data visualization, and automation, I'm always excited to tackle challenging projects that create real impact.
-
M.S. in Applied and Computational Mathematics and Statistics
University of Notre Dame, IN
GPA: 3.54/4.00
Graduated: May 2023 -
B.S. in Applied and Computational Mathematics and Statistics
University of Notre Dame, IN
Minor: Actuarial Science
Graduated: May 2022
Aug 2023 - Present
- Built and maintained ETL data pipelines for the distribution of brands across Central America and the Caribbean, managing over 300,000 SKUs.
- Developed data visualization dashboards with Power BI and Streamlit, deployed using MicroK8s Kubernetes.
- Designed and implemented a PostgreSQL data warehouse integrating data from multiple sources (SAP, BigQuery, MongoDB).
- Created a metadata dashboard to improve data quality, reducing missing values from 35% to under 5%.
Aug 2022 - Dec 2022
- Assisted in teaching Statistical Learning for Data Science (graduate-level) and Introduction to Probability (undergraduate-level).
- Graded assignments and provided assistance to students regarding their questions.
A minimal implementation of the Keras Sequential model and Dense Layer classes, built from scratch using Python and NumPy.
- Docs: microkeras.readthedocs.io/en/latest
- GitHub: MicroKeras Repository
- Colab Demo: Open in Colab
Built data pipelines using Apache Airflow and Docker to store financial data in PostgreSQL. Developed three data visualization dashboards, one using Streamlit, one using Tableau and and one using React and Django.
- GitHub: S&P 500 Dashboard
- Dashboards: Streamlit Dashboard | Tableau Dashboard | React Dashboard
Trained models using Google Cloud AutoML, TensorFlow, and XGBoost to price European options and compared performance against the Black-Scholes model.
- GitHub: Options Pricing Repository
- Research Paper: arXiv
Deep neural network implementation from scratch in C++ with CUDA for GPU acceleration. This project focuses on creating a neural network to classify digits using the MNIST dataset.
- Programming: Python, SQL, C++
- Machine Learning: TensorFlow, PyTorch, SciKit-Learn, XGBoost
- ETL Pipelines: Pandas, PySpark, Polars, Dask
- Databases: PostgreSQL, SQL Server, MongoDB, BigQuery, Hive
- DevOps: Docker, Kubernetes, Apache Airflow
- Data Visualization: Power BI, Streamlit, Tableau
- Web Development: React, Tailwind CSS, Django, FastAPI
- Editor & OS: Neovim, Linux (Arch, Debian, Ubuntu)