Data scientist and analytics engineer with hands-on experience in machine learning, operations research, cloud data engineering, simulation, and energy-sector analytics. I build models, simulations, and pipelines that turn messy, real-world data into actionable decisions — from optimizing warehouse logistics to detecting solar panel faults to measuring ad campaign causality.
- 🎓 UW Madison — Wisconsin School of Business
- 🔭 Currently exploring: cloud-scale data pipelines, causal inference, and renewable energy analytics
- 🌱 Domain depth: solar & wind energy, warehouse automation, transportation, digital advertising
- ⚡ Fun fact: Built a scheduling optimizer that saved $650K+ for a university dining center
End-to-end diagnostic framework for 3 rooftop PV systems at Andre Agassi Preparatory Academy (Las Vegas), detecting inverter faults, shading, and sensor anomalies using NREL PVDAQ data.
| Stack | Python · Pandas · Plotly · Matplotlib · Google Colab |
| Highlights | 7 automated diagnostic flags, peer-comparison anomaly detection, 7,100 sensor anomalies identified in System 1276 |
Assessing wind resource potential at the Lone Star Wind Farm (Abilene, TX) using NREL data, Weibull fitting, and turbine power curve modeling for a GAMESA G87-2.0 MW turbine.
| Stack | Python · SciPy · NumPy · Pandas · Matplotlib |
| Highlights | Weibull k=2.92, AEP of 8,331 MWh, 64.95% capacity factor, air-density-adjusted power output |
Optimizing weekly shift scheduling for 249 student employees at UW Madison's Gordon Dining Center using Pyomo linear programming, replacing a manual process.
| Stack | Python · Pyomo · Pandas · GLPK Solver |
| Highlights | Real operational data, $650K+ estimated savings, constraint-based optimization with fair workload distribution |
Measuring the true causal impact of digital ad exposure on handbag purchase conversions using a randomized experiment with 588K+ users.
| Stack | R · dplyr · ggplot2 · lmtest · sandwich |
| Highlights | ATE of +1pp (p<0.001), HC3 robust regression, decile subgroup analysis showing effect concentrates in top 20% exposure |
Predicting which passengers were transported to an alternate dimension — 82.2% accuracy with Random Forest, featuring multi-stage smart imputation.
| Stack | Python · scikit-learn · XGBoost · Pandas · Seaborn |
| Highlights | 4 models compared (LR, KNN, GBM, RF), group-based imputation recovered 50% of missing data, cabin feature engineering |
Cloud-based analytics pipeline analyzing Southwest's delay patterns vs. industry using BigQuery, dbt, Python, and Looker Studio dashboards.
| Stack | Python · BigQuery · dbt · Looker Studio · GCP |
| Highlights | End-to-end cloud pipeline, 42.8% delay rate analysis, 3.63 min avg in-flight recovery, interactive Looker dashboard |
End-to-end inventory system combining Power BI dashboards, FlexSim warehouse simulation, an autonomous robot prototype (Arduino + SolidWorks), and shipping route optimization with Pyomo.
| Stack | Python · Pyomo · FlexSim · Power BI · SolidWorks · Arduino · Google Maps API |
| Highlights | 303 SKUs managed, warehouse bottleneck identified (Queue2 WIP: 25.32), TSP route optimizer, line-following robot with full CAD drawings |
I'm open to collaborating on data science, energy analytics, and ML engineering projects. Feel free to reach out!
