🚀 Engineering & Data Science Portfolio

A comprehensive collection of technical projects demonstrating end-to-end expertise in Machine Learning, Quantitative Finance, and Full-Stack Data Engineering.

📂 Repository Overview

This repository serves as a centralized portfolio containing six production-grade projects. Each directory represents a standalone application or research pipeline, complete with source code, documentation, and rigorous performance analysis.

🧠 Deep Learning & Time Series

Project	Domain	Tech Stack	Key Impact
Solar Energy Forecasting	Smart Grid / Energy	`XGBoost` `LSTM` `Optuna`	Reduced MAPE error significantly vs. ARIMA baselines; engineered hybrid forecasting models for prosumer consumption/production.
CNN Image Classification	Computer Vision	`PyTorch` `torchvision`	Achieved 84.6% Accuracy on CIFAR-10 using a custom 3-layer CNN with adaptive pooling and augmentation pipelines.

🤖 Classical Machine Learning

Project	Domain	Tech Stack	Key Impact
Loan Approval Prediction	FinTech / Risk	`Scikit-Learn` `SHAP` `Random Forest`	Built an automated underwriting system with 99.25% Precision and 100% Recall; integrated SHAP for regulatory explainability.
TV Show Analytics	Data Mining	`SciPy` `BeautifulSoup` `Requests`	End-to-end scraper for 200+ shows; applied Kruskal-Wallis & Robust Regression to debunk "Golden Age" TV myths.

📉 Quantitative Finance & Systems

Project	Domain	Tech Stack	Key Impact
Portfolio Risk Modeling	Quant Finance	`R` `GARCH` `Quadprog`	Implemented Mean-Variance optimization (Markowitz) and Dynamic Volatility forecasting using GARCH(1,1).
UniBooks System	DBMS	`MS Access` `VBA` `SQL`	Designed a normalized relational database with RBAC security and automated inventory tracking triggers.

🛠 Technical Deep Dives

1. ☀️ Solar Prosumer Energy Forecasting

Challenge: Mitigate energy imbalance costs in smart grids by predicting erratic prosumer behavior.
Solution: Developed a comparative pipeline using Gradient Boosting (XGBoost) and Recurrent Neural Networks (LSTM).
Highlights:
- Automated hyperparameter tuning via Optuna (Bayesian Optimization).
- Implemented 5-fold TimeSeriesSplit cross-validation to prevent look-ahead bias.
- Artifacts: Full technical report (.pdf) and production-ready Python scripts.
👉 View Project

2. 🏦 Loan Approval AI & Fairness

Challenge: Automate loan eligibility while minimizing financial risk and maintaining interpretability.
Solution: A Random Forest classifier tuned for high precision in the "Safe-to-Approve" band.
Highlights:
- Feature Engineering: Created high-impact ratios (e.g., Debt-to-Income, Asset Liquidity).
- Governance: Utilized SHAP (SHapley Additive exPlanations) to audit model decisions for bias.
- Performance: Achieved ROC-AUC of 0.999 on the test set.
👉 View Project

3. 📊 Quantitative Risk Engine (R)

Challenge: Model portfolio risk beyond simple standard deviation in volatile markets.
Solution: A statistical framework combining Modern Portfolio Theory (MPT) with time-series econometrics.
Highlights:
- Convex Optimization: Calculated Global Minimum Variance (GMV) and Tangency portfolios using quadratic programming.
- Volatility Modeling: Integrated GARCH(1,1) to capture volatility clustering and "fat tails" in asset returns.
- Backtesting: Rolling-window analysis to validate Value-at-Risk (VaR) estimations.
👉 View Project

4. 🖼️ CNN Image Classification (Computer Vision)

Challenge: Implement a robust vision pipeline from scratch without relying on pre-trained models.
Solution: Designed a custom 3-layer Convolutional Neural Network (CNN) for the CIFAR-10 dataset.
Highlights:
- Architecture: Utilized Conv2d blocks with Batch Normalization and Max Pooling; integrated Dropout to prevent overfitting.
- Augmentation: Applied random rotations and horizontal flips to improve generalization.
- Result: Achieved 84.6% Accuracy, with strong performance on mechanical classes (Cars/Trucks).
👉 View Project

5. 🕸️ Web Scraping & Statistical Analysis

Challenge: Validate cultural theories ("Golden Age of TV") using real-world unstructured data.
Solution: A dual-phase pipeline: Automated Scraper (Python/Requests) + Statistical Inference (SciPy).
Highlights:
- Data Engineering: Built a resilient scraper to harvest metadata for 200+ shows, handling retries and rate limiting.
- Inference: Applied non-parametric tests (Mann-Whitney U, Kruskal-Wallis) to handle non-normal rating distributions.
- Insight: Disproved "Longer is Better" myths using robust regression analysis.
👉 View Project

6. 📚 UniBooks Management System (DBMS)

Challenge: Replace manual bookstore tracking with a scalable, atomic transaction system.
Solution: A relational database system built with MS Access and VBA automation.
Highlights:
- Schema Design: 3NF Normalized database ensuring data integrity across Inventory, Sales, and Procurement.
- Automation: VBA triggers for real-time stock level checks (Inventory < Order_Qty logic).
- Analytics: SQL-driven dashboards for "Best Sellers" and monthly revenue tracking.
👉 View Project

⚡ Getting Started

Each project is self-contained. To run a specific project:

Navigate to the project folder.
Read the local README.md for specific dependency installation (e.g., pip install -r requirements.txt or R library installation).
Launch the corresponding Jupyter Notebook (.ipynb) or R Script (.R).

# Example: Cloning the repo
git clone [https://github.com/andyhu11/Project-Documentation.git](https://github.com/andyhu11/Project-Documentation.git)
cd Project-Documentation

📄 License

This repository is licensed under the MIT License. See individual project folders for specific third-party attributions.

📫 Connect with Me

LinkedIn: Andy
Portfolio: github.com/andyhu11/Project-Documentation
Email: jiahuiapply26@163.com

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
Image_Classification_CNN		Image_Classification_CNN
Loan_Approval_Prediction		Loan_Approval_Prediction
Portfolio_Risk_Modeling		Portfolio_Risk_Modeling
Time_Series_Forecasting		Time_Series_Forecasting
UniBooks_Management_System		UniBooks_Management_System
Web_Scraping_Data_Analysis		Web_Scraping_Data_Analysis
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Engineering & Data Science Portfolio

📂 Repository Overview

🧠 Deep Learning & Time Series

🤖 Classical Machine Learning

📉 Quantitative Finance & Systems

🛠 Technical Deep Dives

1. ☀️ Solar Prosumer Energy Forecasting

2. 🏦 Loan Approval AI & Fairness

3. 📊 Quantitative Risk Engine (R)

4. 🖼️ CNN Image Classification (Computer Vision)

5. 🕸️ Web Scraping & Statistical Analysis

6. 📚 UniBooks Management System (DBMS)

⚡ Getting Started

📄 License

📫 Connect with Me

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 Engineering & Data Science Portfolio

📂 Repository Overview

🧠 Deep Learning & Time Series

🤖 Classical Machine Learning

📉 Quantitative Finance & Systems

🛠 Technical Deep Dives

1. ☀️ Solar Prosumer Energy Forecasting

2. 🏦 Loan Approval AI & Fairness

3. 📊 Quantitative Risk Engine (R)

4. 🖼️ CNN Image Classification (Computer Vision)

5. 🕸️ Web Scraping & Statistical Analysis

6. 📚 UniBooks Management System (DBMS)

⚡ Getting Started

📄 License

📫 Connect with Me

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages