Paul Ha paulha042

Hi, I'm Paul 👋

🤖 An Artificial Intelligence/ Deep Learning / Machine Learning Enthusiast

👨‍🎓 Bachelor of IT in Data Science (2022 - 2025)

🏫 Master of IT in Artificial Intelligence (2025 - 2026)

Dataset: CIFAR-10 dataset used for generative image synthesis consisting of 60,000 color images across 10 categories (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck).
Model Implemented:
- Denoising Diffusion Implicit Model (DDIM) for image generation.
- Custom U-Net architecture with attention mechanisms.
- Implemented both forward (noise addition) and reverse (denoising) diffusion processes.
Achievement:
- Demonstrated stable convergence and interpretable attention patterns over 100 training epochs.
- Evaluated generated samples with Fréchet Inception Distance (FID = 102.02) and Inception Score (IS = 1.21), showing trade-offs between sampling speed, image fidelity, and diversity.
Future Improvement:
- Implement conditional diffusion guidance for class-controlled generation.
- Refine noise scheduling for improved efficiency and sample quality.
- Experiment with deeper U-Net or transformer-based architectures to enhance robustness and image realism.

Dataset: A private car sales dataset from a Kaggle competition containing vehicle attributes such as year, mileage, brand, fuel type, and engine specifications to predict selling prices.
Model Implemented:
- Applied supervised regression models including RandomForest, XGBoost, LightGBM, and Gradient Boosting.
- Focused on data preprocessing, feature engineering, and hyperparameter optimization using GridSearchCV.
- Utilized Recursive Feature Elimination (RFE) to identify the most significant predictors of price.
Achievement:
- Developed a car price prediction model achieving 10% Mean Absolute Percentage Error (MAPE).
- Compare the capabilities and the peformance of each model to provide the highest accuracy.

Applied K-Means++ and Agglomerative Clustering to segment customers based on purchasing behavior.
Enabled businesses to design personalized marketing strategies, improve customer retention, and optimize product recommendations.
Conducted feature scaling and elbow/silhouette analysis to determine optimal cluster numbers.
Compared the clusters generated from two algorithms (see how they perform differently, we don't measure accuracy here 🙂).
Recommend strategies to target each customer segment generated from the two algorithms.