Skip to content
View Kaiha0's full-sized avatar
🫠
Open to work
🫠
Open to work

Block or report Kaiha0

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Kaiha0/README.md

I'm Kai Hao

Data Scientist | Machine Learning & NLP Enthusiast

Master in Data Science — University of Malaya, Malaysia


About

  • Master in Data Science from University of Malaya
  • Strong focus on Machine Learning, NLP, and Data Engineering
  • Experienced in building end-to-end ML pipelines and scalable data workflows
  • Hands-on experience with Big Data ecosystems and cloud platforms
  • Passionate about translating data into actionable business insights

Technical Skills

Programming & Query Languages

Python R MySQL

Machine Learning & NLP

Supervised Learning Unsupervised Learning Feature Engineering Model Evaluation Text Classification NLP Pipelines

Big Data

Hadoop Apache Spark Hive HBase

Cloud & Data Engineering

Google Cloud BigQuery Dataproc PySpark

AWS EC2 S3 DynamoDB


Connect

Feel free to explore my repositories and connect!

Pinned Loading

  1. WQD7008_PDC_Urban-Mobility-Forecasting-Using-Transfer-Learning-Architecture WQD7008_PDC_Urban-Mobility-Forecasting-Using-Transfer-Learning-Architecture Public

    Cloud-Native Lambda Architecture on AWS for Urban Mobility Forecasting using Transfer Learning.

    Python

  2. WQD7009-BDAA-A-Predictive-Model-for-Heart-Disease-Risk-Factors-using-a-Cloud-Based-Architecture WQD7009-BDAA-A-Predictive-Model-for-Heart-Disease-Risk-Factors-using-a-Cloud-Based-Architecture Public

    End-to-end Heart Disease Predictive Model using a Five-Layer Cloud Architecture (Bronze-Silver-Gold) on GCP with Dataproc, BigQuery ML, and Vertex AI.

    Python

  3. WQD7007-BDM-Big-Data-Implementation-in-Tourism-The-Walt-Disney-Company-Case-Study WQD7007-BDM-Big-Data-Implementation-in-Tourism-The-Walt-Disney-Company-Case-Study Public

    Comparative analysis of Apache Hive vs. Apache Spark on GCP Dataproc for analyzing 3.5M+ Disneyland attraction wait times and predictive modeling.

    Python

  4. WQF7007-NLP-Detecting-Fake-Job-Postings-Using-Natural-Language-Processing-and-Machine-Learning WQF7007-NLP-Detecting-Fake-Job-Postings-Using-Natural-Language-Processing-and-Machine-Learning Public

    NLP-based Machine Learning solution to detect fraudulent job advertisements. Features a modular pipeline from text cleaning to a Gradio/Streamlit prototype

    Jupyter Notebook