🎓 Education:
- Master's in Data Science – University of Melbourne
- Bachelor's in Statistics – University of Melbourne
💼 Aspiring Data Analyst with a strong foundation in statistical modeling, machine learning, and data visualization. Passionate about solving real-world problems through data-driven insights.
- Programming Languages: Python (Pandas, NumPy, Scikit-learn, XGBoost), R, SQL
- Data Visualization: Tableau, Matplotlib, Seaborn
- Machine Learning: Regression, Classification, Clustering, Time Series Forecasting
- Big Data Tools: Kubernetes, Kafka, Elasticsearch, Docker
- Data Wrangling & EDA: Jupyter Notebook, ETL pipelines, data cleaning, feature engineering
- Built an interactive Tableau dashboard analyzing traffic accidents in Victoria, visualizing trends, hotspots, and demographics.
- Tech Stack: Tableau
- Developed machine learning models to predict NOx (Nitrate/Nitrite) concentrations from low-frequency data, supporting real-time monitoring of river water quality.
- Conducted EDA to analyze water quality parameters (e.g., temperature, pH, turbidity) and identified key patterns.
- Trained models using Gaussian Process Regression (GPR), Random Forest (RF), and XGBoost, achieving high predictive accuracy.
- Tech Stack: Python, Scikit-learn, XGBoost, Pandas, NumPy
- Built a Kubernetes-based multi-cluster system for ingesting and analyzing large-scale urban data streams in real-time.
- Implemented pipelines for diverse data sources (e.g., sensor data, social media feeds) and performed system optimization to ensure low latency.
- Tech Stack: Kubernetes, Elasticsearch, Kafka, Docker, Python
- Developed an interactive R Shiny dashboard to visualize Melbourne's attractions, transport routes, and entertainment hotspots.
- Tech Stack: R Shiny, ggplot2, Tableau
- Conducted comprehensive EDA to uncover actionable insights, including trend analysis and data visualization.
- Tech Stack: Python (Pandas, Matplotlib, Seaborn), Jupyter Notebook
- Analyzed the factors affecting rental prices in Melbourne, such as proximity to landmarks (universities, hospitals, stations), crime rates, and land use patterns.
- Conducted spatial analysis, correlation studies, and regression modeling to uncover the relationships between rental prices and various socio-economic factors.
- Implemented machine learning models including OLS regression, spatial lag models, and K-means clustering to identify key drivers of rental price variations.
- Tech Stack: Python, GeoPandas, Pandas, Matplotlib, Scikit-learn
- Advanced machine learning techniques for time series data.
- Efficient data pipeline development with distributed systems (e.g., Kafka, Kubernetes).
- Advanced statistical techniques for multivariate data analysis.
🎯 Career Goal:
To join a forward-thinking team as a Data Analyst, where I can apply my statistical expertise, programming skills, and passion for uncovering insights from data to drive impactful decisions.