Skip to content

joelt1/ANZ-Virtual-Internship

Repository files navigation

ANZ-Virtual-Internship

Completed two online modules in Exploratory Data Analysis and Predictive Analytics Python libraries used - Pandas, Matplotlib, Scikit-learn

Task 1 - Used Pandas to segregate customer data by each month and used Matplotlib to visualise transaction volume and mean transaction amount each day. Also visualised mean customer balance and mean payment amount by age, with gender means included, for each month in data set.

Task 2 - Used Pandas to evaluate mean customer annual salary and then grouped customer data by customer id and mean. Used Scikit-learn for machine learning algorithms in Python. Linear regression model - used card present flag, merchant code, balance, age and amount from grouped data set to predict annual salary, obtained test prediction accuracy of -0.32. Reverting back to original data set, created dummy variables for categorical variables including gender and age. Decision tree classifier and regression models - used modified data set to predict annual salary, obtained test scores of 0.76 and 0.67 respectively.

About

Personal attempt at tasks 1 and 2

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages