Skip to content
View kevinlam-aus's full-sized avatar

Block or report kevinlam-aus

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kevinlam-aus/README.md

Hi there, I'm Kevin

I'm a data engineer and an aspiring data analyst. I mainly write code with Python and SQL, but have dabbled in other langueages.

In college I had an internship as a data analyst as at a social networking app. There I was able to visualize networks and was tasked with attemping to explain what made a specific user popular. With a Bachelor’s degree in Management Information Systems and a minor in Marketing, I bring a unique blend of technical and business acumen to data.

After being a Data Engineer, I realized that though backend engineering exciting, my true passion is leveraging data to drive business decisions and am working to transition into a data analyst role.

Recent projects

Predicting if an employee would leave Salifort Motors

Code: Analyzing turnover at Salifort

Goal: To determine why there was such a high turnover rate at Salifort Motors.

Description: The project focused on analyzing a dataset of employees collected by the HR team. The dataset included satifaction level, employee's last performance review, number of projects an employee contributes to, number of hours an employee works a month, employee tenure, whether they were promoted in the last 5 years, and other relevant information. The project involved loading the data, cleaning and preprocessing it, performing exploratory data analysis (EDA), analyzing the correlation between whether an employee left or not against other variables, and builing logistical and tree-based models.

Skills: data cleaning, data analysis, data modeling, machine learning, data visualization

Technology: Python, Pandas, Numpy, Seaborn, Matplotlib, Sklearn

Results: Using Python functions the analysis revealed that there is cause for concern about a data leakage. However, with the data provided, it was found that the two variables with the highest importance that the model would use to predict if an employee was leaving was their last evaluation score and the number of hours the employee worked. Other factors also included their tenure and if the employee worked over 166.67 hours a month.

Understanding the best time to perform maintenance on bikes

Vizualizations: 2018 Seoul Bike Rentals

Goal: To visualize the slowest time of bike rentals so workers can work on maintenance and repairs away from peak times.

Description: The project focused on visualizing the 2018 bike rental data in Seoul. The data set inclueded things such as rented bike count, date, and weather data such as snowfall and humidity. For simplicity, we chose to focus on season and date/time the bike was rented.

Skills: data visualization

Technology: Tableau

Results: After looking at the visualization, it is noted that the lowest bicycle traffic times are between 9am and 2pm on any given day, with no variation depending on the weekday. However, we notice that the bike rentals decrease significantly in the winter time. Because of this, regular maintanance can be recommended to be done at 9am-2pm on the weekedays and major repairs can be done in the winter months.

Comparing Lighning Strike Data throughout the years

Vizualizations: Lightning Strike Visualizations

Goal: To understand the trends of lighning strikes from 2009-2018.

Description: The project focused on visualizing the lightning strike data to understand how the strikes changed over time. In this dataset columns used were latitute and longitude, and date the strike happened.

Skills: data visualization

Technology: Tableau

Results: After building the report we saw that over the last 9 years, the number of lighning strikes recorded were increasing with Q3 (summer months) being the most frequent month for lightning strikes to occur. Over the years, lightning strikes have moved frrom the east coast to the central mainlands with Louisiana, Arkansas, and Missisipi is where the lighning strikes occured the most in the whole decade.

Waze User Churn

Code: Predicting User Churn at Waze

Goal: To predict user churn at Waze and understand possible reasons why.

Description: The project focused on analyzing a dataset of churned users. The dataset included whether the user was retained or churned, number of times a user opened the app during the month, the number of drives over 1km during the month, whether the user had an iphone or adroid and other variables. The project involved loading the data, cleaning and preprocessing it, performing exploratory data analysis (EDA), analyzing the correlation between whetehr an employee left or not against other variables, and building machine learning models.

Skills: data cleaning, data analysis, data modeling, machine learning, data visualization

Technology: Python, Pandas, Numpy, Seaborn, Matplotlib, Sklearn, xgboost

Results: The machine learning model created would not be a strong predictor so we would not be able to drive any business decisions, however the model is a great start to guide further exploritory efforts. If we had additional information such as geographic location, drive times, or if a user ended the route before reaching their destination, it would give us a greater chance of improving the model.

College Projects

Sample Code from Old College Projects

Pinned Loading

  1. kevinlam-aus kevinlam-aus Public

    My README profile