The goal of this case study is to showcase the skills learned throughout the Google Data Analytics Proffessional Cerficate program on Coursera.
This project will demonstrate some data processing and visualization
6 Phases of Data Analysis
we'll learn the different techniques how processing each phase and exploring data analysis. Several software tools were used for this case study.
Since launched in 2016, Cyclistic has grown across
Chicago with a fleet of 5,824 bicycles that are geo-tracked
and locked into a network of 692 stations.
Their systems have made it easy for riders to unlock bikes
from one station and return to any other station at any time.
The Cyclistic’ s marketing strategy success made possible
because of the flexibility of its pricing plans: single-ride passes,
full-day passes = casual riders, and annual memberships.
The Cyclistic’s marketing strategy is to predict customer interactions in the last 12 months. As members of the marketing team, our tasks are going to explore hypothesis and predictions by comparing how Cyclistic's bikes usage differ between annual and casual riders.
- How can you help the stakeholders resolve their questions?
- How can your insights drive business decisions?
- Identify stakeholders of the project
- Marketing Director/Manager
- Executive Team
Provide insights with relevant data sources and cleaning, key findings, provide meaningful data visualization support and data driven decisions.
Given real datasets available at Motivate International Inc. license here, we will use 12 months data from April 2020 to May 2021 for our fictional business, which aims at increasing revenues with their service's bikes share, by converting casual riders into annual members.
Janitor, Tidyverse, and gglplot2 packages for data cleansing and visualization using in R. Data will be loaded & stored with different variables. Missing and non-null values to sort and filter out. Data missed key such as demographic, income, age not included.
Inspection of data structures and data types for errors:
- Total Rows: 4,358,611
- Total Columns: 13
- Total missing - NA or non-values: 431585
The old datasets will change into a new data-frame including columns (members classification, bike types, week-days, months, ride-duration) for computational and descriptive analysis.
This part focus strictly on descriptive analysis. Comparing the charts, casual users ride the share-bikes more than annual member users during the weekend.
- R - analysis and visualization
- Microsoft SQL Server & Excel
- Tableau - create data visualization and reports
- Jupyter Notebook - data analysis and visualizations
Stackoverflow community
Github community