# **Cyclistic Case Study**

### About the Company
In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. - Google Data Amalytics Professional Certificate

### Services Offered:
- Single-Ride Passes
- Full-Day Passes
- Annual Memberships

### Customer Segment:
- Customers who purchase single-ride or full-day passes are referred to as casual riders.
- Customers who purchase annual memberships are Cyclistic members.

### Finance Analysts Speculation
- Annual members are much more profitable than casual riders.
- According to Moreno, the company's director of marketing, maximizing the number of annual members will be key to the company's future success.

### Notes from Moreno
- Rather than creating a marketing campaign that targets all-new customers, the company should focus on converting casual riders into annual members.
- Casual riders are already aware of the company's program and have chosen Cyclistic for their mobility needs.

# Case Study Deliverables:
- A clear statement of the business task
- A description of all data sources used
- Documentation of any cleaning or manipulation of data
- A summary of analysis
- Supporting visualizations and key findings
- Top 3 Recommendations based on analysis

# **Question to be Answered: How do annual members and casual riders use Cyclistic bikes differently?**

# **Ask Phase**

| **Guiding Questions**                         |              **Answers** |
|-------------------------------------------|:--------------------:|
| What is the problem I am trying to solve? |   Profit maximization for Cyclistic by turning casual riders into annual members   |
| How can my insights drive business decisions?      |  Understanding the difference between the riding patterns of Cyclistic's customer audience could potentially lead to actionable steps and drive marketing campaigns that easily converts casual riders into annual members |

### Deliverable: Concise and Clear Statement of the Business Task
**_Analyze and vary the bike usage patterns for both casual riders and annual members._** 

# **Prepare Phase**

| **Guiding Questions**                         |              **Answers** |
|-------------------------------------------|:--------------------:|
| Where is the data located? |   The riding data has been provided by Motivate International Inc. and is stored on Amazon Web Service (AWS); a reputable cloud provider.   |
| How is the data organized?     | <ol><li> Comma Separated Files </li><li>Proper naming convention followed except for the case of September 2022</li><li>All data files have the same number of columns and all column names are exactly the same.</li></ol> |
| Are there issues with bias or credibility in the data?     | <ol><li>The data being used for analysis is time-relevant as it is of the immediate past 12 months.</li><li>The data is from a primary source as it was collected in real-time by the company itself using their app.</li><li>The data files have a number of missing values/datapoints which can hamper the effectiveness of the resulting analysis.</li></ol> |
| How are components like licensing, privacy, security, and accessibility addressed?    |  <ol><li> The data is licensed under the Divvy company via this [link](https://www.divvybikes.com/data-license-agreement) but cannot be accessed because the server is down and hence, raises a concern. </li> <li> The data is opensource, hence publicly accessible to anyone that might require it. </li><li>Due to privacy issues, the riders' id information have been altered to protect against the possible connection between pass purchases and credit card numbers.</li></ol> |

### Inspecting the Data

In [1]:
import pandas as pd

#### Loading all the Data

In [7]:
import os
# check current working directory
os.getcwd()

'c:\\Users\\Alli Ajagbe\\OneDrive - Plaksha University\\Desktop\\Data Analytics\\case_study'

In [13]:
june2023 = pd.read_csv("../../divvy-data/202306-divvy-tripdata.csv")
may2023 = pd.read_csv("../../divvy-data/202305-divvy-tripdata.csv")
april2023 = pd.read_csv("../../divvy-data/202304-divvy-tripdata.csv")
march2023 = pd.read_csv("../../divvy-data/202303-divvy-tripdata.csv")
february2023 = pd.read_csv("../../divvy-data/202302-divvy-tripdata.csv")
january2023 = pd.read_csv("../../divvy-data/202301-divvy-tripdata.csv")
december2022 = pd.read_csv("../../divvy-data/202212-divvy-tripdata.csv")
november2022 = pd.read_csv("../../divvy-data/202211-divvy-tripdata.csv")
october2022 = pd.read_csv("../../divvy-data/202210-divvy-tripdata.csv")
september2022 = pd.read_csv("../../divvy-data/202209-divvy-publictripdata.csv")
august2022 = pd.read_csv("../../divvy-data/202208-divvy-tripdata.csv")
july2022 = pd.read_csv("../../divvy-data/202207-divvy-tripdata.csv")

#### Columns Check

In [19]:
# asserting that all files have the same columns
all_dfs = [june2023, may2023, april2023, march2023, february2023, january2023, december2022, november2022, october2022, september2022, august2022, july2022]
df = all_dfs[0]
assert all([all(df.columns == df2.columns) for df2 in all_dfs])
print('All files have the same columns.')

All files have the same columns.


In [21]:
june2023

Unnamed: 0,ride_id,rideable_type,started_at,ended_at,start_station_name,start_station_id,end_station_name,end_station_id,start_lat,start_lng,end_lat,end_lng,member_casual
0,6F1682AC40EB6F71,electric_bike,2023-06-05 13:34:12,2023-06-05 14:31:56,,,,,41.910000,-87.690000,41.910000,-87.700000,member
1,622A1686D64948EB,electric_bike,2023-06-05 01:30:22,2023-06-05 01:33:06,,,,,41.940000,-87.650000,41.940000,-87.650000,member
2,3C88859D926253B4,electric_bike,2023-06-20 18:15:49,2023-06-20 18:32:05,,,,,41.950000,-87.680000,41.920000,-87.630000,member
3,EAD8A5E0259DEC88,electric_bike,2023-06-19 14:56:00,2023-06-19 15:00:35,,,,,41.990000,-87.650000,41.980000,-87.660000,member
4,5A36F21930D6A55C,electric_bike,2023-06-19 15:03:34,2023-06-19 15:07:16,,,,,41.980000,-87.660000,41.990000,-87.650000,member
...,...,...,...,...,...,...,...,...,...,...,...,...,...
719613,D7BBF4BCBB72DA32,classic_bike,2023-06-30 12:58:56,2023-06-30 13:41:25,Fairbanks Ct & Grand Ave,TA1305000003,California Ave & Milwaukee Ave,13084,41.891847,-87.620580,41.922695,-87.697153,casual
719614,9A1685F9A39646CA,electric_bike,2023-06-29 19:56:44,2023-06-29 20:09:15,Fairbanks Ct & Grand Ave,TA1305000003,,,41.891970,-87.620198,41.890000,-87.610000,casual
719615,CD4CC5A60881C7AF,electric_bike,2023-06-25 00:27:20,2023-06-25 00:39:09,Clark St & Lincoln Ave,13179,,,41.915745,-87.634604,41.920000,-87.650000,casual
719616,FF6594685CFE2056,electric_bike,2023-06-24 21:26:57,2023-06-24 21:28:44,Fairbanks Ct & Grand Ave,TA1305000003,,,41.891725,-87.620607,41.890000,-87.620000,casual
