# How does a bike-share navigate speedy success?

## Introduction
The Cyclistic Bike Share Case Study is a capstone project for the Google Data Analytics Professional Certificate on 
Coursera. In this project, I will follow the data analysis process, I learned from the course: ask, prepare, 
process, analyze, share and act to analyze the data.

## Scenario 

You are a junior data analyst working on the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.

## About the company 

In 2016, Cyclistic launched a successful bike-share service. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.  

Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as *casual riders*. Customers who purchase annual memberships are *Cyclistic members*. 

Lily Moreno, the director of marketing, has set a clear goal: **Design marketing strategies aimed at converting casual riders into annual members**. In order to do that, however, the team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends. 

## Ask 
Three questions wil guide the future marketing program: 
1. How do annual members and casual riders use Cyclistic bikes differently? 
2. Why would casual riders buy Cyclistic annual memberships?  
3. How can Cyclistic use digital media to influence casual riders to become members? 

Moreno has assigned you the first question to answer: **How do annual members and casual riders use Cyclistic bikes differently?** 

You will produce a report with the folowing deliverables:  
1. A clear statement of the business task 
2. A description of all data sources used 
3. Documentation of any cleaning or manipulation of data 
4. A summary of your analysis  
5. Supporting visualizations and key findings  
6. Your top three recommendations based on your analysis

## Case Study Roadmap - Ask 
### Guiding questions 
* What is the problem you are trying to solve?  
I try to identifie pattern on how a casual rider and annual member use the service differently.

* How can your insights drive business decisions?  
We can make targeted ad campaign or offer discount in order to promote casual rider to annual's member behavior.

### Key tasks 
* Identify the business task
* Consider key stakeholders 

### Deliverable 
* A clear statement of the business task  
Identify profil of a casual and annual member and how they differ

## Case Study Roadmap - Prepare 
### Guiding questions 
* Where is your data located? 
Local files downloaded from https://docs.google.com/spreadsheets/d/1uCTsHlZLm4L7-ueaSLwDg0ut3BP_V4mKDo2IMpaXrk4/template/preview?resourcekey=0-dQAUjAu2UUCsLEQQt20PDA#gid=1797029090
https://docs.google.com/spreadsheets/d/179QVLO_yu5BJEKFVZShsKag74ZaUYIF6FevLYzs3hRc/template/preview#gid=640449855
* How is the data organized?  
Two dataframe with 8 variables

**data2019**  
data.frame:	365069 obs. of  8 variables:  
trip_id          : int  
start_time       : chr  
end_time         : chr  
from_station_id  : int   
from_station_name: chr  
end_station_id   : int  
end_station_name : chr  
usertype         : chr  

**data2020**  
data.frame:	426887 obs. of  8 variables:  
trip_id          : chr  
start_time       : chr  
end_time         : chr  
from_station_name: chr  
from_station_id  : int  
end_station_name : chr  
end_station_id   : int  
usertype         : chr  

* Are there issues with bias or credibility in this data? Does your data ROCCC?  
The data are from the 1er Q (December, January, February and March) of 2019 and 2020 coming from the company directly.  
Data are Reliable, original, comprehensive, current and cited for the Project.

* How are you addressing licensing, privacy, security, and accessibility?  
I choose the MIT License

MIT License

Copyright (c) 2025 Nicolas AMOUSSOUVI

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

* How did you verify the data’s integrity?  
I checked for duplicate, na and remove the line corresponding
* How does it help you answer your question?  
I sure of my data and make more accurate insight
* Are there any problems with the data?  
Data are from winter so the behavior of user can differ during other season

### Key tasks 
* Download data and store it appropriately. 
* Identify how it’s organized. 
* Sort and filter the data. 
* Determine the credibility of the data.

### Deliverable 
* A description of all data sources used  
The data are from the 1er quarter (December, January, February and March) of 2019 and 2020 coming from the company directly.

## Case Study Roadmap - Process 
### Guiding questions 
* What tools are you choosing and why?  
I choose to use Rstudio because it provide data managing, analysing and data visualization
* Have you ensured your data’s integrity?  
Yes
* What steps have you taken to ensure that your data is clean?  
I removed duplicate, na and columns not necessary for the objective.  
I harmonized date format, column names and columns order.

* How can you verify that your data is clean and ready to analyze?  
I checked the format of each column with head and checked the different value with graphics.

* Have you documented your cleaning process so you can review and share those results?  
Yes in the R document

### Key tasks 
* Check the data for errors.
* Choose your tools. 
* Transform the data so you can work with it effectively. 
* Document the cleaning process. 

### Deliverable 
* Documentation of any cleaning or manipulation of data


## Case Study Roadmap - Analyze  
### Guiding questions 
* How should you organize your data to perform analysis on it?  
I put all my data in a single data frame
* Has your data been properly formatted?  
Yes at the previous step
* What surprises did you discover in the data?  
Few Na and duplicates.
ride Id had different format between the 2019's and 2020's data.
rideable_type, gender, birthyear removed from 2019 to 2020 data.
start_lat, start_lng ,end_lat, end_lng was added in 2020 data  
* What trends or relationships did you find in the data?   
Casual members use the service for long ride (89 min) more often thursday and friday but less during the rest of the week  
Annual members use the service for short ride (13 min) and during all day of the week with aeven proportion  
During cold month less use of the service  

Casual riders buy annual memberships to go to work and different activity close by while casual go for long ride.  
Can we explain the long ride of casual user by the conditions of the subscription?  
Full day pass allow for longer use.  
Annual member get free first 15 min that why they have shorter ride length that why it's more for people living close to their activities (city dwelers)  

Make ad promotion targeted for casual rider: tourist and people wanting to sightsee or nature escapade. 
Annual member ad promoting short trip to jobs, activities and quicker alternative to the subway, bus or tram in big cities. We can also focus on the health or ecological benefit of taking a bike instead of transportation.
Make discount for students or companies to increase habits generation and get long time user.
In order to convert casual to annual member, we could give a free week to try it on start of school year, during spring and summer or during strike of public transport for the casual users in order to allow them to try the member experience.
Another idea is referal code for annual member. referee get a bonus referal get a bonus everybody is happier.

How will these insights help answer your business questions?  
Casual and annual member have different behavior so we can cater the campaign for each 

### Key tasks 
* Aggregate your data so it’s useful and accessible. 
* Organize and format your data. 
* Perform calculations. 
* Identify trends and relationships. 

### Deliverable  
A summary of your analysis  
Casual rider do long ride (89 min in mean), take a bike thursday and friday specifically and less often the rest of the week.  
Annual member do short ride (13 min in mean), take a bike evenly during the day.  
Annual member are far more present (100 times more than casual)

## Case Study Roadmap - Share  
### Guiding questions 
* Were you able to answer the question of how annual members and casual riders use Cyclistic bikes differently?  
Yes

* What story does your data tell?  
Annual subscriber do short ride every day while casual do long run mainly thursday and friday

* How do your findings relate to your original question?  
We have clear difference between casual and annual member.  
We can leverage it with ad specific to their needs

* Who is your audience? What is the best way to communicate with them?  
Lily Moreno, the director of marketing, Cyclistic marketing analytics team and Cyclistic executive team

* Can data visualization help you share your findings? 
Graphic with ride count on each day

* Is your presentation accessible to your audience?  
Yes 

### Key tasks 
* Determine the best way to share your findings.  
* Create effective data visualizations. 
* Present your findings. 
* Ensure your work is accessible. 

### Deliverable  
* Supporting visualizations and key findings 

## Case Study Roadmap - Act 
### Guiding questions 
* What is your final conclusion based on your analysis?  
Annual subscriber far more present than Casual rider. important to reward loyalty in order to keep the user.  
Transforming casual user in annual subscriber demand to find a portion of them that use the service for short trip

* How could your team and business apply your insights?  
Build a campaign ad in order to showcase the ease of use, health benefit and quick solution that our service represent for the user compare to commute with a car, public transportation or other alternative  
* What next steps would you or your stakeholders take based on your findings?  
Create a targeted campaign ad for casual user in order to make them annual member with benefit like free first week for subscriber. 
* Is there additional data you could use to expand on your findings?  
Use all data of a year. Seasonnal behavior could be found

### Key tasks 
* Create your portfolio. 
* Add your case study. 
* Practice presenting your case study to a friend or family member. 

### Deliverable 
Your top three recommendations based on your analysis:  
1.Casual rider go for long ride specificaly thursday and friday while annual member go for short ride everyday. Create habits by targeting campaign for companies, schools and tourist website.

2.Target ad for casual with benefit package if they get a referal from a annual member. Discount campaign during specific period (beginning of school year, spring, summer, strike of public transportation ). Showcase the health, ecological and time benefit of our bike service.

3.Reward loyalty for annual user. they are far more present than the casual user (100 times more ) and need to stay on the service maybe a targeted campaign in order to make them stay longer. Loyality system The more you ride the less you pay, bill board for the most km during the month, you can create tribu (schools, companies, district...), you can compare from one year to the next, reward for annual subscriber give access to a day pass with longer rent period or allow for free rent for a friend.
