# **Scenario**
~~~~
~~~~
A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself
apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with
disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about
8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to
commute to work each day.

Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments.
One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes,
and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers
who purchase annual memberships are Cyclistic members.

Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the
pricing flexibility helps Cyclistic attract more customers, Moreno believes that maximizing the number of annual members will
be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a
very good chance to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic
program and have chosen Cyclistic for their mobility needs.

**GOAL** : Design marketing strategies aimed at converting casual riders into annual members. In order to
do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why
casual riders would buy a membership, and how digital media could affect their marketing tactics.

In [1]:
HISTORICAL DATA.

METADATA

select * from voltaic-mantra-364014.cyclisticdb.INFORMATION_SCHEMA.TABLES

Unnamed: 0,table_catalog,table_schema,table_name,table_type,is_insertable_into,is_typed,creation_time,base_table_catalog,base_table_schema,base_table_name,snapshot_time_ms,ddl,default_collation_name,upsert_stream_apply_watermark
0,voltaic-mantra-364014,cyclisticdb,trip_data,BASE TABLE,YES,NO,2022-12-19 22:34:19.423,,,,NaT,CREATE TABLE `voltaic-mantra-364014.cyclisticd...,,NaT


In [2]:
select * from voltaic-mantra-364014.cyclisticdb.INFORMATION_SCHEMA.COLUMNS

Unnamed: 0,table_catalog,table_schema,table_name,column_name,ordinal_position,is_nullable,data_type,is_generated,generation_expression,is_stored,is_hidden,is_updatable,is_system_defined,is_partitioning_column,clustering_ordinal_position,collation_name,column_default,rounding_mode
0,voltaic-mantra-364014,cyclisticdb,trip_data,ride_id,1,YES,STRING,NEVER,,,NO,,NO,NO,,,,
1,voltaic-mantra-364014,cyclisticdb,trip_data,rideable_type,2,YES,STRING,NEVER,,,NO,,NO,NO,,,,
2,voltaic-mantra-364014,cyclisticdb,trip_data,started_at,3,YES,TIMESTAMP,NEVER,,,NO,,NO,NO,,,,
3,voltaic-mantra-364014,cyclisticdb,trip_data,ended_at,4,YES,TIMESTAMP,NEVER,,,NO,,NO,NO,,,,
4,voltaic-mantra-364014,cyclisticdb,trip_data,start_station_name,5,YES,STRING,NEVER,,,NO,,NO,NO,,,,
5,voltaic-mantra-364014,cyclisticdb,trip_data,start_station_id,6,YES,INT64,NEVER,,,NO,,NO,NO,,,,
6,voltaic-mantra-364014,cyclisticdb,trip_data,end_station_name,7,YES,STRING,NEVER,,,NO,,NO,NO,,,,
7,voltaic-mantra-364014,cyclisticdb,trip_data,end_station_id,8,YES,INT64,NEVER,,,NO,,NO,NO,,,,
8,voltaic-mantra-364014,cyclisticdb,trip_data,start_lat,9,YES,FLOAT64,NEVER,,,NO,,NO,NO,,,,
9,voltaic-mantra-364014,cyclisticdb,trip_data,start_lng,10,YES,FLOAT64,NEVER,,,NO,,NO,NO,,,,


~~~~
~~~~


### LICENCE

Lyft Bikes and Scooters, LLC (“Bikeshare”) operates the City of Chicago’s (“City”) Divvy bicycle sharing service. Bikeshare and the City are committed to supporting bicycling as an alternative transportation option. As part of that commitment, the City permits Bikeshare to make certain Divvy system data owned by the City (“Data”) available to the public, subject to the terms and conditions of this License Agreement (“Agreement”). By accessing or using any of the Data, you agree to all of the terms and conditions of [this agreement](https://ride.divvybikes.com/data-license-agreement).
~~~~
~~~~
### BUSINESS TASK

1) How do annual members and casual riders use Cyclistic bikes differently?
2) Why would casual riders buy Cyclistic annual memberships?
3) How can Cyclistic use digital media to influence casual riders to become members?

# Sample Size
~~~~
~~~~

The sample size *n* and margin of error *E* are given by
***$x	=	Z(c/100)2r(100-r)
n	=	N x/((N-1)E2 + x)
E	=	Sqrt[(N - n)x/n(N-1)]$***
where *N* is the population size, *r* is the fraction of responses that you are interested in, and *Z*(c/100) is the critical value for the confidence level *c*.

   
1) Margin of error = 5%
2) Confidence Level = 95%
3) Population Size = 2,746,388
4) Distribution = 50%

Recommended sample size is **385**.

In [3]:
SELECT distinct count(member_casual) as data_population from voltaic-mantra-364014.cyclisticdb.trip_data

Unnamed: 0,data_population
0,84776


## I. Problem
~~~~
~~~~

Three questions will guide the future marketing program:

1. How do annual members and casual riders use Cyclistic bikes differently?
2. Why would casual riders buy Cyclistic annual memberships?
3. How can Cyclistic use digital media to influence casual riders to become members?
~~~~
~~~~

## II. Cleaning Process
~~~~
~~~~

In [4]:
select * from voltaic-mantra-364014.cyclisticdb.trip_data limit 10

-- View

Unnamed: 0,ride_id,rideable_type,started_at,ended_at,start_station_name,start_station_id,end_station_name,end_station_id,start_lat,start_lng,end_lat,end_lng,member_casual
0,CFB93A48F8739A87,docked_bike,2020-04-26 13:05:00,2020-04-26 13:19:00,Walsh Park,628,California Ave & Francis Pl (Temp),259,41.9146,-87.668,41.9185,-87.6974,casual
1,57350116454F657E,docked_bike,2020-04-27 13:03:00,2020-04-27 13:16:00,Walsh Park,628,California Ave & Francis Pl (Temp),259,41.9146,-87.668,41.9185,-87.6974,casual
2,F244A35DC2995411,docked_bike,2020-04-05 13:08:00,2020-04-05 13:23:00,Walsh Park,628,California Ave & Francis Pl (Temp),259,41.9146,-87.668,41.9185,-87.6974,casual
3,AF097D79FD811EC8,docked_bike,2020-04-18 13:26:00,2020-04-18 13:42:00,Walsh Park,628,California Ave & Francis Pl (Temp),259,41.9146,-87.668,41.9185,-87.6974,casual
4,E14E9E37AD877B95,docked_bike,2020-04-25 13:32:00,2020-04-25 13:46:00,Walsh Park,628,California Ave & Francis Pl (Temp),259,41.9146,-87.668,41.9185,-87.6974,casual
5,915EAAD3924C4921,docked_bike,2020-04-11 15:05:00,2020-04-11 15:23:00,Walsh Park,628,California Ave & North Ave,276,41.9146,-87.668,41.9104,-87.6972,casual
6,80549F36A51DC503,docked_bike,2020-04-11 15:05:00,2020-04-11 15:23:00,Walsh Park,628,California Ave & North Ave,276,41.9146,-87.668,41.9104,-87.6972,casual
7,8D62A5D9DBADAF5B,docked_bike,2020-04-01 07:13:00,2020-04-01 07:27:00,Walsh Park,628,Franklin St & Chicago Ave,31,41.9146,-87.668,41.8967,-87.6357,member
8,E4AC19B9C9A172F2,docked_bike,2020-04-02 07:17:00,2020-04-02 07:30:00,Walsh Park,628,Franklin St & Chicago Ave,31,41.9146,-87.668,41.8967,-87.6357,member
9,8E530B18FCBD2657,docked_bike,2020-04-03 07:19:00,2020-04-03 07:32:00,Walsh Park,628,Franklin St & Chicago Ave,31,41.9146,-87.668,41.8967,-87.6357,member


In [5]:
select * from voltaic-mantra-364014.cyclisticdb.trip_data where end_station_name is null order by ride_id limit 20

-- it was noticed that there were null values in end_station_name, end_station_id, end_lat and end_lng 
-- due to the fact that start_station was also a destination.

Unnamed: 0,ride_id,rideable_type,started_at,ended_at,start_station_name,start_station_id,end_station_name,end_station_id,start_lat,start_lng,end_lat,end_lng,member_casual
0,04098BBE9BBB5F57,docked_bike,2020-04-30 15:21:00,2020-04-30 15:32:00,Wood St & Milwaukee Ave,61,,,41.9077,-87.6726,,,casual
1,056E9566355B4D4A,docked_bike,2020-04-30 18:18:00,2020-04-30 18:26:00,Rockwell St & Eastwood Ave,478,,,41.9659,-87.6936,,,member
2,0AFFE16B6EBC5FC8,docked_bike,2020-04-16 14:33:00,2020-04-16 14:34:00,McClurg Ct & Erie St,142,,,41.8945,-87.6179,,,member
3,0B5A99A240EDB88B,docked_bike,2020-04-28 10:54:00,2020-04-28 13:36:00,Clark St & Armitage Ave,94,,,41.9183,-87.6363,,,casual
4,0E058FDC8ADFFB78,docked_bike,2020-04-08 13:58:00,2020-04-08 23:38:00,Aberdeen St & Jackson Blvd,21,,,41.8777,-87.6548,,,casual
5,0F7201752882C449,docked_bike,2020-04-30 20:47:00,2020-04-30 21:19:00,Ashland Ave & Chicago Ave,350,,,41.896,-87.6677,,,member
6,0FC754ACD0AC14F0,docked_bike,2020-04-21 07:29:00,2020-04-21 07:50:00,Sedgwick St & Huron St,111,,,41.8947,-87.6384,,,member
7,12B6B1F28BD4DCD1,docked_bike,2020-04-29 16:54:00,2020-04-29 17:01:00,Halsted St & Wrightwood Ave,349,,,41.9291,-87.6491,,,member
8,180949DF96C718C8,docked_bike,2020-04-20 08:55:00,2020-04-21 09:55:00,Phillips Ave & 79th St,579,,,41.7518,-87.5652,,,casual
9,1B610D87B3AEE776,docked_bike,2020-04-08 06:04:00,2020-04-08 06:19:00,Sheffield Ave & Willow St,93,,,41.9137,-87.6529,,,member


In [6]:
select ride_id,
    rideable_type,
    started_at,
    ended_at,
    start_station_name,
    start_station_id,
    start_lat,
    start_lng,
    member_casual,

CASE
    WHEN end_station_name is null THEN coalesce(start_station_name, end_station_name)  
    else null
END as end_station,
CASE
    WHEN end_lat is null THEN coalesce(start_lat, end_lat)  
    else null
END as end_latitude,
CASE
    WHEN end_lng is null THEN coalesce(start_lng, end_lng)  
    else null
END as end_longitude,
CASE
    WHEN end_station_id is null THEN 'no Id' 
    else null
END as end_station_wthout_id
from voltaic-mantra-364014.cyclisticdb.trip_data where end_station_name is null limit 20

-- replacing null values. 

Unnamed: 0,ride_id,rideable_type,started_at,ended_at,start_station_name,start_station_id,start_lat,start_lng,member_casual,end_station,end_latitude,end_longitude,end_station_wthout_id
0,DBC43BAF2632F51F,docked_bike,2020-04-01 16:16:00,2020-04-01 17:19:00,Eckhart Park,86,41.8964,-87.661,casual,Eckhart Park,41.8964,-87.661,no Id
1,E29191D1830A480A,docked_bike,2020-04-17 19:50:00,2020-04-17 20:09:00,Wells St & Elm St,182,41.9032,-87.6343,member,Wells St & Elm St,41.9032,-87.6343,no Id
2,F02C1C6BEC24743D,docked_bike,2020-04-23 19:34:00,2020-04-23 19:40:00,Daley Center Plaza,81,41.8842,-87.6296,member,Daley Center Plaza,41.8842,-87.6296,no Id
3,3B2321AE918DA50F,docked_bike,2020-04-14 08:46:00,2020-04-14 12:07:00,Orleans St & Elm St,23,41.9029,-87.6377,member,Orleans St & Elm St,41.9029,-87.6377,no Id
4,97C00C77F12AF5AE,docked_bike,2020-04-10 11:54:00,2020-04-10 12:02:00,Wells St & Huron St,53,41.8947,-87.6344,member,Wells St & Huron St,41.8947,-87.6344,no Id
5,56CE9D870538EF11,docked_bike,2020-04-03 08:26:00,2020-04-03 11:41:00,Canal St & Monroe St,191,41.8817,-87.6395,member,Canal St & Monroe St,41.8817,-87.6395,no Id
6,0AFFE16B6EBC5FC8,docked_bike,2020-04-16 14:33:00,2020-04-16 14:34:00,McClurg Ct & Erie St,142,41.8945,-87.6179,member,McClurg Ct & Erie St,41.8945,-87.6179,no Id
7,FC4EA53110083A66,docked_bike,2020-04-16 13:58:00,2020-04-16 14:33:00,McClurg Ct & Erie St,142,41.8945,-87.6179,member,McClurg Ct & Erie St,41.8945,-87.6179,no Id
8,ED7750BCEEE87174,docked_bike,2020-04-09 15:33:00,2020-04-09 16:34:00,Morgan Ave & 14th Pl,137,41.8624,-87.6511,casual,Morgan Ave & 14th Pl,41.8624,-87.6511,no Id
9,A15B03B09F22555B,docked_bike,2020-04-07 10:01:00,2020-04-07 10:27:00,Morgan Ave & 14th Pl,137,41.8624,-87.6511,member,Morgan Ave & 14th Pl,41.8624,-87.6511,no Id


## III. Integrity

This dataset doesn't contain duplicates. Null values were replaced. There is no misspelling or incorrect names.

~~~~
~~~~

### Select data

I'll be focused on the following data that is to be the best of interest for the current task:

  
+ ride_id;
+ started_at;
+ ended_at;
+ start_station_name;
+ start_station_id;
+ end_station_name;
+ start_lat;
+ start_lng;
+ end_lat;
+ end_lng;
+ member_casual.


The main questions that need to be answered to better describe customer's behavior:
   
1. What is the average travel time from both customers types?
2. What is the average distance of both customers?
3. What is the most used station as destination?
4. What were the dates that occurred the most travels by casual customers?


In [7]:
select distinct count(member_casual) as total_members
from voltaic-mantra-364014.cyclisticdb.trip_data
where member_casual = "member"

Unnamed: 0,total_members
0,61148


In [8]:
select distinct count(member_casual) as total_casuals
from voltaic-mantra-364014.cyclisticdb.trip_data
where member_casual = "casual"

Unnamed: 0,total_casuals
0,23628


### *Q*.1) What is the average time travel from both customers?

In [9]:
select avg(duration_minutes) avg_time_travel_members from
(
  select started_at,ended_at,
  datetime_diff(ended_at,started_at, minute) as duration_minutes
  from voltaic-mantra-364014.cyclisticdb.trip_data
  where member_casual = "member"
) 

Unnamed: 0,avg_time_travel_members
0,21.467047


In [10]:
select avg(duration_minutes) avg_time_travel_casuals from
(
  select started_at,ended_at,
  datetime_diff(ended_at,started_at, minute) as duration_minutes
  from voltaic-mantra-364014.cyclisticdb.trip_data
  where member_casual = "casual"
) 

Unnamed: 0,avg_time_travel_casuals
0,73.070637


In [11]:
select distinct member_casual,
                start_station_name,
                end_station_name,
      datetime_diff(ended_at,started_at, minute) as duration_minutes,
       st_distance(st_geogpoint(start_lng, start_lat), st_geogpoint(end_lng,end_lat)) as distance_in_meters,
  from voltaic-mantra-364014.cyclisticdb.trip_data limit 3000

  -- There are zero values, due to destination station were same as departure station.

Unnamed: 0,member_casual,start_station_name,end_station_name,duration_minutes,distance_in_meters
0,casual,Walsh Park,California Ave & Francis Pl (Temp),14,2470.976669
1,casual,Walsh Park,California Ave & Francis Pl (Temp),13,2470.976669
2,casual,Walsh Park,California Ave & Francis Pl (Temp),15,2470.976669
3,casual,Walsh Park,California Ave & Francis Pl (Temp),16,2470.976669
4,casual,Walsh Park,California Ave & North Ave,18,2460.949688
...,...,...,...,...,...
2995,member,State St & 33rd St,Clinton St & Lake St,26,5812.867766
2996,member,State St & 33rd St,Clinton St & Lake St,28,5812.867766
2997,member,State St & 33rd St,Clinton St & Lake St,27,5812.867766
2998,member,State St & 33rd St,Wabash Ave & 16th St,19,2857.714100


### Trends

* Average time travel for members were 21 minutes.
* Average time travel for casuals were 73 minutes.
* There are zero values, due to destination station where was the same as departure.

In [12]:
from lets_plot import * 
ggplot() + \
geom_bar(aes(x="member_casual", y="duration_minutes", color="member_casual", fill="member_casual"), data=df_13, sampling="none" if df_13.size < 50 else sampling_pick(n=50), stat="identity") + \
ggtitle("Chart")  + \
ggsize(300, 200)

### *Q*.2) What is the average distance of both customers?

In [13]:
select  member_casual,start_lat,end_lat,start_lng,end_lng from voltaic-mantra-364014.cyclisticdb.trip_data limit 20

Unnamed: 0,member_casual,start_lat,end_lat,start_lng,end_lng
0,casual,41.9146,41.9185,-87.668,-87.6974
1,casual,41.9146,41.9185,-87.668,-87.6974
2,casual,41.9146,41.9185,-87.668,-87.6974
3,casual,41.9146,41.9185,-87.668,-87.6974
4,casual,41.9146,41.9185,-87.668,-87.6974
5,casual,41.9146,41.9104,-87.668,-87.6972
6,casual,41.9146,41.9104,-87.668,-87.6972
7,member,41.9146,41.8967,-87.668,-87.6357
8,member,41.9146,41.8967,-87.668,-87.6357
9,member,41.9146,41.8967,-87.668,-87.6357


In [14]:
select distinct avg(distance_in_meters) avg_distance_members from
(
SELECT distinct member_casual,
    st_distance(st_geogpoint(start_lng, start_lat), st_geogpoint(end_lng,end_lat)) as distance_in_meters,
    from voltaic-mantra-364014.cyclisticdb.trip_data
    where member_casual = "member"
) 

Unnamed: 0,avg_distance_members
0,3015.641291


In [15]:
select distinct avg(distance_in_meters) as avg_distance_casuals from
(
SELECT distinct member_casual,
    st_distance(st_geogpoint(start_lng, start_lat), st_geogpoint(end_lng,end_lat)) as distance_in_meters,
    from voltaic-mantra-364014.cyclisticdb.trip_data
    where member_casual ="casual"
) 

Unnamed: 0,avg_distance_casuals
0,3015.451225


### Trends

* Average distance for members were 3015.67 meters.
* Average distance for casuals were 3015.45 meters.

In [16]:
from lets_plot import * 
ggplot() + \
geom_point(aes(x="member_casual", y="distance_in_meters"), data=df_13, sampling="none" if df_13.size < 2500 else sampling_systematic(n=2500)) + \
ggtitle("Chart")  + \
ggsize(300, 200)

### *Q*.3) What is the most station used as destination?

In [17]:
select 
    member_casual,
    started_at,
    ended_at,
    start_station_name,
    start_station_id,
    start_lat,
    start_lng,
    

CASE
    WHEN end_station_name is null THEN coalesce(start_station_name, end_station_name)  
    else null
END as end_station,
CASE
    WHEN end_lat is null THEN coalesce(start_lat, end_lat)  
    else null
END as end_latitude,
CASE
    WHEN end_lng is null THEN coalesce(start_lng, end_lng)  
    else null
END as end_longitude,
CASE
    WHEN end_station_id is null THEN 'no Id' 
    else null
END as end_station_wthout_id
from voltaic-mantra-364014.cyclisticdb.trip_data where end_station_name is null limit 20

-- Data View.

Unnamed: 0,member_casual,started_at,ended_at,start_station_name,start_station_id,start_lat,start_lng,end_station,end_latitude,end_longitude,end_station_wthout_id
0,casual,2020-04-01 16:16:00,2020-04-01 17:19:00,Eckhart Park,86,41.8964,-87.661,Eckhart Park,41.8964,-87.661,no Id
1,member,2020-04-17 19:50:00,2020-04-17 20:09:00,Wells St & Elm St,182,41.9032,-87.6343,Wells St & Elm St,41.9032,-87.6343,no Id
2,member,2020-04-23 19:34:00,2020-04-23 19:40:00,Daley Center Plaza,81,41.8842,-87.6296,Daley Center Plaza,41.8842,-87.6296,no Id
3,member,2020-04-14 08:46:00,2020-04-14 12:07:00,Orleans St & Elm St,23,41.9029,-87.6377,Orleans St & Elm St,41.9029,-87.6377,no Id
4,member,2020-04-10 11:54:00,2020-04-10 12:02:00,Wells St & Huron St,53,41.8947,-87.6344,Wells St & Huron St,41.8947,-87.6344,no Id
5,member,2020-04-03 08:26:00,2020-04-03 11:41:00,Canal St & Monroe St,191,41.8817,-87.6395,Canal St & Monroe St,41.8817,-87.6395,no Id
6,member,2020-04-16 14:33:00,2020-04-16 14:34:00,McClurg Ct & Erie St,142,41.8945,-87.6179,McClurg Ct & Erie St,41.8945,-87.6179,no Id
7,member,2020-04-16 13:58:00,2020-04-16 14:33:00,McClurg Ct & Erie St,142,41.8945,-87.6179,McClurg Ct & Erie St,41.8945,-87.6179,no Id
8,casual,2020-04-09 15:33:00,2020-04-09 16:34:00,Morgan Ave & 14th Pl,137,41.8624,-87.6511,Morgan Ave & 14th Pl,41.8624,-87.6511,no Id
9,member,2020-04-07 10:01:00,2020-04-07 10:27:00,Morgan Ave & 14th Pl,137,41.8624,-87.6511,Morgan Ave & 14th Pl,41.8624,-87.6511,no Id


In [18]:
select end_station_name, count(end_station_name) as number_end_rides_station 
from voltaic-mantra-364014.cyclisticdb.trip_data 
where member_casual = "member" group by end_station_name order by number_end_rides_station desc limit 10

Unnamed: 0,end_station_name,number_end_rides_station
0,Clark St & Elm St,675
1,St. Clair St & Erie St,616
2,Dearborn St & Erie St,587
3,Broadway & Barry Ave,511
4,Desplaines St & Kinzie St,508
5,Wabash Ave & Roosevelt Rd,489
6,Larrabee St & Webster Ave,446
7,Wells St & Concord Ln,435
8,Clark St & Armitage Ave,432
9,Wabash Ave & Grand Ave,427


In [19]:
select end_station_name, count(end_station_name) as number_end_rides_station 
from voltaic-mantra-364014.cyclisticdb.trip_data 
where member_casual = "casual" group by end_station_name order by number_end_rides_station desc limit 10


Unnamed: 0,end_station_name,number_end_rides_station
0,Clark St & Elm St,218
1,Dearborn St & Erie St,198
2,Wells St & Huron St,183
3,Stockton Dr & Wrightwood Ave,180
4,Wabash Ave & Grand Ave,179
5,Indiana Ave & Roosevelt Rd,178
6,Ashland Ave & Division St,177
7,Dearborn Pkwy & Delaware Pl,176
8,Wells St & Elm St,172
9,Sheffield Ave & Waveland Ave,170


In [20]:
select end_station_name, count(end_station_name) as number_end_rides_station 
from voltaic-mantra-364014.cyclisticdb.trip_data 
group by end_station_name order by number_end_rides_station desc limit 10

Unnamed: 0,end_station_name,number_end_rides_station
0,Clark St & Elm St,893
1,Dearborn St & Erie St,785
2,St. Clair St & Erie St,695
3,Desplaines St & Kinzie St,678
4,Broadway & Barry Ave,675
5,Wabash Ave & Roosevelt Rd,643
6,Larrabee St & Webster Ave,612
7,Wabash Ave & Grand Ave,606
8,Clark St & Armitage Ave,595
9,Wells St & Concord Ln,567


### Trends

* Members used Clark St & Elm St 675 times as destination.
* Casuals had Clark St & Elm 218 times St as destination.
* Clark St & Elm St was the most used for both destination and departure.

In [21]:
from lets_plot import * 
ggplot() + \
geom_bar(aes(x="end_station_name", y="number_end_rides_station", color="number_end_rides_station", fill="number_end_rides_station"), data=df_16, sampling="none" if df_16.size < 50 else sampling_pick(n=50), stat="identity") + \
ggtitle("Chart")  + \
ggsize(300, 200)

### *Q*.4) What were the dates that occurred the most travels by casual customers?

In [22]:
select member_casual, started_at, ended_at from voltaic-mantra-364014.cyclisticdb.trip_data limit 10

-- Data View.

Unnamed: 0,member_casual,started_at,ended_at
0,casual,2020-04-26 13:05:00,2020-04-26 13:19:00
1,casual,2020-04-27 13:03:00,2020-04-27 13:16:00
2,casual,2020-04-05 13:08:00,2020-04-05 13:23:00
3,casual,2020-04-18 13:26:00,2020-04-18 13:42:00
4,casual,2020-04-25 13:32:00,2020-04-25 13:46:00
5,casual,2020-04-11 15:05:00,2020-04-11 15:23:00
6,casual,2020-04-11 15:05:00,2020-04-11 15:23:00
7,member,2020-04-01 07:13:00,2020-04-01 07:27:00
8,member,2020-04-02 07:17:00,2020-04-02 07:30:00
9,member,2020-04-03 07:19:00,2020-04-03 07:32:00


In [23]:
 select distinct member_casual, started_at,
  FORMAT_DATETIME('%A', CAST(started_at AS DATETIME)) as weekday,
  count(started_at) as num_started_travels, 
  from voltaic-mantra-364014.cyclisticdb.trip_data where member_casual = "casual" group by member_casual, started_at order by num_started_travels desc

Unnamed: 0,member_casual,started_at,weekday,num_started_travels
0,casual,2020-04-26 14:43:00,Sunday,16
1,casual,2020-04-26 12:58:00,Sunday,15
2,casual,2020-04-19 13:26:00,Sunday,14
3,casual,2020-04-26 15:22:00,Sunday,14
4,casual,2020-04-07 17:36:00,Tuesday,14
...,...,...,...,...
11847,casual,2020-04-03 11:47:00,Friday,1
11848,casual,2020-04-14 16:27:00,Tuesday,1
11849,casual,2020-04-03 18:44:00,Friday,1
11850,casual,2020-04-21 08:21:00,Tuesday,1


### Trends
* Weekends and Tuesday's were the most days travels from casual customers.
* Casual customers use share-bikes on non-working days.

In [24]:
from lets_plot import * 
ggplot() + \
geom_point(aes(x="weekday", y="num_started_travels", color="weekday"), data=df_19, sampling="none" if df_19.size < 2500 else sampling_systematic(n=2500)) + \
ggtitle("Chart")  + \
ggsize(300, 200)

# Analysis

The main questions:

1. How do annual members and casual riders use Cyclistic bikes differently?
2. Why would casual riders buy Cyclistic annual memberships?
3. How can Cyclistic use digital media to influence casual riders to become members?

#### Descriptive:

After analysing customer behavior rides for a Cyclistic company between as called "casual" customers and "member" customers, it was verified that both customers have similar behaviors, such as average distance, start station and end station. However, it was noticed that "casual" customers have an average of time travel of seventy-three minutes, it's about fifty-tow minutes plus difference then "members" customers that take an average of twenty-one minutes which leads us to conclude that "casual" customers uses Cyclistic bike-share mainly for leisure while "member" use it to commute to work.

#### Prescriptive:

Regarding the analysis of the data, cyclistic bike-share leads to conclude with the following recommendations to solve this business problem:

* Set pricing advantages or promotions for casual clients to convert casual clients into members, their behavior is very similar but casual clients are set to use bike-share more often for leisure rather member clients, as the time of using bike-share is also longer, which memberships have more advantage on pricing rather a single ticket ride that could have disadvantages, so it will be appealing if membership prices could be attractive for these clients, develop app/dashboard invoice lecturing casual client of how could they save more by choosing a membership for their prupose;
  
* Set a strong e-mail marketing strategy regarding casual clients' e-mails, also set an outdoor marketing strategy regarding st clark & Elm St station, which is the most used for casual clients;
  
* Using digital media for marketing targeting bike leisure platforms with membership promotions which are preferred for casual clients, this could impact business goals effectively.