# Scenario: Analyzing Toronto Bikeshare Data

## Objective
To understand the usage patterns of the Toronto Bikeshare system and identify the most popular days for biking based on trip duration and the number of bike rides.

## Data Loading
The data is loaded from the Excel file `2016_Bike_Share_Toronto_Ridership_Q4.xlsx` into a DataFrame called `TObike`.

## Initial Analysis
1. **Display the first few rows of the DataFrame** to understand its structure.
2. **Display information about the DataFrame**, such as data types and missing values.

## Specific Analysis
1. **Determine the day of the week with the longest total trip duration**.
2. **Determine the day of the week with the highest number of bike rides**.

## Further Steps
1. **Perform data cleaning and preprocessing** if necessary (e.g., handling missing values, converting data types).
2. **Visualize the findings** using appropriate plots (e.g., bar charts, line graphs).

### What day of the week do Torontonians bike the most? Maximum duration trip by adding all on days individually

In [1]:
import numpy as np
import pandas as pd
TObike = pd.read_excel('2016_Bike_Share_Toronto_Ridership_Q4.xlsx')
TObike.head(5)

Unnamed: 0,trip_id,trip_start_time,trip_stop_time,trip_duration_seconds,from_station_name,to_station_name,user_type
0,462305,2016-01-10 00:00:00,2016-01-10 00:07:00,394,Queens Quay W / Dan Leckie Way,Fort York Blvd / Garrison Rd,Casual
1,462306,2016-01-10 00:00:00,2016-01-10 00:09:00,533,Sherbourne St / Wellesley St,Edward St / Yonge St,Member
2,462307,2016-01-10 00:00:00,2016-01-10 00:07:00,383,Queens Quay W / Dan Leckie Way,Fort York Blvd / Garrison Rd,Casual
3,462308,2016-01-10 00:01:00,2016-01-10 00:27:00,1557,Cherry St / Distillery Ln,Fort York Blvd / Capreol Crt,Casual
4,462309,2016-01-10 00:01:00,2016-01-10 00:27:00,1547,Cherry St / Distillery Ln,Fort York Blvd / Capreol Crt,Casual


In [2]:
TObike.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 217569 entries, 0 to 217568
Data columns (total 7 columns):
 #   Column                 Non-Null Count   Dtype 
---  ------                 --------------   ----- 
 0   trip_id                217569 non-null  int64 
 1   trip_start_time        217569 non-null  object
 2   trip_stop_time         217569 non-null  object
 3   trip_duration_seconds  217569 non-null  int64 
 4   from_station_name      217567 non-null  object
 5   to_station_name        217567 non-null  object
 6   user_type              217569 non-null  object
dtypes: int64(2), object(5)
memory usage: 11.6+ MB


In [3]:
#trip_start_time is a datetime64[ns] data type 
#which is a special data type that Python and pandas use to store date and time data. 

# But it is object as we can see above. so we have to convert string to datetime using .to_datetime()

In [4]:
TObike['trip_start_time']= pd.to_datetime(TObike['trip_start_time'])

In [5]:
TObike.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 217569 entries, 0 to 217568
Data columns (total 7 columns):
 #   Column                 Non-Null Count   Dtype         
---  ------                 --------------   -----         
 0   trip_id                217569 non-null  int64         
 1   trip_start_time        217569 non-null  datetime64[ns]
 2   trip_stop_time         217569 non-null  object        
 3   trip_duration_seconds  217569 non-null  int64         
 4   from_station_name      217567 non-null  object        
 5   to_station_name        217567 non-null  object        
 6   user_type              217569 non-null  object        
dtypes: datetime64[ns](1), int64(2), object(4)
memory usage: 11.6+ MB


In [6]:
# we dont need to do the same for stop time as we have duration of trip already

# We need only the day of the week to answer the question of what day of the week is the busiest for Bikeshare Toronto. 
#When the data is formatted as a DateTime object, we can use either the dayofweek()->numeric value of the day of the week or 
# weekday_name() for the full day of week name.

TObike['weekday']= TObike['trip_start_time'].dt.day_name()
TObike.head(5)

Unnamed: 0,trip_id,trip_start_time,trip_stop_time,trip_duration_seconds,from_station_name,to_station_name,user_type,weekday
0,462305,2016-01-10 00:00:00,2016-01-10 00:07:00,394,Queens Quay W / Dan Leckie Way,Fort York Blvd / Garrison Rd,Casual,Sunday
1,462306,2016-01-10 00:00:00,2016-01-10 00:09:00,533,Sherbourne St / Wellesley St,Edward St / Yonge St,Member,Sunday
2,462307,2016-01-10 00:00:00,2016-01-10 00:07:00,383,Queens Quay W / Dan Leckie Way,Fort York Blvd / Garrison Rd,Casual,Sunday
3,462308,2016-01-10 00:01:00,2016-01-10 00:27:00,1557,Cherry St / Distillery Ln,Fort York Blvd / Capreol Crt,Casual,Sunday
4,462309,2016-01-10 00:01:00,2016-01-10 00:27:00,1547,Cherry St / Distillery Ln,Fort York Blvd / Capreol Crt,Casual,Sunday


### What is the day of the week with the longest duration of all trips during that day?":

In [7]:
TObike['trip_duration_seconds'].groupby(TObike['weekday']).aggregate(sum).sort_values(ascending = False)

weekday
Sunday       25490178
Friday       24531608
Monday       24355459
Wednesday    24092014
Tuesday      23752117
Thursday     22613646
Saturday     21499246
Name: trip_duration_seconds, dtype: int64

### What is the day of the week with the most bike rides?"

In [8]:
TObike['weekday'].value_counts()

Friday       34101
Tuesday      33763
Monday       32913
Thursday     32069
Sunday       30750
Wednesday    29759
Saturday     24214
Name: weekday, dtype: int64