## 1. Problem statement

**This analysis assumes the following hypothetical scenario **: 

"California accounts for 14.5% of the nation’s bicycle riders. A total of 645 people were the casualty of the accidents in San Francisco County in 2014". Bicyclists above 20 years old represented the large majority of the victims, and, 74% of them being male [1]. In the recent years, there has been a sharp increase in bike accidents with increased fatalities in the Bay area that involves the car, motorcycle, bicycle and pedestrians[2]. The government of the City and County of San Francisco Bay area wants to launch a bike safety campaign, with the promotional slogan-"Safer Biking Together”. The objective of the campaign is to raise awareness among the bikers, to abide by the safety aspects of the roads to reduce accidents. As part of the campaign, electronic advertisements boards will be used across the city for a part of the day. In order to successfully run a targeted campaign, the right strategy needs to be set so that the largest number of people can be reached. 

This analysis will explore various aspects of the bike sharing data and some recommendations will be made that will be the basis for the campaign. 

## Data Source: https://s3.amazonaws.com/fordgobike-data/index.html

In [1]:
# import all packages and set plots to be embedded inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb
import datetime as dt
import networkx as nx
import matplotlib.pyplot as plt

In [2]:
# load in the dataset into pandas dataframes

df_bike_1 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201801-fordgobike-tripdata.csv')
df_bike_2 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201802-fordgobike-tripdata.csv')
df_bike_3 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201803-fordgobike-tripdata.csv')
df_bike_4 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201804-fordgobike-tripdata.csv')
df_bike_5 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201805-fordgobike-tripdata.csv')
df_bike_6 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201806-fordgobike-tripdata.csv')
df_bike_7 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201807-fordgobike-tripdata.csv')
df_bike_8 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201808-fordgobike-tripdata.csv')
df_bike_9 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201809-fordgobike-tripdata.csv')
df_bike_10 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\201810-fordgobike-tripdata.csv')

# combinning the datasets
df_2018 = pd.concat([df_bike_1,df_bike_2, df_bike_3,df_bike_4, df_bike_5, df_bike_6, df_bike_7, df_bike_8, df_bike_9, df_bike_10])

# show first three rows
df_2018.head(3)

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,end_station_longitude,bike_id,user_type,member_birth_year,member_gender,bike_share_for_all_trip
0,75284,2018-01-31 22:52:35.2390,2018-02-01 19:47:19.8240,120.0,Mission Dolores Park,37.76142,-122.426435,285.0,Webster St at O'Farrell St,37.783521,-122.431158,2765,Subscriber,1986.0,Male,No
1,85422,2018-01-31 16:13:34.3510,2018-02-01 15:57:17.3100,15.0,San Francisco Ferry Building (Harry Bridges Pl...,37.795392,-122.394203,15.0,San Francisco Ferry Building (Harry Bridges Pl...,37.795392,-122.394203,2815,Customer,,,No
2,71576,2018-01-31 14:23:55.8890,2018-02-01 10:16:52.1160,304.0,Jackson St at 5th St,37.348759,-121.894798,296.0,5th St at Virginia St,37.325998,-121.87712,3039,Customer,1996.0,Male,No


In [3]:
# Columns of the dataset and data type 
df_2018.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1598223 entries, 0 to 201457
Data columns (total 16 columns):
duration_sec               1598223 non-null int64
start_time                 1598223 non-null object
end_time                   1598223 non-null object
start_station_id           1587128 non-null float64
start_station_name         1587128 non-null object
start_station_latitude     1598223 non-null float64
start_station_longitude    1598223 non-null float64
end_station_id             1587128 non-null float64
end_station_name           1587128 non-null object
end_station_latitude       1598223 non-null float64
end_station_longitude      1598223 non-null float64
bike_id                    1598223 non-null int64
user_type                  1598223 non-null object
member_birth_year          1497614 non-null float64
member_gender              1497965 non-null object
bike_share_for_all_trip    1598223 non-null object
dtypes: float64(7), int64(2), object(7)
memory usage: 207.3+ MB


**1. Note: The start_time and end_time is in string format. For analyzing time, we need to convert it into datetime format.**

In [4]:
#converting the start and end time into date time format
df_2018['start_time']= pd.to_datetime(df_2018.start_time)
df_2018['end_time']= pd.to_datetime(df_2018.end_time)

# Creating the month category- will return the month of the year
df_2018['month'] = df_2018['start_time'].dt.strftime('%b')

In [5]:
# Show the months of 2018 dataset
df_2018.month.unique()

array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
       'Oct'], dtype=object)

**2. Note: In the 2018 dataset, the data for the month of November and December is missing. To get the dataset for the whole year, we will combine the 2018 dataset with the November and December data of the year 2017.**

In [6]:
#reading the 2017 data
df_2017 = df_bike_10 = pd.read_csv('C:\\Users\\raz37388\\Desktop\\Udacity_assignment\\Assignment 4\\2017-fordgobike-tripdata.csv')

In [7]:
#See first three columns
df_2017.head(3)

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,end_station_longitude,bike_id,user_type,member_birth_year,member_gender
0,80110,2017-12-31 16:57:39.6540,2018-01-01 15:12:50.2450,74,Laguna St at Hayes St,37.776435,-122.426244,43,San Francisco Public Library (Grove St at Hyde...,37.778768,-122.415929,96,Customer,1987.0,Male
1,78800,2017-12-31 15:56:34.8420,2018-01-01 13:49:55.6170,284,Yerba Buena Center for the Arts (Howard St at ...,37.784872,-122.400876,96,Dolores St at 15th St,37.76621,-122.426614,88,Customer,1965.0,Female
2,45768,2017-12-31 22:45:48.4110,2018-01-01 11:28:36.8830,245,Downtown Berkeley BART,37.870348,-122.267764,245,Downtown Berkeley BART,37.870348,-122.267764,1094,Customer,,


In [8]:
#Columns and datatype of the dataset
df_2017.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 519700 entries, 0 to 519699
Data columns (total 15 columns):
duration_sec               519700 non-null int64
start_time                 519700 non-null object
end_time                   519700 non-null object
start_station_id           519700 non-null int64
start_station_name         519700 non-null object
start_station_latitude     519700 non-null float64
start_station_longitude    519700 non-null float64
end_station_id             519700 non-null int64
end_station_name           519700 non-null object
end_station_latitude       519700 non-null float64
end_station_longitude      519700 non-null float64
bike_id                    519700 non-null int64
user_type                  519700 non-null object
member_birth_year          453159 non-null float64
member_gender              453238 non-null object
dtypes: float64(5), int64(4), object(6)
memory usage: 59.5+ MB


**3.Note: In the 2017 dataset, the time data is not in the datetime format which needs to be converted.** 

In [9]:
#converting the start and end time into date time format
df_2017['start_time']= pd.to_datetime(df_2017.start_time)
df_2017['end_time']= pd.to_datetime(df_2017.end_time)

# Creating the month category- will return the month of the 2017
df_2017['month'] = df_2017['start_time'].dt.strftime('%b')

In [10]:
# See the last three rows 
df_2017.tail(3)

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,end_station_longitude,bike_id,user_type,member_birth_year,member_gender,month
519697,424,2017-06-28 09:47:36.347,2017-06-28 09:54:41.187,21,Montgomery St BART Station (Market St at 2nd St),37.789625,-122.400811,48,2nd St at S Park St,37.782411,-122.392706,240,Subscriber,1985.0,Female,Jun
519698,366,2017-06-28 09:47:41.664,2017-06-28 09:53:47.715,58,Market St at 10th St,37.776619,-122.417385,59,S Van Ness Ave at Market St,37.774814,-122.418954,669,Subscriber,1981.0,Male,Jun
519699,188,2017-06-28 09:49:46.377,2017-06-28 09:52:55.338,25,Howard St at 2nd St,37.787522,-122.397405,48,2nd St at S Park St,37.782411,-122.392706,117,Subscriber,1984.0,Male,Jun


In [11]:
# See the number of rows and columns of the dataset
df_2017.shape

(519700, 16)

In [12]:
#Months of 2017
df_2017.month.unique()

array(['Dec', 'Nov', 'Oct', 'Sep', 'Aug', 'Jul', 'Jun'], dtype=object)

In [13]:
#Extracting Nov and Dec of 2017
df_2017_nov = df_2017.query('month == "Nov"')
df_2017_dec = df_2017.query('month == "Dec"')

**4.Note**: **Since we are not sure whether bike sharing contributes to accident and the 2017 does not contain bike sharing information we will drop the bike sharing informaiton from the 2018 data for convenience.** 

In [14]:
# Getting rid of the bike sharing information as this is not something very interesting for our analysis
df_2018 = df_2018.drop(['bike_share_for_all_trip'], axis=1)

In [15]:
#combine the 2017 November December data to obtain a dataset for the whole year for the SanFrancisso bay area
df = pd.concat([df_2018,df_2017_nov, df_2017_dec])

In [16]:
#checking if the data set contains all the 12 months
df.month.unique()

array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
       'Oct', 'Nov', 'Dec'], dtype=object)

In [17]:
#checking if the 2017 data has the bike_share information- should return Nan if the information is not available
df.month.unique()

array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
       'Oct', 'Nov', 'Dec'], dtype=object)

In [18]:
#checking if the Nov data is in the dataframe- she first three rows
df.query("month=='Nov'").month.head(3)

83695    Nov
84835    Nov
86540    Nov
Name: month, dtype: object

In [19]:
#checking if the Dec data is in the dataframe- she first three rows
df.query("month=='Dec'").month.head(3)

0    Dec
1    Dec
2    Dec
Name: month, dtype: object

In [20]:
# Columns and data type of the merged dataset
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1780374 entries, 0 to 86546
Data columns (total 16 columns):
duration_sec               int64
start_time                 datetime64[ns]
end_time                   datetime64[ns]
start_station_id           float64
start_station_name         object
start_station_latitude     float64
start_station_longitude    float64
end_station_id             float64
end_station_name           object
end_station_latitude       float64
end_station_longitude      float64
bike_id                    int64
user_type                  object
member_birth_year          float64
member_gender              object
month                      object
dtypes: datetime64[ns](2), float64(7), int64(2), object(5)
memory usage: 230.9+ MB


In [21]:
# structure of the dataset
df.shape

(1780374, 16)

In [22]:
#Find duplicates in the dataset
df[df.duplicated()]

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,end_station_longitude,bike_id,user_type,member_birth_year,member_gender,month


In [23]:
#Find duplicates -should return False since there is no duplicate
df.duplicated().any()

False

In [24]:
#Finding the null values
df.isnull().sum()

duration_sec                    0
start_time                      0
end_time                        0
start_station_id            11095
start_station_name          11095
start_station_latitude          0
start_station_longitude         0
end_station_id              11095
end_station_name            11095
end_station_latitude            0
end_station_longitude           0
bike_id                         0
user_type                       0
member_birth_year          118434
member_gender              118024
month                           0
dtype: int64

In [25]:
#drop the null values
df.dropna(inplace=True)

In [26]:
#Checking is there are any other Null value- should return 0 for all the feature
df.isnull().sum()

duration_sec               0
start_time                 0
end_time                   0
start_station_id           0
start_station_name         0
start_station_latitude     0
start_station_longitude    0
end_station_id             0
end_station_name           0
end_station_latitude       0
end_station_longitude      0
bike_id                    0
user_type                  0
member_birth_year          0
member_gender              0
month                      0
dtype: int64

In [27]:
#Shape of the dataset
df.shape

(1651156, 16)

In [28]:
#Show first three rows
df.head(3)

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,end_station_longitude,bike_id,user_type,member_birth_year,member_gender,month
0,75284,2018-01-31 22:52:35.239,2018-02-01 19:47:19.824,120.0,Mission Dolores Park,37.76142,-122.426435,285.0,Webster St at O'Farrell St,37.783521,-122.431158,2765,Subscriber,1986.0,Male,Jan
2,71576,2018-01-31 14:23:55.889,2018-02-01 10:16:52.116,304.0,Jackson St at 5th St,37.348759,-121.894798,296.0,5th St at Virginia St,37.325998,-121.87712,3039,Customer,1996.0,Male,Jan
4,39966,2018-01-31 19:52:24.667,2018-02-01 06:58:31.053,74.0,Laguna St at Hayes St,37.776435,-122.426244,19.0,Post St at Kearny St,37.788975,-122.403452,617,Subscriber,1991.0,Male,Jan


In [29]:
# Extracting Day from the start-time
df['day'] = df['start_time'].dt.strftime('%a')

# finding the year from the start date
df['year'] = df['start_time'].dt.year

#finding starting month number- will return a value from 0-12
df['str_month_no'] = df['start_time'].dt.strftime('%m').astype(int)

#extracting hour- will return a value from 0 to 23 
df['start_hour'] = df['start_time'].dt.hour.astype(int)
df['end_hour'] = df['end_time'].dt.hour.astype(int)

#convert the duraiton of the journey in minute
df['duration_min'] = df['duration_sec']/60

In [30]:
#see first three rows to observe the changes
df.head(3)

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,...,user_type,member_birth_year,member_gender,month,day,year,str_month_no,start_hour,end_hour,duration_min
0,75284,2018-01-31 22:52:35.239,2018-02-01 19:47:19.824,120.0,Mission Dolores Park,37.76142,-122.426435,285.0,Webster St at O'Farrell St,37.783521,...,Subscriber,1986.0,Male,Jan,Wed,2018,1,22,19,1254.733333
2,71576,2018-01-31 14:23:55.889,2018-02-01 10:16:52.116,304.0,Jackson St at 5th St,37.348759,-121.894798,296.0,5th St at Virginia St,37.325998,...,Customer,1996.0,Male,Jan,Wed,2018,1,14,10,1192.933333
4,39966,2018-01-31 19:52:24.667,2018-02-01 06:58:31.053,74.0,Laguna St at Hayes St,37.776435,-122.426244,19.0,Post St at Kearny St,37.788975,...,Subscriber,1991.0,Male,Jan,Wed,2018,1,19,6,666.1


In [31]:
# A function to find whether the day is weekend or not
def find_weekend (df_column):
    
    """ This function takes the day column of a dataframe as an input and determine whether it is weeekend or not as an output.
    Saturday and Sunday will be marked as the Weekend while other days will be Weekdays"""
    
    if df_column == 'Sat':
        return 'Weekend'
    elif df_column == 'Sun':
        return 'Weekend'
    else:
        return 'Weekday'

In [32]:
# Applying the find_weekend function to the dataframe
df['day_type'] = df['day'].apply(find_weekend)

In [33]:
#Calculating age in age at the day of the journey
df['age'] = df['year']-df['member_birth_year'].astype(int)

**5.Note:** From the source [3], it can be seen the the San Francisco bay area has four seasons in the year. Hence, the Seasons information will now be extracted from the dataset.

**Season-1.Summer**: June to August

**Season-2.Autumn**: September to November

**Season-3.Winter**: December to February

**Season-4.Spring**: March to May


In [34]:
def find_season (df_column):
    
    """ This function takes the month number column of a dataframe as an input and determine the season."""
    
    if df_column >5 and df_column <9:
        return 'Summer'
    
    elif df_column >8 and df_column <12:
        return 'Autumn'
    
    elif df_column >2 and df_column <6:
        return 'Spring'

    else:
        return 'Winter'

In [35]:
# Applying the find_weekend function to the dataframe
df['season']= df['str_month_no'].apply(find_season)

# Find the number of unique Seasons- should return the four season
df['season'].unique()

array(['Winter', 'Spring', 'Summer', 'Autumn'], dtype=object)

In [36]:
#see first three rows to observe the changes of day_type and age
df.head(3)

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,...,month,day,year,str_month_no,start_hour,end_hour,duration_min,day_type,age,season
0,75284,2018-01-31 22:52:35.239,2018-02-01 19:47:19.824,120.0,Mission Dolores Park,37.76142,-122.426435,285.0,Webster St at O'Farrell St,37.783521,...,Jan,Wed,2018,1,22,19,1254.733333,Weekday,32,Winter
2,71576,2018-01-31 14:23:55.889,2018-02-01 10:16:52.116,304.0,Jackson St at 5th St,37.348759,-121.894798,296.0,5th St at Virginia St,37.325998,...,Jan,Wed,2018,1,14,10,1192.933333,Weekday,22,Winter
4,39966,2018-01-31 19:52:24.667,2018-02-01 06:58:31.053,74.0,Laguna St at Hayes St,37.776435,-122.426244,19.0,Post St at Kearny St,37.788975,...,Jan,Wed,2018,1,19,6,666.1,Weekday,27,Winter


**6.Note: We need to calculate the distance travelled which can be obtained from the information of Longitude and Latitude. For that, we need to import a new library. The code to calculate the distance is taken from Ref-4.**


In [37]:
from geopy.distance import vincenty
def distance_calc (row):
    "this function calculate the distance between the starting and the ending station from the information" 
    "of longitude and laitude and returns the distance in meters and then divided by 1000 gives kilometers."
    start = (row['start_station_latitude'], row['start_station_longitude'])
    stop = (row['end_station_latitude'], row['end_station_longitude'])

    return vincenty(start, stop).meters/1000

In [38]:
#apply function to the dataframe calculate distance
df['distance'] = df.apply (lambda row: distance_calc (row),axis=1)

In [39]:
#Checking rows with 0 values for distance.
df.query("distance==0").head(3)

Unnamed: 0,duration_sec,start_time,end_time,start_station_id,start_station_name,start_station_latitude,start_station_longitude,end_station_id,end_station_name,end_station_latitude,...,day,year,str_month_no,start_hour,end_hour,duration_min,day_type,age,season,distance
97,738,2018-01-31 21:48:35.704,2018-01-31 22:00:54.703,93.0,4th St at Mission Bay Blvd S,37.770407,-122.391198,93.0,4th St at Mission Bay Blvd S,37.770407,...,Wed,2018,1,21,22,12.3,Weekday,33,Winter,0.0
230,2023,2018-01-31 20:25:05.189,2018-01-31 20:58:48.830,311.0,Paseo De San Antonio at 2nd St,37.333798,-121.886943,311.0,Paseo De San Antonio at 2nd St,37.333798,...,Wed,2018,1,20,20,33.716667,Weekday,25,Winter,0.0
239,947,2018-01-31 20:39:07.038,2018-01-31 20:54:54.608,259.0,Addison St at Fourth St,37.866249,-122.299371,259.0,Addison St at Fourth St,37.866249,...,Wed,2018,1,20,20,15.783333,Weekday,59,Winter,0.0


In [40]:
#Number of rows with 0 distance values
df.query("distance==0").count().distance

39654

**7. Note:We can see thre are rows with 0 distance. We will exculde these rowss during our encounter with distance.**

In [41]:
#shape of the dataset
df.shape

(1651156, 26)

In [42]:
#data type information and non-null values
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1651156 entries, 0 to 86546
Data columns (total 26 columns):
duration_sec               1651156 non-null int64
start_time                 1651156 non-null datetime64[ns]
end_time                   1651156 non-null datetime64[ns]
start_station_id           1651156 non-null float64
start_station_name         1651156 non-null object
start_station_latitude     1651156 non-null float64
start_station_longitude    1651156 non-null float64
end_station_id             1651156 non-null float64
end_station_name           1651156 non-null object
end_station_latitude       1651156 non-null float64
end_station_longitude      1651156 non-null float64
bike_id                    1651156 non-null int64
user_type                  1651156 non-null object
member_birth_year          1651156 non-null float64
member_gender              1651156 non-null object
month                      1651156 non-null object
day                        1651156 non-null object
y

In [43]:
# Store the dataset
%store df

Stored 'df' (DataFrame)


## 2. Dataset Overview


### A. What is the structure of this dataset?

The data consists of the biking information of 1.65 million instances from Nov 2017-Sep 2018. From the starting time, different aspects of time like- hour, duration, whether the day is weekend or not has been extracted. Also from the longitude and latitude, the distance of the journey has been calculated. At this point, There are 26 features in the dataset out of which 8 are string values as can be seen from the summary information. 

The qualitative variables provide information about the start station name, end station_name, gender of the rider, and user type- Customer or a Subscriber, longitude and latitude of the starting and ending station and the month when the journey started. 


### B. What is/are the main feature(s) of interest in the dataset?

I'm most interested in figuring out the main characteristics of the bikers, and the variation between the user types, age and gender groups across various time of the year.  I will investigate the dataset for their riding patterns in terms of distance and duration covered during the journey to find out the important features.



### C. What features in the dataset do you think will help support your investigation into your feature(s) of interest?

Age, distance, duration, starting and ending stations are be the most interesting features of the dataset for me. 

To perform the analysis, I will divide the age, distance and duration data into different categories related to gender-Male, Female or other type, age group, and user type-customers or subscribers. Finally, I will find the busiest routes of biking from the starting and ending stations. Based on the analysis, I will make specific recommendations that will be used during the targeted campaign.

### D. Univariate, Bivariate and Multivariate Exploration

Instead of exploring all the univeraiate, Bivariate and Multivariate variables all at once, I will analyze the univariate variables like user type, age, gender, duration, starting and ending stations in each section of the analysis. Time to time, I will show the relationship between the vriables such as- the age of the gender group, age of the users, distance and duration per user group in various days etc. to explore the bivariate and multivariate relations. This in my opinion makes the story more co-herent and interesting.


**8.Note: After this primary wrangling, now the dataset is ready to be explored. However, we will continue to wrangle the dataset as needed.**

## Ref:

[1] Bicycle Accident Statistics available at: https://bicycleinjurylawfirm.com/california-bicycle-accident-statistics/

[2] Bay Area’s dangerous roads: Fatal crashes up 43 percent from 2010 to 2016 available at: https://goo.gl/uWtbdk


[3]. San Francisco Weather Guide available at: https://www.studentflights.com.au/destinations/san-francisco/weather

[4]. Calcute Distance Between two points: https://stackoverflow.com/questions/44446862/calculate-distance-between-latitude-and-longitude-in-dataframe