# US Bikeshare Data
Over the past decade, bicycle-sharing systems have been growing in number and popularity in cities across the world. Bicycle-sharing systems allow users to rent bicycles on a very short-term basis for a price. This allows people to borrow a bike from point A and return it at point B, though they can also return it to the same location if they'd like to just go for a ride. Regardless, each bike can serve several users per day.

In this project, I will use data provided by Motivate, a bike share system provider for many major cities in the United States, to uncover bike share usage patterns. I will compare the system usage between three large cities: Chicago, New York City, and Washington, DC.

In [30]:
#Import libraries
import numpy as np
import pandas as pd
import time

I seek to calculate some statistics about the data I have:
1. Popular times of travel
[Start Time]
   - Most common month
   - Most common day of week
   - Most common hour of day 
   
   
2. Popular stations and trip
[Start Time / End Time / Trip Duration / Start Station / End Station]
   - Most common start station
   - Most common end station
   - Most common trip from start to end (i.e., most frequent combination of start station and end station)
   
   
3. Trip duration
[Start Time / End Time / Trip Duration]
   - Total travel time
   - Average travel time
   
   
4. User info.
[User Type / Gender / Birth Date]
   - counts of each user type
   - counts of each gender (only available for NYC and Chicago)
   - earliest, most recent, most common year of birth (only available for NYC and Chicago)

In [31]:
# Data of cities
CITY_DATA = { 'chicago': 'chicago.csv',
              'new york city': 'new_york_city.csv',
              'washington': 'washington.csv' }

In [32]:
def get_filters():

    print("Hello! Let's explore some US bikeshare data!")
    # TO DO: get user input for city (chicago, new york city, washington).
    while True:
        city = input("Please enter a city form the list ( chicago / new york city / washington ) ").lower()
        if city in CITY_DATA:
            break
        else:
            print("Enter the correct city from the list ( You can copy and paste it )")

    # TO DO: get user input for month (all, january, february, ... , june)
    while True:
        month = input("Please enter a month form the list ( jan / feb / mar / apr / may / jun / all ) ").lower()
        months = ['jan', 'feb', 'mar', 'apr', 'may', 'jun','all']
        if month in months or month == 'all':
            break
        else:
            print("Enter the correct month from the list ( You can copy and paste it )")

    # TO DO: get user input for day of week (all, monday, tuesday, ... sunday)
    while True:
        day = input("Please enter a day form the list ( saturday / sunday / monday / tuesday / wednesday / thursday / friday / all ) ").lower()
        days = ['saturday', 'sunday', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'all']
        if day in days or day == 'all':
            break
        else:
            print("Enter the correct day from the list ( You can copy and paste it )")
    print('-'*30)
    return city, month, day

In [33]:
#Loading data of City, Month and Day of trip
def load_data(city, month, day):
    
    # load data file into a dataframe
    df = pd.read_csv(CITY_DATA[city])

    # convert the Start Time column to datetime
    df['Start Time'] = pd.to_datetime(df['Start Time'])

    # extract month and day of week from Start Time to create new columns
    df['month'] = df['Start Time'].dt.month
    df['day_of_week'] = df['Start Time'].dt.day_name()
    df['hour'] = df['Start Time'].dt.hour

    # filter by month if applicable
    if month != 'all':
        # use the index of the months list to get the corresponding int
        months = ['jan', 'feb', 'mar', 'apr', 'may', 'jun']
        month = months.index(month) + 1
        
        # filter by month to create the new dataframe
        df = df[df['month'] == month]

    # filter by day of week if applicable
    if day != 'all':
        
        # filter by day of week to create the new dataframe
        df = df[df['day_of_week'] == day.title()]
    return df

In [34]:
def time_stats(df):

    print('\nCalculating The Most Frequent Times of Travel...\n')
    start_time = time.time()

    # TO DO: display the most common month
    common_month = df['month'].mode()[0]
    months = ['January', 'Febuary', 'March', 'April', 'May', 'June']
    print("Most common month: ", months[common_month-1])

    # TO DO: display the most common day of week
    common_day = df['day_of_week'].mode()[0]
    print("Most common day of week: ",common_day)

    # TO DO: display the most common start hour
    common_hour = df['hour'].mode()[0]
    print("Most Frequent Start Hour: ", common_hour)

    print("\nThis took %s sec." % (time.time() - start_time))
    print('-'*30)

In [35]:
def station_stats(df):
    
    print('\nCalculating The Most Popular Stations and Trip...\n')
    start_time = time.time()

    # TO DO: display most commonly used start station
    MostCommonlyUsedStart = df["Start Station"].mode()[0]
    print("Most commonly used start station: ",MostCommonlyUsedStart)

    # TO DO: display most commonly used end station
    MostCommonlyUsedEnd = df["End Station"].mode()[0]
    print("Most commonly used end station: ",MostCommonlyUsedEnd)

    # TO DO: display most frequent combination of start station and end station trip
    df["Start to End Station"] = "( " + df["Start Station"] + " )" + " to " + "( " + df["End Station"] + " )"
    MostCombinationTrip = df["Start to End Station"].mode()[0]
    print("Most frequent combination of start station and end station trip: ",MostCombinationTrip)
    
    print("\nThis took %s seconds." % (time.time() - start_time))
    print('-'*30)


In [36]:
def trip_duration_stats(df):

    print('\nCalculating Trip Duration...\n')
    start_time = time.time()
    
    #Conversion of seconds
    secondsDay = 60*60*24
    secondsHour = 60*60
    secondsMinute = 60
    
    # TO DO: display total travel time
    TotalTravelTime = df['Trip Duration'].sum()
    days = TotalTravelTime // secondsDay
    hours = (TotalTravelTime- (days * secondsDay)) // secondsHour
    minutes = (TotalTravelTime - (days * secondsDay) - (hours * secondsHour)) // secondsMinute
    seconds = TotalTravelTime - (days * secondsDay) - (hours * secondsHour) - (minutes * secondsMinute)
    
    print("Total Travel Time: \n%.2f days\n%.2f hours\n%.2f minutes\n%.2f seconds\n"%(days, hours, minutes, seconds))

    # TO DO: display mean travel time
    AverageTravelTime = df['Trip Duration'].mean()
    days = AverageTravelTime // secondsDay
    hours = (AverageTravelTime- (days * secondsDay)) // secondsHour
    minutes = (AverageTravelTime - (days * secondsDay) - (hours * secondsHour)) // secondsMinute
    seconds = AverageTravelTime - (days * secondsDay) - (hours * secondsHour) - (minutes * secondsMinute)
    
    print("Average Travel Time: \n%.2f days\n%.2f hours\n%.2f minutes\n%.2f seconds\n"%(days, hours, minutes, seconds))

    print("\nThis took %s seconds." % (time.time() - start_time))
    print('-'*30)

In [37]:
def user_stats(df):

    print('\nCalculating User Stats...\n')
    start_time = time.time()

    # TO DO: Display counts of user types
    UserTypes = df["User Type"].value_counts()
    print("Counts of user types:\n",UserTypes)
    
    # TO DO: Display counts of gender
    if 'Gender' in (df.columns):   
        Gender = df["Gender"].value_counts()
        print("\nCounts of gender:\n",Gender)

    # TO DO: Display earliest, most recent, and most common year of birth
    if 'Birth Year' in (df.columns):
        df = df.fillna(df['Birth Year'].mode())
        BirthYear = df["Birth Year"].mean()
        print("\nMost common Birth Year:\n",round(BirthYear))

    print("This took %s seconds." % (time.time() - start_time))
    print('-'*30)


In [38]:
def display_data(df):
    # TO DO: Display 5 rows of data
    ViewData = input('\nWould you like to view 5 rows of individual trip data? Enter yes or no\n').lower()
    start_loc = 0
    while True:
        if ViewData == "yes":
            print(df.iloc[start_loc:(start_loc+5)])
            start_loc += 5
            view_data = input("Do you wish to continue?: ").lower()
            if view_data != "yes":
                break
        else:
            ViewData = input('\nPlease Enter yes or no\n').lower()

In [39]:
def main():
    while True:
        city, month, day = get_filters()
        df = load_data(city, month, day)

        time_stats(df)
        station_stats(df)
        trip_duration_stats(df)
        user_stats(df)
        display_data(df)

        restart = input('\nWould you like to try again? Enter yes or no.\n')
        if restart.lower() != 'yes':
            break


if __name__ == "__main__":
	main()

Hello! Let's explore some US bikeshare data!
Please enter a city form the list ( chicago / new york city / washington ) washington
Please enter a month form the list ( jan / feb / mar / apr / may / jun / all ) mar
Please enter a day form the list ( saturday / sunday / monday / tuesday / wednesday / thursday / friday / all ) sunday
------------------------------

Calculating The Most Frequent Times of Travel...

Most common month:  March
Most common day of week:  Sunday
Most Frequent Start Hour:  14

This took 0.009994029998779297 sec.
------------------------------

Calculating The Most Popular Stations and Trip...

Most commonly used start station:  Jefferson Dr & 14th St SW
Most commonly used end station:  Jefferson Dr & 14th St SW
Most frequent combination of start station and end station trip:  ( Jefferson Dr & 14th St SW ) to ( Jefferson Dr & 14th St SW )

This took 0.02000117301940918 seconds.
------------------------------

Calculating Trip Duration...

Total Travel Time: 
49.00