# Load and Filter the Dataset

This is a bit of a bigger task, which involves choosing a dataset to load and filtering it based on a specified month and day. In the quiz below, you'll implement the load_data() function, which you can use directly in your project. There are four steps:

1. Load the dataset for the specified city. Index the global CITY_DATA dictionary object to get the corresponding filename for the given city name.

2. Create month and day_of_week columns. Convert the "Start Time" column to datetime and extract the month number and weekday name into separate columns using the datetime module.

3. Filter by month. Since the month parameter is given as the name of the month, you'll need to first convert this to the corresponding month number. Then, select rows of the dataframe that have the specified month and reassign this as the new dataframe.

4. Filter by day of week. Select rows of the dataframe that have the specified day of week and reassign this as the new dataframe. (Note: Capitalize the day parameter with the title() method to match the title case used in the day_of_week column!)

In [1]:
import pandas as pd

In [2]:
CITY_DATA = { 'chicago': 'chicago.csv',
              'new york city': 'new_york_city.csv',
              'washington': 'washington.csv' }

In [3]:
def load_data(city, month, day):
    """
    Loads data for the specified city and filters by month and day if applicable.

    Args:
        (str) city - name of the city to analyze
        (str) month - name of the month to filter by, or "all" to apply no month filter
        (str) day - name of the day of week to filter by, or "all" to apply no day filter
    Returns:
        df - pandas DataFrame containing city data filtered by month and day
    """
    
    # load data file into a dataframe
    df = pd.read_csv(CITY_DATA[city])

    # convert the Start Time column to datetime
    df['Start Time'] = pd.to_datetime(df['Start Time'])

    # extract month and day of week from Start Time to create new columns
    df['month'] = df['Start Time'].apply(lambda time: time.month)
    df['day_of_week'] = df['Start Time'].apply(lambda time: time.dayofweek)

    # filter by month if applicable
    if month != 'all':
        # use the index of the months list to get the corresponding int
        months = ['january', 'february', 'march', 'april', 'may', 'june']
        month = months.index(month)+ 1
    
        # filter by month to create the new dataframe
        df = df[df['month'] == month]

    # filter by day of week if applicable
    if day != 'all':
        # filter by day of week to create the new dataframe
        dmap = {0:'monday',1:'tuesday',2:'wednesday',3:'thursday',4:'friday',5:'saturday',6:'sunday'}
        df['day_of_week'] = df['day_of_week'].map(dmap)
        df = df[df['day_of_week'] == day]
    
    return df

In [4]:
df = load_data('chicago', 'march', 'friday')

In [5]:
df

Unnamed: 0,Start Time,End Time,Trip Duration,Start Station,End Station,User Type,Gender,Birth Year,month,day_of_week
40,2017-03-24 13:06:00,24/03/2017 13:10,247,Broadway & Berwyn Ave,Clark St & Berwyn Ave,Subscriber,Female,1961.0,3,friday
59,2017-03-03 07:55:00,03/03/2017 7:57,113,Clark St & Chicago Ave,Wells St & Huron St,Subscriber,Male,1981.0,3,friday
68,2017-03-17 12:14:00,17/03/2017 12:22,468,Dearborn Pkwy & Delaware Pl,State St & Randolph St,Subscriber,Female,1984.0,3,friday
83,2017-03-24 14:15:00,24/03/2017 14:27,681,Sheridan Rd & Lawrence Ave,Broadway & Thorndale Ave,Subscriber,Male,1984.0,3,friday
126,2017-03-24 12:39:00,24/03/2017 12:52,772,Michigan Ave & Oak St,Cannon Dr & Fullerton Ave,Subscriber,Male,1993.0,3,friday
224,2017-03-31 19:11:00,31/03/2017 19:18,461,Damen Ave & Cortland St,Damen Ave & Pierce Ave,Subscriber,Male,1989.0,3,friday
290,2017-03-24 10:55:00,24/03/2017 11:01,334,Wacker Dr & Washington St,LaSalle St & Jackson Blvd,Subscriber,Male,1961.0,3,friday
343,2017-03-17 17:51:00,17/03/2017 18:00,525,Milwaukee Ave & Grand Ave,State St & Pearson St,Subscriber,Male,1989.0,3,friday
348,2017-03-31 07:47:00,31/03/2017 7:55,504,Canal St & Madison St,Wabash Ave & Adams St,Subscriber,Male,1953.0,3,friday
