# CSMAD21 - Applied Data Science with Python - Practical 3

## Pandas

Follow the instructions to complete each of these tasks. This set of exercises focuses on working with Python's Pandas library.

The relevant materials for these exercises is lectures 5 and 6 (Pandas).

This is not assessed but will help you gain practical experience for the exam and coursework.

You will need to download some of the csv data set files from the module Blackboard page and place them in the same folder as this notebook. Run the cell below to load all of the necessary Python modules.

**Questions marked with a * are extra challenging**

In [3]:
import pandas as pd
import requests
import numpy as np
from pandas import json_normalize

## 1. Diamonds example data

1.1. Read in the diamonds csv file to a pandas data frame. Use pandas to find how many diamonds have carat greater than 3.5.

In [4]:
diamonds = pd.read_csv('Datasets/diamonds.csv')
diamonds.loc[diamonds['carat']> 3.5]

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
23644,3.65,Fair,H,I1,67.1,53.0,11668,9.53,9.48,6.38
25998,4.01,Premium,I,I1,61.0,61.0,15223,10.14,10.1,6.17
25999,4.01,Premium,J,I1,62.5,62.0,15223,10.02,9.94,6.24
26444,4.0,Very Good,I,I1,63.3,58.0,15984,10.01,9.94,6.31
26534,3.67,Premium,I,I1,62.4,56.0,16193,9.86,9.81,6.13
27130,4.13,Fair,H,I1,64.8,61.0,17329,10.0,9.85,6.43
27415,5.01,Fair,J,I1,65.5,59.0,18018,10.74,10.54,6.98
27630,4.5,Fair,J,I1,65.8,58.0,18531,10.23,10.16,6.72
27679,3.51,Premium,J,VS2,62.5,59.0,18701,9.66,9.63,6.03


<hr style="border:2px solid black"> </hr>

1.2. Create a series of the price of all of the diamonds that have carat greater than 3.5.

In [5]:
diamonds.loc[diamonds['carat']> 3.5, 'price']

23644    11668
25998    15223
25999    15223
26444    15984
26534    16193
27130    17329
27415    18018
27630    18531
27679    18701
Name: price, dtype: int64

<hr style="border:2px solid black"> </hr>

1.3. For ideal cut diamonds whose price is greater than 10000, find the number of diamonds having each clarity.

In [6]:
diamonds.loc[diamonds['price']> 10000].groupby(diamonds['clarity'])['carat'].count()
diamonds.loc[(diamonds['price']> 10000)&(diamonds['cut'] == 'Ideal')].groupby(diamonds['clarity'])['carat'].count()


clarity
I1        8
IF       80
SI1     344
SI2     318
VS1     280
VS2     351
VVS1    132
VVS2    257
Name: carat, dtype: int64

<hr style="border:2px solid black"> </hr>

1.4 Write a Pandas program to create a new 'price/carat' Series (use bracket notation to define the Series name) of the diamonds DataFrame and fill series with the price per carat of all diamonds.

In [91]:
diamonds['price/carat'] = round(diamonds['price'] / diamonds['carat'], 2) 

<hr style="border:2px solid black"> </hr>

## 2. Vancouver street trees data

2.1. Load the Vancouver street trees data provided on Blackboard. What is the most common genus of tree?

In [8]:
vancouver = pd.read_csv('Datasets/StreetTrees_MountPleasant.csv')
vancouver.head()
vancouver['GENUS_NAME'].value_counts()
vancouver['GENUS_NAME'].value_counts().head()

ACER        1621
PRUNUS      1251
PYRUS        387
TILIA        299
CARPINUS     297
Name: GENUS_NAME, dtype: int64

<hr style="border:2px solid black"> </hr>

2.2. Find the mean diameter of trees with height range ID 9.

In [9]:
vancouver.loc[vancouver['HEIGHT_RANGE_ID'] == 9, 'DIAMETER'].mean()

30.183333333333334

<hr style="border:2px solid black"> </hr>

2.3. Get a Series of HEIGHT_RANGE_ID's for only the trees with a diameters greater than the average diameter of all trees, and drop any NaN values.

In [89]:
vancouver['HEIGHT_RANGE_ID'].where((vancouver['DIAMETER'] < vancouver['DIAMETER'].mean())).dropna()

0       2.0
1       1.0
2       1.0
3       1.0
4       1.0
       ... 
6169    1.0
6170    1.0
6171    1.0
6172    1.0
6174    1.0
Name: HEIGHT_RANGE_ID, Length: 3633, dtype: float64

<hr style="border:2px solid black"> </hr>

2.4. Produce a pandas data frame giving the maximum and minimum height range id on each street.

In [30]:
#diamonds.groupby(['cut','clarity'])[['price', 'carat']].mean()
vancouver.groupby(['STD_STREET'])[['HEIGHT_RANGE_ID']].max()
vancouver.groupby(['STD_STREET'])[['HEIGHT_RANGE_ID']].min()
vancouver.groupby('STD_STREET').agg({'HEIGHT_RANGE_ID': ['min', 'max']}).head()

Unnamed: 0_level_0,HEIGHT_RANGE_ID,HEIGHT_RANGE_ID
Unnamed: 0_level_1,min,max
STD_STREET,Unnamed: 1_level_2,Unnamed: 2_level_2
ALBERTA ST,0,8
ATHLETES WAY,2,4
BRUNSWICK ST,1,6
CAMBIE ST,1,7
CAROLINA ST,1,6



<hr style="border:2px solid black"> </hr>

## 3. Iris flower example data

3.1. Load the iris.csv flower data. Add two extra columns to the data frame giving the ratio of sepal length over width and petal length over width.



In [100]:
iris = pd.read_csv('Datasets/iris.csv')
iris['Sepal.Ratio']=iris['Sepal.Length']/iris['Sepal.Width']
iris['Petal.Ratio']=iris['Petal.Length']/iris['Petal.Width']
iris.head()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species,Sepal.Ratio,Petal.Ratio
1,5.1,3.5,1.4,0.2,setosa,1.457143,7.0
2,4.9,3.0,1.4,0.2,setosa,1.633333,7.0
3,4.7,3.2,1.3,0.2,setosa,1.46875,6.5
4,4.6,3.1,1.5,0.2,setosa,1.483871,7.5
5,5.0,3.6,1.4,0.2,setosa,1.388889,7.0


<hr style="border:2px solid black"> </hr>

3.2. Calculate the mean of the ratio between sepal length and width for each species.

In [101]:
iris['Sepal.Ratio'].mean(), iris['Petal.Ratio'].mean()

(1.953680870534934, 4.310499757025204)

<hr style="border:2px solid black"> </hr>


3.3. Perform a data discovery on the dataset.
- How many classes are?
- What is the distribution of the classes?
- What are the characteristic of the data in general/per class?
You can use methods like unique and describe.


In [102]:
iris.head()
iris.Species.unique()

array(['setosa', 'versicolor', 'virginica'], dtype=object)

In [103]:
iris.Species.value_counts()

virginica     50
setosa        50
versicolor    50
Name: Species, dtype: int64

In [104]:
iris.describe()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Sepal.Ratio,Petal.Ratio
count,150.0,150.0,150.0,150.0,150.0,150.0
mean,5.843333,3.057333,3.758,1.199333,1.953681,4.3105
std,0.828066,0.435866,1.765298,0.762238,0.40048,2.489648
min,4.3,2.0,1.0,0.1,1.268293,2.125
25%,5.1,2.8,1.6,0.3,1.546188,2.802381
50%,5.8,3.0,4.35,1.3,2.032292,3.3
75%,6.4,3.3,5.1,1.8,2.22491,4.666667
max,7.9,4.4,6.9,2.5,2.961538,15.0


In [105]:
iris.loc[iris['Species'] == 'setosa'].describe()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Sepal.Ratio,Petal.Ratio
count,50.0,50.0,50.0,50.0,50.0,50.0
mean,5.006,3.428,1.462,0.246,1.470188,6.908
std,0.35249,0.379064,0.173664,0.105386,0.11875,2.854545
min,4.3,2.3,1.0,0.1,1.268293,2.666667
25%,4.8,3.2,1.4,0.2,1.385684,4.6875
50%,5.0,3.4,1.5,0.2,1.463063,7.0
75%,5.2,3.675,1.575,0.3,1.541444,7.5
max,5.8,4.4,1.9,0.6,1.956522,15.0


In [106]:
iris.loc[iris['Species'] == 'virginica'].describe()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Sepal.Ratio,Petal.Ratio
count,50.0,50.0,50.0,50.0,50.0,50.0
mean,6.588,2.974,5.552,2.026,2.230453,2.780662
std,0.63588,0.322497,0.551895,0.27465,0.246992,0.407367
min,4.9,2.2,4.5,1.4,1.823529,2.125
25%,6.225,2.8,5.1,1.8,2.031771,2.511364
50%,6.5,3.0,5.55,2.0,2.16954,2.666667
75%,6.9,3.175,5.875,2.3,2.342949,3.055556
max,7.9,3.8,6.9,2.5,2.961538,4.0


In [107]:
iris.loc[iris['Species'] == 'versicolor'].describe()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Sepal.Ratio,Petal.Ratio
count,50.0,50.0,50.0,50.0,50.0,50.0
mean,5.936,2.77,4.26,1.326,2.160402,3.242837
std,0.516171,0.313798,0.469911,0.197753,0.228658,0.312456
min,4.9,2.0,3.0,1.0,1.764706,2.666667
25%,5.6,2.525,4.0,1.2,2.033929,3.016667
50%,5.9,2.8,4.35,1.3,2.16129,3.240385
75%,6.3,3.0,4.6,1.5,2.232692,3.417582
max,7.0,3.4,5.1,1.8,2.818182,4.1


<hr style="border:2px solid black"> </hr>


3.4. For flowers whos petal ratio is greater than 3cm, find the mean sepal ratio for each species.

You can use methods you have already used in the previoius tasks to complete this task.

In [118]:
iris.loc[iris['Petal.Ratio']> 3].groupby(['Species'])['Sepal.Ratio'].mean()

Species
setosa        1.470188
versicolor    2.160402
virginica     2.230453
Name: Sepal.Ratio, dtype: float64

<hr style="border:2px solid black"> </hr>

## 4 Philadelphia bike share live data

4.1 Complete the code below to load a JSON live feed for a Philadelphia bike share program into a pandas data frame. It may help to look at the JSON data in a visual inspector. One way of doing this is to open the url given in Firefox. Once you have loaded the data, look at the head of the data frame, and list all of the columns.

(I've had to add in some header data to the request, as the server rejects all requests without a user agent string)

*You can use the pandas function json_normalize that was imported at the start of the notebook, but you need to pass it a suitable part of the JSON data.* 

*The indego_bikes_data object returned by requests.get() can be converted to a Python data structure using the json() method*

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html

In [119]:
indego_bikes_url = ("https://www.rideindego.com/stations/json/")
headers = {'User-Agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:63.0) Gecko/20100101 Firefox/63.0"}
indego_bikes_data = requests.get(indego_bikes_url,headers=headers)


In [120]:
##To display the data you can use the following line
indego_bikes_data.json()


{'features': [{'geometry': {'coordinates': [-75.16374, 39.95378],
    'type': 'Point'},
   'properties': {'id': 3004,
    'name': 'Municipal Services Building Plaza',
    'coordinates': [-75.16374, 39.95378],
    'totalDocks': 30,
    'docksAvailable': 29,
    'bikesAvailable': 1,
    'classicBikesAvailable': 0,
    'smartBikesAvailable': 0,
    'electricBikesAvailable': 1,
    'rewardBikesAvailable': 1,
    'rewardDocksAvailable': 29,
    'kioskStatus': 'FullService',
    'kioskPublicStatus': 'Active',
    'kioskConnectionStatus': 'Active',
    'kioskType': 1,
    'addressStreet': '1401 John F. Kennedy Blvd.',
    'addressCity': 'Philadelphia',
    'addressState': 'PA',
    'addressZipCode': '19102',
    'bikes': [{'dockNumber': 25,
      'isElectric': True,
      'isAvailable': True,
      'battery': 0.8}],
    'closeTime': None,
    'eventEnd': None,
    'eventStart': None,
    'isEventBased': False,
    'isVirtual': False,
    'kioskId': 3004,
    'notes': None,
    'openTime': Non

In [123]:
bikes = json_normalize(indego_bikes_data.json()['features'])
bikes.head()

Unnamed: 0,type,geometry.coordinates,geometry.type,properties.id,properties.name,properties.coordinates,properties.totalDocks,properties.docksAvailable,properties.bikesAvailable,properties.classicBikesAvailable,...,properties.isEventBased,properties.isVirtual,properties.kioskId,properties.notes,properties.openTime,properties.publicText,properties.timeZone,properties.trikesAvailable,properties.latitude,properties.longitude
0,Feature,"[-75.16374, 39.95378]",Point,3004,Municipal Services Building Plaza,"[-75.16374, 39.95378]",30,29,1,0,...,False,False,3004,,,,,0,39.95378,-75.16374
1,Feature,"[-75.14403, 39.94733]",Point,3005,"Welcome Park, NPS","[-75.14403, 39.94733]",13,9,4,2,...,False,False,3005,,,,,0,39.94733,-75.14403
2,Feature,"[-75.20311, 39.9522]",Point,3006,40th & Spruce,"[-75.20311, 39.9522]",17,6,10,5,...,False,False,3006,,,,,0,39.9522,-75.20311
3,Feature,"[-75.15993, 39.94517]",Point,3007,"11th & Pine, Kahn Park","[-75.15993, 39.94517]",20,15,4,4,...,False,False,3007,,,,,0,39.94517,-75.15993
4,Feature,"[-75.15114, 39.97944]",Point,3008,Temple University Station,"[-75.15114, 39.97944]",19,14,5,2,...,False,False,3008,,,,,0,39.97944,-75.15114


<hr style="border:2px solid black"> </hr>

4.2. Is there any street with more than one bike station?

In [124]:
bikes.groupby("properties.addressStreet").count()['properties.bikesAvailable'].max()

1

<hr style="border:2px solid black"> </hr>

4.3. Use pandas to count the total number of available docks in each zip code, producing a Series of zipcodes and available dock counts.

*You can use the pandas method sum() on a grouby object to add the all values in a particular group*.

In [125]:
bikes.groupby("properties.addressZipCode")['properties.docksAvailable'].sum()
bikes.groupby("properties.addressZipCode")[['properties.docksAvailable']].sum()

Unnamed: 0_level_0,properties.docksAvailable
properties.addressZipCode,Unnamed: 1_level_1
19102,64
19103,175
19104,269
19106,129
19107,79
19112,60
19121,115
19122,170
19123,95
19125,36


<hr style="border:2px solid black"> </hr> 

__4.4*__. Using pandas, find the difference between the minimum and maximum number of available bikes at docks within each zip code.

In [126]:
##Option 1
bikes.groupby("properties.addressZipCode").agg({'properties.rewardDocksAvailable': ['min', 'max', np.ptp]})
bikes.groupby("properties.addressZipCode")['properties.rewardDocksAvailable'].agg(['min', 'max', np.ptp])

##Option 2
min_max_df = bikes.groupby("properties.addressZipCode")['properties.rewardDocksAvailable'].agg(['min', 'max'])
min_max_df['diff'] = min_max_df['max']-min_max_df['min']
min_max_df

Unnamed: 0_level_0,min,max,diff
properties.addressZipCode,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
19102,1,29,28
19103,0,24,24
19104,0,32,32
19106,1,23,22
19107,1,15,14
19112,12,19,7
19121,12,18,6
19122,5,19,14
19123,5,15,10
19125,6,25,19


__4.5*__. Write Python code using pandas to determine the postal code with the highest median of docks available.

In [127]:
bikes.groupby("properties.addressZipCode")['properties.rewardDocksAvailable'].agg(['median']).sort_values('median',ascending=False).head(3)

Unnamed: 0_level_0,median
properties.addressZipCode,Unnamed: 1_level_1
19130,16.0
19149,16.0
19131,15.0


<hr style="border:2px solid black"> </hr> 

## 5 Bikes Dataset

Load the bikes.csv file into a pandas data frame. Using the DataFrame method isnull(), you can produce a DataFrame where each value is either True if the value is  missing, or False if it is present.


5.1. Produce a count of the number of missing values in each column in the DataFrame.

*Tip - the sum() method treats True and False values as 1 and 0 respectively https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sum.html*

In [87]:
bike = pd.read_csv('Datasets/bikes.csv')
bike.isnull().sum()

Rental ID                    0
Start Time                   0
End Time                     0
Bike ID                      0
Duration in Seconds          0
Start Station ID             0
Start Station Name           0
End Station ID               0
End Station Name             0
User Type                    0
Member Gender           185554
Member Birthday Year    180953
dtype: int64

<hr style="border:2px solid black"> </hr> 

5.2. Think of a sensible way of removing the missing values and use this to create a new DataFrame with no missing values. You can use the copy() method to duplicate a DataFrame before modifying it.

In [88]:
bike2 = bike.dropna().copy()
bike2.isnull().sum()

Rental ID               0
Start Time              0
End Time                0
Bike ID                 0
Duration in Seconds     0
Start Station ID        0
Start Station Name      0
End Station ID          0
End Station Name        0
User Type               0
Member Gender           0
Member Birthday Year    0
dtype: int64

<hr style="border:2px solid black"> </hr> 

5.3. Use the describe() method to calculate statistics of the columns in the data frame. Is there anything strange about the values in a column?

In [89]:
bike2.describe()

Unnamed: 0,Rental ID,Bike ID,Duration in Seconds,Start Station ID,End Station ID,Member Birthday Year
count,922608.0,922608.0,922608.0,922608.0,922608.0,922608.0
mean,22818210.0,3410.908376,981.1843,204.408119,205.009409,1983.954518
std,374795.6,1912.395388,7969.447,155.093432,154.434312,10.788886
min,22178530.0,1.0,61.0,2.0,2.0,1759.0
25%,22492250.0,1748.0,388.0,81.0,81.0,1979.0
50%,22809840.0,3491.0,646.0,176.0,176.0,1987.0
75%,23142550.0,5100.0,1097.0,291.0,293.0,1992.0
max,23479390.0,6471.0,4439590.0,664.0,664.0,2014.0


<hr style="border:2px solid black"> </hr> 

5.5. Convert the Start Time and End Time columns to pandas datetime objects. You can do this using the pd.to_datetime method on those columns.

Create a new column in the data frame that gives the day of the week the journey was started on. You can extract the day of the week from a datetime object using the .dayofweek attribue, and use the apply method of a column in a DataFrame to apply a function to each value in the column.

In [95]:
bike2['Start Time'] = pd.to_datetime(bike2['Start Time'])
bike2['End Time'] = pd.to_datetime(bike2['End Time'])
bike2.head()

Unnamed: 0,Rental ID,Start Time,End Time,Bike ID,Duration in Seconds,Start Station ID,Start Station Name,End Station ID,End Station Name,User Type,Member Gender,Member Birthday Year,Age
0,22178529,2019-04-01 00:02:22,2019-04-01 00:09:48,6251,446.0,81,Daley Center Plaza,56,Desplaines St & Kinzie St,Subscriber,Male,1975.0,45.0
1,22178530,2019-04-01 00:03:02,2019-04-01 00:20:30,6226,1048.0,317,Wood St & Taylor St,59,Wabash Ave & Roosevelt Rd,Subscriber,Female,1984.0,36.0
3,22178531,2019-04-01 00:11:07,2019-04-01 00:15:19,5649,252.0,283,LaSalle St & Jackson Blvd,174,Canal St & Madison St,Subscriber,Male,1990.0,30.0
4,22178532,2019-04-01 00:13:01,2019-04-01 00:18:58,4151,357.0,26,McClurg Ct & Illinois St,133,Kingsbury St & Kinzie St,Subscriber,Male,1993.0,27.0
5,22178533,2019-04-01 00:19:26,2019-04-01 00:36:13,3270,1007.0,202,Halsted St & 18th St,129,Blue Island Ave & 18th St,Subscriber,Male,1992.0,28.0


<hr style="border:2px solid black"> </hr> 

5.6. Create a new column in the data frame giving the (approximate) age in years of the user for each journey. 

In [96]:
bike2['Age'] = 2020 - bike2['Member Birthday Year']
bike2.head()

Unnamed: 0,Rental ID,Start Time,End Time,Bike ID,Duration in Seconds,Start Station ID,Start Station Name,End Station ID,End Station Name,User Type,Member Gender,Member Birthday Year,Age
0,22178529,2019-04-01 00:02:22,2019-04-01 00:09:48,6251,446.0,81,Daley Center Plaza,56,Desplaines St & Kinzie St,Subscriber,Male,1975.0,45.0
1,22178530,2019-04-01 00:03:02,2019-04-01 00:20:30,6226,1048.0,317,Wood St & Taylor St,59,Wabash Ave & Roosevelt Rd,Subscriber,Female,1984.0,36.0
3,22178531,2019-04-01 00:11:07,2019-04-01 00:15:19,5649,252.0,283,LaSalle St & Jackson Blvd,174,Canal St & Madison St,Subscriber,Male,1990.0,30.0
4,22178532,2019-04-01 00:13:01,2019-04-01 00:18:58,4151,357.0,26,McClurg Ct & Illinois St,133,Kingsbury St & Kinzie St,Subscriber,Male,1993.0,27.0
5,22178533,2019-04-01 00:19:26,2019-04-01 00:36:13,3270,1007.0,202,Halsted St & 18th St,129,Blue Island Ave & 18th St,Subscriber,Male,1992.0,28.0


<hr style="border:2px solid black"> </hr> 

5.7. Investigate the numbers of journeys starting at each hour of the day. You can use the hour attribute of a pandas datetime object to extract the hour of the day from the starting times.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.dt.hour.html

In [110]:
bike2['Start Hour'] = bike2['Start Time'].dt.hour
bike2.groupby(bike2['Start Hour'])['Rental ID'].count().sort_values(ascending=False)

Start Hour
17    123528
16     91683
8      81987
18     80356
7      63627
15     52860
19     52258
12     46294
13     44693
14     43109
9      41003
11     40290
10     33052
20     32004
6      28306
21     22527
22     15080
23      9173
5       8993
0       4829
1       2673
4       1668
2       1537
3       1078
Name: Rental ID, dtype: int64

<hr style="border:2px solid black"> </hr> 

__5.8*__. Use the pandas cut() function to create a new column in the data frame that assigns an age range of the user for each journey. Use this new column to visualise the relationship between age group and duration of journeys.

You can find documentation on the cut function here - https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.cut.html

In [113]:
bins = [0, 2, 18, 35, 65, np.inf]
names = ['<2', '2-18', '18-35', '35-65', '65+']

bike2['AgeRange'] = pd.cut(bike2['Age'], bins, labels=names)
bike2

Unnamed: 0,Rental ID,Start Time,End Time,Bike ID,Duration in Seconds,Start Station ID,Start Station Name,End Station ID,End Station Name,User Type,Member Gender,Member Birthday Year,Age,Start Hour,AgeRange
0,22178529,2019-04-01 00:02:22,2019-04-01 00:09:48,6251,446.0,81,Daley Center Plaza,56,Desplaines St & Kinzie St,Subscriber,Male,1975.0,45.0,0,35-65
1,22178530,2019-04-01 00:03:02,2019-04-01 00:20:30,6226,1048.0,317,Wood St & Taylor St,59,Wabash Ave & Roosevelt Rd,Subscriber,Female,1984.0,36.0,0,35-65
3,22178531,2019-04-01 00:11:07,2019-04-01 00:15:19,5649,252.0,283,LaSalle St & Jackson Blvd,174,Canal St & Madison St,Subscriber,Male,1990.0,30.0,0,18-35
4,22178532,2019-04-01 00:13:01,2019-04-01 00:18:58,4151,357.0,26,McClurg Ct & Illinois St,133,Kingsbury St & Kinzie St,Subscriber,Male,1993.0,27.0,0,18-35
5,22178533,2019-04-01 00:19:26,2019-04-01 00:36:13,3270,1007.0,202,Halsted St & 18th St,129,Blue Island Ave & 18th St,Subscriber,Male,1992.0,28.0,0,18-35
6,22178534,2019-04-01 00:19:39,2019-04-01 00:23:56,3123,257.0,420,Ellis Ave & 55th St,426,Ellis Ave & 60th St,Subscriber,Male,1999.0,21.0,0,18-35
7,22178535,2019-04-01 00:26:33,2019-04-01 00:35:41,6418,548.0,503,Drake Ave & Fullerton Ave,500,Central Park Ave & Elbridge Ave,Subscriber,Male,1969.0,51.0,0,35-65
8,22178536,2019-04-01 00:29:48,2019-04-01 00:36:11,4513,383.0,260,Kedzie Ave & Milwaukee Ave,499,Kosciuszko Park,Subscriber,Male,1991.0,29.0,0,18-35
10,22178539,2019-04-01 00:36:20,2019-04-01 00:41:17,4666,297.0,304,Broadway & Waveland Ave,232,Pine Grove Ave & Waveland Ave,Subscriber,Male,1975.0,45.0,0,35-65
11,22178540,2019-04-01 00:58:38,2019-04-01 01:04:43,3735,365.0,37,Dearborn St & Adams St,337,Clark St & Chicago Ave,Subscriber,Male,1991.0,29.0,0,18-35
