# <h1><center>Austin MetroBike Data Analysis</center></h1>

## Introduction
The City of Austin provides public access to Austin MetroBike data. Austin MetroBike is a service that allows people to rent a bike from a kiosk and ride it to another kiosk in the city. Access to the data is provided via two api's. The first database describees Austin MetroBike trips, while the second describes Austin MetroBike kiosks.
<br>
<br>
Let's explore the dataset to learn what a new user can expect when renting an Austin MetroBike.

Trip data source: https://data.austintexas.gov/Transportation-and-Mobility/Austin-MetroBike-Trips/tyfh-5r8s
<br>
Kiosk data source: https://data.austintexas.gov/Transportation-and-Mobility/Austin-MetroBike-Kiosk-Locations/qd73-bsdg

## Table of Contents:
* [Extracting Data](#first-bullet)
* [Cleaning Data](#second-bullet)
* [Visualization and Analysis](#third-bullet)

## Extracting Data from the API <a class="anchor" id="first-bullet"></a>

In [1]:
# Importing the library needed to pull data from each API
import requests

In [31]:
# Accessing Austin MetroBike Trips and Austin MetroBike Kiosks API
austin_metrobike_trips = requests.get("https://data.austintexas.gov/resource/tyfh-5r8s.json")
print(austin_metrobike_trips.status_code)
print(austin_metrobike_trips.json())

austin_metrobike_kiosks = requests.get("https://data.austintexas.gov/resource/qd73-bsdg.json")

200
[{'trip_id': '9900285854', 'membership_type': 'Annual (San Antonio B-cycle)', 'bicycle_id': '207', 'checkout_date': '2014-10-26T00:00:00.000', 'checkout_time': '13:12:00', 'checkout_kiosk_id': '2537', 'checkout_kiosk': 'West & 6th St.', 'return_kiosk_id': '2707', 'return_kiosk': 'Rainey St @ Cummings', 'trip_duration_minutes': '76', 'month': '10', 'year': '2014'}, {'trip_id': '9900285855', 'membership_type': '24-Hour Kiosk (Austin B-cycle)', 'bicycle_id': '969', 'checkout_date': '2014-10-26T00:00:00.000', 'checkout_time': '13:12:00', 'checkout_kiosk_id': '2498', 'checkout_kiosk': 'Convention Center / 4th St. @ MetroRail', 'return_kiosk_id': '2566', 'return_kiosk': 'Pfluger Bridge @ W 2nd Street', 'trip_duration_minutes': '58', 'month': '10', 'year': '2014'}, {'trip_id': '9900285856', 'membership_type': 'Annual Membership (Austin B-cycle)', 'bicycle_id': '214', 'checkout_date': '2014-10-26T00:00:00.000', 'checkout_time': '13:12:00', 'checkout_kiosk_id': '2537', 'checkout_kiosk': 'We

In [45]:
import pandas as pd
import json

# Converting the Austin MetroBike Trips data from json into a Pandas dataframe
df_trips = pd.DataFrame.from_dict(austin_metrobike_trips.json())

df_trips.head(n=3)

Unnamed: 0,trip_id,membership_type,bicycle_id,checkout_date,checkout_time,checkout_kiosk_id,checkout_kiosk,return_kiosk_id,return_kiosk,trip_duration_minutes,month,year
0,9900285854,Annual (San Antonio B-cycle),207,2014-10-26T00:00:00.000,13:12:00,2537,West & 6th St.,2707,Rainey St @ Cummings,76,10,2014
1,9900285855,24-Hour Kiosk (Austin B-cycle),969,2014-10-26T00:00:00.000,13:12:00,2498,Convention Center / 4th St. @ MetroRail,2566,Pfluger Bridge @ W 2nd Street,58,10,2014
2,9900285856,Annual Membership (Austin B-cycle),214,2014-10-26T00:00:00.000,13:12:00,2537,West & 6th St.,2496,8th & Congress,8,10,2014


In [61]:
# Converting the Austin MetroBike Trips data from json into a Pandas dataframe
df_kiosks = pd.DataFrame.from_dict(austin_metrobike_kiosks.json())

# Parsing latitute and longitude from the kiosk dataframe location column into separate columns
kiosk_latitude = []
kiosk_longitude = []
for a in df_kiosks['location']:
    kiosk_latitude.append(a['latitude'])
    kiosk_longitude.append(a['longitude'])
    
df_kiosks['latitude'] = kiosk_latitude
df_kiosks['longitude'] = kiosk_longitude

df_kiosks.head(n=3)

Unnamed: 0,kiosk_id,kiosk_name,kiosk_status,location,address,city_asset_num,property_type,number_of_docks,power_type,footprint_length_feet,footprint_width_feet,council_district,modified_date,:@computed_region_a3it_2a2z,:@computed_region_8spj_utxs,notes,alt_name,latitude,longitude
0,2823,Capital Metro HQ - East 5th at Broadway,active,"{'latitude': '30.2563', 'longitude': '-97.71007'}",2910 E 5th St,16684,undetermined_parking,13,solar,40,5,3,2021-01-04T00:00:00.000,2857,3,,,30.2563,-97.71007
1,3291,11th & San Jacinto,active,"{'latitude': '30.27193', 'longitude': '-97.738...",310 E. 11th St.,32503,sidewalk,11,solar,35,5,1,2021-01-04T00:00:00.000,2856,1,,,30.27193,-97.73854
2,3292,East 4th & Chicon,active,"{'latitude': '30.25987', 'longitude': '-97.723...",1819 East 4th St.,32516,undetermined_parking,9,solar,30,5,3,2021-01-04T00:00:00.000,2857,3,,,30.25987,-97.72373


## Cleaning Data <a class="anchor" id="second-bullet"></a>


In [None]:
# Merging the latitude and longitude data into the trips dataframe based on kiosk ID
df_trips_locs = df_trips.merge(df_kiosks, how='left', left_on='checkout_kiosk_id', right_on='kiosk_id')
df_trips_locs = df_trips_locs.rename(columns={"latitude": "checkout_kiosk_latitude", "longitude": "checkout_kiosk_longitude"})

df_trips_locs = df_trips_locs.merge(df_kiosks, how='left', left_on='return_kiosk_id', right_on='kiosk_id')
df_trips_locs = df_trips_locs.rename(columns={"latitude": "return_kiosk_latitude", "longitude": "return_kiosk_longitude"})

In [109]:
# Check for nans
def nan_checker(obj):
    print("Number of NaNs:", obj.isna().sum())
nan_checker(df_trips_locs['checkout_kiosk_latitude'])
type(df_trips_locs['checkout_kiosk_latitude'][3])

Number of NaNs: 75


float

In [117]:
# Accessing the Google Maps API to estimate the ride length given the latitude and longitude of kiosks
import geopy.distance
import math

checkout_lat = df_trips_locs['checkout_kiosk_latitude']
checkout_long = df_trips_locs['checkout_kiosk_longitude']
return_lat = df_trips_locs['return_kiosk_latitude']
return_long = df_trips_locs['return_kiosk_longitude']

dist = []
for (a,b,c,d) in zip(checkout_lat,checkout_long,return_lat,return_long):
    # If there are nans in the location data, return a nan into the distance data
    if math.isnan(float(a)) or math.isnan(float(b)) or math.isnan(float(c)) or math.isnan(float(d)):
        dist.append(float("nan"))
    # Use the geopy distance module to calculate the distance between two kiosks
    else:
        dist.append(geopy.distance.geodesic([a,b],[c,d]).miles)

# Assign calculated distance values to a dataframe column
df_trips_locs['distance'] = dist

30.27041
<class 'str'>
no nans! 30.27041 -97.75046 30.255906 -97.739949
30.264327
<class 'str'>
no nans! 30.264327 -97.736446 30.26717 -97.75484
30.27041
<class 'str'>
no nans! 30.27041 -97.75046 30.2698 -97.74186
nan
<class 'float'>
nan nan nan nan
30.28039
<class 'str'>
30.28039 -97.73809 nan nan
nan
<class 'float'>
nan nan 30.27595 -97.74739
30.2698
<class 'str'>
no nans! 30.2698 -97.74186 30.27336 -97.73805
30.26968
<class 'str'>
no nans! 30.26968 -97.73074 30.26476 -97.74678
30.2698
<class 'str'>
no nans! 30.2698 -97.74186 30.2698 -97.74186
nan
<class 'float'>
nan nan 30.27595 -97.74739
30.24891
<class 'str'>
no nans! 30.24891 -97.75019 30.2698 -97.74186
30.26717
<class 'str'>
30.26717 -97.75484 nan nan
30.26446
<class 'str'>
no nans! 30.26446 -97.75665 30.26446 -97.75665
30.26408
<class 'str'>
no nans! 30.26408 -97.74355 30.25226 -97.74854
30.24891
<class 'str'>
no nans! 30.24891 -97.75019 30.27974 -97.74254
30.24891
<class 'str'>
no nans! 30.24891 -97.75019 30.27974 -97.74254
30

no nans! 30.27041 -97.75046 30.28039 -97.73809
30.26446
<class 'str'>
no nans! 30.26446 -97.75665 30.26446 -97.75665
30.27041
<class 'str'>
no nans! 30.27041 -97.75046 30.28039 -97.73809
30.26446
<class 'str'>
no nans! 30.26446 -97.75665 30.26446 -97.75665
30.2698
<class 'str'>
30.2698 -97.74186 nan nan
30.26446
<class 'str'>
no nans! 30.26446 -97.75665 30.26446 -97.75665
30.27336
<class 'str'>
no nans! 30.27336 -97.73805 30.26717 -97.75484
30.26446
<class 'str'>
no nans! 30.26446 -97.75665 30.26446 -97.75665
30.27336
<class 'str'>
no nans! 30.27336 -97.73805 30.26717 -97.75484
30.25226
<class 'str'>
no nans! 30.25226 -97.74854 30.262 -97.76118
30.2587
<class 'str'>
no nans! 30.2587 -97.74872 30.26408 -97.74355
30.2696
<class 'str'>
no nans! 30.2696 -97.75332 30.26304 -97.75824
30.26452
<class 'str'>
no nans! 30.26452 -97.7712 30.262 -97.76118
30.26446
<class 'str'>
no nans! 30.26446 -97.75665 30.26446 -97.75665
30.2726
<class 'str'>
no nans! 30.2726 -97.74127 30.26476 -97.74678
30.272

30.267263
<class 'str'>
no nans! 30.267263 -97.747144 30.255906 -97.739949
30.264327
<class 'str'>
no nans! 30.264327 -97.736446 30.2698 -97.74186
30.25895
<class 'str'>
30.25895 -97.71475 nan nan
30.2696
<class 'str'>
no nans! 30.2696 -97.75332 30.26476 -97.74678
30.2696
<class 'str'>
no nans! 30.2696 -97.75332 30.26476 -97.74678
30.2696
<class 'str'>
no nans! 30.2696 -97.75332 30.26476 -97.74678
30.262
<class 'str'>
no nans! 30.262 -97.76118 30.28576 -97.74181
30.262
<class 'str'>
no nans! 30.262 -97.76118 30.28576 -97.74181
30.26217
<class 'str'>
no nans! 30.26217 -97.72743 30.26416 -97.73289
30.26032
<class 'str'>
no nans! 30.26032 -97.71899 30.26634 -97.74378
30.28576
<class 'str'>
no nans! 30.28576 -97.74181 30.2726 -97.74127
30.26968
<class 'str'>
no nans! 30.26968 -97.73074 30.26217 -97.72743
30.26968
<class 'str'>
no nans! 30.26968 -97.73074 30.26217 -97.72743
30.2698
<class 'str'>
no nans! 30.2698 -97.74186 30.2698 -97.74186
30.24891
<class 'str'>
no nans! 30.24891 -97.75019 

no nans! 30.28039 -97.73809 30.28039 -97.73809
30.28039
<class 'str'>
no nans! 30.28039 -97.73809 30.28039 -97.73809
30.26461
<class 'str'>
no nans! 30.26461 -97.73049 30.26461 -97.73049
30.2698
<class 'str'>
no nans! 30.2698 -97.74186 30.26408 -97.74355
30.2678
<class 'str'>
30.2678 -97.75189 nan nan
30.25971
<class 'str'>
no nans! 30.25971 -97.75346 30.2698 -97.74186
30.25226
<class 'str'>
no nans! 30.25226 -97.74854 30.26476 -97.74678
30.26452
<class 'str'>
no nans! 30.26452 -97.7712 30.26717 -97.75484
30.26452
<class 'str'>
no nans! 30.26452 -97.7712 30.26717 -97.75484
30.26735
<class 'str'>
no nans! 30.26735 -97.73933 30.25103 -97.74926
30.26304
<class 'str'>
no nans! 30.26304 -97.75824 30.255906 -97.739949
30.26854
<class 'str'>
no nans! 30.26854 -97.73646 30.26634 -97.74378
30.26854
<class 'str'>
no nans! 30.26854 -97.73646 30.26634 -97.74378
30.26452
<class 'str'>
no nans! 30.26452 -97.7712 30.2659 -97.76822
30.26717
<class 'str'>
no nans! 30.26717 -97.75484 30.26452 -97.7712
3

no nans! 30.264327 -97.736446 30.267263 -97.747144
30.255906
<class 'str'>
30.255906 -97.739949 nan nan
30.2678
<class 'str'>
no nans! 30.2678 -97.75189 30.26634 -97.74378
30.264327
<class 'str'>
no nans! 30.264327 -97.736446 30.255906 -97.739949
30.25103
<class 'str'>
30.25103 -97.74926 nan nan
30.28576
<class 'str'>
no nans! 30.28576 -97.74181 30.27595 -97.74739
30.260814
<class 'str'>
no nans! 30.260814 -97.738086 30.260814 -97.738086
30.255906
<class 'str'>
no nans! 30.255906 -97.739949 30.255906 -97.739949
30.267263
<class 'str'>
no nans! 30.267263 -97.747144 30.2696 -97.75332
30.2696
<class 'str'>
no nans! 30.2696 -97.75332 30.2563 -97.71007
30.27041
<class 'str'>
no nans! 30.27041 -97.75046 30.27654 -97.74155
30.264327
<class 'str'>
no nans! 30.264327 -97.736446 30.255906 -97.739949
30.255906
<class 'str'>
no nans! 30.255906 -97.739949 30.25941 -97.74971
30.2678
<class 'str'>
no nans! 30.2678 -97.75189 30.26634 -97.74378
30.26717
<class 'str'>
no nans! 30.26717 -97.75484 30.2647

In [123]:
from geopy.geocoders import Nominatim

# initialize Nominatim API
geolocator = Nominatim(user_agent="geoapiExercises")
location = geolocator.geocode(place)

check_zip = []
return_zip = []
for (a,b,c,d) in zip(checkout_lat,checkout_long,return_lat,return_long):
    # If there are nans in the location data, return a nan into the distance data
    if math.isnan(float(a)) or math.isnan(float(b)) or math.isnan(float(c)) or math.isnan(float(d)):
        check_zip.append(float("nan"))
        return_zip.append(float("nan"))
    # Use the geopy distance module to calculate the distance between two kiosks
    else:
        check_zip.append(geolocator.reverse([a,b]))
        return_zip.append(geolocator.reverse([c,d]))

# Assign zipcode values to a dataframe column
df_trips_locs['checkout_zip'] = check_zip
df_trips_locs['return_zip'] = return_zip

In [119]:
df_trips_locs.to_csv('Austin Metrobike Data.csv', index=False, header=False)

In [124]:
df_trips_locs

Unnamed: 0,trip_id,membership_type,bicycle_id,checkout_date,checkout_time,checkout_kiosk_id,checkout_kiosk,return_kiosk_id,return_kiosk,trip_duration_minutes,...,modified_date_y,:@computed_region_a3it_2a2z_y,:@computed_region_8spj_utxs_y,notes_y,alt_name_y,return_kiosk_latitude,return_kiosk_longitude,distance,checkout_zip,return_zip
0,9900285854,Annual (San Antonio B-cycle),207,2014-10-26T00:00:00.000,13:12:00,2537,West & 6th St.,2707,Rainey St @ Cummings,76,...,2022-03-04T10:38:00.000,2856,9,parkland at ROW/easement,,30.255906,-97.739949,1.180333,"(603, West Avenue, West Sixth, Austin, Travis ...","(Rainey/Cummings, Lady Bird Lake Hike and Bike..."
1,9900285855,24-Hour Kiosk (Austin B-cycle),969,2014-10-26T00:00:00.000,13:12:00,2498,Convention Center / 4th St. @ MetroRail,2566,Pfluger Bridge @ W 2nd Street,58,...,2021-01-04T00:00:00.000,2858,9,adjacent to parkland,,30.26717,-97.75484,1.117139,"(4th/Sabine, East 4th Street, Rainey Street Hi...","(Austin MetroBike Station, Electric Drive, Sea..."
2,9900285856,Annual Membership (Austin B-cycle),214,2014-10-26T00:00:00.000,13:12:00,2537,West & 6th St.,2496,8th & Congress,8,...,2021-01-04T00:00:00.000,2856,9,double sided,,30.2698,-97.74186,0.515915,"(603, West Avenue, West Sixth, Austin, Travis ...","(Chipotle, East 8th Street, Downtown, Austin, ..."
3,9900285857,24-Hour Kiosk (Austin B-cycle),745,2014-10-26T00:00:00.000,13:12:00,,Zilker Park at Barton Springs & William Barton...,,Zilker Park at Barton Springs & William Barton...,28,...,,,,,,,,,,
4,9900285858,24-Hour Kiosk (Austin B-cycle),164,2014-10-26T00:00:00.000,13:12:00,2538,Bullock Museum @ Congress & MLK,,Convention Center/ 3rd & Trinity,15,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,9900286829,Semester Membership (Austin B-cycle),866,2014-10-28T00:00:00.000,10:12:00,2496,8th & Congress,2575,Riverside @ S. Lamar,14,...,2021-01-04T00:00:00.000,2859,5,parkland at ROW/easement,,30.26446,-97.75665,0.957783,"(Chipotle, East 8th Street, Downtown, Austin, ...","(1300, West Riverside Drive, Seaholm, Austin, ..."
996,9900286830,Annual Membership (Austin B-cycle),895,2014-10-28T00:00:00.000,11:12:00,2552,3rd & West,2495,4th & Congress,6,...,2021-01-04T00:00:00.000,2856,9,,,30.26634,-97.74378,0.495237,"(Austin MetroBike Station, West 3rd Street, Se...","(J&S Coppel Building, West 4th Street, Warehou..."
997,9900286831,Annual Membership (Austin B-cycle),329,2014-10-28T00:00:00.000,11:12:00,2499,City Hall / Lavaca & 2nd,2501,5th & Bowie,4,...,2021-01-04T00:00:00.000,2856,9,,,30.2696,-97.75332,0.513875,"(Austin City Hall, Lavaca Street, Warehouse Di...","(Austin B-cycle Station, Bowie Street, Seaholm..."
998,9900286832,24-Hour Kiosk (Austin B-cycle),78,2014-10-28T00:00:00.000,11:12:00,2570,South Congress & Academy,2569,East 11th St. & San Marcos,41,...,2021-01-04T00:00:00.000,2857,1,,,30.26968,-97.73074,1.603984,"(South Congress Avenue, South River City, Aust...","(East 11th/San Marcos, East 11th Street, Centr..."


## Visualization and Analysis <a class="anchor" id="third-bullet"></a>
