# Minilab 3 - Mapping TAZ travel times

In this lab we will play with some data from the Metropolitan Transportation Commission on travel time from one Traffic Analysis Zone (TAZ) to another. We will use a mapping tool called folium to create a graphical representation of travel times throughout the area.


In [None]:
import folium
import json
from datascience import *
import pandas as pd

## The datasets
### MTC travel skims
The Metropolitan Transportation Co,mission (MTC) is the regional transportation planning organization for the Bay Area. They host a database with average travel time, cost, and distance from each traffic analysis zone (TAZ) to all other TAZs in the Bay Area. The files have data for driving alone, car pooling, walking to transit, driving to transit, walking, and biking. 

We have pre-processed the data from the morning commute to include only TAZs around San Francisco, Oakland and Berkeley. The files with inter-TAZ travel time, travel cost, and travel distance are saved with the following filepaths 'data/sf_oak_TimeSkims_AM.csv', 'data/sf_oak_CostSkims_AM.csv' and 'data/sf_oak_DistanceSkims_AM.csv'

More info on the dataset can be found here - http://analytics.mtc.ca.gov/foswiki/Main/SimpleSkims. 
The descriptions of the columns in the data set are shown below:

    orig	Origin transportation analysis zone	Shape file
    dest	Destination transportation analysis zone	Shape file
    da	    Door-to-door time for the drive alone travel mode (i.e. single occupant private automobile)	 
    daToll	Door-to-door time for the drive alone value toll travel mode (those willing to pay to use an HOT lane)	 
    s2	    Door-to-door time for the shared ride 2 travel mode (i.e. double occupant private automobile)	 
    s2Toll	Door-to-door time for the shared ride 2 value toll travel mode (those willing to pay to use an HOT lane)
    s3	    Door-to-door time for the shared ride 3+ travel mode (i.e. three-or-more occupants traveling in a private vehicle)	 
    s3Toll	Door-to-door time for the shared ride 3+ value toll travel mode (those willing to pay to use an HOT lane, if scenario policy dictates they must pay)	 
    walk	Door-to-door time for walking	 
    bike	Door-to-door time for bicycling	 
    wTrnW	Door-to-door time for walk to transit to walk paths	 
    dTrnW	Door-to-door time for drive to transit to walk paths	 
    wTrnD	Door-to-door time for walk to transit to drive paths (returning home on a park-and-ride tour)

(The raw data with all bay area TAZs can be found at https://mtcdrive.app.box.com/2015-03-116)

### Bay area TAZ geometry data
GeoJSON is a format used for encoding a variety of geographic data structures. We have saved a GeoJSON file with the Traffic Analysis Zone (TAZ) polygons for the TAZs in the San Francisco, Oakland, and Berkeley region. We will use a mapping package called folium to map the TAZs.

### Read the data
Normally in the data 8 class we read the datasets into Tables, but for this lab we are using a  tool that requires the data to be in a Pandas Dataframe. The functionality is **very** similar to the table functionality.

In [None]:
# Read the timeskims datafile
data = pd.read_csv('data/sf_oak_TimeSkims_AM.csv')
#print out the first 5 rows of the data table
# data.head()
data.dtypes

In [None]:
# Read the GeoJson file save the data as a string
# json.dumps() dumps the data to a string
geojson_data = json.load(open('data/SF_Oak_TAZs.geojson'))
geojson_str = json.dumps(geojson_data) 

# The geojson string contains properties of the TAZ as well as lat-lon coordinates defining 
# the polygon boundary for each TAZ. Here is a printout of what the geojson looks like.
geojson_str

## Mapping the TAZs
We will use a mapping package called Folium to map the TAZs



In [None]:
# Create a folium map object. Set the width, and height of the map image 
# as well as start zoom level and center location
taz_map = folium.Map(width=650, height=500, zoom_start=11,
                     location=[37.8062449,-122.371983])

# Add the taz geojson data layer to the map using the geo_json() method
taz_map.geo_json(geo_str=geojson_str)

# Create the map. 
taz_map.create_map('map1.html')

#to show the map in-line
taz_map

## Travel time from one TAZ to all others


In [None]:
#choose an origin TAZ
origin_taz_id=10

# save a new data frame that contains only rows with the origin specified above 
data_origin = data[data['orig']==origin_taz_id]

Folium allows us to color map the elemets based on data. We will again use folium's geo_json() method, but now we will use additional arguments to color the TAZs based on travel time. 
We will use the following inputs to the geojson method:

    data: Pandas DataFrame or Series, default None
        Data to bind to the GeoJSON.
    columns: the columns of the dataframe that we will use to color the data
    key_on: string, default None
        Variable in the GeoJSON file to bind the data to. Must always
        start with 'feature' and be in JavaScript objection notation.
        Ex: 'feature.id' or 'feature.properties.statename'. 
    threshold_scale: list, default None
        Data range for D3 threshold scale. Defaults to the following range
        of quantiles: [0, 0.5, 0.75, 0.85, 0.9], rounded to the nearest
        order-of-magnitude integer. Ex: 270 rounds to 200, 5600 to 6000.
    fill_color: string, default 'blue'
        Area fill color. Can pass a hex code, color name, or if you are
        binding data, one of the following color brewer palettes:
        'BuGn', 'BuPu', 'GnBu', 'OrRd', 'PuBu', 'PuBuGn', 'PuRd', 'RdPu',
        'YlGn', 'YlGnBu', 'YlOrBr', and 'YlOrRd'.


In [None]:
traveltime_map = folium.Map(width=650, height=500, zoom_start=11,
                            location=[37.8062449,-122.371983])

traveltime_map.geo_json(geo_str=geojson_str, data=data_origin,
                        columns=['dest', 'bike'],
                        threshold_scale=[10,20,30,40,50],
                        key_on='feature.properties.TAZ1454',
                        fill_color='PuBu', fill_opacity=.8)

traveltime_map.create_map('map2.html')
traveltime_map



## Color the origin TAZ red

In [None]:
# find origin_taz_id and color it red
for i in range(len(geojson_data['features'])):
    if (geojson_data['features'][i]['properties']['TAZ1454'] == origin_taz_id):
        origin_feature  = geojson_data['features'][i]

# Create a new geojson layer called origin_data. This geojson will only have one feature -
# the origin feature. We will replace the origin_data['features'] with a list containing 
# only the origin TAZ feature

origin_data = json.load(open('data/SF_Oak_TAZs.geojson'))
origin_data['features']=[origin_feature]
origin_str = json.dumps(origin_data)

traveltime_map.geo_json(geo_str=origin_str,fill_color='Red',
                       fill_opacity = .8)

traveltime_map.create_map('map3.html')
traveltime_map

# Task 1 - Modify the map
 <li>Change the map color scheme.
 <li>Choose an Eastbay TAZ as the origin TAZ (e.g. TAZ 1035) and generate a new map with travel times from this TAZ.
 <li>The map is currently displaying average drive alone (da) travel time from one TAZ to another. Change the travel mode to walk-transit-walk (wTrnW)
 <li>Change the color threshold values so that the color transitions happen at appropriate intervals

# Task 2 - The relationship between spatial proximity and travel time
For each of the following modes: drive, walk to transit, drive to transit, bike, and walk, do you think a straightline distance from one TAZ to another is a good indicator of travel time? Explain why or why not.



In [None]:
# Answer here

# Task 3 - Color all unreachable TAZs grey
For the datasets provided, MTC follows a convention of setting travel time = '-999' in the data table if they have determined that a mode is not feasible to get from an origin TAZ to a destination TAZ. For example, a bike trip from San Francisco to Oakland is not possible because bikes are not allowed accross the Bay Bridge. Similarly some walk distances are determined to be too far to make a trip on foot.

Color all TAZs with a travel time of -999 grey. Hint - use a procedure similar to the procedure used to color the origin red.


# Task 3 Solution

## Step 1. Find ids of destinations to color grey
The code below finds the destination ids in one line, below this we will walk through exactly what this line is doing. 

In [None]:
# Find items in the data_origin dataframe that have a travel time by bike of -999
# Below is a one-line solution to do this:
grey_dest_ids  = data_origin[data_origin['bike'] == -999]['dest']

In [None]:
#BUT, let's walk through it
# Let's look at the data_origin table again
# head() prints the first 5 rows of the dataframe
data_origin.head() 

In [None]:
# to get a list of true/false for whether bike travel time = -999
data_origin['bike'] == -999

In [None]:
#grab the rows from the data_origin table where travel time by bike =-999. 
data_origin[data_origin['bike'] == -999]

# Note that all bike travel times are -999 in the table printed below

In [None]:
# Now we want a list of the destination ids where bike travel time = -999
data_origin[data_origin['bike'] == -999]['dest']

# the left columm of the table printed below is the index from the initial data table. 
# The right column is the destination id column

In [None]:
# we assign this to a variable called grey_dest_ids
grey_dest_ids  = data_origin[data_origin['bike'] == -999]['dest']

Now we have a list of destination ids for destinations that are unreachable from the origin. 
## Step 2. Find the grey_dest_ids in the geojson file

In [None]:
# First start with an empty list of grey_features. We will append grey features 
# to them as we find them
grey_features = []

#loop through the destination ids
for dest_id in grey_dest_ids:
    
    # The code below is looping through the geojson features and finding features 
    # where the TAZ1454 property is equal to dest_id. Once it finds it, we append
    # it to the grey_features list. (Note the 'TAZ1454' property contains the TAZ id. 
    # 'TAZ1454' refers to the fact that in total there are 1454 TAZs in the bay area,
    # and they are assigned ids from 1 to 1454)
    for i in range(len(geojson_data['features'])):
        if (geojson_data['features'][i]['properties']['TAZ1454'] == dest_id):
            grey_features.append(geojson_data['features'][i]) 

# Let's take a look at what grey features looks like
grey_features

## Step 3. Create a new Geojson object. Replace the features data with the grey_features list we just created

In [None]:
# Create a new geojson layer called grey_data. We will replace the features in this
# geojson with the list of features we just created above.
grey_data = json.load(open('data/SF_Oak_TAZs.geojson'))
grey_data['features']=grey_features
grey_str = json.dumps(grey_data)

#Now we add this layer to our map. Set the fill color to Grey
traveltime_map.geo_json(geo_str=grey_str,fill_color='Grey',
                       fill_opacity = .8)

# And create the map:
traveltime_map.create_map('map4.html')
traveltime_map

# Task 4 - Map travel cost from an origin 
We have been visualizing travel time from one TAZ to another. As mentioned above MTC also provides info on travel cost and distance from one TAZ to another by mode. Create a folium map of tavel distance from an origin TAZ to all others, using the same procedure as outlined above.

# Task 4 solution
To do this we would run all of the same code as above, but in the very beginning, 
set data = pd.read_csv('data/sf_oak_CostSkims_AM.csv') where we use the cost
skims file rather than the distance skims file.