# OD Flow Analysis

This notebook will use the [PSRC HTS](https://www.psrc.org/our-work/household-travel-survey-program) to create an OD matrix for the census tract level.

Flows will be visualized.

Trip-level data is used ([download here](https://household-travel-survey-psregcncl.hub.arcgis.com/datasets/22d91ae217be41f58ebac0844ac5d60d_0/explore))

In [4]:
# libraries
import numpy as np
import pandas as pd
import censusdata

In [43]:
# read in data
trips_df = pd.read_csv("../land-use-travel-patterns/data/Household_Travel_Survey_Trips.csv")
trips_df_2019 = trips_df[trips_df.survey_year == 2019]

  trips_df = pd.read_csv("../land-use-travel-patterns/data/Household_Travel_Survey_Trips.csv")


## Create OD Matrix

- consider using 1 year of data (2019), then applying weights?

### Determine census tracts in PSRC extent (King, Kitsap, Pierce, Snohomish) and assign to matrix index

In [16]:
# borrowing code from labs...thanks eric :D

def get_census_data(state, county, year=2019):

    # Download the data
    data = censusdata.download('acs5', year,  # Use 2019 ACS 5-year estimates
                               censusdata.censusgeo([('state', state), ('county', county), ('tract', '*')]), ['B01003_001E']) # random table, I just need to tract IDs

    # # Extract information from the first column
    data['Name'] = data.index.to_series().apply(lambda x: x.name)
    data['SummaryLevel'] = data.index.to_series().apply(lambda x: x.sumlevel())
    data['State'] = data.index.to_series().apply(lambda x: x.geo[0][1])
    data['County'] = data.index.to_series().apply(lambda x: x.geo[1][1])
    data['Tract'] = data.index.to_series().apply(lambda x: x.geo[2][1])
    data.reset_index(drop=True, inplace=True)
    data = data[['Tract','Name', 'State', 'County']]
    return data


In [57]:
# Define the state and county for Seattle
state_fips = '53'  # FIPS code for Washington
psrc_county_fips = ['033', '035', '053', '061']  # FIPS code for King, Kitsap, Pierce, Snohomish County

# psrc_census_tracts_df = pd.DataFrame()
# for countyfip in psrc_county_fips:
#     psrc_census_tracts_df = pd.concat([psrc_census_tracts_df, get_census_data(state_fips, countyfip)])
psrc_census_tracts_df = pd.concat([get_census_data(state_fips, psrc_county_fips[0]), get_census_data(state_fips, psrc_county_fips[1]),
                                   get_census_data(state_fips, psrc_county_fips[2]), get_census_data(state_fips, psrc_county_fips[3])])

psrc_census_tracts_df['tractid'] = pd.to_numeric(psrc_census_tracts_df['State']+psrc_census_tracts_df['County']+psrc_census_tracts_df['Tract'], errors='coerce')

In [58]:
# assign each tractid to unique number from 1-776 (easier to wrap my head around in a matrix)
psrc_census_tracts_df['tract_alias'] = np.arange(1, len(psrc_census_tracts_df) + 1)

In [59]:
psrc_census_tracts_df

Unnamed: 0,Tract,Name,State,County,tractid,tract_alias
0,020700,"Census Tract 207, King County, Washington",53,033,53033020700,1
1,020800,"Census Tract 208, King County, Washington",53,033,53033020800,2
2,020900,"Census Tract 209, King County, Washington",53,033,53033020900,3
3,021000,"Census Tract 210, King County, Washington",53,033,53033021000,4
4,022001,"Census Tract 220.01, King County, Washington",53,033,53033022001,5
...,...,...,...,...,...,...
146,052503,"Census Tract 525.03, Snohomish County, Washington",53,061,53061052503,772
147,052604,"Census Tract 526.04, Snohomish County, Washington",53,061,53061052604,773
148,052606,"Census Tract 526.06, Snohomish County, Washington",53,061,53061052606,774
149,052903,"Census Tract 529.03, Snohomish County, Washington",53,061,53061052903,775


### Format PSRC data 
- match tracts to matrix index
- form matrix

In [89]:
# relevant columns
trips_df_tracts = trips_df_2019[["trip_id", "o_tract10", "d_tract10"]]

In [90]:
# use census tracts as "lookup table" to assign alias 
trips_df_tracts = trips_df_tracts.merge(psrc_census_tracts_df[["tractid", "tract_alias"]], left_on='o_tract10', right_on='tractid', 
                                        how='left').rename(columns={'tract_alias':'o_tract_alias'}).drop(columns='tractid')
trips_df_tracts = trips_df_tracts.merge(psrc_census_tracts_df[["tractid", "tract_alias"]], left_on='d_tract10', right_on='tractid', 
                                        how='left').rename(columns={'tract_alias':'d_tract_alias'}).drop(columns='tractid')

In [91]:
trips_df_tracts

Unnamed: 0,trip_id,o_tract10,d_tract10,o_tract_alias,d_tract_alias
0,19100000101027,5.303301e+10,5.303301e+10,328.0,187.0
1,19100014501002,5.303301e+10,5.303301e+10,236.0,325.0
2,19100033902004,5.303301e+10,5.303300e+10,182.0,28.0
3,19100003101001,5.303302e+10,5.303301e+10,341.0,120.0
4,19100003101002,5.303301e+10,5.303302e+10,120.0,341.0
...,...,...,...,...,...
72850,19201842503005,5.306105e+10,5.306105e+10,676.0,706.0
72851,19201842503006,5.306105e+10,5.303302e+10,706.0,240.0
72852,19201842503007,5.303302e+10,5.306105e+10,240.0,706.0
72853,19201842503008,5.306105e+10,5.303302e+10,706.0,78.0


In [101]:
od_matrix = pd.DataFrame(0, index=np.arange(1,len(psrc_census_tracts_df) + 1), columns=np.arange(1, len(psrc_census_tracts_df) + 1)).add(pd.pivot_table(data=trips_df_tracts, values="trip_id", index='o_tract_alias', columns='d_tract_alias',
           fill_value=0, aggfunc='count'))

In [106]:
od_matrix

Unnamed: 0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,...,767.0,768.0,769.0,770.0,771.0,772.0,773.0,774.0,775.0,776.0
1.0,8.0,0.0,5.0,2.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2.0,2.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3.0,4.0,2.0,30.0,3.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0
4.0,4.0,0.0,7.0,5.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,2.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
772.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,44.0,20.0,0.0,0.0,0.0
773.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,19.0,2.0,0.0,0.0,0.0
774.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
775.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0


## Now this is ready to be visualized with scikit-mob, movingpandas, etc!