# Measuring Access to Healthcare in the IE

### Software

To do a full-statck assessment of accessibility we need to link together a few diferent pieces of software:

- [`urbanaccess`]() builds a routable, multimodal transportation network. It consumes data from (1) OpenStreetMap and (2) GTFS and combines them into a network whose distance between nodes is measured in travel time
- [`pandana`]() finds the shortest path through the network between all pairs of nodes using fast underlying C++ library, and can optionally create distance-weighted sums of resources--a simple measurre of accessibility that accounts for space but not competition. `pandana` consumes (1) the network generated by `urbanaccess` and (2) a set of oigins and (3) destinations. We will use `pandana` to create the shortest-path travel cost matrix relating origins to destinations
- [`access`]() creates measures of accessibility that account for distance traveled and include various weighting schemes to discount for competition. `accesss` consumes geodataframes representing (1) supply and (2) demand, and (3) a travel cost matrix. In our case, (1) is equivalent to the desitnation set above, (2) is a set of population counts, and (3) is the output from `pandana`

### Data

- OSM
    - urbanaccess (and osmnet) have the ability to download data from OSM in the format they need using a function called `network_from_bbox` which is nice, but can be very time consuming depending on the size of the network
    - I have stored pre-built metropolitan scale networks (extending 8km beyond the metro border, so supporting queries up to that range) in [our quilt bucket](https://open.quiltdata.com/b/spatial-ucr/packages/osm/metro_networks_8k)
- GTFS
    - [transitfeeds](https://transitfeeds.com/p/riverside-transit-agency/531) is a good place to get up-to-date gtfs data, but it can be hard to ensure you've included every relevant transit agency serving the study area (and impossible to know whether there's another agency that hasnt yet posted its data there)

In [3]:
import quilt3
import pandas as pd
import numpy as np
import geopandas as gpd
import urbanaccess as ua
import access
import pandana as pdna

If you get "ImportError: cannot import name 'vincenty' from 'geopy.distance'" you need to downgrade geopy to version 1.9, e.g.

```
pip install geopy==1.9
```

(I sent [a fix](https://github.com/UDST/osmnet/pull/21) that thas been merged, but still waiting on a new release of the `osmnet` package)

#### Creating a multimodal network

download OSM data from our quilt bucket

In [2]:
p = quilt3.Package.browse("osm/metro_networks_8k", "s3://spatial-ucr")

In [3]:
p['40140.h5'].fetch(dest="../data/")

Copying: 100%|██████████| 42.1M/42.1M [00:03<00:00, 12.7MB/s]  


PackageEntry('file:///Users/knaaptime/projects/healthacc/data/')

In [15]:
# read in the pre-saved OSM network using pandana

osm_network = pdna.Network.from_hdf5("../data/40140.h5")

Now we read in GTFS data using urbanaccess. From here, we're essentially following the tutorial from https://github.com/UDST/urbanaccess/blob/dev/demo/simple_example.ipynb

Two things to note:
- First, `ua` uses a deprecated pandas convention (`to_matrix` instead of `values`) so you need to either fix that code in `network.py` or install the version from my fork. 
- Second `ua` makes some  nontraditional design desisions. Rather than expose something like a Network class, they keep track of a global network object in  `ua.network.ua_network`. Thus, most of the functions dont return  an object, but modify the global network 

In [5]:
#ua.gtfsfeed_to_df

In [4]:
# read in GTFS data from the dir extracted from the zip

loaded_feeds = ua.gtfsfeed_to_df(gtfsfeed_path="/Users/knaaptime/projects/healthacc/data/rside_gtfs/",
                                           validation=False,
                                           verbose=True,
                                           append_definitions=False)

GTFS text file header whitespace check completed. Took 0.06 seconds
--------------------------------
Processing GTFS feed: 
The unique agency id: riverside_transit_agency was generated using the name of the agency in the agency.txt file.
Unique agency id operation complete. Took 0.01 seconds
Unique GTFS feed id operation complete. Took 0.00 seconds
Appended route type to stops
Appended route type to stop_times
--------------------------------
Successfully converted ['departure_time'] to seconds past midnight and appended new columns to stop_times. Took 0.24 seconds
1 GTFS feed file(s) successfully read as dataframes:
     
     Took 0.58 seconds


In [5]:
# transform the raw gtfs data into a transit network

ua.create_transit_net(gtfsfeeds_dfs=loaded_feeds,
                                   day='monday',
                                   timerange=['07:00:00', '10:00:00'],
                                   calendar_dates_lookup=None)

Using calendar to extract service_ids to select trips.
1 service_ids were extracted from calendar
966 trip(s) 51.58 percent of 1,873 total trip records were found in calendar for GTFS feed(s): ['']
NOTE: If you expected more trips to have been extracted and your GTFS feed(s) have a calendar_dates file, consider utilizing the calendar_dates_lookup parameter in order to add additional trips based on information inside of calendar_dates. This should only be done if you know the corresponding GTFS feed is using calendar_dates instead of calendar to specify service_ids. When in doubt do not use the calendar_dates_lookup parameter.
966 of 1,873 total trips were extracted representing calendar day: monday. Took 0.02 seconds
There are no departure time records missing from trips following monday schedule. There are no records to interpolate.
Difference between stop times has been successfully calculated. Took 0.19 seconds
Stop times from 07:00:00 to 10:00:00 successfully selected 8,613 records

<urbanaccess.network.urbanaccess_network at 0x7ffca0c8a5d0>

Here, we create a pointed to the global network so that it's easier to inspect

In [6]:
urbanaccess_net = ua.ua_network

In [7]:
urbanaccess_net.transit_edges

Unnamed: 0,node_id_from,node_id_to,weight,unique_agency_id,unique_trip_id,sequence,id,route_type,unique_route_id,net_type
0,749_riverside_transit_agency,714_riverside_transit_agency,1.000000,riverside_transit_agency,534170020_riverside_transit_agency,1,534170020_riverside_transit_agency_1,3,16_riverside_transit_agency,transit
1,714_riverside_transit_agency,771_riverside_transit_agency,0.533333,riverside_transit_agency,534170020_riverside_transit_agency,2,534170020_riverside_transit_agency_2,3,16_riverside_transit_agency,transit
2,771_riverside_transit_agency,715_riverside_transit_agency,0.433333,riverside_transit_agency,534170020_riverside_transit_agency,3,534170020_riverside_transit_agency_3,3,16_riverside_transit_agency,transit
3,715_riverside_transit_agency,717_riverside_transit_agency,0.816667,riverside_transit_agency,534170020_riverside_transit_agency,4,534170020_riverside_transit_agency_4,3,16_riverside_transit_agency,transit
4,717_riverside_transit_agency,772_riverside_transit_agency,0.633333,riverside_transit_agency,534170020_riverside_transit_agency,5,534170020_riverside_transit_agency_5,3,16_riverside_transit_agency,transit
...,...,...,...,...,...,...,...,...,...,...
8380,7_riverside_transit_agency,549_riverside_transit_agency,11.483333,riverside_transit_agency,538597020_riverside_transit_agency,6,538597020_riverside_transit_agency_6,3,204_riverside_transit_agency,transit
8381,549_riverside_transit_agency,550_riverside_transit_agency,3.516667,riverside_transit_agency,538597020_riverside_transit_agency,7,538597020_riverside_transit_agency_7,3,204_riverside_transit_agency,transit
8382,1037_riverside_transit_agency,1100_riverside_transit_agency,0.400000,riverside_transit_agency,538603020_riverside_transit_agency,1,538603020_riverside_transit_agency_1,3,204_riverside_transit_agency,transit
8383,1100_riverside_transit_agency,1038_riverside_transit_agency,17.000000,riverside_transit_agency,538603020_riverside_transit_agency,2,538603020_riverside_transit_agency_2,3,204_riverside_transit_agency,transit


Now we create `ua` version of the OSM version we read in earlier using pandana.  Pandana and urbanaccess share a lot of underlying code but their objects arent interchangeable. So there's probably a way to just read in the OSM data using `ua` instead of `pandana`, but this is the path of least resistance

In [16]:
ua_osm = ua.create_osm_net(osm_edges=osm_network.edges_df,
                              osm_nodes=osm_network.nodes_df,
                              travel_speed_mph=3)

Created OSM network with travel time impedance using a travel speed of 3 MPH. Took 0.02 seconds


Now all we have to do is integrate the networks and save the result as an h5 file

In [18]:
ua.integrate_network(urbanaccess_network=urbanaccess_net,
                             headways=False)

Loaded UrbanAccess network components comprised of:
     Transit: 2,415 nodes and 8,385 edges;
     OSM: 486,514 nodes and 742,113 edges
Connector edges between the OSM and transit network nodes successfully completed. Took 1.51 seconds
Edge and node tables formatted for Pandana with integer node ids: id_int, to_int, and from_int. Took 3.19 seconds
Network edge and node network integration completed successfully resulting in a total of 488,929 nodes and 755,328 edges:
     Transit: 2,415 nodes 8,385 edges;
     OSM: 486,514 nodes 742,113 edges; and
     OSM/Transit connector: 4,830 edges.


<urbanaccess.network.urbanaccess_network at 0x7ffca0c8a5d0>

In [20]:
urbanaccess_net.transit_edges

Unnamed: 0,from,to,weight,unique_agency_id,unique_trip_id,sequence,id,route_type,unique_route_id,net_type
0,749_riverside_transit_agency,714_riverside_transit_agency,1.000000,riverside_transit_agency,534170020_riverside_transit_agency,1,534170020_riverside_transit_agency_1,3,16_riverside_transit_agency,transit
1,714_riverside_transit_agency,771_riverside_transit_agency,0.533333,riverside_transit_agency,534170020_riverside_transit_agency,2,534170020_riverside_transit_agency_2,3,16_riverside_transit_agency,transit
2,771_riverside_transit_agency,715_riverside_transit_agency,0.433333,riverside_transit_agency,534170020_riverside_transit_agency,3,534170020_riverside_transit_agency_3,3,16_riverside_transit_agency,transit
3,715_riverside_transit_agency,717_riverside_transit_agency,0.816667,riverside_transit_agency,534170020_riverside_transit_agency,4,534170020_riverside_transit_agency_4,3,16_riverside_transit_agency,transit
4,717_riverside_transit_agency,772_riverside_transit_agency,0.633333,riverside_transit_agency,534170020_riverside_transit_agency,5,534170020_riverside_transit_agency_5,3,16_riverside_transit_agency,transit
...,...,...,...,...,...,...,...,...,...,...
8380,7_riverside_transit_agency,549_riverside_transit_agency,11.483333,riverside_transit_agency,538597020_riverside_transit_agency,6,538597020_riverside_transit_agency_6,3,204_riverside_transit_agency,transit
8381,549_riverside_transit_agency,550_riverside_transit_agency,3.516667,riverside_transit_agency,538597020_riverside_transit_agency,7,538597020_riverside_transit_agency_7,3,204_riverside_transit_agency,transit
8382,1037_riverside_transit_agency,1100_riverside_transit_agency,0.400000,riverside_transit_agency,538603020_riverside_transit_agency,1,538603020_riverside_transit_agency_1,3,204_riverside_transit_agency,transit
8383,1100_riverside_transit_agency,1038_riverside_transit_agency,17.000000,riverside_transit_agency,538603020_riverside_transit_agency,2,538603020_riverside_transit_agency_2,3,204_riverside_transit_agency,transit


In [21]:
# I think somewhere ua has hardcoded the paths to be `logs` inside the cwd, so you need to move up *two* levels
ua.save_network(urbanaccess_network=urbanaccess_net,
                        filename='../../data/combined_net.h5',
                        overwrite_key = True)

Using existing data/../../data/combined_net.h5 hdf5 store.
Existing edges overwritten in data/../../data/combined_net.h5 hdf5 store.
Using existing data/../../data/combined_net.h5 hdf5 store.
nodes saved in data/../../data/combined_net.h5 hdf5 store.


In [33]:
urbanaccess_net.net_edges

Unnamed: 0,from,to,weight,unique_agency_id,unique_trip_id,sequence,edge_id,route_type,unique_route_id,net_type,distance,from_int,to_int
0,749_riverside_transit_agency,714_riverside_transit_agency,1.000000,riverside_transit_agency,534170020_riverside_transit_agency,1.0,534170020_riverside_transit_agency_1,3.0,16_riverside_transit_agency,transit,,2232.0,1395.0
1,714_riverside_transit_agency,771_riverside_transit_agency,0.533333,riverside_transit_agency,534170020_riverside_transit_agency,2.0,534170020_riverside_transit_agency_2,3.0,16_riverside_transit_agency,transit,,1395.0,102.0
2,771_riverside_transit_agency,715_riverside_transit_agency,0.433333,riverside_transit_agency,534170020_riverside_transit_agency,3.0,534170020_riverside_transit_agency_3,3.0,16_riverside_transit_agency,transit,,102.0,2022.0
3,715_riverside_transit_agency,717_riverside_transit_agency,0.816667,riverside_transit_agency,534170020_riverside_transit_agency,4.0,534170020_riverside_transit_agency_4,3.0,16_riverside_transit_agency,transit,,2022.0,85.0
4,717_riverside_transit_agency,772_riverside_transit_agency,0.633333,riverside_transit_agency,534170020_riverside_transit_agency,5.0,534170020_riverside_transit_agency_5,3.0,16_riverside_transit_agency,transit,,85.0,56.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
755323,3656016333,298_riverside_transit_agency,0.216157,,,,,,,osm to transit,,,2413.0
755324,845_riverside_transit_agency,54307124,0.334064,,,,,,,transit to osm,,2414.0,
755325,54307124,845_riverside_transit_agency,0.334064,,,,,,,osm to transit,,,2414.0
755326,2144_riverside_transit_agency,412218965,0.484858,,,,,,,transit to osm,,2415.0,


In [35]:
combined_net = pdna.Network(urbanaccess_net.net_nodes["x"],
                               urbanaccess_net.net_nodes["y"],
                               urbanaccess_net.net_edges["from_int"],
                               urbanaccess_net.net_edges["to_int"],
                               urbanaccess_net.net_edges[["weight"]],
                               twoway=False)


ValueError: Buffer dtype mismatch, expected 'long' but got 'double'

In [25]:
urbanaccess_net.

Unnamed: 0,from,to,weight
0,749_riverside_transit_agency,714_riverside_transit_agency,1.000000
1,714_riverside_transit_agency,771_riverside_transit_agency,0.533333
2,771_riverside_transit_agency,715_riverside_transit_agency,0.433333
3,715_riverside_transit_agency,717_riverside_transit_agency,0.816667
4,717_riverside_transit_agency,772_riverside_transit_agency,0.633333
...,...,...,...
8380,7_riverside_transit_agency,549_riverside_transit_agency,11.483333
8381,549_riverside_transit_agency,550_riverside_transit_agency,3.516667
8382,1037_riverside_transit_agency,1100_riverside_transit_agency,0.400000
8383,1100_riverside_transit_agency,1038_riverside_transit_agency,17.000000


In [14]:
h = pd.HDFStore('../data/40140.h5')

In [6]:
h['two_way'][0]

True

In [37]:
twoway = pd.Series({0: True})

In [47]:
h.get('two_way')

True

In [42]:
pdna.Network.from_hdf5('../data/combined_net.h5')

KeyError: 'No object named two_way in the file'

In [None]:
# read in the pre-saved OSM network using pandana

osm_network = pdna.Network.from_hdf5("../data/40140.h5")

In [25]:
pdna.Network(h.nodes['x'], h.nodes['y'], h.edges['from'], h.edges['to'], h.edges[['distance']])

<pandana.network.Network at 0x7fc732910b90>

In [29]:
urbanaccess_net.net_edges

Unnamed: 0,from,to,weight,unique_agency_id,unique_trip_id,sequence,edge_id,route_type,unique_route_id,net_type,distance,from_int,to_int
0,749_riverside_transit_agency,714_riverside_transit_agency,1.000000,riverside_transit_agency,534170020_riverside_transit_agency,1.0,534170020_riverside_transit_agency_1,3.0,16_riverside_transit_agency,transit,,2232,1395
1,714_riverside_transit_agency,771_riverside_transit_agency,0.533333,riverside_transit_agency,534170020_riverside_transit_agency,2.0,534170020_riverside_transit_agency_2,3.0,16_riverside_transit_agency,transit,,1395,102
2,771_riverside_transit_agency,715_riverside_transit_agency,0.433333,riverside_transit_agency,534170020_riverside_transit_agency,3.0,534170020_riverside_transit_agency_3,3.0,16_riverside_transit_agency,transit,,102,2022
3,715_riverside_transit_agency,717_riverside_transit_agency,0.816667,riverside_transit_agency,534170020_riverside_transit_agency,4.0,534170020_riverside_transit_agency_4,3.0,16_riverside_transit_agency,transit,,2022,85
4,717_riverside_transit_agency,772_riverside_transit_agency,0.633333,riverside_transit_agency,534170020_riverside_transit_agency,5.0,534170020_riverside_transit_agency_5,3.0,16_riverside_transit_agency,transit,,85,56
...,...,...,...,...,...,...,...,...,...,...,...,...,...
755323,3656016333,298_riverside_transit_agency,0.216157,,,,,,,osm to transit,,,2413
755324,845_riverside_transit_agency,54307124,0.334064,,,,,,,transit to osm,,2414,
755325,54307124,845_riverside_transit_agency,0.334064,,,,,,,osm to transit,,,2414
755326,2144_riverside_transit_agency,412218965,0.484858,,,,,,,transit to osm,,2415,


In [30]:
urbanaccess_net.net_nodes["x"].dtype

dtype('float64')

In [31]:
urbanaccess_net.net_nodes["y"].dtype

dtype('float64')

In [32]:
urbanaccess_net.net_edges["from_int"].dtype

Int64Dtype()

In [33]:
urbanaccess_net.net_edges["to_int"].dtype

Int64Dtype()

In [34]:
urbanaccess_net.net_edges["weight"].dtype

dtype('float64')

In [35]:
h.edges['distance'].dtype

dtype('float64')