This analysis quantifies transit accessibility across several dimensions: transit mode (walking, driving, public transit); institution type (e.g. private nonprofit, public four year, public two year); and campus type (branch, main, or all campuses of a given institution). Accessibility is further characterized according to demographic attributes available in Census data, such as income, educational attainment, and race.
In order to determine the accessibility of a given campus we begin by generating isochrones for each of the transit modes examined. An isochrone is a polygon representing all areas reachable by a typical traveler in a given amount of travel time, using the campus as a starting point. We choose 30 minutes for our isochrones’ travel time across all modes. Different isochrone creation methods are required for each mode.
Walking and transit isochrones are generated using the OpenTripPlanner software package (OTP); specifically, the Docker image of the project published by geospatial analysis firm Urbica. OTP is mature software that is used in a variety of commercial contexts. Primarily designed to facilitate multimodal trip planning by users of public transit systems, OTP accounts for walking time between stops, transfers between lines, and numerous other parameters, with reasonable defaults for each. It is also capable of generating isochrones for walking and transit modes that take these various factors into account. OTP requires two inputs to answer trip planning queries about a given area: schedule information for the transit system; and data about the area road network. Most large American transit agencies publish their schedule information electronically using the General Transit Feed Specification (GTFS), which includes information about transit stops (including their location), lines, and schedules. The Metropolitan Transit Authority of Harris County (“METRO Houston”) is the Houston area’s primary transit agency, operating bus and light rail lines serving more than 9,000 stops. METRO Houston makes its schedule data freely available in the GTFS format on a regular basis. To avoid the effects of temporary service reductions associated with the COVID-19 emergency, we employ a GTFS snapshot of schedule data published by METRO Houston on January 10, 2020. For the road network, we extract a portion of OpenStreetMap using the osmium tool (osmium extract --strategy smart --bbox -97.2,27.84,-93.75,32.5 -O -o houston.osm.pbf input.osm.pbf
). Once the data is loaded into an instance of OTP, we query the instance API for walking and transit isochrones for each campus in our dataset, using 4:30pm CST on Wednesday, January 15, 2020 as a start time. This time was chosen to represent a typical pre-COVID weekday commute.
Generating accurate driving isochrones requires not only road network geometry but an empirical model of observed speeds that is reflective of road congestion and other factors. Only driving isochrones require this additional context; transit agency schedules as embodied in GTFS files account for such conditions; and walking is typically unaffected by traffic conditions. Obtaining or building a road dataset containing an accurate speed model is a significant undertaking. We instead employ a commercial isochrone API from a vendor that has already built such a model. The Mapbox platform is employed by hundreds of millions of users in a typical month, and its directions service is well-regarded. The same algorithmic model powers their isochrone capability, which we query to generate a 30-minute driving isochrone for each campus in our dataset.
All isochrones and accompanying metadata are then loaded into PostGIS, an open source spatial database that facilitates geospatial analysis.
With isochrones in hand, the task of assessing the population within 30 minutes of a given campus at first seems solved. Visual inspection of the data quickly reveals a problem: Census tract data, which we rely upon to understand the population’s characteristics, is far more coarse than our isochrones. In many cases, isochrones’ peripheries reach only to accessible commercial areas that are unlikely to house many of the residents of a given tract. In other cases they cover parks or other undeveloped areas.
We require a means of improving the spatial resolution of Census data. Fortunately, the Texas Legislature’s 1979 "Peveto Bill" tax reform implemented a system of appraisal districts that provides a path forward. Texas counties maintain geospatial records of taxable parcels of land and information about each parcel’s tax status. This data includes a statewide classification scheme designating the manner in which the parcel is used, e.g. residential, commercial, agricultural, industrial, etc. In some cases these two datasets exist in a single unified file; more typically, they exist as independent files that must be joined using data formatting rules specific to the county. Through access to open data portals and a series of public records requests, we obtained parcel and tax data for Austin, Brazoria, Brazos, Chambers, Colorado, Fort Bend, Galveston, Grimes, Harris, Liberty, Matagorda, Montgomery, San Jacinto, Walker, Waller, Washington, and Wharton counties. We then constructed Python scripts to join the data sets in a manner that revealed the land use classification for each parcel. Although the land classification scheme is uniform across Texas, its application is not. Based upon visual comparison with satellite data, we filter parcels to a residential set by using all state codes beginning with A and B; as well as the codes F1M, M1 and M3, which account for some mobile homes. We make an exception for Washington County, where we use the codes A, B and M to reflect the idiosyncratic patterns of land code application we observe there.
We can now identify the subset of a given Census tract devoted to residential use. For each residential parcel, we identify the Census tract in which it exists; calculate the parcel’s land area; and assign a portion of the Census tract’s population pro rata on the basis of that land area, divided across all residential parcel land area in the tract.
An example serves to illustrate this method. A hypothetical Census tract occupies 100 square meters. All of this area is divided into parcels. 50 square meters are occupied by parcels with a residential land use code. That residential portion is divided into four parcels: one parcel of 20 square meters and three parcels of 10 square meters each. Census data indicates that 20 people live in this tiny tract, 5 of whom are Black and 15 of whom are white. In this scenario, the large parcel will be assigned 2 Black people and 6 white people and each of the three smaller parcels will be assigned 1 Black person and 3 white people. Note that our method includes no requirement that a tract’s population be assigned to parcels in whole-person increments. Fractional persons may be assigned to parcels and typically are: if, in the above example, the population of the Census tract were halved, the smallest parcels would each be assigned 0.5 Black people and 1.5 white people.
Accessibility analysis is then performed by calculating the centroid of each parcel and, for each isochrone, determining the set of tract centroids that lie within it. The assigned population associated with each of those tracts is then summed for the isochrone and summary statistics are generated.
Limitations:
- The method by which we assign Census tract persons to parcels is clearly imperfect: it does not account for multistory residences nor other obvious variations in housing density. Nor can it account for the actual distribution of a tract’s population and their traits, which will never be distributed perfectly homogeneously. However, we believe it substantially improves the accuracy of our accessibility analysis.
- Some neighboring counties offer limited transit services that may expand the reach of METRO Houston accessibility. Without GTFS data, we cannot account for them.
- Our analysis creates isochrones from single geographic points for each campus, which for some large campuses may not account for cross-campus travel that consumes a meaningful portion of the 30 minute travel window.
Source code and isochrones are available at https://github.com/sbma44/houston-parcels