In [1]:
import geopandas as gpd
import pandas as pd
import os, time

In [2]:
analysis_crs = "EPSG:26910"

## README
(Asana task: https://app.asana.com/1/11860278793487/project/1159042832247728/task/1209932157759733?focus=true)
* Step 1: prepare Overburdened Communities spatial data. The raw data contains scattered shapes and does not cover the entire region. To be compatible with downstream scripts, this step dissolves the raw data into shapes by county, and "fills" the remaining of the county as non-overburdened Communities.
* Step 2: assign travel model roadway links to Cverburdened (and non-overburdened) shapes via spatial overlay. If a link intersects more than one shapes instead of falling within one shape - e.g. crosses county boundary or Overburdened Communities boundary, all the link-shape pairs are kept, with the share of link length in each shape calculated. This step calls the generic script `\travel-model-one-master\utilities\cube-to-shapefile\correspond_link_to_TAZ.py`.
* Step 3: join links with emission rates from EMFAC model. Emission rates can vary by county, speed, and time-of-day. In PBA50, different emission rates were assumed for CARE Communities (the predecessor of Overburdened) versus non-CARE Communities. In PBA50+, same emission rates were assumed for Overburdened Communities and non-Overburdened. This step calls the generic script `join_networklinks_to_emissionrates.py`.
* Step 4 (not included in this notebook): the output of Step 3 is loaded into a Tableau dashboard to calculate the emissions from VMT. For links that intersect with multiple shapes, in other words, joined to multiple emission rates, assume that VMT distributes evenly within each link, therefore the VMT for each shape is proportional to the length share. 

### step 1: prepare Overburden Communities spatial data

In [3]:
# source data

# Bay Area counties shapefile (unclipped version to be consistent with the Overburdened Communities shapefile)
BayareaCounties_DIR = r"M:\Data\GIS layers\Counties\bay_counties.shp"
# Overburdened Communities shapefile
Overburden_DIR = r"M:\Application\PBA50Plus_Data_Processing\OverburdenedCommunities_analysis\PBA50plus\spatial_data\baaqmd_overburdened_communities_2022.geojson"


In [4]:
# load overburdened communities shapefile and dissolve to county
Overburden_gpd = gpd.read_file(Overburden_DIR)
display(Overburden_gpd.head())

print('crs: {}'.format(Overburden_gpd.crs))
Overburden_gpd = Overburden_gpd.to_crs(analysis_crs)

Overburden_gpd_dissolve = Overburden_gpd[['County','Acres', 'geometry']]
Overburden_gpd_dissolve['County'] = Overburden_gpd_dissolve['County'].apply(lambda x: x.replace(' County', ''))
Overburden_gpd_dissolve = Overburden_gpd_dissolve.dissolve(by='County', aggfunc='sum')
Overburden_gpd_dissolve = Overburden_gpd_dissolve.reset_index()
# add a field 'COUNTY_OBC' to identify whether overburdened
LOOKUP_COUNTY = pd.DataFrame({
    "county_census": ["001", "013", "041", "055", "075", "081", "085", "095", "097"],
    "County": ["Alameda", "Contra Costa", "Marin", "Napa", "San Francisco", "San Mateo", "Santa Clara", "Solano", "Sonoma"]
})
Overburden_gpd_dissolve = Overburden_gpd_dissolve.merge(LOOKUP_COUNTY, on=['County'], how='left')
Overburden_gpd_dissolve['COUNTY_OBC'] = Overburden_gpd_dissolve['county_census']+'_Overburdened'
display(Overburden_gpd_dissolve)

Unnamed: 0,County,Acres,MERGE_SRC,Tract,ZIP,Population,CIscore,CIscoreP,Ozone,Ozone_Pctl,...,Elderly_65,Hispanic,White,African_Am,Native_Ame,Asian_Amer,Pacific_Is,Other_Mult,ApproxLoc,geometry
0,Alameda County,3602.121227,CES4_BAAQMD_Top30_1000ft_buffer,,,,,,,,...,,,,,,,,,,"POLYGON ((579304.309 4169399.300, 579313.518 4..."
1,Alameda County,421.181924,CES4_BAAQMD_Top30_1000ft_buffer,,,,,,,,...,,,,,,,,,,"POLYGON ((580858.161 4170932.665, 580866.229 4..."
2,Alameda County,0.052156,CES4_BAAQMD_Top30_1000ft_buffer,,,,,,,,...,,,,,,,,,,"POLYGON ((571463.586 4176385.968, 571607.846 4..."
3,Alameda County,0.017112,CES4_BAAQMD_Top30_1000ft_buffer,,,,,,,,...,,,,,,,,,,"POLYGON ((572294.970 4176583.994, 572304.904 4..."
4,Alameda County,0.200896,CES4_BAAQMD_Top30_1000ft_buffer,,,,,,,,...,,,,,,,,,,"POLYGON ((571668.158 4176502.620, 571750.271 4..."


crs: epsg:26910


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)


Unnamed: 0,County,geometry,Acres,county_census,COUNTY_OBC
0,Alameda,"MULTIPOLYGON (((558910.259 4182286.290, 558859...",43199.272887,1,001_Overburdened
1,Contra Costa,"MULTIPOLYGON (((560357.571 4194632.540, 560040...",80776.79577,13,013_Overburdened
2,Marin,"MULTIPOLYGON (((550858.885 4189510.442, 550852...",2325.832268,41,041_Overburdened
3,Napa,"MULTIPOLYGON (((562034.943 4223568.718, 562050...",950.730762,55,055_Overburdened
4,San Francisco,"MULTIPOLYGON (((552467.560 4173620.357, 552467...",6552.00821,75,075_Overburdened
5,San Mateo,"MULTIPOLYGON (((560936.560 4158466.149, 560946...",10221.136205,81,081_Overburdened
6,Santa Clara,"MULTIPOLYGON (((603282.131 4128689.939, 603289...",39853.841743,85,085_Overburdened
7,Solano,"MULTIPOLYGON (((555398.858 4221425.380, 552820...",183916.887402,95,095_Overburdened
8,Sonoma,"MULTIPOLYGON (((552191.567 4222399.545, 552176...",8053.161606,97,097_Overburdened


In [5]:
# load Bay Area counties shapefile
counties_gpd = gpd.read_file(BayareaCounties_DIR)
display(counties_gpd)

print('crs: {}'.format(counties_gpd.crs))
counties_gpd = counties_gpd.to_crs(analysis_crs)

# get areas outside of the overburdened communities
non_Overburdened = gpd.overlay(counties_gpd, Overburden_gpd_dissolve, how='difference')
non_Overburdened.rename(columns={'NAME': 'County'}, inplace=True)
# use county id for field 'COUNTY_OBC' to be consistent with an subsequent script
non_Overburdened['COUNTY_OBC'] = non_Overburdened['GEOID'].apply(lambda x: x[2:])
display(non_Overburdened)

# put together
Overburden_all_gpd = pd.concat([Overburden_gpd_dissolve, non_Overburdened], ignore_index=True)

# write out the file for QAQC
Overburden_all_gpd.drop(columns=['GEOID'], inplace=True)
Overburden_all_gpd_DIR = r"M:\Application\PBA50Plus_Data_Processing\OverburdenedCommunities_analysis\PBA50plus\spatial_data\overburdened_communities_all.shp"
Overburden_all_gpd.to_file(Overburden_all_gpd_DIR)


Unnamed: 0,GEOID,NAME,geometry
0,6097,Sonoma,"POLYGON ((505677.166 4240651.574, 505674.283 4..."
1,6075,San Francisco,"MULTIPOLYGON (((548043.614 4188095.410, 548789..."
2,6041,Marin,"POLYGON ((518783.065 4193293.177, 518725.963 4..."
3,6055,Napa,"POLYGON ((546614.872 4284199.845, 546615.561 4..."
4,6095,Solano,"POLYGON ((581757.266 4241280.771, 581729.692 4..."
5,6013,Contra Costa,"POLYGON ((564382.937 4195443.795, 564368.696 4..."
6,6085,Santa Clara,"POLYGON ((584827.591 4117542.151, 584830.219 4..."
7,6001,Alameda,"POLYGON ((563387.415 4173575.695, 563307.232 4..."
8,6081,San Mateo,"POLYGON ((536452.983 4160134.450, 536480.048 4..."


crs: epsg:26910


  non_Overburdened = gpd.overlay(counties_gpd, Overburden_gpd_dissolve, how='difference')


Unnamed: 0,GEOID,County,geometry,COUNTY_OBC
0,6097,Sonoma,"MULTIPOLYGON (((505651.739 4240632.916, 505607...",97
1,6075,San Francisco,"MULTIPOLYGON (((550807.711 4189452.701, 550872...",75
2,6041,Marin,"POLYGON ((518621.121 4193745.487, 518549.565 4...",41
3,6055,Napa,"MULTIPOLYGON (((554297.731 4223589.890, 554287...",55
4,6095,Solano,"MULTIPOLYGON (((622951.713 4241465.567, 622922...",95
5,6013,Contra Costa,"MULTIPOLYGON (((564323.298 4195414.144, 564313...",13
6,6085,Santa Clara,"MULTIPOLYGON (((584837.832 4117591.517, 584840...",85
7,6001,Alameda,"MULTIPOLYGON (((563002.964 4174282.866, 562920...",1
8,6081,San Mateo,"MULTIPOLYGON (((536723.529 4161443.601, 536806...",81


  Overburden_all_gpd.to_file(Overburden_all_gpd_DIR)


### step 2: create link to Overburdened Communities crosswalk

In [6]:
# 2023 base year
%run X:\travel-model-one-master\utilities\geographies\create_geography_overlays\correspond_link_to_TAZ.py "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v35\net_2023_Baseline\shapefiles\network_links.shp" "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v35\net_2023_Baseline\shapefiles\link_to_COUNTY_Overburdened.csv" --shapefile "M:\Application\PBA50Plus_Data_Processing\OverburdenedCommunities_analysis\PBA50plus\spatial_data\overburdened_communities_all.shp" --shp_id COUNTY_OBC --linkshp_mi linkOBC_mi --linkshp_share linkOBC_share
# 2050 No Project
%run X:\travel-model-one-master\utilities\geographies\create_geography_overlays\correspond_link_to_TAZ.py "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v35\net_2050_Baseline\shapefiles\network_links.shp" "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v35\net_2050_Baseline\shapefiles\link_to_COUNTY_Overburdened.csv" --shapefile "M:\Application\PBA50Plus_Data_Processing\OverburdenedCommunities_analysis\PBA50plus\spatial_data\overburdened_communities_all.shp" --shp_id COUNTY_OBC --linkshp_mi linkOBC_mi --linkshp_share linkOBC_share
# 2050 Blueprint and EIR Alternative 2
%run X:\travel-model-one-master\utilities\geographies\create_geography_overlays\correspond_link_to_TAZ.py "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v35\net_2050_Blueprint\shapefiles\network_links.shp" "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v35\net_2050_Blueprint\shapefiles\link_to_COUNTY_Overburdened.csv" --shapefile "M:\Application\PBA50Plus_Data_Processing\OverburdenedCommunities_analysis\PBA50plus\spatial_data\overburdened_communities_all.shp" --shp_id COUNTY_OBC --linkshp_mi linkOBC_mi --linkshp_share linkOBC_share
# 2050 EIR Alternative 1
%run X:\travel-model-one-master\utilities\geographies\create_geography_overlays\correspond_link_to_TAZ.py "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v36\net_2050_Alt1\shapefiles\network_links.shp" "M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v36\net_2050_Alt1\shapefiles\link_to_COUNTY_Overburdened.csv" --shapefile "M:\Application\PBA50Plus_Data_Processing\OverburdenedCommunities_analysis\PBA50plus\spatial_data\overburdened_communities_all.shp" --shp_id COUNTY_OBC --linkshp_mi linkOBC_mi --linkshp_share linkOBC_share


Read 33,953 links from M:\Application\Model One\RTP2025\INPUT_DEVELOPMENT\Networks\BlueprintNetworks_v35\net_2023_Baseline\shapefiles\network_links.shp
network_links.crs=epsg:26910
network_links.crs=EPSG:26910
sum(network_links[link_mi]) = 4,614.15
Read 18 links from M:\Application\PBA50Plus_Data_Processing\OverburdenedCommunities_analysis\PBA50plus\spatial_data\overburdened_communities_all.shp
After intersecting links with shapefile, have 35,959 rows
check_df for intersect failure: check_df._merge.value_counts()=_merge
both          35959
left_only        60
right_only        0
Name: count, dtype: int64
unjoined_df len=60 type=<class 'geopandas.geodataframe.GeoDataFrame'>
           A      B  DISTANCE  SPDCLASS  CAPCLASS  LANES  TSIN  GL  USE  \
2839    1459   8625      0.01        56        56      7     1  10    1   
2843    1462  11918      0.01        56        56      7     1  10    1   
2845    1463  11896      0.01        56        56      7     1  10    1   
2851    1466   412

### step 3: join links to emission rates

In [None]:
%run join_networklinks_to_emissionrates.py RTP2025 OverBurdened --emfac_version EMFAC2017