# Locations of files

**Truck OD; NHTS Zones shapefile:**  https://nhts.ornl.gov/od  
  - NHTS Zone shapefile direct link:  https://nhts.ornl.gov/od/assets/data/NextGen_OD_Zone_ESRI_11152022.zip
  - Truck OD data direct link (also has data dictionary):  https://nhts.ornl.gov/od/assets/data/2020_Truck_OD_Annual_v2.zip

**NHTS Zones Shapefile:**  https://www.fhwa.dot.gov/policyinformation/analysisframework/04.cfm  
  - **NOTE:**  For most of the files below, geopandas cannot read in directly.  Instead, look at the code sample below on how to read this in
  - Direct link to Version 1 OD Zone shapefile:  https://www.fhwa.dot.gov/policyinformation/analysisframework/zip/NextGen_NHTS_Shapefile_v3.zip  
    - The difference between this file and the NHTS Zone shapefile above seems to be the capitalization of the field names.  See code below.
  - Direct link to Version 2 OD Zone shapefile:  https://www.fhwa.dot.gov/policyinformation/analysisframework/zip/Version_2_zone_ESRI.zip  
  - Direct link to Version 1 Zone to County shapefile: https://www.fhwa.dot.gov/policyinformation/analysisframework/zip/Revised_County_fhwa583_UMD.zip
  - Direct link to Version 2 Zone to County shapefile: https://www.fhwa.dot.gov/policyinformation/analysisframework/zip/County_version_2_ESRI.zip

**FAF5:** https://faf.ornl.gov/faf5/  also look at https://www.bts.gov/faf
  - Direct link to MS Access database for 2018-2022:  https://faf.ornl.gov/faf5/data/download_files/FAF5.5.1_2018-2022_access.zip
  - Direct link to MS Access database for 2017-2025:  https://faf.ornl.gov/faf5/data/download_files/FAF5.5.1_HiLoForecasts_access.zip

**FAF Highway Network**: main page is:  https://ops.fhwa.dot.gov/freight/freight_analysis/faf/
  -  Direct link to links and nodes:  https://ops.fhwa.dot.gov/freight/freight_analysis/faf/faf_highway_assignment_results/FAF5_Model_Highway_Network.zip
    - **NOTE:** How to read in the nodes and the links is a bit tricky.  See sample code below
  - Data dictionary (not very useful) and hyperlinks to the FAF network data:  
https://data-usdot.opendata.arcgis.com/datasets/usdot::freight-analysis-framework-faf5-regions/about  

**Census Economic data:**
  - Commodity Flow Survey data (not yet used, but it might help to determine intermoda/container flows): https://www.census.gov/data/datasets/2017/econ/cfs/historical-datasets.html  
  - Economic data used:  https://www2.census.gov/programs-surveys/economic-census/data/2017/sector00/EC1700BASIC.zip

**FIPS counties and their names:**  https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt  

**County data**
  - CBSA to Counties, but it seems to only map 1915 counties to CBSA's:  https://www2.census.gov/programs-surveys/metro-micro/geographies/reference-files/2023/delineation-files/list1_2023.xlsx  

  - US counties shapefile:  https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2022&layergroup=Counties+%28and+equivalent%29 (click on the download button) or find it: https://www2.census.gov/geo/tiger/TIGER2017/COUNTY/
    - This also has the county lat/long but I use the census data above instead:  https://gist.github.com/russellsamora/12be4f9f574e92413ea3f92ce1bc58e6  

In [36]:
import geopandas as gpd
import pandas as pd
import shapely
import numpy as np

pd.options.display.max_columns = None
pd_max_colwidth_original = pd.options.display.max_colwidth
# pd.options.display.max_colwidth = None
gpd.options.io_engine = "pyogrio"

In [2]:
data_directory = r'../../Truck_Areas/'

## Read in shapefiles for FAF zones and CBSA zones

In [23]:
# County to FAF zones, version one.  main page https://www.fhwa.dot.gov/policyinformation/analysisframework/04.cfm

shapefile_path = "zip://" + data_directory + "Revised_County_fhwa583_UMD.zip/Revised_County"
print(shapefile_path)
version_1_zone_by_county_esri_gdf = gpd.read_file(shapefile_path)
version_1_zone_by_county_esri_gdf

zip://../../Truck_Areas/Revised_County_fhwa583_UMD.zip/Revised_County


Unnamed: 0,ID,AREA,ID1,AREA1,CBSA,LSAD,NOT_MSABUT,COUNTY,COUNTY_FIP,STATE_ABB,STATEFP,CBSAFP2,CBSAFP2_ST,geometry
0,1,905.13,1,905.13,"Eufaula, AL-GA",M2,,Barbour,005,AL,01,RAL3,RAL3_AL,"POLYGON ((-85.74825 31.61805, -85.74803 31.619..."
1,2,673.48,2,673.48,"Troy, AL",M2,,Pike,109,AL,01,RAL3,RAL3_AL,"POLYGON ((-85.65767 31.88028, -85.65958 31.879..."
2,3,647.65,3,647.65,"Columbus, GA-AL",M1,,Russell,113,AL,01,17980,17980_AL,"POLYGON ((-85.05603 32.06305, -85.10498 32.062..."
3,5,653.15,5,653.15,,RS,,Bradley,011,AR,05,RAR2,RAR2_AR,"POLYGON ((-92.33084 33.70781, -92.28527 33.705..."
4,6,598.96,6,598.96,"Pine Bluff, AR",M1,,Cleveland,025,AR,05,38220,38220_AR,"POLYGON ((-91.97584 33.70441, -92.00634 33.704..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3137,3140,494.49,3229,494.49,"Raleigh, NC",M1,,Franklin,069,NC,37,39580,39580_NC,"POLYGON ((-78.25597 35.81812, -78.25649 35.820..."
3138,3141,916.27,3230,916.27,"Midland, TX",M1,,Martin,317,TX,48,33260,33260_TX,"POLYGON ((-102.21104 32.52324, -102.20270 32.5..."
3139,3142,376.55,3231,376.55,"Parkersburg-Vienna, WV",M1,,Wood,107,WV,54,37620,37620_WV,"POLYGON ((-81.74725 39.09538, -81.74545 39.098..."
3140,4,4516.88,4,4516.88,,RS,,Aleutians West,016,AK,02,RAK1,RAK1_AK,"MULTIPOLYGON (((-179.14511 51.26567, -179.1489..."


In [25]:
# FAF zones shapefiles, version one.  main page https://www.fhwa.dot.gov/policyinformation/analysisframework/04.cfm

shapefile_path = "zip://" + data_directory + "NextGen_NHTS_Shapefile_v3.zip!NextGen_NHTS_Shapefile"
print(shapefile_path)
NextGen_OD_Zone_ESRI_v3_gdf = gpd.read_file(shapefile_path)
NextGen_OD_Zone_ESRI_v3_gdf

zip://../../Truck_Areas/NextGen_NHTS_Shapefile_v3.zip!NextGen_NHTS_Shapefile


Unnamed: 0,OBJECTID,Shape_Leng,STATEFP,STATE_ABB,Zone_ID,Zone_Name,Shape_Area,CBSAFP2,geometry
0,1,3.818555,48,TX,10180_TX,"Abilene, TX",0.684971,10180,"POLYGON ((-99.11429 32.51481, -99.11405 32.500..."
1,2,2.326638,39,OH,10420_OH,"Akron, OH",0.256752,10420,"POLYGON ((-81.00332 41.34786, -81.00319 41.347..."
2,3,4.623962,13,GA,10500_GA,"Albany, GA",0.481920,10500,"POLYGON ((-83.80272 31.80358, -83.79873 31.801..."
3,4,4.740245,41,OR,10540_OR,"Albany, OR",0.675868,10540,"POLYGON ((-121.80001 44.68342, -121.80093 44.6..."
4,5,5.057234,36,NY,10580_NY,"Albany-Schenectady-Troy, NY",0.820146,10580,"POLYGON ((-73.27429 42.94365, -73.27439 42.942..."
...,...,...,...,...,...,...,...,...,...
578,579,8.490674,54,WV,RWV2_WV,WV-NonMSA areas (NW),1.445256,RWV2,"POLYGON ((-80.51934 39.72140, -80.45793 39.721..."
579,580,7.434668,54,WV,RWV3_WV,WV-NonMSA areas (S),0.839708,RWV3,"POLYGON ((-81.93251 38.02536, -81.96155 38.006..."
580,581,19.886006,56,WY,RWY1_WY,WY-NonMSA areas (E),10.316506,RWY1,"POLYGON ((-104.05770 44.99743, -104.05591 44.8..."
581,582,16.301682,56,WY,RWY2_WY,WY-NonMSA areas (NW),9.289463,RWY2,"POLYGON ((-108.24943 44.99946, -108.24934 44.9..."


In [22]:
# FAF zones shapefiles, version one.  main page https://nhts.ornl.gov/od

shapefile_path = "zip://" + data_directory + "NextGen_OD_Zone_ESRI_11152022.zip!NextGen_OD_Zone_ESRI_11152022"
NextGen_OD_Zone_ESRI_11152022_gdf = gpd.read_file(shapefile_path)
NextGen_OD_Zone_ESRI_11152022_gdf

zip://../../Truck_Areas/NextGen_OD_Zone_ESRI_11152022.zip!NextGen_OD_Zone_ESRI_11152022


Unnamed: 0,OBJECTID,shape_leng,statefp,state_abb,zone_id,zone_name,shape_area,CBSAFP2,geometry
0,1,3.818555,48,TX,10180_TX,"Abilene, TX",0.684971,10180,"POLYGON ((-99.11429 32.51481, -99.11405 32.500..."
1,2,2.326638,39,OH,10420_OH,"Akron, OH",0.256752,10420,"POLYGON ((-81.00332 41.34786, -81.00319 41.347..."
2,3,4.623962,13,GA,10500_GA,"Albany, GA",0.481920,10500,"POLYGON ((-83.80272 31.80358, -83.79873 31.801..."
3,4,4.740245,41,OR,10540_OR,"Albany, OR",0.675868,10540,"POLYGON ((-121.80001 44.68342, -121.80093 44.6..."
4,5,5.057234,36,NY,10580_NY,"Albany-Schenectady-Troy, NY",0.820146,10580,"POLYGON ((-73.27429 42.94365, -73.27439 42.942..."
...,...,...,...,...,...,...,...,...,...
578,579,8.490674,54,WV,RWV2_WV,WV-NonMSA areas (NW),1.445256,RWV2,"POLYGON ((-80.51934 39.72140, -80.45793 39.721..."
579,580,7.434668,54,WV,RWV3_WV,WV-NonMSA areas (S),0.839708,RWV3,"POLYGON ((-81.93251 38.02536, -81.96155 38.006..."
580,581,19.886006,56,WY,RWY1_WY,WY-NonMSA areas (E),10.316506,RWY1,"POLYGON ((-104.05770 44.99743, -104.05591 44.8..."
581,582,16.301682,56,WY,RWY2_WY,WY-NonMSA areas (NW),9.289463,RWY2,"POLYGON ((-108.24943 44.99946, -108.24934 44.9..."


In [17]:
# FAF zones shapefiles, version two.  main page https://www.fhwa.dot.gov/policyinformation/analysisframework/04.cfm

shapefile_path = data_directory + "Version_2_zone_ESRI.zip"
version_2_zone_esri_gdf = gpd.read_file(shapefile_path)
version_2_zone_esri_gdf

../../Truck_Areas/Version_2_zone_ESRI.zip


Unnamed: 0,ID,AREA,V2ZONEID,geometry
0,1,2758.429688,10180_TX,"POLYGON ((-100.15191 32.08264, -100.14955 32.2..."
1,2,922.926575,10420_OH,"POLYGON ((-81.39169 41.34827, -81.36784 41.347..."
2,3,1958.024414,10500_GA,"POLYGON ((-84.54265 31.07903, -84.53710 31.255..."
3,4,2306.622070,10540_OR,"POLYGON ((-121.79943 44.25828, -121.96013 44.2..."
4,5,2873.361816,10580_NY,"POLYGON ((-74.26331 42.79653, -74.25644 42.811..."
...,...,...,...,...
588,589,5358.561523,RWV2_WV,"POLYGON ((-79.89554 39.29958, -79.92606 39.288..."
589,590,3175.730713,RWV3_WV,"POLYGON ((-80.80626 37.86888, -80.75141 37.835..."
590,591,36045.851562,RWY1_WY,"POLYGON ((-105.27824 41.65666, -105.27874 41.0..."
591,592,31974.337891,RWY2_WY,"POLYGON ((-109.79848 45.00219, -109.75073 45.0..."


## Highway network

main page is:  https://ops.fhwa.dot.gov/freight/freight_analysis/faf/

In [44]:
# NOTE:  there are a lot of duplicates in the nodes, for unknown reasons

faf_nodes_df = gpd.read_file(r'zip://' + data_directory + 'FAF5_Model_Highway_Network.zip!Networks/Geodatabase Format/FAF5Network.gdb/a0000000a.gdbtable')
print(f'Number of nodes prior to dropping duplicates: {len(faf_nodes_df)}')
faf_nodes_df.drop_duplicates(inplace=True)
print(f'Number of nodes after dropping duplicates: {len(faf_nodes_df)}')

Number of nodes prior to dropping duplicates: 974788
Number of nodes after dropping duplicates: 348495


In [45]:
# NOTE:  there are a lot of duplicates in the nodes, for unknown reasons

faf_links_df = gpd.read_file(r'zip://' + data_directory + 'FAF5_Model_Highway_Network.zip!Networks/Geodatabase Format/FAF5Network.gdb/a00000009.gdbtable')
print(f'Number of links: {len(faf_links_df)}')


Number of links: 487394


In [46]:
# get rid of HI and AK.  The nodes most often do not have state information, but all the links do
# So get rid of HI and AK on the links, then find all the nodes that are not used by the links and
# eliminate them

# Note:  after testing, there is no need to test for 'almost equality' on the lat/longs

faf_links_df.drop(faf_links_df[faf_links_df.STATE.isin(["HI", "AK"])].index, inplace=True)
print(f'Number of links after dropping HI and AK: {len(faf_links_df)}')

Number of links after dropping HI and AK: 484446


In [47]:
# match up links to nodes

def build_node_to_link_structures() -> tuple[list[int], list[int], dict[tuple[float, float], int]]:
    '''Match up the nodes to the links and vice versa
    Note:  This has a "side" effect of adding FROM_NODE_IDX and TO_NODE_IDX to the faf_links_df data frame'''
    
    faf_links_df.reset_index(drop=True, inplace=True)
    faf_nodes_df.reset_index(drop=True, inplace=True)
    _node_lat_long_to_idx = {(row.geometry.coords[0][1], row.geometry.coords[0][0]): row.Index for row in faf_nodes_df.itertuples(index=True)}
    _from_node_idx: list[int] = list()
    _to_node_idx: list[int] = list()
    links_with_no_valid_nodes: list[tuple[int, str, str, str, int, int]] = list()
    for row in faf_links_df.itertuples(index=True):
        if shapely.get_num_geometries(row.geometry) > 1:
            print(row)
        linestring = shapely.get_geometry(row.geometry, 0)
        one_end_point = (linestring.coords[0][1], linestring.coords[0][0])
        other_end_point = (linestring.coords[-1][1], linestring.coords[-1][0])
        one_end_point_idx = _node_lat_long_to_idx[one_end_point] if one_end_point in _node_lat_long_to_idx else -1
        other_end_point_idx = _node_lat_long_to_idx[other_end_point] if other_end_point in _node_lat_long_to_idx else -1
        _from_node_idx.append(one_end_point_idx)
        _to_node_idx.append(other_end_point_idx)
        
        if one_end_point_idx == -1 or other_end_point_idx == -1:
            links_with_no_valid_nodes.append((row.ID, row.Country, row.STATE, row.Road_Name, one_end_point_idx, other_end_point_idx))
        
    faf_links_df['FROM_NODE_IDX'] = _from_node_idx
    faf_links_df['TO_NODE_IDX'] = _to_node_idx
    print(f'Links with no valid nodes: {links_with_no_valid_nodes}')
    return _from_node_idx, _to_node_idx, _node_lat_long_to_idx

In [49]:
from_node_idx, to_node_idx, node_lat_long_to_idx = build_node_to_link_structures()

node_idx_used_by_links = set(from_node_idx) | set(to_node_idx)
node_idx_not_used_by_links = set(node_lat_long_to_idx.values()) - node_idx_used_by_links
print(f'Number of nodes not used by links (typically nodes in HI and AK): {len(node_idx_not_used_by_links)}')

# Drop the nodes not used by the links.  These are the HI and AK nodes
faf_nodes_df.drop(list(node_idx_not_used_by_links), inplace=True)

# rebuild the structures, also resets the indices
from_node_idx, to_node_idx, node_lat_long_to_idx = build_node_to_link_structures()
print(f'Final node count: {len(faf_nodes_df)}')


Links with no valid nodes: []
Number of nodes not used by links (typically nodes in HI and AK): 2325
Links with no valid nodes: []
Final node count: 346170
