National Bridge Inventory (NBI) represents bridge data submitted annually to FHWA by the States, Federal agencies, and Tribal governments.The data conforms to the Recording and Coding Guide for the Structure Inventory and Appraisal of the Nations Bridges. Each data set is submitted in the spring, and may be corrected or updated throughout the year. The data is considered final and is published on this website at the end of each calendar year. [Source: Federal Highway Administration]
The Python script downloads CSV and zip file directly from the FHWA website. This features ensures that all transformations to the dataset are accounted for.
The nbi-csv-json-converter
divides the task into two: Download and Conversion. Each of the subdivided tasks is run by separate scripts. The conversion script also performs cross-check, item check, validations mentioned by FHWA, and populates JSON objects into MongoDB.
This crawler creates a local copy of NBI files. This prevents unnecessary requests to the server during the processing stage. `
# Start NBI File Download for all years
python3 Downloadv1.py
# Decompress Zip files and convert CSV files to JSON
python3 ProcessMain.py
Downloadv1.py, contains two global lists to configure the states and years for which files are to downloaded. By default the two lists includes all states and years.
states = ['NE','AL','AK',...]
years = [1992,1993,1994,...]
Downloadv1.py output includes the following items:
NBIDATA
folder. This folder includes all bridge inspection files for all states and years in the configuration list. The files are renamed following this convention:XXYYYY.txt
.XX
representes the two digit state code andYYYY
represents the year of reporting into NBI.
ProcessMain.py, contains two global lists to configure the states and years for which files are to downloaded. To limit processing time, by default the states list only includes Nebraska ['NE']. The years list by default includes all years.
states = ['NE']
years = [1992,1993,1994,...]
To connect to mongodb instance dbConnect.txt
is required. you may create the file manually and add standard URI connection scheme given below:
mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]]
To know more about the schema, Click Here
Use the following schema to connect to the MongoDB instance installed on local machine:
mongodb://localhost:27017
ProcessMain.py output includes the following items:
missinggeo.txt
This text file will include all structure Number and year, where geo coordinates are invalidSummary.txt
This text file will include basic summary of this processing runmergedNBI.json
This text file includes JSON objects converted from csv rows