ETL.py model.py
This project consists of an ETL pipeline and a modeling portion. Jupyter notebooks for each of the individual ETL processes as well as the modeling are retained under the "notebooks" directory for refrence. These notebooks were up to date as of the publishing of this project but should be considered as refrence only and will not be maintained.
- Load libraries in "requirements.txt"
- Create or connect to PostgreSQL database with postgis extension
- Create "global_vars.py" with the following local parameters: '''
localhost = { "host": "xxxxx", "database": "disaster_db", "user": "xxxxxx", "password": "xxxxxx", }
DATA_PATHS = { # download shapefiles from FEMA NRI link below "nri_shapefile": "path", # download latest community resiliance .cvs from census link below "census_resilience": "path", # local directory where files are stored "extract_dir": "path" }
'''
FEMA NRI https://hazards.fema.gov/nri/data-resources NRI_Shapefile_Counties.zip
https://www.census.gov/programs-surveys/community-resilience-estimates/data/datasets.html
https://www.ncei.noaa.gov/stormevents/ftp.jsp (FTP access only)
csvfiles are accessable through FTP:
ftp://ftp.ncei.noaa.gov/pub/data/swdi/stormevents/csvfiles/
Detailed information about the fields/columns:
ftp://ftp.ncei.noaa.gov/pub/data/swdi/stormevents/csvfiles/Storm-Data-Bulk-csv-Format.pdf
Documentation on the file naming convention:
ftp://ftp.ncei.noaa.gov/pub/data/swdi/stormevents/csvfiles/README