Loan-level analysis of Fannie Mae and Freddie Mac data
R SQLPL Shell

README.md

Create a PostgreSQL database with loan-level data from Fannie Mae and Freddie Mac

Scripts used in support of this post: Mortgages Are About Math: Open-Source Loan-Level Analysis of Fannie and Freddie

Usage

  1. Make sure you have PostgreSQL installed locally. If you want to use R, install it too
  2. Download data from Fannie Mae and/or Freddie Mac and unzip all files into a directory with fannie/ and freddie/ subdirectories
  3. Make sure to update the proper /path/to/ paths in initialize_database.sh, create_loans_and_supporting_tables.sql, and load_all_loans_script.sh
  4. ./initialize_database.sh creates a Postgres database called agency-loan-level, creates some tables, and imports supporting data including FHFA home price data and Freddie Mac mortgage rate data
  5. ./db_scripts/load_all_loans.sh to import the data files. This might take a very long time (~2 days), so you could consider loading the data in chunks. The total database takes up around 215 GB on disk

Analysis

The analysis/ folder has additional SQL and R scripts used to analyze the data, see more in the full post