Managers at a real estate investment firm seek the top 5 zip codes to invest in. This analysis seeks to identify those top 5 zip codes based on size, return on investment, and Sharpe ratio.
The firm requires insight on the best zip codes to invest in. The following time series analysis will identify the top 5 zip codes based on ROI. These top 5 will be analyzed and modeled in order to draw accurate forecasts of the future. The models will forecast over a 3 year investment horizon. Each forecasted zip code will be evaluated on a risk adjusted basis to provide a recommendation on where to invest.
Analyzed data of 14,723 zip codes in the United States from Zillow.com. Included monthly data of housing prices across each zip code, along with size rank.
Modeling criteria:
Size Rank: zip code must be in the top 50 size rank.
Larger size accounts for qualitative factors such as job opportunities, population, and other drivers of demand that fall outside the scope of this model
Return on Investment
Measures return relative to cost of the investment
Sharpe Ratio
An investment's sharpe ratio reveals the investment's return per unit of risk. It allows investors to evaluate returns on a risk adjusted basis.
I sampled the top 50 zip codes according to size rank, then calculated ROI in each zip code over the historical period.
Revealed the top 5 Zip codes:
“Region 1” → 66126: Washington DC
“Region 2” → 66133: Washington DC
“Region 3” → 62037: Kings County, New York
“Region 4” → 96027: Long Beach/Anaheim, California
“Region 5” → 62040: Kings County, New York
The autoarima package enables a stepwise search of the best parameters of an ARIMA model. Autoarima was applied to each of the 5 identified regions. Next, each model forecasted a 3 year investment horizon:
Returning to our modeling criteria, each regions return on investment and sharpe ratio was calculated and evaluated.
Analyzing the forecasted ROI and sharpe ratio, I recommend to the firm:
- Invest in region 1, 2, 3, and 5.
- Do not invest in region 4
- Regions 1, 2, 3, and 5 forecast ROI with efficient risk adjusted return. However, region 4 does not provide efficient risk adjusted return. Moreover, region 4 showed the lowest return on investment over the forecasted period.
The regions selected share certain characteristics that yield attractive risk adjusted return. Each zip code is from a Tier 1 or 2 city. These cities (New York, Los Angeles, Washington DC) share the characteristics of:
- Healthy employment and strong economics anchored by the higher than usual opportunity in these high tier cities. In fact, regions 1, 2, 3, and 5 all have unemployment rates at or below 6%. For example, regions 1 and 2 neighbor Capitol Hill.
- Proximity to amenities such as parks, entertainment, or transit.
- Population growth fueling housing demand.
Therefore, I also recommend that future investment consideration adheres to these characteristics. Together, they create strong demand that results in efficient ROI. Further analysis that uses employment, income growth, and population growth may reveal other lucrative zip codes.
Real estate prices are not just a function of time, but a relationship between supply and demand not fully captured by this analysis. Including exogenous variables such as interest rates, housing supply, or population growth would likely yield a better forecast. However, this falls outside the scope of this project.
Sampling the top 50 zip codes by size rank during EDA allowed us to somewhat capture qualitative factors not included.
├── README.md <- The top-level README for reviewers of this project
├── time-series <- Narrative documentation of analysis in Jupyter notebook
├── TS_presentation.pdf <- PDF version of project presentation