Skip to content

yehcolum/ZillowTimeSeries

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview


Managers at a real estate investment firm seek the top 5 zip codes to invest in. This analysis seeks to identify those top 5 zip codes based on size, return on investment, and Sharpe ratio.

Business Problem

The firm requires insight on the best zip codes to invest in. The following time series analysis will identify the top 5 zip codes based on ROI. These top 5 will be analyzed and modeled in order to draw accurate forecasts of the future. The models will forecast over a 3 year investment horizon. Each forecasted zip code will be evaluated on a risk adjusted basis to provide a recommendation on where to invest.

The Data

Analyzed data of 14,723 zip codes in the United States from Zillow.com. Included monthly data of housing prices across each zip code, along with size rank.

Modeling and results

Modeling criteria:

Size Rank: zip code must be in the top 50 size rank.
Larger size accounts for qualitative factors such as job opportunities, population, and other drivers of demand that fall outside the scope of this model

Return on Investment
Measures return relative to cost of the investment

Sharpe Ratio
An investment's sharpe ratio reveals the investment's return per unit of risk. It allows investors to evaluate returns on a risk adjusted basis.



I sampled the top 50 zip codes according to size rank, then calculated ROI in each zip code over the historical period.
Revealed the top 5 Zip codes:
“Region 1” → 66126: Washington DC
“Region 2” → 66133: Washington DC
“Region 3” → 62037: Kings County, New York
“Region 4” → 96027: Long Beach/Anaheim, California
“Region 5” → 62040: Kings County, New York

AutoArima Forecasting

The autoarima package enables a stepwise search of the best parameters of an ARIMA model. Autoarima was applied to each of the 5 identified regions. Next, each model forecasted a 3 year investment horizon:

Screen Shot 2022-08-14 at 2 33 37 PM

Screen Shot 2022-08-14 at 2 35 56 PM

Screen Shot 2022-08-14 at 2 39 57 PM

Screen Shot 2022-08-14 at 2 39 20 PM

Screen Shot 2022-08-14 at 2 37 49 PM

Interpretation

Returning to our modeling criteria, each regions return on investment and sharpe ratio was calculated and evaluated.

Screen Shot 2022-08-14 at 2 41 53 PMScreen Shot 2022-08-14 at 2 42 14 PM

Recommendations

Analyzing the forecasted ROI and sharpe ratio, I recommend to the firm:

  1. Invest in region 1, 2, 3, and 5.

  1. Do not invest in region 4

  1. Regions 1, 2, 3, and 5 forecast ROI with efficient risk adjusted return. However, region 4 does not provide efficient risk adjusted return. Moreover, region 4 showed the lowest return on investment over the forecasted period.

The regions selected share certain characteristics that yield attractive risk adjusted return. Each zip code is from a Tier 1 or 2 city. These cities (New York, Los Angeles, Washington DC) share the characteristics of:

  1. Healthy employment and strong economics anchored by the higher than usual opportunity in these high tier cities. In fact, regions 1, 2, 3, and 5 all have unemployment rates at or below 6%. For example, regions 1 and 2 neighbor Capitol Hill.

  1. Proximity to amenities such as parks, entertainment, or transit.

  1. Population growth fueling housing demand.

Therefore, I also recommend that future investment consideration adheres to these characteristics. Together, they create strong demand that results in efficient ROI. Further analysis that uses employment, income growth, and population growth may reveal other lucrative zip codes.

Limitations

Real estate prices are not just a function of time, but a relationship between supply and demand not fully captured by this analysis. Including exogenous variables such as interest rates, housing supply, or population growth would likely yield a better forecast. However, this falls outside the scope of this project.

Sampling the top 50 zip codes by size rank during EDA allowed us to somewhat capture qualitative factors not included.

Repository Structure

├── README.md <- The top-level README for reviewers of this project
├── time-series <- Narrative documentation of analysis in Jupyter notebook
├── TS_presentation.pdf <- PDF version of project presentation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%