# Estimating Colorado’s Housing Shortfall

Greg Totten (Colorado State Demography Office)

In [None]:
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

The duckplyr package is configured to fall back to dplyr when it encounters an
incompatibility. Fallback events can be collected and uploaded for analysis to
guide future development. By default, data will be collected but no data will
be uploaded.
ℹ Automatic fallback uploading is not controlled and therefore disabled, see
  `?duckplyr::fallback()`.
✔ Number of reports ready for upload: 27.
→ Review with `duckplyr::fallback_review()`, upload with
  `duckplyr::fallback_upload()`.
ℹ Configure automatic uploading with `duckplyr::fallback_config()`.
✔ Overwriting dplyr methods with duckplyr methods.
ℹ Turn off with `duckplyr::methods_restore()`.

## Defining The Shortfall

In order to estimate the total housing shortfall in Colorado, we must first define the metrics by which we are going to assess the number of housing units the state might be short.

How these metrics are defined can have a significant impact on the resulting analysis. As a highly stylized example to demonstrate this concept let’s begin by using two examples to bound our estimates:

1.  Every person currently residing in the state does so inside of a permanent housing unit. In this scenario we would have a relatively low estimate of the total number of necessary housing units - as what would be required is enough units to house the state’s unhoused population. In this case the estimate of number housing units might just be the estimate of the number of people in this population. However, this could be reduced by changing our requirements of housing to include any shelter - such as tents as vehicles, which would likely bring the estimated much lower.
2.  Every current US resident who would like to live in Colorado may do so, and they will be able to do so for free. Conversely, in this scenario we might expect an estimate that is quite high, as while some people still may opt not to reside in our beautiful state (perhaps they do not particularly like sun), we could reasonably expect many Americans, perhaps into the hundreds of millions, might opt to spend no money to live in our beautiful state.

In between these two estimates are a range of scenarios that might be indicative of the number of housing units which are necessary - based on the objectives we are trying to determine, and the underlying assumptions about housing preferences which underlie them. In this paper we will examine a variety of methods, based primarily on studies by other researchers, that we can apply to Colorado to determine the estimated housing shortfall in the state, under that method. In this way we will provide not so much a point estimate of the total housing shortfall, but a range of estimates which can be utilized by planners and policy makers based on their discretion with respect to the reasonableness and applicability of each method. In doing so we also hope to plan a clear, concise, explanation of the method, what objective it is attempting to solve for, and the meaning of the estimate within that context.

## Data

Data primarily comes from the most recent American Community Survey (“ACS”) one year estimates for Colorado, and data from the Colorado State Demography Office (“SDO”). One year ACS estimates are primarily used as the population of the state is large enough to allow for the use of such estimates. If applying similar methodologies at smaller geography levels (such as county level), it may be necessary to instead use 5 year estimates. Additionally, some methods of deriving estimates, such as by analyzing Public Use Microdata Sample (“PUMS”) data may not be possible for all methods. As such many methods determined here may only be applicable at the state level.

## Examples

### Harvard Joint Center for Housing Studies Blog

One resource that compares four relatively recent national studies attempting to determine housing shortfalls are a January 2024 blog entry from the Harvard Joint Center for Housing Studies (“JHCS”) ([McCue and Huang 2024](#ref-mccue2024)). The four studies each utilize different methodologies and resulting estimates covering different years. The four studies are:

1.  [National Association of Home Builders (NAHB) 2021](https://www.freddiemac.com/research/insight/20210507-housing-supply)
2.  [Freddie Mac 2020](https://www.freddiemac.com/research/insight/20210507-housing-supply)
3.  [National Association of Realtors (NAR) 2021](https://www.nar.realtor/advocacy/housing-is-critical-infrastructure)
4.  [National Low Income Housing Coalition (NLIHC)](https://nlihc.org/gap)

The following sections will provide estimates of the housing shortage in Colorado based on each of these study methodologies.

### National Association of Home Builders

The NAHB study estimates the national housing shortage by examining the difference in ACS vacancy rates in the current year from their long run average. This resulted in an estimated shortage of 1.5 million units.

The first step is to create a time series of vacancy rates for the state with ACS data. This data is accessed from the IPUMS USA database ([Ruggles et al. 2024](#ref-ruggles2024a)) using the `ipumsr` package in R ([Greg Freedman Ellis, Derek Burk, and Finn Roberts 2024](#ref-ipumsr)).

In [None]:
acs_samples <- get_sample_info("usa") |>
  filter(str_detect(name, pattern = "us20d{2}a")) |>
  pull(name)

ipums_dir <- "data/ipums_raw/"
ipums_ddi <- "usa_00075.xml"
file_loc <- paste0(
  ipums_dir,
  ipums_ddi
)
  if (file.exists(file_loc)) {
    acs_ddi <- read_ipums_ddi(file_loc)
    acs_00_23 <- acs_ddi |>
      read_ipums_micro()
  } else {
acs_00_23 <- define_extract_micro(
      collection = "usa",
      description = "ACS 1 year samples in Colorado of vacancy variables",
      samples = acs_samples,
      variables = list(
        var_spec("STATEFIP", case_selections = "08"),
        "COUNTYFIP",
        "PUMA",
        "GQ",
        "OWNERSHP",
        "VACANCY"
      ),
      data_structure = "household_only"
    ) |>
      submit_extract() |>
      wait_for_extract() |>
      download_extract(download_dir = ipums_dir) |>
      read_ipums_micro()
  }

Use of data from IPUMS USA is subject to conditions including that users should cite the data appropriately. Use command `ipums_conditions()` for more details.

Vacancy rates are calculated based on both the `OWNERSHP` and `VACANCY` variables. Additionally the `GQ` variable is used to filter out group quarters.

In [None]:
acs_info <- ipums_var_info(acs_ddi)

own_tbl <- acs_info %>% 
  filter(var_name %in% c('OWNERSHP', 'VACANCY', "GQ")) %>% 
  select(var_name, val_labels) %>% 
  unnest(val_labels) %>% 
  unite(val_lbl, c(val,lbl), sep="_") %>% 
  group_by(var_name) %>%
  mutate(row = row_number()) %>%
  pivot_wider(names_from = var_name,
              values_from = val_lbl) %>%
  select(-row)

own_tbl_gt <- own_tbl %>%
  gt()

own_tbl_gt

In [None]:
acs_00_23 |> count(OWNERSHP)

# A tibble: 3 × 2
  OWNERSHP                              n
  <int+lbl>                         <int>
1 0 [N/A]                           82030
2 1 [Owned or being bought (loan)] 369311
3 2 [Rented]                       143978

Greg Freedman Ellis, Derek Burk, and Finn Roberts. 2024. *Ipumsr: An r Interface for Downloading, Reading, and Handling IPUMS Data*. <https://tech.popdata.org/ipumsr/>.

McCue, Daniel;, and Sophie Huang. 2024. “Estimating the National Housing Shortfall \| Joint Center for Housing Studies.” <https://www.jchs.harvard.edu/blog/estimating-national-housing-shortfall>.

Ruggles, Steven, Sarah Flood, Matthew Sobek, Daniel Backman, Annie Chen, Grace Cooper, Stephanie Richards, Renae Rodgers, and Megan Schouweiler. 2024. “IPUMS USA: Version 15.0.” Minneapolis, MN: IPUMS. <https://doi.org/10.18128/D010.V15.0>.