Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
1087 lines (879 sloc) 65.3 KB
---
title: On Vancouver population projections
author: Jens von Bergmann
date: '2019-08-01'
slug: on-vancouver-population-projections
categories:
- cancensus
- cmhc
- Vancouver
- CANSIM
tags: []
description: "A closer look at the the Regional Growth Strategy and population projections"
featured: ''
images: ["https://doodles.mountainmath.ca/posts/2019-08-01-on-vancouver-population-projections_files/figure-html/projections_vs_actual-1.png"]
featuredalt: ""
featuredpath: ""
linktitle: ''
type: "post"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
echo = FALSE,
message = FALSE,
warning = FALSE,
fig.width=8,
cache=TRUE
)
library(tidyverse)
library(cancensus)
library(cansim)
library(statcanXtabs)
library(cmhc)
```
Metro Vancouver's population is growing. For planning purposes we want to understand how our population will be growing. For that we need projections.
Here we need to carefully distinguish two related but distinct types of population projections.
* projections of population demand, and
* projections of population growth.
Projecting demand, or even just estimating current population demand, is complex. Demand is a function of a variety of factors, most importantly jobs, and amenities, as well as home prices and rents (in relation to incomes and wealth). Jobs and amenities, relative to other areas, determine the attractiveness of a region to in-migration (and the dissuasion to out-migration). Natural growth (birth minus deaths) is only a small factor in Metro Vancouver's growth.
This *attractiveness* of a region sets an abstract upper limit for population demand. Actual population growth up to the abstract upper limit depends on the amount of available housing (relating to population demand via *households*). This is where we have to remember that demand is a function of price, and the abstract upper limit on population demand is built on the assumption that people can find housing at an acceptable price. In reality, the population demand is curbed through market prices for market housing, and waitlists for non-market housing. This is how population demand and population growth meet.
For planning purposes, defining demand through current market prices is tautological and not very useful. After all, population projections are often used to limit the amount of permitted housing. Which leaves us with the question: What is the cost of housing we should estimate population demand at?
For market housing, a good, albeit somewhat abstract, answer is [Glaeser's *Minimum Profitable Production Cost* (MPPC)](http://realestate.wharton.upenn.edu/wp-content/uploads/2017/03/802.pdf), that is the cost to add housing to a given market, which depends on construction cost (which will rise with density and could use some refinement), augmented by municipal service charges and possibly other charges, for example charges aimed to maintain livability, the cost of land (usually taken to be 20% of construction cost and service charges), and a developer profit. While the MPPC in Vancouver is high, the price of market housing is even higher than that. In the media this is often described as elevated *land values*. Land values of a housing unit is the residual value that buyers are willing to pay above what it costs to construct the unit (including development charges). That differential is negotiated through supply and demand of market housing.
Demand for non-market housing is, by definition, not negotiated through price. The distribution of non-market housing is done via eligibility criteria and waitlists. Demand can be approximated by the length of the waitlist, although that will generally be a substantial under-estimate in real life situations where wait list tend to be so long that many eligible households don't even bother applying.
## A look at population demand in Vancouver and Calgary
It is safe to say that Vancouver's amenity value is comparatively high, but it is hard to quantify that. Quantifying the job market is much easier. For comparison we choose Calgary, also a city with high amenity value, although probably not quite as high even though it often features high in livability surveys. However, at this time Calgary facing a very different jobs market from Vancouver.
```{r}
unemployment_data <- get_cansim("14-10-0294") %>%
normalize_cansim_values(factors = TRUE) %>%
filter(Statistics=="Estimate",
`Data type`=="Seasonally adjusted",
`Labour force characteristics`=="Unemployment rate") %>%
select(Date,GeoUID,GEO,VALUE) %>%
mutate(name=sub(",.+$","",GEO))
employment_data <- get_cansim("14-10-0294") %>%
normalize_cansim_values(factors = TRUE) %>%
filter(Statistics=="Estimate",
`Data type`=="Seasonally adjusted",
`Labour force characteristics`=="Employment") %>%
select(Date,GeoUID,GEO,VALUE) %>%
mutate(name=sub(",.+$","",GEO))
employment_data2 <- get_cansim("14-10-0294") %>%
normalize_cansim_values(factors = TRUE) %>%
filter(Statistics=="Estimate",
`Data type`=="Unadjusted",
`Labour force characteristics`=="Employment") %>%
select(Date,GeoUID,GEO,VALUE) %>%
mutate(name=sub(",.+$","",GEO))
last_unemployment_for <- function(n){
unemployment_data %>% filter(name==n,Date==last(Date)) %>% pull(VALUE)
}
job_vacancy_data <- get_cansim("14-10-0325") %>%
normalize_cansim_values(factors = TRUE) %>%
mutate(name=sub(",.+$","",GEO)) %>%
filter(Statistics=="Job vacancy rate")
job_vacancy_numbers <- get_cansim("14-10-0325") %>%
normalize_cansim_values(factors = TRUE) %>%
mutate(name=sub(",.+$","",GEO)) %>%
filter(Statistics=="Job vacancies")
last_job_vacancy_for <- function(data=job_vacancy_data,n) {
data %>% filter(name==n,Date==last(Date))%>% pull(VALUE)
}
metrics <- c("Population","Dwellings","Employment")
get_2011_base_projections <- function(){
projections_data<-tabulizer::extract_tables("http://www.metrovancouver.org/services/regional-planning/PlanningPublications/TableA1-PopDwelUnitEmpProjforMVSubregMuni.pdf",method="lattice")[[1]] %>%
as_tibble() %>%
slice(2:n())
metrics %>%
lapply(function(m){
i=match(m,metrics)
indices=c(1,seq(4*i-2,4*i+1))
projections_data %>%
select(indices) %>%
set_names((.)[1,]) %>%
slice(2:n()) %>%
gather(key="Year",value="Value",-1) %>%
mutate(Metric=m)
}) %>%
bind_rows %>%
mutate(Value=as.integer(gsub(",","",Value)),
Year=as.integer(Year))
}
rental_vacancy_yyc <-get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Rms Vacancy Rate Time Series"],
geography_id = cmhc_geography_list$Calgary)) %>%
mutate(Year=substring(X1,1,4),
Month=10) %>%
mutate(Date=as.Date(paste0("01-",Month,"-",Year),format="%d-%m-%Y")) %>%
mutate_at(c("Bachelor","1 Bedroom", "2 Bedroom","3 Bedroom +","Total"),function(d){as.numeric(d)/100})
rental_condo_vacancy_yyc <-get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Srms Vacancy Rate Time Series"],
geography_id = cmhc_geography_list$Calgary)) %>%
mutate(Year=substring(X1,1,4),
Month=10) %>%
mutate(Date=as.Date(paste0("01-",Month,"-",Year),format="%d-%m-%Y")) %>%
mutate_at(c("3-19 Units","20-49 Units","50-99 Units","100+ Units","Total"),function(d){as.numeric(d)/100})
current_vacancy_rate_yyc <- rental_vacancy_yyc %>%
filter(Date==max(Date)) %>%
pull(Total)
current_condo_vacancy_rate_yyc <- rental_condo_vacancy_yyc %>%
filter(Date==max(Date)) %>%
pull(Total)
rental_vacancy_yvr <-get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Rms Vacancy Rate Time Series"],
geography_id = cmhc_geography_list$Vancouver)) %>%
mutate(Year=substring(X1,1,4),
Month=10) %>%
mutate(Date=as.Date(paste0("01-",Month,"-",Year),format="%d-%m-%Y")) %>%
mutate_at(c("Bachelor","1 Bedroom", "2 Bedroom","3 Bedroom +","Total"),function(d){as.numeric(d)/100})
rental_condo_vacancy_yvr <-get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Srms Vacancy Rate Time Series"],
geography_id = cmhc_geography_list$Vancouver)) %>%
mutate(Year=substring(X1,1,4),
Month=10) %>%
mutate(Date=as.Date(paste0("01-",Month,"-",Year),format="%d-%m-%Y")) %>%
mutate_at(c("3-19 Units","20-49 Units","50-99 Units","100+ Units","Total"),function(d){as.numeric(d)/100})
current_vacancy_rate_yvr <- rental_vacancy_yvr %>%
filter(Date==max(Date)) %>%
pull(Total)
current_condo_vacancy_rate_yvr <- rental_condo_vacancy_yvr %>%
filter(Date==max(Date)) %>%
pull(Total)
starts_yyc <- get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Scss Starts Time Series"],
geography_id = cmhc_geography_list$Calgary)) %>%
mutate(Year=substring(X1,5,9),
Month=substring(X1,1,3)) %>%
mutate(Date=as.Date(paste0("01 ",X1),format="%d %b %Y"))
average_10y_6_months_starts <- starts_yyc %>%
filter(Year>2008,Year<=2018,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
last_6_months_starts <- starts_yyc %>%
filter(Year==2019,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
completions_yyc <- get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Scss Completions Time Series"],
geography_id = cmhc_geography_list$Calgary)) %>%
mutate(Year=substring(X1,5,9),
Month=substring(X1,1,3)) %>%
mutate(Date=as.Date(paste0("01 ",X1),format="%d %b %Y"))
average_10y_6_monts_completions <- completions_yyc %>%
filter(Year>2008,Year<=2018,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
last_6_months_completions <- completions_yyc %>%
filter(Year==2019,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
construction_yyc <- get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Scss Under Construction Time Series"],
geography_id = cmhc_geography_list$Calgary)) %>%
mutate(Year=substring(X1,5,9),
Month=substring(X1,1,3)) %>%
mutate(Date=as.Date(paste0("01 ",X1),format="%d %b %Y"))
average_10y_6_months_construction <- construction_yyc %>%
filter(Year>2008,Year<=2018,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
last_6_months_construction <- construction_yyc %>%
filter(Year==2019,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
starts_yvr <- get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Scss Starts Time Series"],
geography_id = cmhc_geography_list$Vancouver)) %>%
mutate(Year=substring(X1,5,9),
Month=substring(X1,1,3)) %>%
mutate(Date=as.Date(paste0("01 ",X1),format="%d %b %Y"))
average_10y_6_months_starts_yvr <- starts_yvr %>%
filter(Year>2008,Year<=2018,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
last_6_months_starts_yvr <- starts_yvr %>%
filter(Year==2019,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
completions_yvr <- get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Scss Completions Time Series"],
geography_id = cmhc_geography_list$Vancouver)) %>%
mutate(Year=substring(X1,5,9),
Month=substring(X1,1,3)) %>%
mutate(Date=as.Date(paste0("01 ",X1),format="%d %b %Y"))
average_10y_6_monts_completions_yvr <- completions_yvr %>%
filter(Year>2008,Year<=2018,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
last_6_months_completions_yvr <- completions_yvr %>%
filter(Year==2019,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
construction_yvr <- get_cmhc(cmhc_timeseries_params(
table_id = cmhc_table_list["Scss Under Construction Time Series"],
geography_id = cmhc_geography_list$Vancouver)) %>%
mutate(Year=substring(X1,5,9),
Month=substring(X1,1,3)) %>%
mutate(Date=as.Date(paste0("01 ",X1),format="%d %b %Y"))
average_10y_6_months_construction_yvr <- construction_yvr %>%
filter(Year>2008,Year<=2018,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
last_6_months_construction_yvr <- construction_yvr %>%
filter(Year==2019,Month %in% c("Jan", "Feb", "Mar", "Apr", "May", "Jun")) %>%
pull(Apartment) %>%
mean
starts_ratio <- last_6_months_starts/average_10y_6_months_starts-1
completions_ratio <- last_6_months_completions/average_10y_6_monts_completions-1
construction_ratio <- last_6_months_construction/average_10y_6_months_construction-1
starts_ratio_yvr <- last_6_months_starts_yvr/average_10y_6_months_starts_yvr-1
completions_ratio_yvr <- last_6_months_completions_yvr/average_10y_6_monts_completions_yvr-1
construction_ratio_yvr <- last_6_months_construction_yvr/average_10y_6_months_construction_yvr-1
percent1 <- function(x)scales::percent(x,accuracy = 0.1)
```
Calgary has a healthy existing housing supply to accommodate mild population growth with a primary market rental vacancy rate of `r percent1(current_vacancy_rate_yyc)` and a private condo rental vacancy rate of `r percent1(current_condo_vacancy_rate_yyc)`. At the same time Calgary is continuing to add new housing units at a good pace with apartment building starts for the first six months of 2019 only `r percent1(-starts_ratio)` below the corresponding 10 year average and completions and apartment units under construction up by `r percent1(completions_ratio)` and `r percent1(construction_ratio)`, respectively.
Economic indicators in Calgary are slowly improving, but current economic conditions show a relatively high unemployment rate of `r percent1(last_unemployment_for(n="Calgary"))`, above the Canadian average of `r percent1(last_unemployment_for(n="Canada"))`. At the same time, Calgary has a job vacancy rate of `r percent1(last_job_vacancy_for(n="Calgary"))`, below the Canadian average of `r percent1(last_job_vacancy_for(n="Canada"))`. That means that unless economic conditions change we expect that on balance people to move away from Calgary over the long term.
This means that Calgary's housing market likely has a bit of a buffer at this point so that population demand and population growth are roughly the same right now, which simplifies things.
In Vancouver on the other hand, population demand and population growth are quite different. Vancouver's unemployment rate sits at `r percent1(last_unemployment_for(n="Vancouver"))` while the job vacancy rate of the economic region is `r percent1(last_job_vacancy_for(n="Lower Mainland-Southwest"))`. The current population can't fill the existing jobs, and with current primary and secondary market rental vacancy rates at an anemic `r percent1(current_vacancy_rate_yvr)` and `r percent1(current_condo_vacancy_rate_yvr)`, respectively, there is no way for people wanting to fill those jobs to move into the region. Current population demand is significantly higher than population growth. Dwelling growth becomes a hard cap on population growth.
```{r}
job_differntial_yvr <- (last_job_vacancy_for(n="Lower Mainland-Southwest") - last_job_vacancy_for(n="Canada"))
jobs_backlog <-last_job_vacancy_for(job_vacancy_numbers,"Lower Mainland-Southwest") * job_differntial_yvr / last_job_vacancy_for(n="Lower Mainland-Southwest")
dwelling_backlog <- jobs_backlog /1.3
dwelling_backlog_months <- dwelling_backlog / last_6_months_completions_yvr
```
The silver lining in this desperate situation is that apartment building starts in the first half of 2019 is a whopping `r percent1(starts_ratio_yvr)` above the corresponding 10 year average and completions and apartment units under construction are up by `r percent1(completions_ratio_yvr)` and `r percent1(construction_ratio_yvr)`, respectively. One should note however that at current rates at fairly high completions it would take about `r round(dwelling_backlog_months)` months to complete the number of dwelling units needed to make room for the `r scales::comma(jobs_backlog)` workers (at a healthy average rate of 1.3 workers per dwelling unit, and completely ignoring the need for moving vacancies) to bring Vancouver's job vacancy rate down to the Canadian average. And that's not accounting for jobs created through continued economic growth or knock-on effects of the economic activity associated with adding these workers and filling the jobs. Thus it will take a while to clear the backlog of undersupply, let alone catch up with latent demand due to delayed household formation, overcrowded households and bring the rental vacancy rate up to healthy levels.
In clearing the backlog, Vancouver's prices and rents will have to adjust to meet this demand, with recent demand side measures like the City of Vancouver Empty Homes Tax and the vacancy tax component of the provincial Speculation and Vacancy Tax helping ensure that properties get occupied and thus aiding the process of price and rent adjustments.
This means that at least for the shorter term future Vancouver will have to arrange itself with a significant gap between population demand and population growth, and the associated market selection mechanism sorting out who gets to come/stay and who gets pushed or shut out of market housing, and our non-market mechanisms selecting how gets to occupy non-market housing units.
## Planning for growth
In our highly regulated building environment, population projections (and dwelling projections) are turned into a high-stakes game. In Vancouver, we rely on the **Metro Vancouver Regional Growth Strategy (RGS)** to set population and dwelling growth targets, and distribute the growth across the municipalities within the region. Those targets are then used as a guideline for how much new construction to permit.
At the metro level, there are three possible outcomes:
1. The RGS overestimates population demand and we permit more housing than needed.
2. The RGS correctly estimates population demand and we permit exactly as much housing as needed.
3. The RGS underestimates population demand and/or we permit too little housing.
In the first case, developers will simply stop building excess housing since providing housing in excess of demand becomes unprofitable fast. The second case essentially never happens, projections are always wrong. The third case is the case of a housing shortage, which leads to people getting shut out of the region, price growth for ownership and rental housing, and a dampening of the economy due to labour shortages.
The costs for under-estimating growth are high (while the cost to over-estimate growth is low). Time to take a more careful look at the RGS.
## Benchmarking the RGS
To illustrate this we can look at the (now dated) [population projections by the Metro Vancouver Regional District, (now just called Metro Vancouver) from 2010](https://vancouver.ca/files/cov/metro-vancouver-regional-growth-strategy.pdf).
```{r}
metro_van_projections <- tibble(Population=c(2195000,2780000,3129000,3400000),
Dwellings=c(848000,1130000,1307000,1422000),
Employment=c(1158000,1448000,1622000,1753000),
Year=c(2006,2021,2031,2041)) %>%
group_by(Year) %>%
gather(key="Metric",value="Value",metrics)
```
```{r}
metro_van_rates <- metro_van_projections %>%
group_by(Metric) %>%
spread(key=Year,value=Value) %>%
mutate(`2006-2021`=(`2021`/`2006`)^(1/15),
`2021-2031`=(`2031`/`2021`)^(1/10),
`2031-2041`=(`2041`/`2031`)^(1/10)) %>%
# mutate(`2006-2021`=(`2021`-`2006`)/15,
# `2021-2031`=(`2031`-`2021`)/10,
# `2031-2041`=(`2041`-`2031`)/10) %>%
select(Metric,`2006-2021`,`2021-2031`,`2031-2041`)
metro_van_rates %>%
gather(key="Period",value="Annual growth rate",c("2006-2021","2021-2031","2031-2041")) %>%
ggplot(aes(x=Metric,y=`Annual growth rate`-1,fill=Period)) +
geom_bar(stat="identity",position="dodge") +
scale_fill_brewer(palette = "Set2") +
scale_y_continuous(labels=scales::percent) +
theme_light() +
labs(title="Metro Vancouver projections",
y="Annual growth rate",
x="",caption="Metro Vancouver Regional Growth Strategy July 29, 2011 Version")
```
The projections call for a slowing of the growth in the region over the years. In the beginning, dwelling growth exceeds population growth to account for shrinking household size, but curiously the model has almost identical dwelling and population growth rates for 2031 to 2041. Unfortunately there is no further explanation how the model was derived, but this seems to assume that household size stabilizes by the 2031-2041 period and all latent demand has been dealt with. Employment is projected to rise at a slightly slower rate than population, probably reflecting the shrinking portion of the population in the work force as the share of seniors continues to increase and the entry time in the labour market continues to get delayed due to increasing levels of education.
Long term projections are hard, so let's focus on the short term until 2021. And interpolate values for the 5 year intervals given by census years that allow us to check how the projections held up. For simplicity, we will assume the growth rates to be constant within each interval, the exact way we interpolate does not make much of the difference for what follows. Projections are always bound to be different from actual numbers anyway, what this post focuses on is the relative difference between the three metrics.
```{r echo=FALSE}
interpolation_years=c(2011,2016)
interpolate_variable <- function(mvp,...) {
approx(x=mvp$Year,y=mvp %>% pull(...),xout=interpolation_years)$y
}
rates <- metro_van_rates %>%
select(Metric,`2006-2021`) %>%
spread(key=Metric,value=`2006-2021`)
metro_van_detail <- metro_van_projections %>%
filter(Year %in% c(2006,2021)) %>%
group_by(Year) %>%
spread(key="Metric",value="Value") %>%
bind_rows(tibble(Year=2011,
Population=filter(.,Year==2006)$Population*rates$Population^5,
Dwellings=filter(.,Year==2006)$Dwellings*rates$Dwellings^5,
Employment=filter(.,Year==2006)$Employment*rates$Employment^5)) %>%
bind_rows(tibble(Year=2016,
Population=filter(.,Year==2011)$Population*rates$Population^5,
Dwellings=filter(.,Year==2011)$Dwellings*rates$Dwellings^5,
Employment=filter(.,Year==2011)$Employment*rates$Employment^5)) %>%
arrange(Year)
plot_data <- metro_van_detail %>%
group_by(Year) %>%
gather(key="Metric",value="Value",c("Population","Dwellings","Employment"))
years_color=set_names(RColorBrewer::brewer.pal(4,"Dark2"),plot_data$Year %>% unique)
# ggplot(plot_data,aes(x=Metric,y=Value,fill=factor(Year))) +
# geom_bar(stat="identity",position = "dodge") +
# theme_light() +
# scale_y_continuous(labels=scales::comma) +
# #scale_fill_manual(values=years_color) +
# labs(title="Metro Vancouver projections",
# subtitle="(assuming constant growth rate between 2006 and 2021)",fill="Year",x="",y="",
# caption="Metro Vancouver Regional Growth Strategy July 29, 2011 Version")
```
The Regional Growth Strategy was updated in 2017, and it is a worthwhile exercise to check how the updated projections compare.
```{r fig.width=9}
metro_van_region_projections <- get_2011_base_projections()
metro_van_projections2 <- metro_van_region_projections %>%
filter(MUNICIPALITY=="Metro Vancouver Total")
metro_van_rates2 <- metro_van_projections2 %>%
group_by(Metric) %>%
spread(key=Year,value=Value) %>%
mutate(`2011-2021`=(`2021`/`2011`)^(1/10),
`2021-2031`=(`2031`/`2021`)^(1/10),
`2031-2041`=(`2041`/`2031`)^(1/10)) %>%
# mutate(`2006-2021`=(`2021`-`2006`)/15,
# `2021-2031`=(`2031`-`2021`)/10,
# `2031-2041`=(`2041`-`2031`)/10) %>%
select(Metric,`2011-2021`,`2021-2031`,`2031-2041`)
type_colors <- set_names(RColorBrewer::brewer.pal(length(metrics),"Dark2"),c("Projection","Updated","Actual"))
type_colors["Original"]=type_colors[["Projection"]]
type_colors["Census"]=type_colors[["Actual"]]
metro_van_rates2 %>%
gather(key="Period",value="Annual growth rate",c("2011-2021","2021-2031","2031-2041")) %>%
mutate(Type="Updated") %>%
bind_rows(metro_van_rates %>%
gather(key="Period",value="Annual growth rate",c("2006-2021","2021-2031","2031-2041")) %>%
mutate(Type="Original")) %>%
mutate(Period=case_when(grepl("2021$",Period)~"Before 2021",TRUE~Period)) %>%
mutate(Period=factor(Period,levels=c("Before 2021","2021-2031","2031-2041"))) %>%
ggplot(aes(x=Metric,y=`Annual growth rate`-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
scale_fill_manual(values=type_colors) +
facet_wrap("Period") +
scale_y_continuous(labels=scales::percent) +
theme_light() +
#theme(legend.position = "bottom") +
labs(title="Metro Vancouver projections",
y="Annual growth rate",fill="RGS",
x="",caption="Metro Vancouver Regional Growth Strategy July 28, 2017 Version")
```
We see that the updated projections sport slightly higher growth rates, and also remove the assumption of a stop to shrinking household size for the 2031-2041 period.
```{r eval=FALSE, include=FALSE}
bc_projections <- read_csv(here::here("data/Population_Projections.csv")) %>%
filter(X1==933)
comparing_population_projections <-
metro_van_projections %>% filter(Metric=="Population") %>%
rename(Original=Value) %>%
select(Year,Original) %>%
full_join(metro_van_projections2 %>% filter(Metric=="Population") %>%
rename(Updated=Value) %>%
select(Year,Updated),by="Year") %>%
full_join(bc_projections %>% select(Year,Total) %>%
rename(BCStats=Total),by="Year") %>%
filter(Year>=2006) %>%
gather(key="Type",value="Value",c("Original","Updated","BCStats"),factor_key = TRUE) %>%
filter(!is.na(Value))
ggplot(comparing_population_projections,aes(x=Year,y=Value,fill=Type)) +
geom_bar(stat="identity",position="dodge")
```
## How to measure population, dwellings and employment
```{r}
pop_data <- get_cansim("17-10-0135") %>%
normalize_cansim_values(factors=TRUE) %>%
filter(GEO=="Vancouver (CMA), British Columbia",
`Age group`=="All ages",
Sex=="Both sexes") %>%
select(GEO,REF_DATE,VALUE) %>%
rename(Year=REF_DATE)
census_years <- c(2006,2011,2016)
census_counts <- census_years %>% lapply(function(year){
get_census(paste0("CA",as.character(year) %>% substr(3,4)),region=list(CMA="59933")) %>% mutate(Year=year)
}) %>% bind_rows
calculate_growth_rates <- function(data,Year1,Year2){
data %>%
filter(Year %in% c(Year1,Year2)) %>%
spread(key=Year,value=Value) %>%
mutate(!!paste0(Year1,"-",Year2):=(!!as.name(Year2)/!!as.name(Year1))^(1/(as.integer(Year2)-as.integer(Year1)))) %>%
select(Metric,!!paste0(Year1,"-",Year2)) %>%
spread(key=Metric,value=!!paste0(Year1,"-",Year2))
}
rates2 <- metro_van_projections2 %>%
select(Year,Metric,Value) %>%
calculate_growth_rates("2011","2021")
metro_van_detail2 <- metro_van_projections2 %>%
select(Year,Metric,Value) %>%
filter(Year %in% c(2011,2021)) %>%
group_by(Year) %>%
spread(key="Metric",value="Value") %>%
bind_rows(tibble(Year=2016,
Population=filter(.,Year==2011)$Population*rates2$Population^5,
Dwellings=filter(.,Year==2011)$Dwellings*rates2$Dwellings^5,
Employment=filter(.,Year==2011)$Employment*rates2$Employment^5)) %>%
arrange(Year)
metro_van_enriched <- metro_van_detail %>%
group_by(Year) %>%
gather(key="Metric",value="Value",c("Population","Dwellings","Employment")) %>%
filter(Year %in% census_years) %>%
mutate(Type="Projection") %>%
bind_rows((.) %>% filter(Year==2006) %>%
mutate(Type="Updated"),
census_counts %>%
select(Year,Dwellings) %>%
mutate(Type="Actual",Metric="Dwellings") %>%
rename(Value=Dwellings),
employment_data %>%
filter(name=="Vancouver",strftime(Date,"%m")=="07") %>%
mutate(Year=strftime(Date,"%Y") %>% as.integer) %>%
filter(Year %in% census_years) %>%
select(Year,VALUE) %>%
rename(Value=VALUE) %>%
mutate(Type="Actual",Metric="Employment"),
pop_data %>%
select(Year,VALUE) %>%
filter(Year %in% census_years) %>%
mutate(Year=as.integer(Year)) %>%
rename(Value=VALUE) %>%
mutate(Type="Actual",Metric="Population"),
metro_van_detail2 %>%
gather(key=Metric,value=Value,metrics) %>%
mutate(Type="Updated"))
pow_data_2006x <- get_sqlite_xtab("97-561-XCB2006011",url="https://www12.statcan.gc.ca/census-recensement/2006/dp-pd/tbt/OpenDataDownload.cfm?PID=90656",format = "xml") %>%
filter(Sex=="Total - Sex",
Geography=="Greater Vancouver")
pow_data_2006 <- get_sqlite_xtab("97-561-XCB2006014",url="https://www12.statcan.gc.ca/census-recensement/2006/dp-pd/tbt/OpenDataDownload.cfm?PID=95838",format="xml") %>%
filter(Sex=="Total - Sex",
`Age groups`=="Total - Age groups",
Geography=="Greater Vancouver",
`Employment income groups`=="Total - Employment income groups") %>%
pull(Value)
pow_data_2016 <- get_sqlite_xtab("98-400-X2016089",url="https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/CompDataDownload.cfm?LANG=E&PID=111627&OFT=CSV") %>%
filter(GEO_NAME=="Canada",
`Immigrant status`=="Total - Immigrant status",
`Language used most often at work`=="Total - Language used most often at work",
`Language spoken most often at home`=="Total - Language spoken most often at home",
`Place of work census division`=="Greater Vancouver, B.C.") %>%
rename(Value=`Dim: Used regularly at work - Other language(s) (9): Member ID: [1]: Total - Other language(s) used regularly at work`) %>%
pull(Value)
vectors_2006 <- search_census_vectors("Total employed labour force 15 years and over by place of work status","CA06","Total") %>%
bind_rows(child_census_vectors(.))
pows_data_2006 <- get_census("CA06",regions=list(CMA="59933"),vectors=vectors_2006$vector,labels="short") %>%
cancensusHelpers::detail_labels()
pow_2006_total=pow_data_2006+pows_data_2006$`No fixed workplace address`
vectors_2016 <- search_census_vectors("Total - Place of work status for the employed labour force aged 15 years and over in private households ","CA16","Total") %>%
bind_rows(child_census_vectors(.))
pows_data_2016 <- get_census("CA16",regions=list(CMA="59933"),vectors=vectors_2016$vector,labels="short") %>%
cancensusHelpers::detail_labels()
pow_2016_total=pow_data_2016+pows_data_2016$`No fixed workplace address`
plot_data <- metro_van_enriched %>%
group_by(Metric,Type) %>%
spread(key=Year,value=Value) %>%
mutate(`2006-2011`=(`2011`/`2006`)^(1/5),
`2011-2016`=(`2016`/`2011`)^(1/5),
`2006-2016`=(`2016`/`2006`)^(1/10)) %>%
select(Metric,`2006-2011`,`2011-2016`,`2006-2016`) %>%
gather(key="Period",value="Value",c(`2006-2011`,`2011-2016`,`2006-2016`),factor_key = TRUE) %>%
ungroup %>%
mutate(Metric=factor(Metric,levels=metrics),
Type=factor(Type,levels=c("Projection","Updated","Actual")))
metric_colors <- set_names(RColorBrewer::brewer.pal(length(metrics),"Set2"),metrics)
```
To understand population demand we need to at least measure population, dwellings and jobs/employment.
The question how to measure these metrics is surprisingly less straight-forward than it seems. Ideally we want high-frequency data so we can closely monitor how our region is changing. Annual data is probably good enough for that. StatCan has a number of annual tables at the CMA level that can inform on employment and population growth, although the employment tables are based on place of residence and not place of work. Which makes a difference when there are changes in net commute patterns for Metro Vancouver. The census also counts population and employment, although one needs to adjust for the census undercounts.
StatCan does not have good tables on the number of dwelling units, although CHSP data is starting to use administrative data to count residential properties. Folding in unit counts for each property gets us closer to dwelling units, but this still fails to account for some informal dwellings, like non-authorized secondary suites. BC Stats has annual estimates of dwelling units that can also be used. The census has (private) dwelling counts, but they generally under-count as the census inevitably misses dwelling units, as we also need to account for collective dwellings.
Finer geography data, so data for municipalities within Metro Vancouver, are harder to get from the annual StatCan tables, although BC Stats has population estimates (and projections) at the municipal level, and custom tabulations are available at least for the larger municipalities. Census data is available on municipal and sub-municipal geographies, but the numbers still need to be adjusted for undercounts and net commute flows.
Sorting out these details takes more effort than what we are willing to spent on this post. We will focus on growth rates and differences over given time periods, what's most important is to choose consistent metrics for this. Differences stemming from the particular choice of metric tend to divide out this way, and we will ignore related issues for this post as they are smaller than the effects we are interested in.
The RGS is unfortunately a bit vague on how exactly the base estimates were derived, stating that dwelling counts (just private or private and collective?) are based on the census, and population and employment (not jobs) estimates are derived from the census and were adjusted for census undercounts (probably via a custom-tabulation at the sub-metro level?).
## Comparing metrics
Population counts at the metropolitan level are easily available from StatCan table 17-10-0135 that is already adjusted for census undercounts, but is anchored on July 1st, almost two months after the census. For Metro Vancouver, that came out at `r scales::comma(filter(pop_data,Year=="2006")$VALUE)`, quite close to the `r scales::comma(filter(metro_van_projections,Year==2006,Metric=="Population")$Value)` from the Regional Growth Strategy.
Next up is the number of dwellings, those *should* be straight-up census numbers, but the RGS had `r scales::comma(filter(metro_van_projections,Year==2006,Metric=="Dwellings")$Value)` dwelling units and the 2006 census counted `r scales::comma(filter(census_counts,Year==2006)$Dwellings)` private dwellings, not including collective dwellings. It's unclear to me why these differ so dramatically.
Employment counts are harder than they should be. Moreover, employment is probably not the most relevant metric, it would be better to count jobs. Which is even harder. As the RGS employment estimates are based on the census, we can query the employment variable by place of work geography. We get a total employment of `r scales::comma(pow_2006_total)` people in Metro Vancouver, compared to `r scales::comma(filter(metro_van_projections,Year==2006,Metric=="Employment")$Value)` from the RGS.
The match is reasonably good, although our census numbers don't include the census undercount yet. Undercounts are only available for the place of residence, not the place of work geography, but that difference will have a negligible effect. An alternative is to use Labour Force Survey data that is available annually, but only considers place of residence instead of place of work. For 2006 that estimate comes out as `r scales::comma(employment_data2 %>% filter(name=="Vancouver",Date==as.Date("2006-05-01")) %>% pull(VALUE))` for the (unadjusted) employment in Metro Vancouver in May 2006.
To move forward, we will focus on the growth of our measures rather than the absolute numbers, as differences in sources and methods tend to divide out that way. As this is something that should be monitored on an ongoing basis we will be relying on the annual StatCan tables for population and employment estimates, rather than the census estimates.
However, when we need finer regional breakup we are confined to census estimates.
## Checking in with 2011 data
The 2011 Census data gives us the first reality check of the data. We use the baseline 2011 estimates from the updated Metro Vancouver Regional Growth Strategy from 2017 as another point of comparison.
We compare the annual growth rates for the 2006-2011 timeframe as projected by Metro Vancouver in 2011 to the original 2006 base line and updated 2011 base lines, to the "actual" change based on Census (for private dwellings) and annual StatCan tables (for population and employment). Calling these "actual" is a bit ambitious, as we are glossing over details as we have explained above, but we will run with this for now.
```{r fig.width=9}
ggplot(plot_data %>% filter(Period=="2006-2011"),aes(x=Type,y=Value-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
facet_wrap("Metric") +
theme_light() +
scale_fill_manual(values=type_colors) +
scale_y_continuous(labels=scales::percent) +
labs(title="Metro Vancouver Projections vs updated vs actual 2006-2011", x="",y="Annual growth rate",fill="",
caption="Metro Vancouver Regional Growth Strategy July 29, 2011 and July 28, 2017 Versions,\nStatCan Census 2006, 2011, 2016, Tables 17-10-0135, 14-10-0294")
```
What this shows is that population came in roughly as projected, whereas employment fell a bit short (which should not be surprising considering financial crisis hit in that time window). What's a real head-scratcher is the change in dwellings, where projections came a bit short of the actual, and the two Regional Growth Strategy baseline differ significantly. This points to a change in methods in the RGS.
We can also look at how our population, dwelling and employment estimates line up with the 2006 and 2011 baseline estimates from Metro Vancouver.
```{r}
metro_van_enriched %>%
filter(Year %in% c("2006","2011")) %>%
group_by(Year) %>%
spread(key=Type,value=Value) %>%
mutate(`2006 baseline vs actual`=Projection/Actual,
`2011 baseline vs actual`=Updated/Actual) %>%
gather(key="Type",value="Value",c("2006 baseline vs actual","2011 baseline vs actual")) %>%
ungroup %>%
filter((Year==2011 & grepl("2011",Type))| (Year==2006 & grepl("2006",Type))) %>%
ggplot(aes(x=factor(Year),y=Value-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
facet_wrap("Metric") +
theme_light() +
scale_y_continuous(labels=scales::percent) +
labs(title="Metro Vancouver Projections vs Actual", x="",y="Deviation",fill="",
caption="Metro Vancouver Regional Growth Strategy July 29, 2011 and July 28, 2017 Versions,\nStatCan Census 2006, 2011, 2016, Tables 17-10-0135, 14-10-0294")
```
This shows that the Metro Vancouver Regional Growth Strategy did a decent job at projecting jobs and population, but came in systematically short of their dwelling targets. The smaller differences aren't concerning and expected when using different metrics, but it would be prudent to better understand the apparent dwelling shortfall in the RGS. It would be good to better understand exactly how Metro Vancouver derives their estimates and projections.
## Checking in with 2016 data
The 2016 census gives us another opportunity to check the projections against the data. We can use the projections from the original and the updated Growth Strategy and compare them to our estimates based on Census data and annual tables.
```{r projections_vs_actual}
pd <- plot_data %>%
filter(Period != "2006-2011") %>%
filter(Period=="2006-2016" & Type=="Projection" | Period=="2011-2016" & Type=="Updated" | Type=="Actual") %>%
mutate(Type=recode(Type,"Projection"="Original"))
ggplot(pd,aes(x=fct_rev(Period),y=Value-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
facet_wrap("Metric") +
scale_fill_manual(values=type_colors) +
theme_light() +
scale_y_continuous(labels=scales::percent) +
labs(title="Metro Vancouver Projections vs updated vs actual", x="",y="Annual growth rate",fill="",
caption="Metro Vancouver Regional Growth Strategy July 29, 2011 and July 28, 2017 Versions,\nStatCan Census 2006, 2011, 2016, Tables 17-10-0135, 14-10-0294")
```
We see that the population grew faster than projected off the 2006 baseline, but a bit slower than projected off the 2011 baseline. Dwellings consistently grew slower than projected while employment consistently outperformed projections.
Another interesting takeaway from the data is to just look at the trend in our "Actual" data.
```{r}
pd <- plot_data %>%
filter(Type=="Actual",Period %in% c("2006-2011","2011-2016"))
ggplot(pd,aes(x=Period,y=Value-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
facet_wrap("Metric") +
scale_fill_manual(values=type_colors,guide=FALSE) +
theme_light() +
scale_y_continuous(labels=scales::percent) +
labs(title="Metro Vancouver historical growth", x="",y="Annual growth rate",fill="",
caption="StatCan Census 2006, 2011, 2016, Tables 17-10-0135, 14-10-0294")
```
This shows that dwelling growth slowed somewhat after 2011, while population growth slightly increased and employment growth increased dramatically, recovering from the slower growth during the economic slowdown around 2008.
## Regional breakdown
Another aspect of the Metro Vancouver Regional Growth Strategy is the regional allocation of the growth. Here we are working exclusively with the updated Growth Strategy. For this we can't get away with using annual population and employment data any more but have no choice but go with census data. This means we have to worry about census undercounts, but this is where things get tricky. Those estimates aren't easily available at the municipal level, and one would expect significant regional variation that simply using metro estimates isn't particularly informative. BC Stats has custom population estimates at the municipal level that include the census undercount, and StatCan probably has custom estimates as well. We will skip over this detail, the difference in census coverage rates between the census years should not impact our results by much.
Moreover, we can't lazily assume that place of residence and place of work (roughly) coincides like we did when looking at the Metro level, we will have to be careful about how we count employment. Again, a custom tabulation would be helpful for this, but we will make due with adjusting employment numbers by net commuter flow between municipalities, recognizing that we are missing flows between municipalities with fewer than 20 commuters.
```{r}
metro_van_region_change <- metro_van_region_projections %>%
bind_rows((.) %>%
filter(MUNICIPALITY %in% c("Metro Vancouver Total","Vancouver")) %>%
group_by(Year,Metric) %>%
spread(key=MUNICIPALITY,value=Value) %>%
mutate(Value=`Metro Vancouver Total`-Vancouver,
MUNICIPALITY="Rest of Metro Vancouver") %>%
select(MUNICIPALITY,Metric,Year,Value)) %>%
group_by(MUNICIPALITY,Metric) %>%
spread(key=Year,value=Value) %>%
mutate(`2011-2021`=(`2021`/`2011`)^(1/10),
`2021-2031`=(`2031`/`2021`)^(1/10),
`2031-2041`=(`2041`/`2031`)^(1/10),
`2011-2041`=(`2041`/`2011`)^(1/30)) %>%
gather(key="Period",value="Value",c(`2011-2021`,`2021-2031`,`2031-2041`,`2011-2041`)) %>%
mutate(`2016`=`2011`*Value^5)
```
```{r}
vancouver_regions_data <- list_census_regions("CA16") %>%
filter(level=="CSD",CMA_UID=="59933")
vancouver_regions <- vancouver_regions_data %>%
pull(region)
origin_commute_data <- get_sqlite_xtab("98-400-X2016325","https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/CompDataDownload.cfm?LANG=E&PID=111332&OFT=CSV") %>%
rename(GeoUID=`GEO_CODE (POR)`) %>%
filter(GeoUID %in% vancouver_regions) %>%
group_by(GeoUID) %>%
summarise(Home=sum(`Dim: Sex (3): Member ID: [1]: Total - Sex`)) %>%
collect()
destination_commute_data <- get_sqlite_xtab("98-400-X2016325","https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/CompDataDownload.cfm?LANG=E&PID=111332&OFT=CSV") %>%
rename(GeoUID=`GEO_CODE (POW)`) %>%
filter(GeoUID %in% vancouver_regions) %>%
group_by(GeoUID) %>%
summarise(Work=sum(`Dim: Sex (3): Member ID: [1]: Total - Sex`)) %>%
collect()
origin_commute_data_2011 <- get_sqlite_xtab("99-012-X2011032","https://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/dt-td/OpenDataDownload.cfm?PID=106036",format = 'xml') %>%
filter(Sex=="Total - Sex") %>%
filter(GeoUID %in% vancouver_regions) %>%
group_by(GeoUID) %>%
summarise(Home=sum(Value)) %>%
collect()
destination_commute_data_2011 <- get_sqlite_xtab("99-012-X2011032","https://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/dt-td/OpenDataDownload.cfm?PID=106036",format = 'xml') %>%
filter(Sex=="Total - Sex") %>%
mutate(GeoUID=`Geography POW ID`) %>%
filter(GeoUID %in% vancouver_regions) %>%
group_by(GeoUID) %>%
summarise(Work=sum(Value)) %>%
collect()
net_commute_data <- full_join(origin_commute_data,destination_commute_data,by="GeoUID") %>%
mutate(net=Work-Home)
net_commute_data_2011 <- full_join(origin_commute_data_2011,destination_commute_data_2011,by="GeoUID") %>%
mutate(net=Work-Home)
add_geo_uids <- function(data){
data %>%
left_join(vancouver_regions_data %>%
mutate(name=recode(name,"Greater Vancouver A"="Electoral Area A")) %>%
mutate(name=case_when(name=="Langley" & municipal_status=="CY" ~ "Langley City",
name=="Langley" & municipal_status=="DM" ~ "Langley Township",
name=="North Vancouver" & municipal_status=="CY" ~ "North Vancouver City",
name=="North Vancouver" & municipal_status=="DM" ~ "North Vancouver District",
name=="Tsawwassen" ~ "Tsawwassen First Nation",
TRUE ~ name)) %>%
rename(GeoUID=region) %>%
select(name,GeoUID),
by=c("MUNICIPALITY"="name"))
}
metro_van_region_employment <- metro_van_region_change %>%
ungroup %>%
filter(Metric=="Employment",Period=="2011-2021") %>%
select(MUNICIPALITY, `2011`,`2016`,Value) %>%
rename(Employment_2016=`2016`) %>%
rename(Employment_2011=`2011`) %>%
add_geo_uids
vectors_2016 <- search_census_vectors("Total - Place of work status for the employed labour force aged 15 years and over in private households ","CA16","Total") %>%
bind_rows(child_census_vectors(.))
vectors_2011 <- search_census_vectors("Total employed population aged 15 years and over by place of work status","CA11","Total") %>%
bind_rows(child_census_vectors(.))
pow_region_data_2011 <- get_census("CA11",regions=list(CMA="59933"),
vectors=vectors_2011$vector,labels="short",level = "CSD") %>%
cancensusHelpers::detail_labels() %>%
left_join(net_commute_data,by="GeoUID") %>%
rename(employment_base=`Total employed population aged 15 years and over by place of work status`) %>%
mutate(employment=employment_base+net) %>%
left_join(metro_van_region_employment,by="GeoUID") %>%
mutate(`Projection to actual`=Employment_2011 /employment) %>%
#mutate(`Projection to actual 2`=Employment /employment_base) %>%
filter(!is.na(`Projection to actual`)) #%>%
#gather(key="Type",value="Value",c("Projection to actual","Projection to actual 2"))
pow_region_data_2016 <- get_census("CA16",regions=list(CMA="59933"),
vectors=vectors_2016$vector,labels="short",level = "CSD") %>%
cancensusHelpers::detail_labels() %>%
left_join(net_commute_data,by="GeoUID") %>%
rename(employment_base=`Total - Place of work status for the employed labour force aged 15 years and over in private households - 25% sample data`) %>%
mutate(employment=employment_base+net) %>%
left_join(metro_van_region_employment,by="GeoUID") %>%
mutate(`Projection to actual`=Employment_2016 /employment) %>%
#mutate(`Projection to actual 2`=Employment /employment_base) %>%
filter(!is.na(`Projection to actual`)) #%>%
#gather(key="Type",value="Value",c("Projection to actual","Projection to actual 2"))
pow_region_data <- bind_rows(
pow_region_data_2016 %>%
select(GeoUID,employment,Population,Dwellings) %>%
rename(Employment=employment) %>%
mutate(Year="2016",Type="Census") %>%
gather(key="Metric",value="Value",metrics),
pow_region_data_2011 %>%
select(GeoUID,employment,Population,Dwellings) %>%
rename(Employment=employment) %>%
mutate(Year="2011",Type="Census") %>%
gather(key="Metric",value="Value",metrics)
)
geo_uid_names <- pow_region_data_2016 %>%
select(GeoUID,`Region Name`) %>%
unique %>%
filter(GeoUID %in% (metro_van_region_change %>%
add_geo_uids() %>%
filter(Metric=="Population",`2016`>5000) %>%
pull(GeoUID) %>%
na.omit))
pow_region_change_data <- pow_region_data %>%
group_by(GeoUID,Metric) %>%
spread(key=Year,value=Value) %>%
mutate(Value=(`2016`/`2011`)^(1/5),Period="2011-2016")
pow_combined_data <- pow_region_change_data %>%
bind_rows(metro_van_region_change %>%
add_geo_uids() %>%
ungroup %>%
select(GeoUID,Metric,Period,Value,`2011`,`2016`) %>%
filter(Period=="2011-2021") %>%
mutate(Period=recode(Period,"2011-2021"="2011-2016")) %>%
mutate(Type="Projection")) %>%
left_join(geo_uid_names) %>%
filter(!is.na(`Region Name`))
pow_change_data <- left_join(pow_region_data_2016 %>%
select(GeoUID,`Region Name`,Employment_2016) %>%
rename(`Employment 2016`=Employment_2016),
pow_region_data_2011 %>%
select(GeoUID,Employment_2011) %>%
rename(`Employment 2011`=Employment_2011),
by="GeoUID") %>%
mutate(`Actual change`=`Employment 2016`/`Employment 2011`) %>%
left_join(metro_van_region_employment %>% rename(`Projected change`=Value),by="GeoUID") %>%
gather(key="Type",value="Value",c("Actual change","Projected change"))
pow_plot_data <- bind_rows(
pow_region_data_2016 %>%
select(GeoUID,employment) %>%
rename(`Employment`=employment) %>%
mutate(Year="2016",Type="Actual"),
pow_region_data_2011 %>%
select(GeoUID,employment) %>%
rename(`Employment`=employment) %>%
mutate(Year="2011",Type="Actual"),
metro_van_region_employment %>%
rename(`2011`=Employment_2011,`2016`=Employment_2016) %>%
gather(key="Year",value="Employment",c("2011","2016")) %>%
select(GeoUID,Year,Employment) %>%
mutate(Type="Projection")) %>%
left_join(pow_region_data_2016 %>%
select(GeoUID,`Region Name`,Population,`Employment_2011`, `Employment_2016`),
by="GeoUID") %>%
filter(`Employment_2011`>10000) %>%
select(GeoUID,`Region Name`,Type,Year,Employment) %>%
group_by(GeoUID,Type) %>%
spread(key=Year,value=Employment)
# ggplot(pow_plot_data,aes(y=`Region Name`,color=Type)) +
# ggalt::geom_dumbbell(aes(x=`2011`,xend=`2016`),position="dodge") +
# theme_light()
ggplot(pow_combined_data %>% filter(Period=="2011-2016"),aes(x=`Region Name`,y=Value-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
theme_light() +
coord_flip() +
facet_wrap("Metric") +
scale_fill_manual(values=type_colors) +
scale_y_continuous(labels=scales::percent) +
theme(legend.position = "bottom") +
labs(title="Annual rate of change 2011-2016",fill="",x="",y="",
caption="Metro Vancouver Regional Growth Strategy July 28, 2017 Versions,\nStatCan Census 2011, 2016")
```
Again, projections are just projections, and our census numbers do not account for census undercounts, although most of these will divide out when taking differences. And our employment estimates, especially for smaller municipalities, should be taken with caution because our commute flow data dropped flows below 20 commuters. Thus it makes more sense to focus on the more populous municipalities.
```{r}
large_regions <- vancouver_regions_data %>%
filter(pop>=50000) %>%
arrange(-pop) %>%
rename(GeoUID=region) %>%
select(GeoUID,pop)
ggplot(pow_combined_data %>%
right_join(large_regions,by="GeoUID") %>%
mutate(`Region Name`=as.character(.data$`Region Name`)) %>%
filter(Period=="2011-2016") %>%
arrange(pop) %>%
mutate(`Region Name`=factor(`Region Name`,unique((.)$`Region Name`))),
aes(x=`Region Name`,y=Value-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
theme_light() +
coord_flip() +
scale_fill_manual(values=type_colors) +
theme(legend.position = "bottom") +
facet_wrap("Metric") +
scale_y_continuous(labels=scales::percent) +
labs(title="Annual rate of change 2011-2016",fill="",x="Municipalities with at least 50k people in 2016",y="",
caption="Metro Vancouver Regional Growth Strategy July 28, 2017 Versions,\nStatCan Census 2011, 2016")
```
The City of North Vancouver is the only municipality that exceeded its projections on all three metrics.
The City of Vancouver, Surrey, New Westminister and Burnaby all exceeded their employment projections, while at the same time falling short of their dwelling (and their population) projections. All of the other larger municipalities, with the exception of Maple Ridge, came in below their projections on all three metrics.
Unfortunately Metro Vancouver did not detail how they arrived at their projections, so it is difficult to read too much into this discrepancy. But it provides a good opportunity to re-evaluate the metrics and assumptions they are based on. What we do see is a very good correspondence between dwelling and population growth, as is to be expected.
Of particular interest is the relationship between these metrics, in particular the one between dwellings and population. It can be useful to break down population change into population change due to change in dwelling units, population change due to change in household size and population change due to change in the rate of unoccupied dwelling units [as we have explained before](https://doodles.mountainmath.ca/blog/2017/02/10/2016-census-data/), with each component [mapped out across Canada](https://censusmapper.ca/maps/988?index=0#9/49.2158/-123.3023).
```{r}
# pow_combined_data %>%
# filter(Type=="Projection") %>%
# select(Metric,Value,`Region Name`) %>%
# spread(key=Metric,value=Value) %>%
# mutate(pop_dw=Population/Dwellings,
# emp_pop=Employment/Population) %>%
# gather(key="Ratio",value="Value",c("pop_dw","emp_pop")) %>%
# ggplot(aes(x=`Region Name`,y=Value-1,fill=Ratio)) +
# geom_bar(stat="identity",position="dodge") +
# coord_flip()
pow_combined_data %>%
filter(Metric!="Employment") %>%
select(GeoUID,Metric,Type,Value,`Region Name`) %>%
group_by(GeoUID,`Region Name`,Type) %>%
spread(key=Metric,value=Value) %>%
mutate(Value=Population/Dwellings) %>%
ggplot(aes(x=`Region Name`,y=Value-1,fill=Type)) +
geom_bar(stat="identity",position="dodge") +
scale_y_continuous(limits = c(-0.026,0.006), breaks=c(-0.02,-0.01,0),
labels=c(-0.02,-0.01,0) +1) +
coord_flip() +
theme_light() +
labs(title="Population growth rate vs dwelling growth rate 2011-2016",x="",y="",
fill="",
caption="Metro Vancouver Regional Growth Strategy July 28, 2017 Versions,\nStatCan Census 2011, 2016")
```
Generally we expect population to grow slower than dwellings due to shrinking household size, but this is not always the case as the data shows. For example, household size increase in the City of North Vancouver, and the increase was large just enough to make up for a slight increase in dwellings not occupied by usual residents. Maple Ridge and the District of Langley on the other hand experienced a slight drop in household size, but this was more than made up for by a drop in dwellings not occupied by usual residents. New Westminster saw both, an increase in household size and a drop in dwellings not occupied by usual residents, which boosted the ratio. Our interactive map makes it easy to explore the reasons for these census ratios [in the case of the population in private households only](https://censusmapper.ca/maps/988?index=0#9/49.2158/-123.3023) as well as [in a more complex visualization for the entire population](https://censusmapper.ca/maps/596#9/49.2526/-123.1437) (which slightly changes the interpretation for Maple Ridge, presumably due to an increase in collective dwellings).
We should point out that for Electoral A there has been a significant number of reclassification from collective to private dwellings within parts of UBC student housing that register as unoccupied by usual residents in the census, as well as significant new construction, both of which add to "unoccupied" dwellings and drag down the ratios from the census shown in the graph above.
To quantify the relationship between population and dwelling growth rates we can fit a simple linear model to the RGS targets, as well as to growth rates derived from census data.
```{r}
pop_dw_data <- pow_combined_data %>%
filter(Metric!="Employment") %>%
select(`Region Name`,Type,Metric,Value) %>%
spread(key=Metric,value=Value) %>%
left_join(
pow_combined_data %>%
filter(Metric=="Dwellings") %>%
ungroup() %>%
select(GeoUID,`2011`,Type) %>%
rename(`Base Dwellings`=`2011`),by=c("GeoUID","Type")) %>%
left_join(
pow_combined_data %>%
filter(Metric=="Population") %>%
ungroup() %>%
select(GeoUID,`2011`,Type) %>%
rename(`Base Population`=`2011`),by=c("GeoUID","Type"))
model1 = lm(Population~Dwellings,data=pop_dw_data %>% filter(Type=="Census"))
model2 = lm(Population~Dwellings,data=pop_dw_data %>% filter(Type=="Projection"))
ggplot(pop_dw_data,aes(x=Dwellings,y=Population,color=Type)) +
geom_point() +
geom_smooth(method="glm") +
theme_light() +
# ggrepel::geom_text_repel(aes(label=`Region Name`),data=pop_dw_data %>%
# filter(`Base Population`>=100000,
# Type=="Census"),min.segment.length = 0,show.legend =FALSE) +
labs(title="Population growth rate vs dwelling growth rate 2011-2016",
x="Annual dwelling growth",y="Annual population growth",
colour="",
caption="Metro Vancouver Regional Growth Strategy July 28, 2017 Versions,\nStatCan Census 2011, 2016")
```
Census data gives a better fit than to the RGS projections, with an adjusted *R<sup>2</sup>* of `r round(summary(model1)$adj.r.squared,3)` for census data and `r round(summary(model2)$adj.r.squared,3)` for the RGS. This points to room for refinement. One should be able to improve the model fit by accounting for collective dwellings, as well as number of units completed in the months before census day for each of the census years, [which has a noticeable impact on the number of unoccupied units as we have explained before](https://doodles.mountainmath.ca/blog/2017/04/03/joyce-collingwood/). A more detailed understanding of the number of bedrooms of the added (and lost) dwelling units, as well as tenure, is [a great predictor of household size](https://doodles.mountainmath.ca/blog/2019/04/20/a-bedroom-is-a-bedroom/), information on the mix of bedrooms should give another significant boost to improve the model fit. Lastly, we need to recognize that change in wealth is another measure, as change in wealth is generally associated with consumption of more housing (and shrinking household size).
The takeaway is that the relationship between new dwelling units and population growth is very predictable, and in our supply constrained market we find that population growth is constrained by dwelling growth.
## Planning for the region
Dwelling growth in our region is highly regulated. The Regional Growth Strategy is only a guiding document, but it is often used as a plan that allocates growth to the region overall and distributes it within the region.
We have already seen that there are significant latent population pressures, partially due to dwelling growth targets set out by the Growth Strategy not being met, and partially by these targets likely being too low. Here we want to focus on how the Growth Strategy is distributing that growth, and how that might relate to people's preferences.
```{r eval=FALSE, include=FALSE}
ggplot(metro_van_region_change %>%
ungroup %>%
filter(Period=="2011-2021",!grepl("Tsawwassen|Rest of",MUNICIPALITY)) %>%
mutate(MUNICIPALITY=factor(MUNICIPALITY,levels=metro_van_region_change %>%
filter(Metric=="Population",Period=="2011-2021") %>%
arrange(`2011`) %>%
pull(MUNICIPALITY))),
aes(x=MUNICIPALITY,y=Value-1,fill=Metric)) +
geom_bar(stat="identity",position="dodge") +
coord_flip() +
scale_fill_manual(values=metric_colors) +
scale_y_continuous(labels=scales::percent) +
theme_light() +
labs(x="",y="Projected annual growth",fill="",
title="Regional breakdown of projected growth 2011-2021",
caption="Metro Vancouver Regional Growth Strategy July 28, 2017 Version")
```
We have already seen above that the projections assume stronger growth in some of the outlying regions than in some of the central regions. To bring this out more, we compare the projected growth in the City of Vancouver to the rest of Metro Vancouver.
```{r}
ggplot(metro_van_region_change %>%
ungroup %>%
filter(Period=="2011-2021",grepl("^Vancouver$|Rest of",MUNICIPALITY)) %>%
mutate(MUNICIPALITY=factor(MUNICIPALITY,levels=metro_van_region_change %>%
filter(Metric=="Population",Period=="2011-2021") %>%
arrange(-`2011`) %>%
pull(MUNICIPALITY))),
aes(x=MUNICIPALITY,y=Value-1,fill=Metric)) +
geom_bar(stat="identity",position="dodge") +
coord_flip() +
scale_fill_manual(values=metric_colors) +
scale_y_continuous(labels=scales::percent) +
theme_light() +
labs(x="",y="Projected annual growth",fill="",
title="Regional breakdown of projected growth 2011-2021",
caption="Metro Vancouver Regional Growth Strategy July 28, 2017 Version")
```
The Regional Growth Strategy is projecting that the City of Vancouver will grow slower than the rest of the region. This makes sense if we assume that, on balance, people would rather live further out than they do right now. But I all evidence points toward the opposite being true, that is that people would, on balance, rather live in the City of Vancouver than in more outlying municipalities. Allocating growth in direct opposition to people's preferences serves to drive up prices and rents in the central areas, and to drive low income people out. Which is [exactly what we have seen when looking at income data](https://doodles.mountainmath.ca/blog/2017/09/26/evolution-of-the-income-distribution/) where incomes have rising much faster in the City of Vancouver than in Metro Vancouver overall, with the higher income brackets growing faster and low income brackets dropping faster in the City of Vancouver.
## Upshot
The last two sections illustrate two main dangers of an overly restrictive planning regimen.
* When planners don't manage to hit their dwelling targets, or set their dwelling targets too low, we end up with higher population pressures and hit labour market constraints.
* When planners don't manage to allocate the growth within the region according to people's preferences, it will create artificial price gradients and push people out of their preferred areas and lengthen commutes.
Both of these outcomes have strong negative consequences for individuals, as well as the for the region overall. This is why it is very important that planners pay close attention to early warning signals of dwelling shortages as well as within-region location preferences.
More transparent models would also be helpful, especially given how high the stakes are when the dwelling targets are set too low. It would be helpful if planners, maybe in conjunction with economists, could explain better what consequences different planning scenarios may likely have for the region, so the politicians and the general public can make better decisions on how the region should grow.
In our highly regulated environment it is prudent to develop better metrics that can be updated regularly and that help inform of region-wide housing shortages, as well as more localized housing shortages within the region. Recently [Nathan Lauster](https://twitter.com/LausterNa) and I set out to [draw up an outline of how such metrics could be constructed](https://doodles.mountainmath.ca/blog/2019/06/12/simple-metrics-for-deciding-if-you-have-enough-housing/), a task we are planning on following up on with more concrete metrics and code that implement them.
As usual, the code for the analysis is [available on GitHub](https://github.com/mountainMath/doodles/blob/master/content/posts/2019-08-01-on-vancouver-population-projections.Rmarkdown).
<!-- ## Growth without projections -->
<!-- It should be noted that there is an alternative model how to do this, one can simply let the market respond flexibly to population pressures and build as many dwelling units as it wants. Sometimes the market will build too many dwelling units in locations that people don't really want to live, resulting in lower prices and possibly financial losses for the developer. And building activity in that location will slow down. Sometimes the market will build too few units in a location and prices will rise, resulting in higher than expected gains for developers and building activity in that location will increase. In this model, the role of municipal and regional planning would be to facilitate the activity of developers by ensuring that adequate services are provided, that enough park space and other public services are available to accommodate the growth. This model greatly simplifies the planning process, as it only needs to respond to development activity and does not need to rely on projections. -->
You can’t perform that action at this time.