# New housing starts and completions in Canada
*June 15, 2022*

Let's take a look at some housing data in Canada (always a hot topic). Specifically, I want to see new builds and completions across the country.

We start by importing pandas, for working with the data, and a few modules to handle zip files (so we can access the latest data directly from StatsCan).

In [4]:
import pandas as pd
from zipfile import ZipFile
from io import BytesIO
import requests
import datawrappergraphics

Now we bring in data and pull it out of the zip file.

In [5]:
r = requests.get("https://www150.statcan.gc.ca/n1/en/tbl/csv/34100143-eng.zip?st=XKh0NRc8")
files = ZipFile(BytesIO(r.content))
file = files.open(files.namelist()[0])
raw = pd.read_csv(file, encoding="utf-8")

raw.head(5)

  raw = pd.read_csv(file, encoding="utf-8")


Unnamed: 0,REF_DATE,GEO,DGUID,Housing estimates,Type of unit,UOM,UOM_ID,SCALAR_FACTOR,SCALAR_ID,VECTOR,COORDINATE,VALUE,STATUS,SYMBOL,TERMINATED,DECIMALS
0,1948-01,Canada,2016A000011124,Housing starts,Total units,Units,300,units,0,v729949,1.1.1,1456.0,,,,0
1,1948-01,Canada,2016A000011124,Housing under construction,Total units,Units,300,units,0,v730976,1.2.1,29510.0,,,,0
2,1948-01,Canada,2016A000011124,Housing completions,Total units,Units,300,units,0,v731911,1.3.1,3196.0,,,,0
3,1948-01,Atlantic provinces,2016A00011,Housing starts,Total units,Units,300,units,0,v729950,2.1.1,53.0,,,,0
4,1948-01,Atlantic provinces,2016A00011,Housing under construction,Total units,Units,300,units,0,v730977,2.2.1,,x,,,0


### New starts

Let's look at Canada-wide stuff, and we'll start with housing starts for all types of units (single family detached, apartments etc).

In [47]:
canada = (raw
          .loc[(raw["GEO"] == "Canada") &
               (raw["Housing estimates"] == "Housing starts") &
               (raw["Type of unit"] == "Total units"), :]
          .pivot(columns="GEO", index="REF_DATE", values="VALUE")
          .reset_index()
          )

canada.tail(5)

GEO,REF_DATE,Canada
888,2022-01,13388.0
889,2022-02,15453.0
890,2022-03,16099.0
891,2022-04,20775.0
892,2022-05,22850.0


Because this data is not seasonally adjusted, and there are way more new builds started in summer months than in winter ones, let's just look at May to May for each year.

In [43]:
canada["REF_DATE"] = pd.to_datetime(canada["REF_DATE"])
may = canada[canada["REF_DATE"].dt.month == 5]

may.tail(5)

GEO,REF_DATE,Canada
844,2018-05-01,15985.0
856,2019-05-01,16409.0
868,2020-05-01,16014.0
880,2021-05-01,22098.0
892,2022-05-01,22850.0


Now we'll send that dataframe to datawrapper!

In [36]:
(datawrappergraphics.Chart("NrVAt")
    .data(may)
    .head(f"New housing <b>starts</b> in May in Canada since 1950")
    .publish()
    .show()
 )

INFO:root:SUCCESS: Data added to chart.
INFO:root:SUCCESS: Metadata updated.
INFO:root:SUCCESS: Chart head added.
INFO:root:SUCCESS: Chart published!


Let's also see where the most recent month ranks, all-time.

In [54]:
rank = canada.sort_values("Canada", ascending=False)

rank["rank"] = range(1, len(canada)+1)

rank = rank.set_index("rank")

rank.head(10)

GEO,REF_DATE,Canada
rank,Unnamed: 1_level_1,Unnamed: 2_level_1
1,2021-11,25004.0
2,1987-05,24846.0
3,2021-06,23573.0
4,1987-06,23433.0
5,1976-06,23301.0
6,1975-10,23181.0
7,1970-10,23161.0
8,2022-05,22850.0
9,1976-05,22799.0
10,1972-10,22592.0


It ranks 8th all-time, but honestly: it's not too far behind the month with the record (November 2021).

### Completions

Now we'll do the same, but for completions rather than starts.

In [44]:
completions = (raw
               .loc[(raw["GEO"] == "Canada") &
                    (raw["Housing estimates"] == "Housing completions") &
                    (raw["Type of unit"] == "Total units"), :]
                .pivot(columns="GEO", index="REF_DATE", values="VALUE")
                .reset_index()
                )

completions.tail(5)

GEO,REF_DATE,Canada
888,2022-01,14218.0
889,2022-02,13624.0
890,2022-03,16411.0
891,2022-04,14721.0
892,2022-05,17761.0


And again, show May to May.

In [45]:
completions["REF_DATE"] = pd.to_datetime(completions["REF_DATE"])
completions = completions[completions["REF_DATE"].dt.month == 5]

completions.tail(5)

GEO,REF_DATE,Canada
844,2018-05-01,16696.0
856,2019-05-01,14020.0
868,2020-05-01,14797.0
880,2021-05-01,18035.0
892,2022-05-01,17761.0


In [38]:
(datawrappergraphics.Chart("Z2ag8")
    .data(completions)
    .head(f"New housing <span style='color:#C42127; font-weight:bold'>completions</span> in May in Canada since 1950")
    .publish()
    .show()
 )

INFO:root:SUCCESS: Data added to chart.
INFO:root:SUCCESS: Metadata updated.
INFO:root:SUCCESS: Chart head added.
INFO:root:SUCCESS: Chart published!


That's all for now. It would also be interesting to look at this data for other regions in Canada to see where housing starts or completions are surging or faltering.

\-30\-