A small side project (from this already-side-project!) to look at the lowest recorded temperatures in the area. 
I am looking at installing a high-efficiency heat pump that only works when temperatures are above -13F (-25C). 
When temperatures are lower than that, the heat pump stops working and the house would have no heat at the worst possible time. 
So the obvious question is: How often does it get below (or near) that temperature?

I suspect this information is available online somewhere, but I've got all this tooling right here, so what the heck.

In [None]:
import download_historical_data as dl
import os 
import matplotlib.pyplot as plt
import pandas as pd

plt.style.use("default")  # alternative "ggplot"

HISTORICAL_DATA_DIR = os.path.abspath("./historical_data")
WEATHER_DATA_DIR = os.path.join(HISTORICAL_DATA_DIR, "weather_station_data")
ANALYSIS_DATA_DIR = os.path.abspath("./analysis_data/")
LOW_TEMP_DATA_DIR = os.path.join(ANALYSIS_DATA_DIR, "low_temps")

for dir in [HISTORICAL_DATA_DIR, WEATHER_DATA_DIR, LOW_TEMP_DATA_DIR, ANALYSIS_DATA_DIR]:
    if not os.path.exists(dir):
        os.makedirs(dir)


## Denver-area stations only
WEATHER_STATION_IDS = [
    "USC00053005",  # Ft Collins
    "USC00050848",  # Boulder
    "USC00055984",  # Northglenn
    "USC00058995",  # Wheat Ridge
    "USC00054762",  # Lakewood
    "USW00023062"   # Stapleton
]

In [None]:
# Uncomment to force re-download of source data
# Data files are saved locally so you only need to re-download to get new/different data
#dl.download_ghcnd_historical_data(WEATHER_DATA_DIR, WEATHER_STATION_IDS)

In [None]:
temp_df = dl.read_weather_data(WEATHER_DATA_DIR, WEATHER_STATION_IDS, earliest_date=None)
temp_df.head()

In [None]:
for c in temp_df.columns:
    if c.endswith("_tmax"):
        temp_df.drop(c, axis="columns", inplace=True)

renames = {
    "USC00053005_tmin": "FtCollins",
    "USC00050848_tmin": "Boulder",
    "USC00055984_tmin": "Northglenn",
    "USC00058995_tmin": "WheatRidge",
    "USC00054762_tmin": "Lakewood",
    "USW00023062_tmin": "Stapleton"
}
temp_df.rename(columns=renames, inplace=True)
temp_df.head()

In [None]:
stacked = temp_df.stack()
# stacked.columns.set_names(names="min_temp", inplace=True)
stacked.index.set_names(names="station", level=1, inplace=True)
stacked = pd.DataFrame(stacked)
stacked.rename(columns={0:"celsius"}, inplace=True)
stacked["fahrenheit"] = (stacked["celsius"] * 1.8) + 32
stacked.head()

In [None]:
## Let's get rid of data before 2005
stacked = stacked.loc(axis=0)['2005-01-01':]
stacked.head(10)

In [None]:
# Sort by date, latest->earliest
stacked.sort_index(axis=0, level=0, ascending=False, inplace=True)
stacked.head(10)

In [None]:
sorted = stacked.sort_values(by="celsius", ascending=False, kind="stable")
sorted.head()

In [None]:
# How many days get down to our cutoff temp of -25C?
very_low_days = sorted.where(sorted["celsius"] <= -25).dropna()
grouped = very_low_days.groupby(by="date", axis=0, level=0)
len(grouped)

In [None]:
very_low_days.head(25)

In [None]:
## How many days get _close_ to our cutoff?
# TODO: This is actually counting station-days, not just days
low_days = sorted[sorted["celsius"] < -20]
grouped_low = low_days.groupby(by="date", axis=0, level=0)
len(grouped_low)

In [None]:
low_days.head(20)

In [None]:
low_days.hist()

OK, so there are some cold days. How cold does it get here? ie, what temperature would a heat pump have to support in order to be reliable in the face of previous low temperatures?

The obvious way to answer that is to look at the coldest temperatures recorded recently. 

In [None]:
## Record-breaking coldest days
rev_sorted = stacked.sort_values(by="celsius", ascending=True, kind="stable")
rev_sorted.head(20)