# Download Environment Canada Daily Data

can skip this and download directly from this dropbox link:

[a500_data.zip](https://www.dropbox.com/s/1bganh60983pges/a500_pandas_data.zip?dl=0)

unzip this to create the folder data in the same pandas folder as this notebook.

In [1]:
from pathlib import Path

import pandas as pd
import requests

## Set the context for this notebook

Importing the context module will check to see whether
`data/processed` and `data/raw` exist and complain if
it can't find them

In [2]:
import context

in context.py, setting root_dir to /Users/phil/repos
******************************
context imported. Front of path:
/Users/phil/repos
/Users/phil/repos/pandas_yvr
******************************



## Station Inventory

* Instructions: copy and paste this url into a browser: (ftp://client_climate@ftp.tor.ec.gc.ca/Pub/Get_More_Data_Plus_de_donnees/)

* To get the station inventory (1.3 Mbyte csv file), copy and paste this url into
  a browser (`%20` is the blank space character)

  ftp://client_climate@ftp.tor.ec.gc.ca/Pub/Get_More_Data_Plus_de_donnees/Station%20Inventory%20EN.csv

  We have saved a copy in `data/Station Inventory EN .csv`

In [3]:
#
# note that whitespace (blanks, tabs, newlines) in a tuple are discarded
# so we can split a long string up like this:
#
f = (
    "ftp://client_climate@ftp.tor.ec.gc.ca/Pub/"
    "Get_More_Data_Plus_de_donnees/Station%20Inventory%20EN.csv"
)

In [5]:
inventory_file = context.data_dir / "Station Inventory EN.csv"
inventory = pd.read_csv(inventory_file, skiprows=3)

# Rename some of the columns to more convenient labels
cols_dict = {
    "TC ID": "Airport Code",
    "Station ID": "Env Canada ID",
    "Latitude (Decimal Degrees)": "Latitude (deg)",
    "Longitude (Decimal Degrees)": "Longitude (deg)",
}
inventory = inventory.rename(columns=cols_dict)

print(inventory.shape)
inventory.head()

(8756, 19)


Unnamed: 0,Name,Province,Climate ID,Env Canada ID,WMO ID,Airport Code,Latitude (deg),Longitude (deg),Latitude,Longitude,Elevation (m),First Year,Last Year,HLY First Year,HLY Last Year,DLY First Year,DLY Last Year,MLY First Year,MLY Last Year
0,ACTIVE PASS,BRITISH COLUMBIA,1010066,14,,,48.87,-123.28,485200000,-1231700000,4.0,1984,1996,,,1984.0,1996.0,1984.0,1996.0
1,ALBERT HEAD,BRITISH COLUMBIA,1010235,15,,,48.4,-123.48,482400000,-1232900000,17.0,1971,1995,,,1971.0,1995.0,1971.0,1995.0
2,BAMBERTON OCEAN CEMENT,BRITISH COLUMBIA,1010595,16,,,48.58,-123.52,483500000,-1233100000,85.3,1961,1980,,,1961.0,1980.0,1961.0,1980.0
3,BEAR CREEK,BRITISH COLUMBIA,1010720,17,,,48.5,-124.0,483000000,-1240000000,350.5,1910,1971,,,1910.0,1971.0,1910.0,1971.0
4,BEAVER LAKE,BRITISH COLUMBIA,1010774,18,,,48.5,-123.35,483000000,-1232100000,61.0,1894,1952,,,1894.0,1952.0,1894.0,1952.0


## Info for Selected Station

To download data for Vancouver Airport station (airport code YVR), we need the ID codes used by Environment Canada for this station.  Here is how we find the numerical code 'YVR'.
Note that it has changed at some point from 889 to 51442

In [6]:
station = "YVR"

# Extract the inventory row(s) corresponding to this station
station_info = inventory[inventory["Airport Code"] == station]
station_info

Unnamed: 0,Name,Province,Climate ID,Env Canada ID,WMO ID,Airport Code,Latitude (deg),Longitude (deg),Latitude,Longitude,Elevation (m),First Year,Last Year,HLY First Year,HLY Last Year,DLY First Year,DLY Last Year,MLY First Year,MLY Last Year
996,VANCOUVER INTL A,BRITISH COLUMBIA,1108395,51442,71892.0,YVR,49.19,-123.18,491141000,-1231102000,4.3,2013,2019,2013.0,2019.0,2013.0,2019.0,,
1008,VANCOUVER INT'L A,BRITISH COLUMBIA,1108447,889,,YVR,49.2,-123.18,491142000,-1231055000,4.3,1937,2013,1953.0,2013.0,1937.0,2013.0,1937.0,2013.0


### Download Data

First, define a function to download the CSV data using the Environment Canada API:

In [7]:
def download_daily_raw(env_canada_id, year, savefile="test.csv", verbose=True):
    """Download CSV file of daily data for selected station and year"""

    # URL endpoint and query parameters
    url_endpoint = "http://climate.weather.gc.ca/climate_data/bulk_data_e.html"
    params = {
        "format": "csv",
        "stationID": env_canada_id,
        "Year": year,
        "Month": "01",
        "Day": "01",
        "timeframe": "2",
        "submit": " Download Data",
    }

    # Send GET request
    response = requests.get(url_endpoint, params=params)

    # Download CSV file
    if verbose:
        print(f"Saving to {savefile}")
    with open(savefile, "wb") as f:
        f.write(response.content)

    return None

*Note: The code below uses [f-strings](https://realpython.com/python-f-strings/) to substitute variable values into a string*

In [8]:
# Early data (1937 to mid 2013)
stn_id_early = 889  # station id for YVR airport
years_early = range(1937, 2014)

for year in years_early:
    savefile = context.raw_dir / Path(f"weather_daily_{station}_{stn_id_early}_{year}.csv")
    download_daily_raw(stn_id_early, year, savefile=savefile)

Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1937.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1938.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1939.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1940.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1941.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1942.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1943.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1944.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1945.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1946.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1947.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_889_1948.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weat

In [10]:
# Recent data (mid 2013 to 2019)
stn_id_recent = 51442
years_recent = range(2013, 2021)

for year in years_recent:
    savefile = (context.raw_dir / 
                Path(f"weather_daily_{station}_{stn_id_recent}_{year}.csv"))
    download_daily_raw(stn_id_recent, year, savefile=savefile)

Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2013.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2014.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2015.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2016.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2017.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2018.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2019.csv
Saving to /Users/phil/repos/pandas_yvr/data/raw/weather_daily_YVR_51442_2020.csv
