![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fcurriculum-notebooks&branch=master&subPath=Mathematics/StatisticsProject/AccessingData/weather-hourly-canada.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# Weather in Canada

[The National Office of Climate Services at Environment and Climate Change Canada](https://www.canada.ca/en/environment-climate-change/services/climate-change/canadian-centre-climate-services.html) has a dataset of weather measurements at monthy, daily, or even hourly intervals.

## List of Weather Stations

Let's start by importing and mapping the locations of weather stations in Canada from the [Government of Canada Open Data Portal](https://open.canada.ca/data/en/dataset/9764d6c6-3044-450c-ac5a-383cedbfef17).

In [None]:
import pandas as pd
import requests
stations = pd.read_csv('https://dd.weather.gc.ca/observations/doc/swob-xml_station_list.csv')
print('Mapping {} weather stations.'.format(len(stations)))
import folium
from folium.plugins import MarkerCluster
latitude = stations['Latitude'].mean()
longitude = stations['Longitude'].mean()
station_map = folium.Map(location=[latitude,longitude], zoom_start=3)
marker_cluster = MarkerCluster()
for row in stations.itertuples():
    marker_cluster.add_child(folium.Marker(location=[row.Latitude,row.Longitude], popup=row.Name))
station_map.add_child(marker_cluster)
station_map

Since measuring instruments can change and weather stations can be relocated, the data have been [homogenized](https://climate-change.canada.ca/climate-data/#/adjusted-station-data) to account for these factors. Let's map weather stations with homogenized data.

In [None]:
hstations = pd.read_csv('https://dd.weather.gc.ca/climate/observations/climate_station_list.csv')
print('Mapping {} weather stations.'.format(len(hstations)))
hstation_map = folium.Map(location=[hstations['Latitude'].mean(),hstations['Longitude'].mean()], zoom_start=3)
marker_cluster = MarkerCluster()
for row in hstations.itertuples():
    marker_cluster.add_child(folium.Marker(location=[row.Latitude,row.Longitude], popup=row._1+', '+row.Province+', '+row._6)) # the name is the first column after the index, so we use _1
hstation_map.add_child(marker_cluster)
hstation_map

At the time of writing, there are five weather stations in the data set that have 0 values for the `Latitude` and `Longitude`, which you will be able to see on the map if they haven't been corrected.

### Filtering Station Data

To see how many weather stations there are in a province, you can filter the data.

In [None]:
hstations[hstations['Province']=='ALBERTA']

You can also find just the stations that are currently collecting hourly weather data.

In [None]:
hstations[hstations['HLY Last Year']>2022]

To check out the data available for a particular station, you can filter by `Station Name` or `Climate ID`.

In [None]:
hstations[hstations['Climate ID']=='3020035']

## Weather Data

Once you have selected a station, you can then import a year worth of daily weather data for that station.

In [None]:
station_id = '3020035'
year = '2022'

provinces = {'ALBERTA':'AB','BRITISH COLUMBIA':'BC','MANITOBA':'MB','NEW BRUNSWICK':'NB','NEWFOUNDLAND':'NL','NORTHWEST TERRITORIES':'NT','NOVA SCOTIA':'NS','NUNAVUT':'NU','ONTARIO':'ON','PRINCE EDWARD ISLAND':'PE','QUEBEC':'QC','SASKATCHEWAN':'SK','YUKON TERRITORY':'YT'}
province = provinces[hstations[hstations['Climate ID']=='3017700']['Province'].values[0]]
weather = pd.DataFrame()
for month in range(1,13):
    if month < 10:
        month = '0'+str(month)
    else:
        month = str(month)
    url = 'https://dd.weather.gc.ca/climate/observations/daily/csv/'+province+'/climate_daily_'+province+'_'+station_id+'_'+year+'-'+month+'_P1D.csv'
    try:
        monthly_weather = pd.read_csv(url, encoding='ISO-8859-1')
        weather = pd.concat([weather,monthly_weather])
    except:
        print('No data for '+month)
weather

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)