<a href="https://colab.research.google.com/github/annvrowan/GEOV181/blob/main/GEOV181_Weather-Climate.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GEOV181: Weather and Climate

In this exercise we will look at how we can use publicly available data to describe the current weather in Bergen, how weather patterns vary over time, and space, and what are the uncertainties associated with our descriptions of the weather.


### How does this exercise work?

Similar to the previous module, this exercise is a [Jupyter](https://jupyter.org/) notebook. A Jupyter notebook contains a mix of text blocks and [Python](https://www.python.org/) code blocks that you can run yourself. The text blocks are used to explain the data analysis workflow of this exercise. The data analysis itself is performed in blocks of Python code. The Python code blocks contain a mix of actual code and comments, which start with `#` and which describe what each line of code below does.

We will run this notebook in [Google Colab](https://colab.research.google.com/). Google Colab gives us the opportunity to run this notebook online and in a web browser, which is the easiest way to complete this exercise.

Note: If you prefer not to use Google Colab then you can also run this notebook on your own computer. This requires installing Python and Jupyter on your own machine. The easiest to get things running is to install the Python distribution [Anaconda](https://www.anaconda.com/download/) (notice the odd preference for snake names in the Python world!). This may require a bit more fiddling, but your instructors for this exercise are happy to help out.

## Weather data from Norges Meteorologisk Institutt (MET Norway)
This exercise will make use of free meteorological data collected by MET Norway provided through a Creative Commons BY 4.0 licence:

<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons-lisens" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a>

This means that we can use these data as long as we acknowledge how they were collected by adding a comment such as:
*Credit should be given to The Norwegian Meteorological institute, shortened “MET Norway”, as the source of data, for example, by including in the text: "Data from The Norwegian Meteorological Institute" or "Based on data from MET Norway".*

## Exercise 1: Collecting weather observations
To access data from MET Norway we use their Frost API data service. To use this you first need to create a user number for yourself using your email address in the box named 'Create a user' here:
https://frost.met.no/howto.html
This generates a unqiue client ID that you can then use to access the data that you want.

In [None]:
# Libraries needed (pandas is not standard and must be installed in Python)
import requests
import pandas as pd

# Insert your own client ID here
client_id = '0ed7796b-057a-4134-8a5c-190293abfe41'

# client_id = '<INSERT CLIENT ID HERE>'

The next block of code retrieves data from Frost using the requests.get function. Its first argument is the endpoint we are going to get data from, in this case the observations endpoint. The next argument is a dictionary containing the parameters that we need to define in order to retrieve data from this endpoint: sources, elements and referencetime.

In [None]:
# Define endpoint and parameters
endpoint = 'https://frost.met.no/observations/v0.jsonld'
parameters = {
    'sources': 'SN18700,SN90450',
    'elements': 'mean(air_temperature P1D),sum(precipitation_amount P1D),mean(wind_speed P1D)',
    'referencetime': '2010-04-01/2010-04-03',
}
# Issue an HTTP GET request
r = requests.get(endpoint, parameters, auth=(client_id,''))
# Extract JSON data
json = r.json()

Once we have extracted the JSON from the request, we want to make sure that we actually got some data. This block will output detailed error information.

In [None]:
# Check if the request worked, print out any errors
if r.status_code == 200:
    data = json['data']
    print('Data retrieved from frost.met.no!')
else:
    print('Error! Returned status code %s' % r.status_code)
    print('Message: %s' % json['error']['message'])
    print('Reason: %s' % json['error']['reason'])

Data retrieved from frost.met.no!


Data retrieved from frost.met.no!
If the above block returned an error, then the rest cannot be run. If it printed out a success, then we are all set.
Below, we use the pandas library to insert the observation data into a table format. This is useful for doing all kinds of analysis on the returned data. If you only want to print or save the data, then pandas won't be necessary, and you can simply loop over the elements in the data variable.
Note: the block below is tailored to the observations endpoint, and changes would need to be made to accomodate other endpoints.

In [None]:
# This will return a Dataframe with all of the observations in a table format
df = pd.DataFrame()
for i in range(len(data)):
    row = pd.DataFrame(data[i]['observations'])
    row['referenceTime'] = data[i]['referenceTime']
    row['sourceId'] = data[i]['sourceId']
    df = pd.concat([row])

df = df.reset_index()

You can run the line below to get a preview of the DataFrame table.

In [None]:
df.head()

Unnamed: 0,index,elementId,value,unit,level,timeOffset,timeResolution,timeSeriesId,performanceCategory,exposureCategory,qualityCode,referenceTime,sourceId
0,0,mean(air_temperature P1D),6.1,degC,"{'levelType': 'height_above_ground', 'unit': '...",PT0H,P1D,0,C,2,2,2010-04-02T00:00:00.000Z,SN90450:0
1,1,mean(air_temperature P1D),3.4,degC,"{'levelType': 'height_above_ground', 'unit': '...",PT6H,P1D,0,C,2,2,2010-04-02T00:00:00.000Z,SN90450:0
2,2,sum(precipitation_amount P1D),0.0,mm,,PT18H,P1D,0,C,2,2,2010-04-02T00:00:00.000Z,SN90450:0
3,3,sum(precipitation_amount P1D),0.0,mm,,PT6H,P1D,0,C,2,2,2010-04-02T00:00:00.000Z,SN90450:0
4,4,mean(wind_speed P1D),3.9,m/s,"{'levelType': 'height_above_ground', 'unit': '...",PT0H,P1D,0,C,2,2,2010-04-02T00:00:00.000Z,SN90450:0


To make a shorter and more readable table, you can use the code below:

In [None]:
# These additional columns will be kept
columns = ['sourceId','referenceTime','elementId','value','unit','timeOffset']
df2 = df[columns].copy()
# Convert the time value to something Python understands
df2['referenceTime'] = pd.to_datetime(df2['referenceTime'])

In [None]:
# Preview the result
df2.head()

Unnamed: 0,sourceId,referenceTime,elementId,value,unit,timeOffset
0,SN90450:0,2010-04-02 00:00:00+00:00,mean(air_temperature P1D),6.1,degC,PT0H
1,SN90450:0,2010-04-02 00:00:00+00:00,mean(air_temperature P1D),3.4,degC,PT6H
2,SN90450:0,2010-04-02 00:00:00+00:00,sum(precipitation_amount P1D),0.0,mm,PT18H
3,SN90450:0,2010-04-02 00:00:00+00:00,sum(precipitation_amount P1D),0.0,mm,PT6H
4,SN90450:0,2010-04-02 00:00:00+00:00,mean(wind_speed P1D),3.9,m/s,PT0H
