# DMI API Tutorial

This tutorial gives an introduction on how to use the Danish Meteorological Institute's (DMI) API to download historical weather data. The API documentation can be found [here](https://confluence.govcloud.dk/display/FDAPI).

The tutorial uses the Python programming language and is in the format of a Jupyter Notebook. The notebook can be downloaded and run locally, allowing new users to download data immediately. 

## Part 1: Retrieving data
Part 1 of this tutorial will show how to request data and convert it to a table format. Part 2 will deal with how to request specific data and more advanced data handling.


In [2]:
# Import necessary libraries
import requests # library for making HTTP requests
import pandas as pd # library for data analysis
# following command allows figures to be shown inline
%matplotlib inline

In order to access the API it is necessary to create a user and obtain an api-key. This key grants permission to retrieve data and allows DMI to generate usage statistics.

A guide to creating a user profile and getting an api-key can be found [here](https://confluence.govcloud.dk/pages/viewpage.action?pageId=26476690).



In [3]:
api_key = 'd3182f8b-381a-4e7c-968d-7b12759ce5c3' # insert your own key between the '' signs

An easy test to see if your api-key works is to paste the api url followed by a question mark and then your api-key into your browser, e.g.: https://dmigw.govcloud.dk/metObs/v1/observation?api-key=111111xx-1x11-11xx-1111-1x1111111xx1 (in the example, a API key is used that won't work for you).

This should return a page of data in the browser window.
<br><br>

In the following code block, data is retrieved using the *requests.get* function. Further information on REST APIs and HTTP request methods can be found [here](https://restfulapi.net/http-methods/).


In [10]:
url = 'https://dmigw.govcloud.dk/v2/metObs/collections/observation/items?stationId=06074' # url for the current api version
r = requests.get(url, params={'api-key': api_key}) # Issues a HTTP GET request
print(r, r.url)

<Response [200]> https://dmigw.govcloud.dk/v2/metObs/collections/observation/items?stationId=06074&api-key=d3182f8b-381a-4e7c-968d-7b12759ce5c3


The [response status code](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes) indicates whether the request was successful or not. A 200 code means that the retrieval was successful. 
<br/><br/>




Next, we extract the JSON file from the returned request object. [JSON](https://restfulapi.net/introduction-to-json/) is a human-readable format for data exchange.


In [17]:
json = r.json() # Extract JSON data
print(json) # Print the first two data entries

{'type': 'FeatureCollection', 'features': [{'geometry': {'coordinates': [10.1353, 56.0803], 'type': 'Point'}, 'id': '05b95f7e-91df-8f3b-30b5-f79cd51736c6', 'type': 'Feature', 'properties': {'created': '2021-11-09T18:47:22.955234Z', 'observed': '2021-11-09T18:50:00Z', 'parameterId': 'cloud_height', 'stationId': '06074', 'value': 150.0}}, {'geometry': {'coordinates': [10.1353, 56.0803], 'type': 'Point'}, 'id': '173ba231-c3d4-c0f8-aff8-509377a2c5d9', 'type': 'Feature', 'properties': {'created': '2021-11-09T18:47:22.941159Z', 'observed': '2021-11-09T18:50:00Z', 'parameterId': 'humidity', 'stationId': '06074', 'value': 97.0}}, {'geometry': {'coordinates': [10.1353, 56.0803], 'type': 'Point'}, 'id': '19bfc212-199b-77ab-50cc-3edb46e35a55', 'type': 'Feature', 'properties': {'created': '2021-11-09T18:47:22.956673Z', 'observed': '2021-11-09T18:50:00Z', 'parameterId': 'temp_dew', 'stationId': '06074', 'value': 9.8}}, {'geometry': {'coordinates': [10.1353, 56.0803], 'type': 'Point'}, 'id': '4286f2

Furthermore, the JSON object can be converted to a convenient table (DataFrame) using the Pandas library.

In [27]:
df = pd.DataFrame(json) # Convert JSON object to a Pandas DataFrame
# print(df.head()) # Print the first five rows of the DataFrame

ValueError: arrays must all be same length

From the above snippet of data it is possible to deduce that the timestamps are in the format of microseconds since January 1st 1970. Try copying one timestamp and convert it to reable date using [this tool](https://www.epochconverter.com/).

<br/>


The timestamps strings can be converted to a datetime object using the Pandas *to_datetime* function.

In [None]:
df['time'] = pd.to_datetime(df['timeObserved'], unit='us') # The unit 'us' corresponds to microseconds
print(df['time'].head()) # Print the first five timestamps

<br/>
Last, we will generate a list of all the parameters available.

In [None]:
parameter_ids = df['parameterId'].unique() # Generate a list of unique parameter ids
print(parameter_ids) # Print all unique parameter ids

<br/><br/>

## Part 2: Requesting specific data

The above example was a heavily simplied example to illustrate how the API can be accessed. For most applicatios the user wants to specify query criterias, such as:
1. Meterological stations (e.g. 04320, 06074, etc.)
2. Parameters (e.g. wind_speed, humidity, etc.)
3. Time frame (to and from time)
4. Limit (maximum number of observations)


In [None]:
# Start and end time should be specified in microseconds since January 1st 1970 (Unix time)
end_time = pd.datetime.today() # End time is defined as the current time
start_time = pd.datetime(2020,1,10) # Start time is defined as specific date

def datetime_to_unixtime(dt):
    '''Function converting a datetime objects to a Unix microsecond string'''
    return str(int(pd.to_datetime(dt).value*10**-3))


# Specify query parameters
params = {'api-key' : api_key,
          'from' : datetime_to_unixtime(start_time),
          'to' : datetime_to_unixtime(end_time),
          'stationId' : '06188',
          #'parameterId' : 'temp_mean_past1h',
          'limit' : '100000',
          }


r = requests.get(url, params=params) # submit GET request based on url and headers
print(r, r.url) # Print request status and url

N.B.: the parameterId was commented out above, which results in all the parameters being included. As of the time of writing it was only possible to request one or all parameters. The same is true for the stations. The *limit* parameter is the maximum number of observations you want to download, generally it should be set to a large value in order for it not to be limiting.


If the request was succesfull, the variable *r* now contains a JSON object with the requested data. Next, the JSON object is extracted and converted to a Pandas DataFrame as previously shown.

A new collumn is created named *time*, which is the observation times in the format of Python *datetime* objects. Also, the unused columns are deleted.

In [None]:
json = r.json() # Extract JSON object
df = pd.DataFrame(json) # Convert JSON object to a DataFrame

df['time'] = pd.to_datetime(df['timeObserved'], unit='us') # Set the DataFrame index as the observation time

df = df.drop(['_id', 'timeCreated', 'timeObserved'], axis=1) # Delete unused columns

df.index = df['time'] # Set the time as the index

print(df.head()) # Print the first five rows

Since the table includes the multile parameters from station *06188*, it is convenient to format the table such that the index is time and each column represents a unique parameter. A simple method for doing this is to set a multi-index and then unstack, as shown below.

Lastly, the data is visualized.

In [None]:
df2 = df.set_index(['time', 'parameterId']).drop_duplicates().unstack(level=-1)['value']

params = ['wind_speed', 'humidity', 'temp_dry'] # Chosing which parameters to plot

# Generate plot of data
ax = df2[params].interpolate().plot(figsize=(12,7), legend=False, fontsize=12, subplots=True)
ax[0].set_ylabel('Wind speed [m/s]', size=12)
ax[1].set_ylabel('Humidity [%]', size=12)
ax[2].set_ylabel('Air temperature [$^\circ$C]', size=12)
ax[2].set_xlabel('', size=12)


Useful links:
1. [Station numbers](https://confluence.govcloud.dk/display/FDAPI/Stations)
2. [Parameters](https://confluence.govcloud.dk/display/FDAPI/Parameters)
3. [Codes](https://confluence.govcloud.dk/display/FDAPI/Codes)
4. [FAQ](https://confluence.govcloud.dk/display/FDAPI/FAQ)
5. [Terms of use](https://confluence.govcloud.dk/display/FDAPI/Terms+of+Use)
6. [Operational status](https://confluence.govcloud.dk/display/FDAPI/Operational+Status+of+API)
7. [API uptime](http://status.govcloud.dk/)
8. [Contact & support](https://confluence.govcloud.dk/pages/viewpage.action?pageId=26476715)
9. [User creation](https://confluence.govcloud.dk/pages/viewpage.action?pageId=26476690)


[The Norwegian Meteorological Institute's historical weather data API](https://frost.met.no/)

[Swedish Meterological Institutes Open Data API](https://opendata.smhi.se/apidocs/metobs/)

*Updated on 27 January 2020 by Adam R. Jensen*