.# Data Engineering Assignment v3
## Goal

The assignment is meant to show us your way of thinking, the technology you are familiar with and your way of dealing with structural choices. Delivering a perfect project is *not* the expectation. 

The goal is to construct an API that returns the estimated precipitation (rain) on any location in the Netherlands for a given moment in the past.

To do so, you will need to extract precipitation data from 34 weather stations located across the Netherlands, and apply an interpolating algorithm to assess the precipitation on the given location.


## Steps

- Extract data from external source(s)
- Clean and reformat data
- Setup interpolation model for a specific day with the extracted data (model functions provided)
- Create an api to query a precipitation for a specific location at the current time


In [2]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import requests
from lib.kriging import Precipitation

## 1. Load weatherstation position and id

The station properties are stored in the file 'weatherstations-NL.csv'. This contains name, id and location of each weather station.  
- [ ] Load the data into a pandas DataFrame
- [ ] Visualize the locations on a map. All stations are situated in The Netherlands

In [3]:
station_info = pd.read_csv('./data/weatherstations-NL.csv',header=None)
station_info.rename(columns={0: 'lat', 1: 'long', 2: 'id', 3: 'Name'}, inplace=True)
station_info['lat'] = station_info['lat'].str.replace('N','')
station_info['long'] = station_info['long'].str.replace('E','')
station_info.head()

Unnamed: 0,lat,long,id,Name
0,52.18,4.42,210,Valkenburg
1,52.27,6.88,290,Twenthe
2,52.93,4.78,235,De_Kooy
3,51.45,3.6,310,Vlissingen
4,52.32,4.78,240,Schiphol


In [4]:
import os 
import folium
from folium import plugins
import rioxarray as rxr
import earthpy as et
import earthpy.spatial as es

m = folium.Map(location=[station_info.lat[0], station_info.long[0]], 
               tiles = 'Stamen Terrain')
for index, row in station_info.iterrows():
    folium.Marker(
    location=[row[0], row[1]],
    popup=row[3],
    icon=folium.Icon()
    ).add_to(m)
m

Click Open link on the dialog shown by your browser
If you don’t see a dialog, click Launch Meeting below
Click Open link on the dialog shown by your browser
If you don’t see a dialog, click Launch Meeting below
## 2. Extract station's data

The hourly weather data can be retrieved from the KNMI, the Dutch meteorological institute.

The API url is: https://www.daggegevens.knmi.nl/klimatologie/uurgegevens.
Request the vars: 'PRCP'

Of all variables, only the column 'RH' is important, the sum of percipitation on that hour. Note that the unit is in 0.1 mm, so 10 represents 1 mm. The value -1 represents values < 0.05mm and can be translated into 0.5.

- [ ] Extract the precipitation data from the KNMI API for all stations and store it in a Panda's DataFrame.
- [ ] Merge the precipitation data with the weather station info from point 1.
- [ ] Reshape the inputs to match the format of the interpolation function.

## 3. Use data to predict precipitation at a given location
- [ ] Fit the reshaped data to the model (Precipitation)
- [ ] Output predicted precipitation at the requested location.

In [None]:
# X contain lat and lon coordinates, y contains the precipitation value. For required format check 
# the Precipitation class in lib/krigin

precip = Precipitation()
precip.fit(X, y)

In [None]:
precip.predict(X)

## 4. Create the API

Use the procedure developed so far to create the API that given a time and location, returns the precipitation level.  
The structure and technology used in the API are completely your choice, e.g. `fastapi`. 