<img src="https://raw.githubusercontent.com/OpenEnergyPlatform/academy/develop/docs/data/img/OEP_logo_2_no_text.svg" alt="OpenEnergy Platform" height="75" width="75" align="left"/>

# Create a (Geo)Dataframe from OEP Data and export it as geopackage

Repository: https://github.com/OpenEnergyPlatform/tutorial <br>
Please report bugs and improvements here: https://github.com/OpenEnergyPlatform/examples/issues <br>
How to get started with Jupyter Notebooks can be found here: https://realpython.com/jupyter-notebook-introduction/ <br>


license: [**GNU Affero General Public License Version 3 (AGPL-3.0)**](https://github.com/openego/data_processing/blob/master/LICENSE)<br> 
copyright: **Reiner Lemoine Institut** <br>
authors: **TuPhanRLI, christian-rli** <br>

## Introduction

This tutorial gives you an overview of the [**OpenEnergy Platform**](https://openenergy-platform.org/) and how you can work with the **RESTful-HTTP** API in Python to access geodata. <br>
The full API documentaion can be found on [ReadtheDocs.io](https://oep-data-interface.readthedocs.io/en/latest/api/how_to.html).

In order to run this entire notebook you need to have some python packages installed. Install them all by using the requirements.txt and running `pip install -r requirements.txt`. Note the colored info blocks:

<br>
<div class="alert alert-block alert-danger">
This is an important information!
</div>
<div class="alert alert-block alert-info">
This is an information!
</div>
<div class="alert alert-block alert-success">
This is your task!
</div>

## Content

1 Select data <br>
2 Make a pandas dataframe <br>
3 Plot a dataframe (geo plot)<br>
4 Save data 

In [14]:
# pip install requests pandas missingno geopandas shapely matplotlib

import os
import getpass

import requests
import pandas as pd
import missingno
import geopandas as gpd
from shapely import wkb
import matplotlib.pyplot as plt

## 1. Select data

This will select the following table from the OEP: https://openenergy-platform.org/dataedit/view/openstreetmap . 

You can change the details to address other tables.

In [15]:
# NEVER commit your token to a repository
# get your token from an environment variable
# or ask user
token = os.environ.get("OEP_API_TOKEN") or getpass.getpass('Token:')

In [16]:
# select data
schema = 'openstreetmap'
table = 'osm_deu_point_windpower'
oep_url = 'openenergy-platform.org'
requested_data = requests.get('https://'+oep_url+'/api/v0/schema/'+schema+'/tables/'+table+'/rows')
requested_data.status_code

<div class="alert alert-block alert-info">
<b>Response [200]</b> succesfully selected data! <br>
<b>Response [404]</b> table doesn't exist!
</div>

## 2. Make a pandas dataframe

The API returns data in json format. In order to be more flexible with it, we'll convert it to a pandas dataframe.


In [None]:
#Create dataframe from json format
df = pd.DataFrame(requested_data.json())

<div class="alert alert-block alert-success">
Let's take a look at our data!
</div>

In [None]:
# Show metadata for a specific dataframe.
df.info()

In [None]:
#Print the df_pp dataframe as table.
df.head()

In [None]:
#visualization of the dataframe 
missingno.bar(df, color='tab:blue');

## 3. Plot a dataframe (geo plot)

<div class="alert alert-block alert-success">
Geoinformation can come in different representations. Two commons ways are `well known text` (WKT) and `well known binary` (WKB). We can convert these. In pandas to apply a change to every entity in a column we can use its apply function.
</div>

In [None]:
#Print the df geodataframe as table with geometry data
df.geom.head()

In [None]:
# transform WKB to WKT / Geometry specially the geom column
df['geom'] = df['geom'].apply(lambda x:wkb.loads(x, hex=True))

The data of this table is encoded in the coordinate reference system UTM Zone 33 North. 

In [None]:
#Print the gdf geodataframe as table with geometry data
df.geom.head()

<div class="alert alert-block alert-success">
Finally, let's plot our data!
</div>

* crs parameters can be changed depends on your source and location.
* At the following lines there are possibilities to set up the crs variable.
* WGS84 Latitude/Longitude: "EPSG:4326"
* UTM Zone 33 North: "EPSG:32633"

In [None]:
# geo plot data
crs = {'init' :'epsg:32633'}
gdf = gpd.GeoDataFrame(
                        df,# specifify your dataframe
                        crs=crs, # this is your coordinate system
                        geometry=df.geom # specify the geometry list we created
                        )
base1 = gdf.plot(color='white', edgecolor='black',figsize=(16,16))
gdf.plot(ax=base1, color='tab:blue')
plt.show()

In [None]:
# Show metadata for a specific (geo)dataframe.
gdf.info()

## 4. Save Data

Geodataframes have a function to easily store in different file types. In the following we'll store the data in GeoJSON and geopackage.

In [None]:
# Convert the GeoDataFrame to GeoPackage and GeoJSON Format and save file 
# at the folder path "output_GeoData"
gdf.geometry.to_file("output_example_geo_json.geojson", driver='GeoJSON')
gdf.geometry.to_file("output_example_geopackage.gpkg", layer='data_example', driver="GPKG")