<img src="http://openenergy-platform.org/static/OEP_logo_2_no_text.svg" alt="OpenEnergy Platform" height="100" width="100"  align="left"/>

# OpenEnergyPlatform
<br><br>

## Usage of OpenEnergyPlatform oem2orm tool (using the oedialect)
Repository: https://github.com/openego/oedialect <br>
Documentation: http://oep-data-interface.readthedocs.io/en/latest/api/how_to.html

Please report bugs and improvements here: https://github.com/OpenEnergyPlatform/oedialect/issues <br>

In [None]:
__copyright__ = "Reiner Lemoine Institut"
__license__   = "GNU Affero General Public License Version 3 (AGPL-3.0)"
__url__       = "https://github.com/openego/data_processing/blob/master/LICENSE"
__author__    = "jh-RLI, christian-rli"

## Tutorial: Creating sql tables, reading spatial-data, uploading to the oedb

Takeaways:
- How to create a table on the OEP from a oemetadata file
- How to read sptial data (from .gkpg files) in python
- How to upload this data to the OEP using the OEP-API and the oedialect

### Overview

First we set up the environment with all dependencies and provide the credentials to connect to the OEP. Then we setup our sql tables using the oemetadata (oem) format in v1.4.0. The metadata strings should be reviewed first, in order to avoid unsupported datatypes and other inconsistencies inside the string. If we use our own oem data, this can lead to errors in the next steps. We use the oem2orm package to create sqlalchemy tables derived from the oemetadata files. We then create the tables on the oep using the oep API, which uses sqlachemy and the oedialect. After that you should always check if the tables exist and were created properly. If this looks fine we can proceed to the next step and import our spatial data into a geopandas dataframe in python and then upload the data using the oedialect again. Geopandas provides all input and output functionality to do so. In this tutorial we focus on reading spatial-data from .gpkg files. 

Have fun!

## 0. Setup

You need to be signed up to the [OEP](https://openenergy-platform.org/user/register) to access your username and API Token.

To run this Jupyter Notebook you need to create an execution environment that contains all packages listed in the [requirements.txt](https://github.com/OpenEnergyPlatform/data-preprocessing/blob/feature/oep-upload-oem2orm/data-import/oep-upload/tutorials/requirements.txt) file. Keep in mind that you have to select this environment as kernel (maybe add [new kernel](https://ipython.readthedocs.io/en/stable/install/kernel_install.html)) in jupyter notebook.

As described in the oem2orm [usage](https://github.com/OpenEnergyPlatform/data-preprocessing/blob/feature/oep-upload-oem2orm/data-import/oep-upload/README.md) details, it's best practice to clone the data-preprocessing repository on [GitHub](https://github.com/OpenEnergyPlatform/data-preprocessing) repository, as we only want to upload data that has been properly reviewed. The reviewed data can be found in the master branch. If you want, you may still use your own data with this example, but be sure to delete your tables afterwards. The oem2orm tool also requires the use of [Open Energy Metadata (oem)](https://github.com/OpenEnergyPlatform/oemetadata/blob/develop/oemetadata/v140/template.json)in v1.4.0 or lower.

In [1]:
import requests
import json
import geopandas as gpd
import getpass
import sqlalchemy as sa
from sqlalchemy.orm import sessionmaker
import oedialect
from oem2orm import oep_oedialect_oem2orm as oem2orm

If you want to see detailed runtime information on oem2orm functions or if errors occur, you can activate the logger with this simple setup function.

In [2]:
oem2orm.setup_logger()

Display logging information[Yes] or [No]:Yes
logging activated


### Connection to OEP

In order to to upload data to the OEP we first need to connect to it, using our OEP user name and token.

Note: You can view your token on your OEP profile page after logging in. 

The setup_db_connection function will promt for the user credentials and returns the DB nametuple which 
is used for all database interactions. DB contains the sqlachemy engine and metadata object. Since the OEP is the default database for oem2orm functions, we don't need to pass parameters to the function in this example.

In [3]:
# Default parameters: setup_db_connection(engine="postgresql+oedialect", host="openenergy-platform.org")
db = oem2orm.setup_db_connection()

Enter OEP-username:Jonas.Huber


## 1. Creating sql tables from oemetadata

The oemetadata format is a standardised json file format and required for all data uploaded to the OEP. It includes the data model and the used data types. This allows us to derive the necessary tables in sqlalchemy from it.

### Provide an oemetadata file / Data Input

In order to create the table(s) we need to tell python where to find our oemetadata file first. To do this we place the oem file in v1.4.0 in the folder "upload-example-metadata" in the current directory (Path of this jupyter notebbok) or provide a path to our oemetadata folder. oem2orm will process all files that are located in the folder. 

In [4]:
# folder structure like ".../tutorials/upload-example-metadata", store your oem files here
metadata_folder = oem2orm.select_oem_dir(oem_folder_name="upload-example-metadata")

### Setup a Table

The collect_tables_function collects all metadata files in a folder and retrives the SQLAlchemy ORM objects and returns them. The Tables are ordered by foreign key.

<span style='background:ligtblue'>
INFO: The red output fields represent information that is printed by the logger. This does not mean
that an error has occurred.
</span>

In [5]:
# iterates over the provided oem files and retrieves the data models as ORM (objects)
ordered_orm = oem2orm.collect_ordered_tables_from_oem(db, metadata_folder)

INFO:[Table('oep_metadata_table_example_v140_JH', MetaData(bind=Engine(postgresql+oedialect://Jonas.Huber:***@openenergy-platform.org)), Column('id', BIGINT(), table=<oep_metadata_table_example_v140_JH>, primary_key=True, nullable=False), Column('year', INTEGER(), table=<oep_metadata_table_example_v140_JH>), Column('value', FLOAT(), table=<oep_metadata_table_example_v140_JH>), Column('geom', Geometry(spatial_index=False), table=<oep_metadata_table_example_v140_JH>), schema='model_draft')]


### Create the new Table

Now we can use the function create_tables() from oem2orm to create all of our Table objects we just created in the database.

In [6]:
oem2orm.create_tables(db, ordered_orm)

### Check if the table exists

We use the OEP API to check if the table exists. The API returns a status code. If the table exists the code 200 is displayed, if the code 404 is shown the table does not exists 

In [7]:
oep_url = "http://openenergy-platform.org"
schema = "model_draft"
table= "oep_metadata_table_example_v140_JH"

result = requests.get(oep_url + '/api/v0/schema/' + schema + '/tables/' + table + '/columns')
json_result = result.json()
result.status_code

if result.status_code is 200:
    print(table + " exists")
else:
    print("Table does not exist")

# json_result


oep_metadata_table_example_v140_JH exists


### Add the metadata to your table

We use the API again and add the OEMetadata to the table. 

We accomplish this in 3 steps:
1. Load metadata from file to python
2. Send the metadata to the Table via the API
    2.1 We must verify the interaction with the OEP-API by providing the token again 
3. Check if the metadata exists on the table

In [8]:
# Load metadata from file to python    
meta_json = oem2orm.prepare_md_for_api_action(table, metadata_folder)

In [9]:
import os
import getpass


# Verify the interaction with the OEP-API by providing the token again 
token = oem2orm.setUserToken()
        
# Send the metadata to the Table via the API
oep_url = "https://openenergy-platform.org"
schema = "model_draft"
table= "oep_metadata_table_example_v140_JH"
data = meta_json
headers = {'Authorization': 'Token %s'%token, 'Accept' : 'application/json', 'content_type': 'application/json'}

api_create_metadata = requests.post(oep_url +
            "/api/v0/schema/{schema}/tables/{table}/meta/".format(
                schema=schema, table=table
            ),
    
            # data=meta_json,
            data=data,
            headers=headers)

print(api_create_metadata.json())
print(api_create_metadata.status_code)


{'reason': 'Metadata could not be parsed'}
400


In [10]:
# Check if the metadata exists on the table

result=requests.get(oep_url + "/api/v0/schema/{schema}/tables/{table}/meta/".format(schema=schema, table=table))
json_result = result.json()

if result.status_code is 200 and bool(json_result) is not False:
    print("The OEmetadata is created on the table")
else:
    print("Something's gone wrong. The metadata is still empty.")
    print(json_result)

Something's gone wrong. The metadata is still empty.
{}


### 1.1 Delete SQL tables 

If you dont want to upload data in the next step then your should delete you tables from the Database.

In [11]:
oem2orm.delete_tables(db, ordered_orm)

Please confirm that you would like to drop the following tables:
  0. model_draft.oep_metadata_table_example_v140_JH
Please confirm with either of the choices below:
- yes
- no
- the indexes to drop in the format 0, 2, 3, 5
Please type the choice completely as there is no default choice.Yes


ConnectionException: Answer contains no JSON: <Response [500]>

## 2. Reading spatial-data and 3. uploading it to the oedb

Geopandas offers functionality for spatial data. Its read_file() function accepts several source formats, like e.g. .gkpg and .geojson. The function is also able to import the data by using an url that provides the data.

For more information see: https://geopandas.org/io.html

### Read the data

In [None]:
example_gdf = gpd.read_file('../data/TemplateData.csv', layer='')

Looking at the first three lines of our dataframe:

In [None]:
example_df[:3]

### Insert (upload) data into Table
 
Uploading the information from our DataFrame is now done with a single command. Uploading data in this way will always delete the content of the table and refill it with new values every time. If you change 'replace' to 'append', the data entries will be added to the preexisting ones. (Connecting and uploading may take a minute.)

In [None]:
Session = sessionmaker(bind=engine)
session = Session()
try: 
    example_df.to_sql(table_name, conn, schema_name, if_exists='replace')
    print('Inserted to ' + table_name)
except Exception as e:
    session.rollback()
    raise
    print('Insert incomplete!')
finally:
    session.close()

You can also insert data manually into the table.

In [None]:
Session = sessionmaker(bind=engine)
session = Session()
try:
    insert_statement = ExampleTable.insert().values(
        [
            dict(variable='fairy dust', unit='t', year=2020, value=200),
            dict(variable='mana', unit='kg', year=1999, value=120),
            dict(variable='the force', unit='l', year=1998, value=1100)
        ]
    )
    session.execute(insert_statement)
    session.commit()
    print('Insert successful!')
except Exception as e:
    session.rollback()
    raise
    print('Insert incomplete!')
finally:
    session.close()

## 4. Retreiving data, verify that the tables exist and data were uploaded successfully

### Select from Table

Now  we can query our table to see if the data arrived.

In [None]:
Session = sessionmaker(bind=engine)
session = Session()
print(session.query(ExampleTable).all())
session.close()

## Storing Query Result in DataFrame
We can write the results of the query back into a DataFrame, where it's easier to handle.

In [None]:
Session = sessionmaker(bind=engine)
session = Session()
df = pd.DataFrame(session.query(ExampleTable).all())
session.close()
df

In [None]:
pip show sqlalchemy

In [None]:
pip show oedialect