# Power Plants in Germany

This file covers german power plants. It downloads the power plant list from the BNetzA and augments it with more information.

## Table of Contents
* [Power Plants in Germany](#Power-Plants-in-Germany)
* [License](#License)
* [Prepare the environment](#Prepare-the-environment)
* [Specify the source URLs:](#Specify-the-source-URLs:)
* [Define functions](#Define-functions)
* [Downloads](#Downloads)
	* [Download the BNetzA power plant list](#Download-the-BNetzA-power-plant-list)
	* [Download the Uba Plant list](#Download-the-Uba-Plant-list)
* [Translate contents](#Translate-contents)
	* [Columns](#Columns)
	* [Fuel types](#Fuel-types)
	* [Power plant status](#Power-plant-status)
	* [CHP Capability](#CHP-Capability)
	* [EEG](#EEG)
	* [UBA Columns](#UBA-Columns)
* [Process data](#Process-data)
	* [Set index to the BNetzA power plant ID](#Set-index-to-the-BNetzA-power-plant-ID)
	* [Merge data from UBA List](#Merge-data-from-UBA-List)
		* [case 1-1](#case-1-1)
		* [case n-1](#case-n-1)
		* [case 1-n](#case-1-n)
		* [Merge into plantlist](#Merge-into-plantlist)
	* [Delete fuels not in focus](#Delete-fuels-not-in-focus)
	* [Add Columns for shutdown and retrofit](#Add-Columns-for-shutdown-and-retrofit)
	* [Convert input colums to usable data types](#Convert-input-colums-to-usable-data-types)
	* [Identify generation technology](#Identify-generation-technology)
		* [Process technology information from UBA list](#Process-technology-information-from-UBA-list)
		* [Identify generation technology based on BNetzA information](#Identify-generation-technology-based-on-BNetzA-information)
	* [Add efficiency data](#Add-efficiency-data)
		* [Efficiencies from research](#Efficiencies-from-research)
			* [Import data](#Import-data)
			* [Plot efficiencies by year of commissioning](#Plot-efficiencies-by-year-of-commissioning)
			* [Determine least-squares approximation based on researched data (planned)](#Determine-least-squares-approximation-based-on-researched-data-%28planned%29)
			* [Apply efficiency approximation from least squares approximation (planned)](#Apply-efficiency-approximation-from-least-squares-approximation-%28planned%29)
		* [Efficiencies from literature](#Efficiencies-from-literature)
			* [Import data](#Import-data)
			* [Apply efficiency approximation from literature](#Apply-efficiency-approximation-from-literature)
	* [Add geodata](#Add-geodata)
* [Define final output](#Define-final-output)
* [Documenting the data package (meta data)](#Documenting-the-data-package-%28meta-data%29)
* [Write the results to file](#Write-the-results-to-file)


# License

- This notebook is published under the LICENSENAME

# Prepare the environment

In [1]:
import urllib.request
import csv
import pandas as pd
import numpy as np
import posixpath
import urllib.parse
import datetime  
import re
import os.path
import yaml  # http://pyyaml.org/, pip install pyyaml, conda install pyyaml
import json
import subprocess
import sqlite3

from bokeh.charts import Scatter, show
from bokeh.io import output_notebook
output_notebook()
%matplotlib inline
import logging
logger = logging.getLogger('notebook')
logger.setLevel('INFO')
nb_root_logger = logging.getLogger()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s',
                              datefmt='%d %b %Y %H:%M:%S')
nb_root_logger.handlers[0].setFormatter(formatter)

#create download and output folders if they do not exist
if not os.path.exists('downloads/'): os.makedirs('downloads/')
if not os.path.exists('output/'): os.makedirs('output/')
if not os.path.exists('output/datapackage_powerplants_germany'): os.makedirs('output/datapackage_powerplants_germany')    
if not os.path.exists('output/datapackage_powerplants_germany/original_data/'): os.makedirs('output/datapackage_powerplants_germany/original_data/')

# Specify the source URLs:

In [2]:
# BNetzA Power plant list
url_bnetza = 'http://www.bundesnetzagentur.de/SharedDocs/Downloads/DE/Sachgebiete/Energie/Unternehmen_Institutionen/Versorgungssicherheit/Erzeugungskapazitaeten/Kraftwerksliste/Kraftwerksliste_CSV.csv?__blob=publicationFile&v=5'

# UBA Power plant list
url_uba = 'http://www.umweltbundesamt.de/sites/default/files/medien/376/dokumente/kraftwerke_in_deutschland_ab_100_megawatt_elektrischer_leistung_2015_09.xls'

# Define functions

This section defines functions used multiple times within this script

In [3]:
def downloadandcache(url):
    """This function downloads a file into a folder called 
    downloads and returns the local filepath."""
    path = urllib.parse.urlsplit(url).path
    filename = posixpath.basename(path)
    now = datetime.datetime.now()
    datestring = ""
    datestring = str(now.year)+"-"+str(now.month)+"-"+str(now.day)
    filepath = "downloads/"+datestring+"-"+filename
    filepath_original_data = "output/datapackage_powerplants_germany/original_data/"+filename
    
    #check if file exists, otherwise download it
    if os.path.exists(filepath) == False:
        print("Downloading file", filename)
        urllib.request.urlretrieve(url, filepath)
        urllib.request.urlretrieve(url, filepath_original_data)
    else:
        print("Using local file from", filepath)
    filepath = './'+filepath
    return filepath


# Downloads

## Download the BNetzA power plant list

This section downloads the BNetzA power plant list and converts it to a pandas data frame

In [4]:
bnetza_data_filepath=(downloadandcache(url_bnetza))
plantlist=pd.read_csv(bnetza_data_filepath, 
               skiprows=9,
               sep=';',  # CSV field separator, default is ','
               thousands='.',  # Thousands separator, default is ','
               decimal=',',  # Decimal separator, default is '.')  
               encoding='cp1252')
plantlist.head()

Using local file from downloads/2016-4-14-Kraftwerksliste_CSV.csv


Unnamed: 0,Kraftwerksnummer Bundesnetzagentur,Unternehmen,Kraftwerksname,PLZ (Standort Kraftwerk),Ort (Standort Kraftwerk),Straße und Hausnummer (Standort Kraftwerk),Bundesland,Blockname,Aufnahme der kommerziellen Stromerzeugung der derzeit in Betrieb befindlichen Erzeugungseinheit (Jahr),Kraftwerksstatus (in Betrieb/ vorläufig stillgelegt/ saisonale Konservierung Reservekraftwerk/ Sonderfall),Energieträger,"Spezifizierung ""Mehrere Energieträger"" und ""Sonstige Energieträger"" - Hauptbrennstoff","Spezifizierung ""Mehrere Energieträger"" - Zusatz- / Ersatzbrennstoffe",Auswertung Energieträger (Zuordnung zu einem Hauptenergieträger bei Mehreren Energieträgern),Vergütungsfähig nach EEG (ja/nein),Wärmeauskopplung (KWK) (ja/nein),Netto-Nennleistung (elektrische Wirkleistung) in MW,Bezeichnung Verknüpfungspunkt (Schaltanlage) mit dem Stromnetz der Allgemeinen Versorgung gemäß Netzbetreiber,Netz- oder Umspannebene des Anschlusses in kV,Name Stromnetzbetreiber
0,BNA0001,,,52074,Aachen,,Nordrhein-Westfalen,,1997,in Betrieb,Windenergie (Onshore-Anlage),,,Windenergie (Onshore-Anlage),Ja,,15.0,,MS,INFRAWEST GmbH
1,BNA0002,ecoJoule construct GmbH,,28832,Achim,,Niedersachsen,,2002,in Betrieb,Windenergie (Onshore-Anlage),,,Windenergie (Onshore-Anlage),Ja,,13.3,,HS/MS,EWE NETZ GmbH
2,BNA0003,Sendenhorster Windenergie GmbH & Co. KG,,59229,Ahlen,,Nordrhein-Westfalen,,2003,in Betrieb,Windenergie (Onshore-Anlage),,,Windenergie (Onshore-Anlage),Ja,,15.0,,MS,Stadtwerke Ahlen Netz GmbH
3,BNA0004,Windpark Ahlerstedt GmbH & Co. KG,,21702,Ahlerstedt,,Niedersachsen,,1999,in Betrieb,Windenergie (Onshore-Anlage),,,Windenergie (Onshore-Anlage),Ja,,39.6,,HS/MS,EWE NETZ GmbH
4,BNA0005,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT A,1990,in Betrieb,Erdgas,,,Erdgas,Nein,Nein,37.5,Malchow,110,50 Hertz Transmission GmbH


## Download the Uba Plant list

In [5]:
uba_data_filepath=(downloadandcache(url_uba))
plantlist_uba=pd.read_excel(uba_data_filepath,
                           skiprows=9
                           )
plantlist_uba.head()

Using local file from downloads/2016-4-14-kraftwerke_in_deutschland_ab_100_megawatt_elektrischer_leistung_2015_09.xls


Unnamed: 0,Kraftwerksname / Standort,Betreiber,Bundesland,Standort-PLZ,Kraftwerksstandort,Elektrische Bruttoleistung (MW),Fernwärme-leistung (MW),Inbetriebnahme (ggf. Ertüchtigung),Anlagenart,Primärenergieträger
0,Ahrensfelde A bis D,Vattenfall Europe GmbH,BB,16356.0,Ahrensfelde,152.0,,1990,GT,Erdgas
1,Albbruck-Dogern,Rheinkraftwerk Albbruck-Dogern AG,BW,79774.0,Albbruck,108.9,,1933 / 2009,LWK,Wasser
2,"Altbach/Deizisau GT A-C, E",EnBW Kraftwerke AG,BW,73776.0,Altbach,305.0,,1971-1997,GT,Erdgas
3,Altbach/Deizisau HKW 1,EnBW Kraftwerke AG,BW,73776.0,Altbach,476.0,280.0,1985 (2006),HKW,Steinkohle
4,Altbach/Deizisau HKW 2,EnBW Kraftwerke AG,BW,73776.0,Altbach,379.0,280.0,1997 (2012),HKW (DT),Steinkohle


# Translate contents

## Columns

A dictionary with the original column names to the new column names is created. This dictionary is used to translate the column names.

Original Name|Translation
-|-
Kraftwerksnummer Bundesnetzagentur|id
Unternehmen|company
Kraftwerksname|name
PLZ\n(Standort Kraftwerk)|postcode
Ort\n(Standort Kraftwerk)|city
Straße und Hausnummer (Standort Kraftwerk)|street
Bundesland|state
Blockname|block
Aufnahme der kommerziellen Stromerzeugung der derzeit in Betrieb befindlichen Erzeugungseinheit\n(Jahr)|commissioned
Kraftwerksstatus \n(in Betrieb/\nvorläufig stillgelegt/\nsaisonale Konservierung\nReservekraftwerk/\nSonderfall)|status
Energieträger|fuel_basis
Spezifizierung "Mehrere Energieträger" und "Sonstige Energieträger" - Hauptbrennstoff|fuel_multiple1
Spezifizierung "Mehrere Energieträger" - Zusatz- / Ersatzbrennstoffe|fuel_multiple2
Auswertung\nEnergieträger (Zuordnung zu einem Hauptenergieträger bei Mehreren Energieträgern)|fuel
Vergütungsfähig nach EEG\n(ja/nein)|eeg
Wärmeauskopplung (KWK)\n(ja/nein)|chp
Netto-Nennleistung (elektrische Wirkleistung) in MW|capacity
Bezeichnung Verknüpfungspunkt (Schaltanlage) mit dem Stromnetz der Allgemeinen Versorgung gemäß Netzbetreiber|network_node
Netz- oder Umspannebene des Anschlusses in kV|voltage
Name Stromnetzbetreiber|network_operator

In [6]:
dict_columns = { 'Kraftwerksnummer Bundesnetzagentur':'id',
            'Unternehmen':'company',
            'Kraftwerksname':'name',
            'PLZ\n(Standort Kraftwerk)':'postcode',
            'Ort\n(Standort Kraftwerk)':'city',
            'Straße und Hausnummer (Standort Kraftwerk)':'street',
            'Bundesland':'state',
            'Blockname':'block',
            'Aufnahme der kommerziellen Stromerzeugung der derzeit in Betrieb befindlichen Erzeugungseinheit\n(Jahr)':'commissioned',
            'Kraftwerksstatus \n(in Betrieb/\nvorläufig stillgelegt/\nsaisonale Konservierung\nReservekraftwerk/\nSonderfall)':'status',
            'Energieträger':'fuel_basis',
            'Spezifizierung "Mehrere Energieträger" und "Sonstige Energieträger" - Hauptbrennstoff':'fuel_multiple1',
            'Spezifizierung "Mehrere Energieträger" - Zusatz- / Ersatzbrennstoffe':'fuel_multiple2',
            'Auswertung\nEnergieträger (Zuordnung zu einem Hauptenergieträger bei Mehreren Energieträgern)':'fuel',
            'Vergütungsfähig nach EEG\n(ja/nein)':'eeg',
            'Wärmeauskopplung (KWK)\n(ja/nein)':'chp',
            'Netto-Nennleistung (elektrische Wirkleistung) in MW':'capacity',
            'Bezeichnung Verknüpfungspunkt (Schaltanlage) mit dem Stromnetz der Allgemeinen Versorgung gemäß Netzbetreiber':'network_node',
            'Netz- oder Umspannebene des Anschlusses in kV':'voltage',
            'Name Stromnetzbetreiber':'network_operator',
            'Kraftwerksname / Standort': 'uba_name',
            'Betreiber ': 'uba_company',
            'Standort-PLZ': 'uba_postcode',
            'Kraftwerksstandort': 'uba_city',
            'Elektrische Bruttoleistung (MW)': 'uba_capacity',
            'Fernwärme-leistung (MW)': 'uba_chp_capacity',
            'Inbetriebnahme  (ggf. Ertüchtigung)': 'uba_commissioned',
            'Anlagenart': 'uba_technology',
            'Primärenergieträger': 'uba_fuel',
          }
plantlist.rename(columns=dict_columns, inplace=True)

# Check if all columns have been translated
for columnnames in plantlist.columns:
    if not columnnames in dict_columns.values():
        logger.error("Untranslated column: "+ columnnames)

## Fuel types

In [7]:
dict_fuels = {'Steinkohle':'coal',
              'Erdgas':'gas',
              'Braunkohle':'lignite',
              'Kernenergie':'uranium',
              'Pumpspeicher':'pumped_storage',
              'Biomasse':'biomass',
              'Mineralölprodukte':'oil',
              'Laufwasser':'run_of_river',
              'Sonstige Energieträger\n(nicht erneuerbar) ':'other_non_renewable',
              'Abfall':'waste',
              'Speicherwasser (ohne Pumpspeicher)':'reservoir',
              'Unbekannter Energieträger\n(nicht erneuerbar)':'unknown_non_renewable',
              'Mehrere Energieträger\n(nicht erneuerbar)':'multiple_non_renewable',
              'Deponiegas':'gas_landfill',
              'Windenergie (Onshore-Anlage)':'wind_onshore',
              'Windenergie (Offshore-Anlage)':'wind_offshore',
              'Solare Strahlungsenergie':'solar',
              'Klärgas':'gas_sewage',
              'Geothermie':'geothermal',
              'Grubengas':'gas_mine'
                        }
plantlist["fuel"].replace(dict_fuels, inplace=True)
plantlist["fuel"].unique()

# Check if all fuels have been translated
for fuelnames in plantlist["fuel"].unique():
    if not fuelnames in dict_fuels.values():
        logger.error("Untranslated fuel: "+ fuelnames)

## Power plant status

In [8]:
dict_plantstatus ={
'in Betrieb':'operating',
'vorläufig stillgelegt':'shutdown_temporary',
'Sonderfall':'special_case',
'saisonale Konservierung':'seasonal_conservation',
'Reservekraftwerk':'reserve',
'Endgültig Stillgelegt 2011':'shutdown_2011',
'Endgültig Stillgelegt 2012':'shutdown_2012',
'Endgültig Stillgelegt 2013':'shutdown_2013',
'Endgültig Stillgelegt 2014':'shutdown_2014',
'Endgültig Stillgelegt 2015':'shutdown_2015',
'Endgültig stillgelegt 2015':'shutdown_2015',
}
plantlist["status"].replace(dict_plantstatus, inplace=True)
plantlist["status"].unique()

# Check if all fuels have been translated
for statusnames in plantlist["status"].unique():
    if not statusnames in dict_plantstatus.values():
        logger.error("Untranslated plant status: "+ statusnames)

## CHP Capability

In [9]:
dict_yesno ={
'Nein':'no',
'nein':'no',
'Ja':'yes',
'ja':'yes',    
}
plantlist["chp"].replace(dict_yesno, inplace=True)
plantlist["chp"].unique()

# Check if all fuels have been translated
for chpnames in plantlist["chp"].unique():
    if (not chpnames in dict_yesno.values()) & (str(chpnames) != "nan"):
        logger.error("Untranslated chp capability: " + str(chpnames))

## EEG

In [10]:
plantlist["eeg"].replace(dict_yesno, inplace=True)
plantlist["eeg"].unique()

# Check if all fuels have been translated
for eegnames in plantlist["eeg"].unique():
    if (not eegnames in dict_yesno.values()) & (str(eegnames) != "nan"):
        logger.error("Untranslated EEG type: " + str(eegnames))

## UBA Columns

Using the same dictionary as above

In [11]:
dict_uba_columns = {
            'Kraftwerksname / Standort': 'uba_name',
            'Betreiber ': 'uba_company',
            'Standort-PLZ': 'uba_postcode',
            'Kraftwerksstandort': 'uba_city',
            'Elektrische Bruttoleistung (MW)': 'uba_capacity',
            'Fernwärme-leistung (MW)': 'uba_chp_capacity',
            'Inbetriebnahme  (ggf. Ertüchtigung)': 'uba_commissioned',
            'Anlagenart': 'uba_technology',
            'Primärenergieträger': 'uba_fuel',
            'Bundesland':'uba_state',
          }
plantlist_uba.rename(columns=dict_uba_columns, inplace=True)

# Check if all columns have been translated
for columnnames in plantlist_uba.columns:
    if not columnnames in dict_uba_columns.values():
        logger.error("Untranslated column: "+ columnnames)
        
#Prepare for matching
plantlist_uba['uba_id_string'] = plantlist_uba['uba_name'] + '_' + plantlist_uba['uba_fuel']

# Process data

## Set index to the BNetzA power plant ID

In [12]:
# Set Index to Kraftwerksnummer_Bundesnetzagentur
plantlist['bnetza_id'] = plantlist['id']
plantlist = plantlist.set_index('id')
plantlist.head()

Unnamed: 0_level_0,company,name,postcode,city,street,state,block,commissioned,status,fuel_basis,fuel_multiple1,fuel_multiple2,fuel,eeg,chp,capacity,network_node,voltage,network_operator,bnetza_id
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
BNA0001,,,52074,Aachen,,Nordrhein-Westfalen,,1997,operating,Windenergie (Onshore-Anlage),,,wind_onshore,yes,,15.0,,MS,INFRAWEST GmbH,BNA0001
BNA0002,ecoJoule construct GmbH,,28832,Achim,,Niedersachsen,,2002,operating,Windenergie (Onshore-Anlage),,,wind_onshore,yes,,13.3,,HS/MS,EWE NETZ GmbH,BNA0002
BNA0003,Sendenhorster Windenergie GmbH & Co. KG,,59229,Ahlen,,Nordrhein-Westfalen,,2003,operating,Windenergie (Onshore-Anlage),,,wind_onshore,yes,,15.0,,MS,Stadtwerke Ahlen Netz GmbH,BNA0003
BNA0004,Windpark Ahlerstedt GmbH & Co. KG,,21702,Ahlerstedt,,Niedersachsen,,1999,operating,Windenergie (Onshore-Anlage),,,wind_onshore,yes,,39.6,,HS/MS,EWE NETZ GmbH,BNA0004
BNA0005,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT A,1990,operating,Erdgas,,,gas,no,no,37.5,Malchow,110,50 Hertz Transmission GmbH,BNA0005


## Merge data from UBA List

In [13]:
# UBA List import here
matchinglist=pd.read_csv('inputs/matching_bnetza_uba.csv', 
               skiprows=0,
               sep=';',  # CSV field separator, default is ','
               thousands='.',  # Thousands separator, default is ','
               decimal=',',  # Decimal separator, default is '.')  
               encoding='cp1252')
matchinglist['uba_id_string'] = matchinglist['uba_match_name'] + '_' + matchinglist['uba_match_fuel']
matchinglist.head()

Unnamed: 0,ID BNetzA,uba_match_name,uba_match_fuel,uba_id_string
0,BNA0005,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas
1,BNA0006,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas
2,BNA0007,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas
3,BNA0008,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas
4,BNA0010a,Albbruck-Dogern,Wasser,Albbruck-Dogern_Wasser


### case 1-1

In [14]:
#Check for cases:
#1-1 One BNetzA ID to one UBA-ID

match1t1 = matchinglist[(matchinglist.duplicated(subset=['uba_id_string'], keep=False) == False) & (matchinglist.duplicated(subset=['ID BNetzA'], keep=False)== False)]
match1t1 = pd.merge(match1t1, plantlist_uba, left_on='uba_id_string', right_on='uba_id_string', how='left')
match1t1 = match1t1.set_index('ID BNetzA')
match1t1.head()

#Add comment
match1t1['merge_comment'] = 'List matching type: Single UBA power plant assigned to single BNetzA power plant'

### case n-1

In [15]:
# Match multiple BNetza IDs to One UBA ID
# Matching structure: 
# bnetza_id uba_id
# 1         1
# 2         1
# 3         1
# 4         2
# 5         2

# Get relevant entries from the matchinglist and merge the corresponding UBA Data to the list.
matchnt1= matchinglist[(matchinglist.duplicated(subset=['uba_id_string'], keep=False) == True) & (matchinglist.duplicated(subset=['ID BNetzA'], keep=False)== False)]
matchnt1 = pd.merge(matchnt1, plantlist_uba, left_on='uba_id_string', right_on='uba_id_string', how='left')
matchnt1 = matchnt1.set_index('ID BNetzA')



# Import BNetzA Capacities and CHP criterion into matchnt1 dataframe
plantlist_capacities = pd.DataFrame(plantlist[['capacity','chp']])
plantlist_capacities=plantlist_capacities.rename(columns = {'capacity':'capacity_bnetza'})
plantlist_capacities=plantlist_capacities.rename(columns = {'chp':'chp_bnetza'})
#print(plantlist_capacities)
matchnt1 = pd.merge(matchnt1, plantlist_capacities, left_index=True, right_index=True, how='left')

# Get sum of BNetzA Capacitites for each UBA Index and merge into matchnt1 dataframe
plantlist_uba_capacitysum = pd.DataFrame(matchnt1.groupby('uba_id_string').sum()['capacity_bnetza'])
plantlist_uba_capacitysum=plantlist_uba_capacitysum.rename(columns ={'capacity_bnetza':'capacity_bnetza_aggregate'})
#matchnt1 = pd.merge(matchnt1, plantlist_uba_capacitysum, left_index=True,right_index=True, how='left')
matchnt1 = pd.merge(matchnt1, plantlist_uba_capacitysum, left_on='uba_id_string', right_index=True, how='left')
#print(matchnt1)


# Scale UBA Capacities based BNEtza Data
matchnt1['uba_capacity_scaled'] = matchnt1['uba_capacity'] * matchnt1['capacity_bnetza']/matchnt1['capacity_bnetza_aggregate']


#determine sum of capacities with chp capability and add to matchnt1
plantlist_uba_chp_capacities = matchnt1[(matchnt1['chp_bnetza'] == 'yes')]
plantlist_uba_chp_capacitysum = pd.DataFrame(plantlist_uba_chp_capacities.groupby('uba_id_string').sum()[['capacity_bnetza']])
plantlist_uba_chp_capacitysum = plantlist_uba_chp_capacitysum.rename(columns = {'capacity_bnetza':'capacity_bnetza_with_chp'})
matchnt1 = pd.merge(matchnt1, plantlist_uba_chp_capacitysum, left_index=True, right_index=True, how='left')

matchnt1['uba_chp_capacity_scaled'] = matchnt1['uba_chp_capacity'] * matchnt1['capacity_bnetza']/matchnt1['capacity_bnetza_with_chp']

# Change column names for merge later on
matchnt1['uba_chp_capacity_original'] = matchnt1['uba_chp_capacity']
matchnt1['uba_chp_capacity'] = matchnt1['uba_chp_capacity_scaled']
matchnt1['uba_capacity_original'] = matchnt1['uba_capacity']
matchnt1['uba_capacity'] = matchnt1['uba_capacity_scaled']

#Add comment
matchnt1['merge_comment'] = 'List matching type: UBA capacity distributed proportionally to multiple BNetzA power plants'

# Drop columns not needed anymore
#colsToDrop = ['capacity_bnetza', 'chp_bnetza','capacity_bnetza_with_chp', 'capacity_bnetza_aggregate', 'uba_chp_capacity_scaled', 'uba_capacity_scaled']
#matchnt1 = matchnt1.drop(colsToDrop, axis=1)
matchnt1.head()

Unnamed: 0,uba_match_name,uba_match_fuel,uba_id_string,uba_name,uba_company,uba_state,uba_postcode,uba_city,uba_capacity,uba_chp_capacity,...,uba_fuel,capacity_bnetza,chp_bnetza,capacity_bnetza_aggregate,uba_capacity_scaled,capacity_bnetza_with_chp,uba_chp_capacity_scaled,uba_chp_capacity_original,uba_capacity_original,merge_comment
BNA0005,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas,Ahrensfelde A bis D,Vattenfall Europe GmbH,BB,16356.0,Ahrensfelde,38.0,,...,Erdgas,37.5,no,150.0,38.0,,,,152.0,List matching type: UBA capacity distributed p...
BNA0006,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas,Ahrensfelde A bis D,Vattenfall Europe GmbH,BB,16356.0,Ahrensfelde,38.0,,...,Erdgas,37.5,no,150.0,38.0,,,,152.0,List matching type: UBA capacity distributed p...
BNA0007,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas,Ahrensfelde A bis D,Vattenfall Europe GmbH,BB,16356.0,Ahrensfelde,38.0,,...,Erdgas,37.5,no,150.0,38.0,,,,152.0,List matching type: UBA capacity distributed p...
BNA0008,Ahrensfelde A bis D,Erdgas,Ahrensfelde A bis D_Erdgas,Ahrensfelde A bis D,Vattenfall Europe GmbH,BB,16356.0,Ahrensfelde,38.0,,...,Erdgas,37.5,no,150.0,38.0,,,,152.0,List matching type: UBA capacity distributed p...
BNA0010a,Albbruck-Dogern,Wasser,Albbruck-Dogern_Wasser,Albbruck-Dogern,Rheinkraftwerk Albbruck-Dogern AG,BW,79774.0,Albbruck,83.647826,,...,Wasser,79.5,no,103.5,83.647826,,,,108.9,List matching type: UBA capacity distributed p...


### case 1-n

In [16]:
# 1-n Case here
# The resulting DataFrame should be called "match1tn"
# Matching structure: 
# bnetza_id uba_id
# 1         1
# 1         2
# 1         3
# 2         4
# 2         5

# Get relevant entries from the matchinglist and merge the corresponding UBA Data to the list.
match1tn= matchinglist[(matchinglist.duplicated(subset=['ID BNetzA'], keep=False) == True) & (matchinglist.duplicated(subset=['uba_id_string'], keep=False)== False)]
match1tn = pd.merge(match1tn, plantlist_uba, left_on='uba_id_string', right_on='uba_id_string', how='left')
match1tn = match1tn.set_index('ID BNetzA')
match1tn.head()

# Import BNetzA Capacities and CHP criterion into match1tn dataframe
plantlist_capacities = pd.DataFrame(plantlist[['capacity','chp']])
plantlist_capacities=plantlist_capacities.rename(columns = {'capacity':'capacity_bnetza'})
plantlist_capacities=plantlist_capacities.rename(columns = {'chp':'chp_bnetza'})
#plantlist_capacities.head()
#print(plantlist_capacities)
match1tn = pd.merge(match1tn, plantlist_capacities, left_index=True, right_index=True, how='left')
match1tn.index.names=['ID BNetzA']
match1tn.head()

# Get sum of UBA Capacitites for each BNetzA Index and merge into match1tn dataframe
plantlist_bnetza_capacitysum = pd.DataFrame(match1tn.groupby(match1tn.index).sum()['uba_capacity'])
plantlist_bnetza_capacitysum=plantlist_bnetza_capacitysum.rename(columns ={'uba_capacity':'uba_capacity_aggregate'})
#print(plantlist_uba_capacitysum)
match1tn = pd.merge(match1tn, plantlist_bnetza_capacitysum, left_index=True, right_index=True, how='left')

# Get sum of UBA CHP Capacities for each BNetzA Index and merge inot match1tn datafram
plantlist_bnetza_chp_capacitysum = pd.DataFrame(match1tn.groupby(match1tn.index).sum()['uba_chp_capacity'])
plantlist_bnetza_chp_capacitysum=plantlist_bnetza_chp_capacitysum.rename(columns ={'uba_chp_capacity':'uba_chp_capacity_aggregate'})
match1tn = pd.merge(match1tn, plantlist_bnetza_chp_capacitysum, left_index=True, right_index=True, how='left')

# Get UBA Technology for each BNetzA Index and merge into match1tn dataframe 
## Option 1: Take all technologies and merge them
#match1tn['uba_technology_aggregate'] = pd.DataFrame(match1tn.groupby(match1tn.index).transform(lambda x: ', '.join(x))['uba_technology'])
## Option 2 (currently preferred): Take technology with highest occurence
match1tn['uba_technology_aggregate'] = pd.DataFrame(match1tn.groupby(match1tn.index)['uba_technology'].agg(lambda x:x.value_counts().index[0]))

# Get UBA Plant name
match1tn['uba_name_aggregate'] = pd.DataFrame(match1tn.groupby(match1tn.index).transform(lambda x: ', '.join(x))['uba_name'])

# Get UBA company name
match1tn['uba_company_aggregate'] = pd.DataFrame(match1tn.groupby(match1tn.index)['uba_company'].agg(lambda x:x.value_counts().index[0]))

# Change column names for merge later on
match1tn = match1tn.rename(columns={'uba_chp_capacity':'uba_chp_capacity_original','uba_capacity':'uba_capacity_original',
                                   'uba_chp_capacity_aggregate':'uba_chp_capacity','uba_capacity_aggregate':'uba_capacity'})
#match1tn['uba_chp_capacity_original'] = match1tn['uba_chp_capacity_aggregate']
#match1tn['uba_capacity_original'] = match1tn['uba_capacity_aggregate']

#Add comment
match1tn['merge_comment'] = 'List matching type: Multiple UBA capacities aggregated to single BNetzA power plant'

# Drop columns not needed anymore
colsToDrop = ['capacity_bnetza', 'chp_bnetza']
match1tn = match1tn.drop(colsToDrop, axis=1)

# Drop duplicate rows and keep first entry
match1tn = match1tn.reset_index().drop_duplicates(subset='ID BNetzA', keep='first').set_index('ID BNetzA')

match1tn.head()

Unnamed: 0_level_0,uba_match_name,uba_match_fuel,uba_id_string,uba_name,uba_company,uba_state,uba_postcode,uba_city,uba_capacity_original,uba_chp_capacity_original,uba_commissioned,uba_technology,uba_fuel,uba_capacity,uba_chp_capacity,uba_technology_aggregate,uba_name_aggregate,uba_company_aggregate,merge_comment
ID BNetzA,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
BNA0073,Berlin-Mitte HKW GT 1,Erdgas,Berlin-Mitte HKW GT 1_Erdgas,Berlin-Mitte HKW GT 1,Vattenfall Europe GmbH,BE,10179.0,Berlin,178.0,1210.0,1997,HKW / GuD,Erdgas,468.0,1210.0,HKW / GuD,"Berlin-Mitte HKW GT 1, Berlin-Mitte HKW GT 2, ...",Vattenfall Europe GmbH,List matching type: Multiple UBA capacities ag...
BNA0606,Emsland D (Lingen) DT,Erdgas,Emsland D (Lingen) DT_Erdgas,Emsland D (Lingen) DT,RWE Power AG,NI,49808.0,Lingen,326.0,50.0,2010,GuD,Erdgas,888.0,50.0,GuD,"Emsland D (Lingen) DT, Emsland D (Lingen) GT 1...",RWE Power AG,List matching type: Multiple UBA capacities ag...
BNA1019,Wehr 1,Wasser,Wehr 1_Wasser,Wehr 1,Schluchseewerk AG,BW,79664.0,Wehr,248.0,,1976,PSW,Wasser,992.0,,PSW,"Wehr 1, Wehr 2, Wehr 3, Wehr 4",Schluchseewerk AG,List matching type: Multiple UBA capacities ag...
,Gambsheim / Rhein 1-4,Wasser,Gambsheim / Rhein 1-4_Wasser,Gambsheim / Rhein 1-4,EnBW / EDF,BW,77866.0,Gambsheim,108.0,,1974,LWK,Wasser,,,,Gambsheim / Rhein 1-4,,List matching type: Multiple UBA capacities ag...


### Merge into plantlist

In [17]:
# Merge the UBA DataFrames
# Merge first two dataframes
plantlist_uba_for_merge = match1t1.append(matchnt1)

# Add third dataframe
plantlist_uba_for_merge = plantlist_uba_for_merge.append(match1tn)

# Merge plantlist_uba_for_merge into the plantlist
plantlist = pd.merge(plantlist, plantlist_uba_for_merge, left_index=True, right_index=True, how='left')

In [18]:
plantlist.loc[['BNA0073']]

Unnamed: 0,company,name,postcode,city,street,state,block,commissioned,status,fuel_basis,...,uba_fuel,uba_id_string,uba_match_fuel,uba_match_name,uba_name,uba_name_aggregate,uba_postcode,uba_state,uba_technology,uba_technology_aggregate
BNA0073,Vattenfall Europe Wärme AG,Mitte,10179,Berlin,Köpenicker Straße 60,Berlin,GuD Mitte,1996,operating,Erdgas,...,Erdgas,Berlin-Mitte HKW GT 1_Erdgas,Erdgas,Berlin-Mitte HKW GT 1,Berlin-Mitte HKW GT 1,"Berlin-Mitte HKW GT 1, Berlin-Mitte HKW GT 2, ...",10179.0,BE,HKW / GuD,HKW / GuD


## Delete fuels not in focus

In [19]:
# Delete unwanted fuels
plantlist = plantlist[plantlist.fuel != 'solar']
plantlist = plantlist[plantlist.fuel != 'wind_onshore']
plantlist = plantlist[plantlist.fuel != 'wind_offshore']

# Delete placeholder values
plantlist = plantlist[plantlist.company != 'EEG-Anlagen < 10 MW']
plantlist = plantlist[plantlist.company != 'Nicht-EEG-Anlagen < 10 MW']

## Add Columns for shutdown and retrofit

In [20]:
# Add columns with empty data
plantlist['shutdown'] = 'NaN'

plantlist['shutdown'] = pd.to_numeric(plantlist['status'].str.extract('[\w].+(\d\d\d\d)'),errors='coerce')
plantlist['status'][plantlist['shutdown'] > 0] = 'shutdown'

# Fill retrofit data column
# Identify restrofit dates in UBA list
plantlist['retrofit'] = pd.to_numeric(plantlist['uba_commissioned'].str.extract('[(.+](\d\d\d\d)'),errors='coerce')

# Split multiple commissioning dates as listed in UBA
plantlist['uba_commissioned_1'] = pd.to_numeric(plantlist['uba_commissioned'].str.extract('(\d\d\d\d)'),errors='coerce')
plantlist['uba_commissioned_1'][plantlist['uba_commissioned_1'].isnull()] = pd.to_numeric(plantlist['uba_commissioned'].str.extract('(\d\d\d\d).+[\w]'),errors='coerce')
plantlist['uba_commissioned_2'] = pd.to_numeric(plantlist['uba_commissioned'].str.extract('[\w].+(\d\d\d\d).+[\w]'),errors='coerce')
plantlist['uba_commissioned_3'] = pd.to_numeric(plantlist['uba_commissioned'].str.extract('[\w].+(\d\d\d\d)'),errors='coerce')

plantlist['uba_commissioned_1'][plantlist['retrofit']==plantlist['uba_commissioned_1']] = ''
plantlist['uba_commissioned_2'][plantlist['retrofit']==plantlist['uba_commissioned_2']] = ''
plantlist['uba_commissioned_3'][plantlist['retrofit']==plantlist['uba_commissioned_3']] = ''

# Split multiple commissioning dates as listed in BNetzA
plantlist['commissioned_1'] = pd.to_numeric(plantlist['commissioned'].str.extract('(\d\d\d\d)'),errors='coerce')
plantlist['commissioned_1'][plantlist['commissioned_1'].isnull()] = pd.to_numeric(plantlist['commissioned'].str.extract('(\d\d\d\d).+[\w]'),errors='coerce')
plantlist['commissioned_2'] = pd.to_numeric(plantlist['commissioned'].str.extract('[\w].+(\d\d\d\d).+[\w]'),errors='coerce')
plantlist['commissioned_3'] = pd.to_numeric(plantlist['commissioned'].str.extract('[\w].+(\d\d\d\d)'),errors='coerce')

# Show plantlist
plantlist[plantlist['status']=='shutdown']

#plantlist.to_excel('power_plants_germany_tech.xlsx', sheet_name='output')

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a

Unnamed: 0,company,name,postcode,city,street,state,block,commissioned,status,fuel_basis,...,uba_technology,uba_technology_aggregate,shutdown,retrofit,uba_commissioned_1,uba_commissioned_2,uba_commissioned_3,commissioned_1,commissioned_2,commissioned_3
BNA0011,Papierfabrik Albbruck GmbH,Papierfabrik Albbruck,79774,Albbruck,,Baden-Württemberg,,2009.0,shutdown,Mehrere Energieträger,...,,,2012.0,,,,,2009.0,,
BNA0059a,Volkswagen AG,HKW Kassel,34225,Baunatal,,Hessen,Turbine 1,1961.0,shutdown,Erdgas,...,,,2013.0,,,,,1961.0,,
BNA0099,Gemeinschaftskraftwerk Veltheim GmbH,Gasturbinenkraftwerk Bielefeld Ummeln,33649,Bielefeld,,Nordrhein-Westfalen,GT Ummeln,1975.0,shutdown,Erdgas,...,,,2015.0,,,,,1975.0,,
BNA0118,Energie- und Wasserversorgung Bonn/Rhein-Sieg ...,Heizkraftwerk Süd,53121,Bonn,,Nordrhein-Westfalen,Heizkraftwerk Süd,1969.0,shutdown,Erdgas,...,,,2012.0,,,,,1969.0,,
BNA0143,swb Erzeugung GmbH & Co. KG,KW Mittelsbüren,28237,Bremen,Auf den Delben 35,Bremen,Block 3,1974.0,shutdown,Mehrere Energieträger,...,,,2013.0,,,,,1974.0,,
BNA0187,E.ON Kraftwerke GmbH,Datteln,45711,Datteln,,Nordrhein-Westfalen,1,1964.0,shutdown,Steinkohle,...,,,2013.0,,,,,1964.0,,
BNA0188,E.ON Kraftwerke GmbH,Datteln,45711,Datteln,,Nordrhein-Westfalen,2,1964.0,shutdown,Steinkohle,...,,,2013.0,,,,,1964.0,,
BNA0189,E.ON Kraftwerke GmbH,Datteln,45711,Datteln,,Nordrhein-Westfalen,3,1969.0,shutdown,Steinkohle,...,,,2013.0,,,,,1969.0,,
BNA0203,E.ON Kraftwerke GmbH,Knepper,44357,Dortmund,,Nordrhein-Westfalen,C,1971.0,shutdown,Steinkohle,...,,,2014.0,,,,,1971.0,,
BNA0212,Stadtwerke Duisburg AG,HKW II/B,47053,Duisburg,Zirkelstraße,Nordrhein-Westfalen,HKW II/B,1965.0,shutdown,Mehrere Energieträger,...,,,2012.0,,,,,1965.0,,


## Convert input colums to usable data types

In [21]:
plantlist['capacity_float'] = pd.to_numeric(plantlist['capacity'],errors='coerce')
plantlist['commissioned_float'] = pd.to_numeric(plantlist[['commissioned','commissioned_1','commissioned_2','commissioned_3']].max(axis=1),errors='coerce')
plantlist['retrofit_float'] = pd.to_numeric(plantlist['retrofit'],errors='coerce')
plantlist.head()

Unnamed: 0,company,name,postcode,city,street,state,block,commissioned,status,fuel_basis,...,retrofit,uba_commissioned_1,uba_commissioned_2,uba_commissioned_3,commissioned_1,commissioned_2,commissioned_3,capacity_float,commissioned_float,retrofit_float
BNA0005,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT A,1990,operating,Erdgas,...,,,,,1990.0,,,37.5,1990.0,
BNA0006,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT B,1990,operating,Erdgas,...,,,,,1990.0,,,37.5,1990.0,
BNA0007,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT C,1990,operating,Erdgas,...,,,,,1990.0,,,37.5,1990.0,
BNA0008,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT D,1990,operating,Erdgas,...,,,,,1990.0,,,37.5,1990.0,
BNA0010a,Rheinkraftwerk Albbruck- Dogern AG,RADAG,79774,Albbruck,Kraftwerkstrasse 34,Baden-Württemberg,,1933,operating,Laufwasser,...,,1933.0,,2009.0,1933.0,,,79.5,1933.0,


## Identify generation technology

### Process technology information from UBA list

In [22]:
# Split uba_technology information into technology (GT, CC,...) and type (HKW, IKW, ...)
plantlist['technology'] = plantlist['uba_technology']
plantlist['type'] = plantlist['uba_technology']

# Translate technologies
dict_technology = {
            'GT': 'GT',
            'GuD': 'CC',
            'DKW': 'ST',
            'LWK': 'ROR',
            'PSW': 'PSP',
            'DWR': 'ST', #Pressurized water reactor
            'G/AK': 'GT', #GT with heat recovery
            'SWR': 'ST', #boiling water reactor
            'SWK': 'SPP', #storage power plant
            'SSA': '', #bus bar
            'HKW (DT)': 'ST',
            'HKW / GuD': 'CC',
            'GuD / HKW': 'CC',
            'IKW / GuD': 'CC',
            'IKW /GuD': 'CC',
            'HKW / SSA': '',
            'IKW / SSA': '',
            'HKW': '',
            'IKW': '',
            'IKW / HKW': ''
}
plantlist["technology"].replace(dict_technology, inplace=True)
plantlist["technology"].unique()

# Check if all technologies have been translated
for technology in plantlist["technology"].unique():
    if (not technology in dict_technology.values()) & (str(technology) != "nan"):
        logger.error("Untranslated technology: " + str(technology))

# Translate types
dict_type = {
            'HKW': 'CHP', #thermal power plant,
            'HKW (DT)': 'CHP',
            'IKW': 'IPP', #industrial power plant         
            'HKW / GuD': 'CHP',
            'GuD / HKW': 'CHP',
            'IKW / GuD': 'IPP',
            'IKW /GuD': 'IPP',
            'IKW / SSA': 'IPP',
            'HKW / SSA': 'CHP',
            'IKW / HKW': 'CHP',
            'GT': '',
            'GuD': '',
            'DKW': '',
            'LWK': '',
            'PSW': '',
            'DWR': '', #Pressurized water reactor
            'G/AK': 'CHP', #GT with heat recovery
            'SWR': '', #boiling water reactor
            'SWK': '', #storage power plant
            'SSA': '', #
    
}
plantlist["type"].replace(dict_type, inplace=True)
plantlist["type"].unique()

# Check if all types have been translated
for type in plantlist["type"].unique():
    if (not type in dict_type.values()) & (str(type) != "nan"):
        logger.error("Untranslated type: " + str(type))

#plantlist.head()

### Identify generation technology based on BNetzA information

In [23]:
# Set technology based on fuels
plantlist['technology'][(plantlist['fuel']=='uranium')&((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'ST'
plantlist['technology'][(plantlist['fuel']=='lignite')&((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'ST'
plantlist['technology'][(plantlist['fuel']=='coal')&((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'ST'
plantlist['technology'][(plantlist['fuel']=='run_of_river')&((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'ROR'
plantlist['technology'][(plantlist['fuel']=='pumped_storage')&((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'PSP'
plantlist['technology'][(plantlist['fuel']=='reservoir')&((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'RES'


# Set technology based on name and block information combined with fuels (e.g. combined-cycle, gas turbine)
## Define technology CC as combination of GT and DT
plantlist['technology'][((plantlist['name'].str.contains("GT"))|(plantlist['block'].str.contains("GT")))&
                        ((plantlist['name'].str.contains("DT"))|(plantlist['block'].str.contains("DT")))&
#                        ((plantlist['fuel']=='gas')|(plantlist['fuel']=='oil'))&
                        ((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'CC'
## Define technology CC if specified as GuD
plantlist['technology'][((plantlist['name'].str.contains("GuD"))|(plantlist['block'].str.contains("GuD"))|
                            (plantlist['name'].str.contains("GUD"))|(plantlist['name'].str.contains("GUD")))&
#                        ((plantlist['fuel']=='gas')|(plantlist['fuel']=='oil'))&
                        ((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'CC'
## Define technology GT
plantlist['technology'][((plantlist['name'].str.contains("GT"))|(plantlist['block'].str.contains("GT"))|
                            (plantlist['name'].str.contains("Gasturbine"))|(plantlist['block'].str.contains("Gasturbine")))&
#                        ((plantlist['fuel']=='gas')|(plantlist['fuel']=='oil'))&
                        ((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'GT'
## Define technology ST
plantlist['technology'][((plantlist['name'].str.contains("DT"))|(plantlist['block'].str.contains("DT"))|
                            (plantlist['name'].str.contains("Dampfturbine"))|(plantlist['block'].str.contains("Dampfturbine"))|
                            (plantlist['name'].str.contains("Dampfkraftwerk"))|(plantlist['block'].str.contains("Dampfkraftwerk"))|
                            (plantlist['name'].str.contains("DKW"))|(plantlist['block'].str.contains("DKW")))&
#                        ((plantlist['fuel']=='gas')|(plantlist['fuel']=='oil'))&
                        ((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'ST'
## Define technology CB
plantlist['technology'][((plantlist['name'].str.contains("motor"))|(plantlist['block'].str.contains("motor"))|
                            (plantlist['name'].str.contains("Motor"))|(plantlist['block'].str.contains("Motor")))&
#                        ((plantlist['fuel']=='gas')|(plantlist['fuel']=='oil'))&
                        ((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'CB'

# Set technology ST for all technologies which could not be identified
plantlist['technology'][((plantlist['technology']=='')|(plantlist['technology'].isnull()))] = 'ST'

#plantlist.to_excel('power_plants_germany_tech.xlsx', sheet_name='output')

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  app.launch_new_instance()
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-vie

## Add country code

In [24]:
# Add country Code
plantlist["country_code"] = plantlist["state"]
dict_state_country = {
    'Brandenburg' : 'DE',
    'Baden-Württemberg' : 'DE',
    'Niedersachsen' : 'DE',
    'Bayern' : 'DE',
    'Mecklenburg-Vorpommern' : 'DE',
    'Sachsen-Anhalt' : 'DE',
    'Hessen' : 'DE',
    'Nordrhein-Westfalen' : 'DE',
    'Berlin' : 'DE',
    'Saarland' : 'DE',
    'Thüringen' : 'DE',
    'Sachsen' : 'DE',
    'Bremen' : 'DE',
    'Schleswig-Holstein' : 'DE',
    'Hamburg' : 'DE',
    'Rheinland-Pfalz' : 'DE',
    'Österreich' : 'AT',
    'Luxemburg' : 'LU',
    'Schweiz' : 'CH',
}
plantlist["country_code"].replace(dict_state_country, inplace=True)
plantlist["country_code"].unique()

# Check if all types have been translated
for type in plantlist["country_code"].unique():
    if (not type in dict_state_country.values()) & (str(type) != "nan"):
        logger.error("Untranslated type: " + str(type))

## Add efficiency data

### Efficiencies from research

#### Import data

In [25]:
# Efficiencies
data_efficiencies_bnetza=pd.read_csv('inputs/input_efficiency_de.csv',
                                     sep=';',  # CSV field separator, default is ','
                                     decimal='.',  # Decimal separator, default is '.')  
                                     encoding='cp1252')
data_efficiencies_bnetza = data_efficiencies_bnetza.set_index('id')
data_efficiencies_bnetza['efficiency_net'] = pd.to_numeric(data_efficiencies_bnetza['efficiency_net'],errors='coerce')
data_efficiencies_bnetza['efficiency_source'] = data_efficiencies_bnetza['efficiency_source']
data_efficiencies_bnetza = data_efficiencies_bnetza.dropna(subset=['efficiency_net'])
data_efficiencies_bnetza

plantlist = pd.merge(plantlist, data_efficiencies_bnetza, left_index=True, right_index=True, how='left')
plantlist.head()

Unnamed: 0,company,name,postcode,city,street,state,block,commissioned,status,fuel_basis,...,retrofit_float,technology,type,country_code,efficiency_net,efficiency_gross,efficiency_comment,date,efficiency_source,source_type
BNA0005,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT A,1990,operating,Erdgas,...,,GT,,DE,0.31,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A
BNA0006,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT B,1990,operating,Erdgas,...,,GT,,DE,0.31,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A
BNA0007,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT C,1990,operating,Erdgas,...,,GT,,DE,0.31,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A
BNA0008,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT D,1990,operating,Erdgas,...,,GT,,DE,0.31,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A
BNA0010a,Rheinkraftwerk Albbruck- Dogern AG,RADAG,79774,Albbruck,Kraftwerkstrasse 34,Baden-Württemberg,,1933,operating,Laufwasser,...,,ROR,,DE,,,,,,


#### Plot efficiencies by year of commissioning

In [26]:
plantlist_for_efficiency_analysis = plantlist
plantlist_for_efficiency_analysis = plantlist_for_efficiency_analysis.dropna(subset=['efficiency_net'])
fuel_for_plot = ['lignite', 'coal', 'oil', 'gas']
plantlist_for_efficiency_analysis = plantlist_for_efficiency_analysis[plantlist_for_efficiency_analysis.fuel.isin(fuel_for_plot)]
plot_efficiency_type = Scatter(plantlist_for_efficiency_analysis, 
                              notebook=True, 
                              x='commissioned_float', 
                              y='efficiency_net',
                              color='fuel', 
                              title='Efficiency vs commissioning year', 
                              xlabel='Year', 
                              ylabel='Efficiency',
                              legend="top_left",
                              height=700,
                              width=1000,
                             )
show(plot_efficiency_type)

  and will be removed in the future.")


#### Determine least-squares approximation based on researched data (planned)

In [27]:
#import statsmodels.api as sm
#from statsmodels.formula.api import ols
#import matplotlib.pyplot as plt

#olslist = {}
#for fuelnames in plantlist["fuel"].unique():
#    plantlist_for_efficiency_analysis = plantlist[(plantlist.fuel==fuelnames) & (plantlist.efficiency_net.notnull()==True)]
#    if len(plantlist_for_efficiency_analysis.index)>=4:
#        efficiencyestimate = ols("efficiency_net  ~  commissioned_float + chp +uba_technology ", plantlist_for_efficiency_analysis).fit()
#        olslist[fuelnames]=efficiencyestimate
#        print(efficiencyestimate.summary())

        
#        fig, ax = plt.subplots()
#        fig = sm.graphics.plot_fit(efficiencyestimate, 'commissioned_float',  ax=ax)
#        plt.ylabel("Efficiency")
#        plt.xlabel("Commissioned")
#        plt.title(fuelnames)
#        plt.legend(['Data', 'Fitted model'], loc=2)
#        plt.show()

#### Apply efficiency approximation from least squares approximation (planned)

In [28]:
#Planned

### Efficiencies from literature

Jonas Egerer, Clemens Gerbaulet, Richard Ihlenburg, Friedrich Kunz, Benjamin Reinhard, Christian von Hirschhausen, Alexander Weber, Jens Weibezahn (2014): **Electricity Sector Data for Policy-Relevant Modeling: Data Documentation and Applications to the German and European Electricity Markets**. DIW Data Documentation 72, Berlin, Germany.

#### Import data

In [29]:
data_efficiencies_literature=pd.read_csv('inputs/input_efficiency_literature_by_fuel_technology.csv',
                                     sep=',',  # CSV field separator, default is ','
                                     decimal='.',  # Decimal separator, default is '.')  
                                     encoding='utf8')
data_efficiencies_literature['technology'] = data_efficiencies_literature['technology'].str.upper()
data_efficiencies_literature = data_efficiencies_literature.set_index(['fuel','technology'])
data_efficiencies_literature

Unnamed: 0_level_0,Unnamed: 1_level_0,efficiency_intercept,efficiency_slope
fuel,technology,Unnamed: 2_level_1,Unnamed: 3_level_1
biomass,ST,0.38,0.0
coal,ST,-4.575,0.0025
gas,CB,-2.358,0.0014
gas,CC,-8.46,0.0045
gas,GT,-4.82,0.0026
gas,HP,0.95,0.0
gas,IC,0.38,0.0
gas,ST,-1.815,0.0011
lignite,ST,-4.4,0.0024
oil,CB,-2.358,0.0014


#### Apply efficiency approximation from literature

In [30]:
plantlist = plantlist.join(data_efficiencies_literature,on=['fuel','technology'])
plantlist['efficiency_literature'] = plantlist['efficiency_intercept']+plantlist['efficiency_slope']*plantlist[['commissioned_float','retrofit_float']].max(axis=1)
plantlist

Unnamed: 0,company,name,postcode,city,street,state,block,commissioned,status,fuel_basis,...,country_code,efficiency_net,efficiency_gross,efficiency_comment,date,efficiency_source,source_type,efficiency_intercept,efficiency_slope,efficiency_literature
BNA0005,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT A,1990,operating,Erdgas,...,DE,0.3100,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.820,0.0026,0.3540
BNA0006,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT B,1990,operating,Erdgas,...,DE,0.3100,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.820,0.0026,0.3540
BNA0007,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT C,1990,operating,Erdgas,...,DE,0.3100,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.820,0.0026,0.3540
BNA0008,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT D,1990,operating,Erdgas,...,DE,0.3100,,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.820,0.0026,0.3540
BNA0010a,Rheinkraftwerk Albbruck- Dogern AG,RADAG,79774,Albbruck,Kraftwerkstrasse 34,Baden-Württemberg,,1933,operating,Laufwasser,...,DE,,,,,,,,,
BNA0010b,Rheinkraftwerk Albbruck- Dogern AG,WKW,79804,Dogern,Zollstrasse,Baden-Württemberg,,2009,operating,Laufwasser,...,DE,,,,,,,,,
BNA0011,Papierfabrik Albbruck GmbH,Papierfabrik Albbruck,79774,Albbruck,,Baden-Württemberg,,2009,shutdown,Mehrere Energieträger,...,DE,,,,,,,-4.575,0.0025,0.4475
BNA0012a,Sappi Alfeld GmbH,Werkskraftwerk Sappi Alfeld,31061,Alfeld,Mühlenmasch 1,Niedersachsen,Turbine 5,1988,operating,Biomasse,...,DE,,,,,,,0.380,0.0000,0.3800
BNA0012b,Sappi Alfeld GmbH,Werkskraftwerk Sappi Alfeld,31061,Alfeld,Mühlenmarsch 1,Niedersachsen,Gaskraftwerk,1947,operating,Erdgas,...,DE,,,,,,,-1.815,0.0011,0.3267
BNA0012c,Sappi Alfeld GmbH,Werkskraftwerk Sappi Alfeld,31061,Alfeld,Mühlenmarsch 1,Niedersachsen,Wasserturbine,1912,operating,Laufwasser,...,DE,,,,,,,,,


## Add geodata

In [31]:
data_plant_locations=pd.read_csv('inputs/input_plant_locations_de.csv',
                                     sep=';',  # CSV field separator, default is ','
                                     decimal='.',  # Decimal separator, default is '.')  
                                     encoding='cp1252')

data_plant_locations = data_plant_locations.set_index('id')

data_plant_locations['lat'] = pd.to_numeric(data_plant_locations['lat'], errors='coerce')
data_plant_locations['lon'] = pd.to_numeric(data_plant_locations['lon'], errors='coerce')

plantlist = pd.merge(plantlist, data_plant_locations, left_index=True, right_index=True, how='left')
plantlist.head()


Unnamed: 0,company,name,postcode,city,street,state,block,commissioned,status,fuel_basis,...,efficiency_comment,date,efficiency_source,source_type,efficiency_intercept,efficiency_slope,efficiency_literature,lat,lon,location_checked
BNA0005,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT A,1990,operating,Erdgas,...,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.82,0.0026,0.354,52.5895,13.558652,15.12.2015
BNA0006,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT B,1990,operating,Erdgas,...,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.82,0.0026,0.354,52.5895,13.558652,15.12.2015
BNA0007,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT C,1990,operating,Erdgas,...,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.82,0.0026,0.354,52.5895,13.558652,15.12.2015
BNA0008,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT D,1990,operating,Erdgas,...,,30.10.2015,https://www.ffe.de/download/berichte/Endberich...,A,-4.82,0.0026,0.354,52.5895,13.558652,15.12.2015
BNA0010a,Rheinkraftwerk Albbruck- Dogern AG,RADAG,79774,Albbruck,Kraftwerkstrasse 34,Baden-Württemberg,,1933,operating,Laufwasser,...,,,,,,,,47.58629,8.13207,15.12.2015


# Define final output

In [32]:
# Merge uba_name_aggregate and uba_name
plantlist['uba_name_aggregate'][plantlist['uba_name_aggregate'].isnull()] = plantlist['uba_name']

# Drop columns not relevant for output
colsToDrop = ['bnetza_id',
              'capacity',
              'uba_name',
              'uba_capacity_original',
              'uba_chp_capacity_original',
              'uba_city', 
              'uba_commissioned', 
              'uba_company', 
              'uba_company_aggregate', 
              'uba_fuel', 
              'uba_postcode', 
              'uba_state', 
              'uba_technology', 
              'uba_technology_aggregate', 
              'retrofit',
              'uba_commissioned_1', 
              'uba_commissioned_2', 
              'uba_commissioned_3', 
              'commissioned_1', 
              'commissioned_2', 
              'commissioned_3', 
              'fuel_basis', 
              'fuel_multiple1', 
              'fuel_multiple2',
              'efficiency_gross',
              'efficiency_intercept',
              'efficiency_slope',
              'source_type',
              'date',
              'location_checked',
             ]
plantlist = plantlist.drop(colsToDrop, axis=1)

# Rename columns
plantlist = plantlist.rename(columns={'commissioned':'commissioned_original', 
                                      'commissioned_float':'commissioned', 
                                      'retrofit_float':'retrofit', 
                                      'capacity_float':'capacity',
                                      'uba_capacity':'capacity_uba', 
                                      'uba_chp_capacity':'chp_capacity_uba', 
                                      'efficiency_net':'efficiency_data', 
                                      'efficiency_literature':'efficiency_estimate', 
                                      'uba_name_aggregate':'name_uba'})

# Sort columns
columns_sorted = [
                 'country_code',
                 'company',
                 'name',
                 'postcode',
                 'city',
                 'street',
                 'state',
                 'block',
                 'commissioned_original',
                 'commissioned',
                 'retrofit',
                 'shutdown',
                 'status',
                 'fuel',
                 'technology',
                 'type',
                 'eeg',
                 'chp',
                 'capacity',
                 'capacity_uba',
                 'chp_capacity_uba',
                 'merge_comment',
                 'efficiency_data',
                 'efficiency_estimate',
                 'efficiency_source',
                 'network_node',
                 'voltage',
                 'network_operator',
                 'name_uba',
                 'lat',
                 'lon',
                 'comment']
plantlist = plantlist.reindex(columns=columns_sorted)

plantlist.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app


Unnamed: 0,country_code,company,name,postcode,city,street,state,block,commissioned_original,commissioned,...,efficiency_data,efficiency_estimate,efficiency_source,network_node,voltage,network_operator,name_uba,lat,lon,comment
BNA0005,DE,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT A,1990,1990.0,...,0.31,0.354,https://www.ffe.de/download/berichte/Endberich...,Malchow,110,50 Hertz Transmission GmbH,Ahrensfelde A bis D,52.5895,13.558652,
BNA0006,DE,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT B,1990,1990.0,...,0.31,0.354,https://www.ffe.de/download/berichte/Endberich...,Malchow,110,50 Hertz Transmission GmbH,Ahrensfelde A bis D,52.5895,13.558652,
BNA0007,DE,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT C,1990,1990.0,...,0.31,0.354,https://www.ffe.de/download/berichte/Endberich...,Malchow,110,50 Hertz Transmission GmbH,Ahrensfelde A bis D,52.5895,13.558652,
BNA0008,DE,Vattenfall Europe Generation AG,Ahrensfelde,16356,Ahrensfelde,Lindenberger Weg,Brandenburg,GT D,1990,1990.0,...,0.31,0.354,https://www.ffe.de/download/berichte/Endberich...,Malchow,110,50 Hertz Transmission GmbH,Ahrensfelde A bis D,52.5895,13.558652,
BNA0010a,DE,Rheinkraftwerk Albbruck- Dogern AG,RADAG,79774,Albbruck,Kraftwerkstrasse 34,Baden-Württemberg,,1933,1933.0,...,,,,Tiengen,110,Amprion GmbH,Albbruck-Dogern,47.58629,8.13207,


# Documenting the data package (meta data)

We document the data packages meta data in the specific format JSON as proposed by the Open Knowledge Foundation. See the Frictionless Data project by OKFN (http://data.okfn.org/) and the Data Package specifications (http://dataprotocols.org/data-packages/) for more details.

In order to keep the notebook more readable, we first formulate the metadata in the human-readable YAML format using a multi-line string. We then parse the string into a Python dictionary and save that to disk as a JSON file.

In [33]:
# Here we define meta data of the resulting data package.
# The meta data follows the specification at:
# http://dataprotocols.org/data-packages/

metadata = """

name: opsd-power-plants-germany
title: List of power plants in Germany.
description: This dataset contains an augmented and corrected power plant list based on the power plant list provided by the BNetzA.
version: "2016-04-14"
keywords: [power plants,germany]
geographical-scope: Germany
opsd-changes-to-last-version: Fix errors, add column country_code

resources:
    - path: power_plants_germany.csv
      format: csv
      mediatype: text/csv
      schema:  # Schema according to: http://dataprotocols.org/json-table-schema/        
        fields:
            - name: id
              description: Power plant ID based on the ID provided in the BNetzA-list.
              type: string
            - name: country_code
              description: Two-letter ISO code
              type: string  
            - name: company
              description: Company name
              type: string
            - name: name
              description: Power plant name
              type: string
              format: default
            - name: postcode
              description: Postcode
              type: string
              format: default
            - name: city
              description: City
              type: string
              format: default
            - name: street
              description: Street
              type: string
              format: default
            - name: state
              description: State
              type: string
              format: default
            - name: block
              description: Power plant block 
              type: string
              format: default
            - name: commissioned_original
              description: Year of commissioning (raw data)
              type: string
              format: default
            - name: commissioned
              description: Year of commissioning 
              type: integer
              format: default
            - name: retrofit
              description: Year of modernization according to UBA data
              type: integer
              format: default
            - name: shutdown
              description: Year of decommissioning
              type: integer
              format: default
            - name: status
              description: Power plant status
              type: string
              format: default
            - name: fuel
              description: Used fuel or energy source
              type: string
              format: default
            - name: technology
              description: Power plant technology or sort
              type: string
              format: default
            - name: type
              description: Purpose of the produced power
              type: string
              format: default
            - name: eeg
              description: Status of being entitled to a renumeration
              type: boolean
              format: default
            - name: chp
              description: Status of being able to supply heat
              type: boolean
              format: default
            - name: capacity
              description: Power capacity
              type: number
              format: default
            - name: capacity_uba
              description: Power capacity according to UBA data
              type: number
              format: default
            - name: chp_capacity_uba
              description: Heat capacity according to UBA data
              type: number
              format: default
            - name: merge_comment
              description: Comment on BNetzA - UBA merge
              type: string
              format: default              
            - name: efficiency_data
              description: Proportion between power output and input
              type: number
              format: default
            - name: efficiency_estimate
              description: Estimated proportion between power output and input
              type: number
              format: default
            - name: efficiency_source
              description: Source of efficiency data
              type: string
              format: default
            - name: network_node
              description: Connection point to the electricity grid 
              type: string
              format: default
            - name: voltage
              description: Grid or transformation level of the network node
              type: string
              format: default
            - name: network_operator
              description: Network operator of the grid or transformation level
              type: string
              format: default
            - name: name_uba
              description: Power plant name according to UBA data
              type: string
              format: default
            - name: lat
              description: Precise geographic coordinates - latitude
              type: number
              format: default
            - name: lon
              description: Precise geographic coordinates - longitude
              type: number
              format: default
            - name: comment
              description: Further comments
              type: string
              format: default
    - path: power_plants_germany.xlsx
      format: xlsx
      mediatype: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet    

licenses:
    - url: http://example.com/license/url/here
      name: License Name Here
      version: 1.0
      id: license-id-from-open

sources:
    - name: BNetzA Kraftwerksliste,
      web: http://www.bundesnetzagentur.de/DE/Sachgebiete/ElektrizitaetundGas/Unternehmen_Institutionen/Versorgungssicherheit/Erzeugungskapazitaeten/Kraftwerksliste/kraftwerksliste-node.html
    - name: Umweltbundesamt Datenbank Kraftwerke in Deutschland,
      web: http://www.umweltbundesamt.de/dokument/datenbank-kraftwerke-in-deutschland

maintainers:
    - name: Clemens Gerbaulet
      email: cfg@wip.tu-berlin.de
      web: http://open-power-system-data.org/

views:
    # You can put hints here which kind of graphs or maps make sense to display your data. This makes the 
    # Data Package Viewer at http://data.okfn.org/tools/view automatically display visualazations of your data.
    # See http://data.okfn.org/doc/data-package#views for more details.    

# extend your datapackage.json with attributes that are not
# part of the data package spec
# you can add your own attributes to a datapackage.json, too

openpowersystemdata-enable-listing: True  # This is just an example we don't actually make use of yet.

opsd-jupyter-notebook-url: https://github.com/Open-Power-System-Data/datapackage_power_plants/blob/master/Power_Plants_DE.ipynb

"""

metadata = yaml.load(metadata)

datapackage_json = json.dumps(metadata, indent=4, separators=(',', ': '))

# Write the results to file

Write the outputs

In [34]:
output_path = 'output/datapackage_powerplants_germany/'
output_path2 = 'output/datapackage_powerplants_germany'

#Write the result to file
plantlist.to_csv(output_path+'power_plants_germany.csv', encoding='utf-8')

#Write the results to excel file
plantlist.to_excel(output_path+'power_plants_germany.xlsx', sheet_name='output')

#Write the results to sqlite database
plantlist.to_sql(output_path+'power_plants_germany', sqlite3.connect(output_path+'power_plants_germany.sqlite'), if_exists="replace") 

#Write the information of the metadata
with open(os.path.join(output_path, 'datapackage.json'), 'w') as f:
    f.write(datapackage_json)

#Set this string to this notebook's filename!    
nb_filename = 'download_and_process.ipynb'

# Save a copy of the notebook to markdown, to serve as the package README file
subprocess.call(['ipython', 'nbconvert', '--to', 'markdown', nb_filename])
path_readme = os.path.join(output_path2, 'README.md')
try:
    os.remove(path_readme)
except Exception:
    pass
os.rename(nb_filename.replace('.ipynb', '.md'), path_readme)       

  force_unicode(url))
  force_unicode(url))
  force_unicode(url))
  force_unicode(url))
  force_unicode(url))
