#### Overall Information Dataset EnergiNet (Energy Data DK)
This dataset originates from EnergiNet’s “Regulating and Balance Power, Overall Data,” available at EnergiNet Data Service. It contains key metrics from the energy market related to regulating and balancing power, with data spanning from 2021 to 2023.

This dataset covers energy market metrics:

1. Hour UTC & DK: Timestamp fields in UTC and Danish time, capturing full hours. Formats vary for CSV, XLSX, JSON, or API.
2. Price Area: Defines energy zones (e.g., DK1, DK2, SE, NO).
3. mFRR Balances: Records mFRR activations for upward, downward, and special balancing in MWh.
4. Imbalance MWh & Prices: Captures system imbalance in MWh and related prices in EUR and DKK per MWh.
5. Balancing Power Prices: Up/down regulation prices in EUR and DKK. DKK values are calculated from exchange rates.
Formatting varies by data type, and all fields follow energy market standards.


#### EnergiNet
Timespan: 2022-01-01 - 2022-12-31 
Tariff-Zone DK2 where Bornholm is located.  
Balancing Prices and in Euros
No Extra Research due to data holes was necessary, negative prices are neglected

#### Nordpool
Timespan: 2022-01-01 - 2022-12-31  
Day-Ahead prices, negative once neglected
3 days were missing and replaced by data of ENTSOE Transparency platform (details at the link)

Link ENTSOE: https://transparency.entsoe.eu/transmission-domain/r2/dayAheadPrices/show?name=&defaultValue=false&viewType=TABLE&areaType=BZN&atch=false&dateTime.dateTime=30.10.2022+00:00|CET|DAY&biddingZone.values=CTY|10Y1001A1001A65H!BZN|10YDK-2--------M&resolution.values=PT60M&dateTime.timezone=CET_CEST&dateTime.timezone_input=CET+(UTC+1)+/+CEST+(UTC+2)



In [4]:
import pandas as pd
import numpy as np



class DataManagement():
    
    
    def __init__(self, filename, Area, dropColumns):
        self.filename = filename
        if(Area=='DK2'):
            #GitHub-Path
            self.df = pd.read_csv("../Data assignment 1/"+str(filename)+".csv", delimiter=',')
            #self.df = pd.read_csv("./data/raw/"+str(filename)+".csv", delimiter=',')
        else:
            #GitHub-Path
            self.df = pd.read_csv("../Data assignment 1/"+str(filename)+".csv", delimiter=';')
            #self.df = pd.read_csv("./data/raw/"+str(filename)+".csv", delimiter=';')
        
        for column_name in self.df.columns:
                self.df[column_name] = self.df[column_name].str.replace(',', '.')
        
        
        if(Area == 'DK2'):
            self.df = self.df[self.df["PriceArea | PriceArea | 804696"]==Area]
            self.cleanDataSetEnergiNet(dropColumns)
        elif(Area == 'Nordpool'):
            self.cleanNordpool(dropColumns)
        
        
        
        
    def cleanDataSetEnergiNet(self, dropColumns):
        self.df.rename(columns={'HourDK | HourDK | 804695':'HourDK'}, inplace=True)
        self.df['HourDK'] = pd.to_datetime(self.df['HourDK'])
        self.df = self.df[(self.df['HourDK'] >='2022-01-01 00:00:00') & ('2023-01-01 00:00:00' > self.df['HourDK'])]
        self.df.drop(dropColumns,axis=1,inplace=True)
        self.df.reset_index(drop=True, inplace=True)
        self.df.index = self.df.index + 1
        
        self.df.columns = self.df.columns.str.strip()
        self.df.rename(columns={'BalancingPowerPriceUpEUR | BalancingPowerPriceUpEUR | 804718':'BalancingPriceUpEUR', 'BalancingPowerPriceDownEUR | BalancingPowerPriceDownEUR | 804720':'BalancingPriceDownEUR'}, inplace=True)
        
        self.df['BalancingPriceUpEUR'] = pd.to_numeric(self.df['BalancingPriceUpEUR'])
        self.df['BalancingPriceDownEUR'] = pd.to_numeric(self.df['BalancingPriceDownEUR'])
        self.df.loc[self.df['BalancingPriceUpEUR'] < 0, 'BalancingPriceUpEUR'] = 0
        self.df.loc[self.df['BalancingPriceDownEUR'] < 0, 'BalancingPriceDownEUR'] = 0
        
        
        #self.df.to_csv("./data/processed/BalancingPrices.csv")
        
    def cleanNordpool(self, dropColumns):
        self.df.drop(dropColumns,axis=1,inplace=True)
        self.df.rename(columns={'ts':'HourDK'}, inplace=True)
        self.df['HourDK'] = pd.to_datetime(self.df['HourDK'])
        self.df.rename(columns={'Nordpool Elspot Prices - hourly price DK-DK2 EUR/MWh | 9F7J/00/00/Nordpool/DK2/hourly_spot_eur | 3038':'DA_PriceEUR'}, inplace=True)
        self.df['DA_PriceEUR'] = pd.to_numeric(self.df['DA_PriceEUR'])
        self.df.loc[self.df['DA_PriceEUR'] < 0, 'DA_PriceEUR'] = 0
        
        full_range = pd.date_range(start= self.df['HourDK'].min(), end=self.df['HourDK'].max(), freq='h')
        
        missing_timestamps = full_range.difference(self.df['HourDK'])
        
        
        #self.df.to_csv("./data/processed/DA_Prices.csv")
        
    
    
        
        

In [6]:

dropCol = ['ts', 
           'HourUTC | HourUTC | 804694', 
           'ImbalancePriceDKK | ImbalancePriceDKK | 804717',
           'BalancingPowerPriceUpDKK | BalancingPowerPriceUpDKK | 804719',
           'BalancingPowerPriceDownDKK | BalancingPowerPriceDownDKK | 804721',
           'mFRRDownActBal | mFRRDownActBal | 804724',
           'PriceArea | PriceArea | 804696',
           'mFRRUpActSpec | mFRRUpActSpec | 804713',
           'mFRRDownActSpec | mFRRDownActSpec | 804714',
           'mFRRUpActBal | mFRRUpActBal | 804722',
           'mFRRDownActBal | mFRRDownActBal | 804724',
           'ImbalanceMWh | ImbalanceMWh | 804715',
           'ImbalancePriceEUR | ImbalancePriceEUR | 804716'
           ]
EnergiNet = DataManagement("EnergiNet_2022","DK2",dropCol)

dropCol = np.array([
    'Nordpool Elspot Prices - daily average price DK-DK2 DKK/MWh | 9F7J/00/00/Nordpool/DK2/daily_average_spot_dkk | 4821',
   'Nordpool Elspot Prices - daily average price DK-DK2 EUR/MWh | 9F7J/00/00/Nordpool/DK2/daily_average_spot_eur | 4822',
    'Nordpool Elspot Prices - hourly price DK-DK2 DKK/MWh | 9F7J/00/00/Nordpool/DK2/hourly_spot_dkk | 4820'
])
Nordpool = DataManagement("Nordpool_2022","Nordpool",dropCol)

prices_merge_df = pd.merge(EnergiNet.df, Nordpool.df, how='outer', on="HourDK")

prices_merge_df = prices_merge_df[prices_merge_df['HourDK']>=pd.to_datetime('2022-01-01 00:00:00')]

#GitHub-Path
prices_merge_df.to_csv('../Data assignment 1/prices_merged_df_output.csv', index=False)
#prices_merge_df.to_csv('./data/processed/prices_merged_df_output.csv', index=False)




In [7]:
prices_merge_df

Unnamed: 0,HourDK,BalancingPriceUpEUR,BalancingPriceDownEUR,DA_PriceEUR
24,2022-01-01 00:00:00,46.599880,46.599880,46.60
25,2022-01-01 01:00:00,41.329926,41.329926,41.33
26,2022-01-01 02:00:00,42.179790,42.179790,42.18
27,2022-01-01 03:00:00,44.370335,44.370335,44.37
28,2022-01-01 04:00:00,37.669601,37.669601,37.67
...,...,...,...,...
8779,2022-12-31 19:00:00,15.080000,15.080000,15.08
8780,2022-12-31 20:00:00,11.570000,1.340000,11.57
8781,2022-12-31 21:00:00,14.890000,0.000000,14.89
8782,2022-12-31 22:00:00,9.940000,0.000000,9.94
