# Electriticy Generation in Denmark by Source

> **Note the following:** 
> 1. This is *not* meant to be an example of an actual **data analysis project**, just an example of how to structure such a project.
> 1. Remember the general advice on structuring and commenting your code
> 1. The `dataproject.py` file includes a function which can be used multiple times in this notebook.

Imports and set magics:

In [10]:
!pip install requests



In [12]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
import requests

# autoreload modules when code is run
%load_ext autoreload
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


# Read and clean data

Import your data, either through an API or manually, and load it. 

In [13]:
#Import data
import dataproject as datp

## Explore each data set

In order to be able to **explore the raw data**, you may provide **static** and **interactive plots** to show important developments 

In [14]:
df = datp.df

In [15]:
#Check for duplicates
mylist = df['HourDK'].values.tolist()

dup = {x for x in mylist if mylist.count(x) > 2}
print(dup)
#To count the number of list elements that were duplicated, you can run
print(len(dup))

{'2022-10-30T02:00:00'}
1


In [17]:
# Determine the indicies
df.index[df['HourDK'] == '2022-10-30T02:00:00']

# Delete two rows
try: 
    df = df.drop(labels=[3018, 3019], axis=0)
except:
  print("Already removed")


Already removed


In [22]:
# We create a sub dataframe
df2 = df[['HourDK', 'PriceArea', 'OffshoreWindLt100MW_MWh', 'OffshoreWindGe100MW_MWh', 'OnshoreWindLt50kW_MWh', 'OnshoreWindGe50kW_MWh', 'HydroPowerMWh', 'SolarPowerLt10kW_MWh', 'SolarPowerGe10Lt40kW_MWh', 'SolarPowerGe40kW_MWh', 'SolarPowerSelfConMWh', 'UnknownProdMWh', 'GrossConsumptionMWh']]

#Create new variables
df2['Offshore_MWh']=df2['OffshoreWindLt100MW_MWh']+df2['OffshoreWindLt100MW_MWh']
df2['Onshore_MWh']=df2['OnshoreWindLt50kW_MWh']+df2['OnshoreWindGe50kW_MWh']
df2['Solar_MWh']=df2['SolarPowerLt10kW_MWh']+df2['SolarPowerGe10Lt40kW_MWh']+df2['SolarPowerGe40kW_MWh']+df2['SolarPowerSelfConMWh']

#Group DK1 and DK2 to get total generation and consumption for DK
df2 = df2.groupby([df2['HourDK']]).sum()

df2

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df2['Offshore_MWh']=df2['OffshoreWindLt100MW_MWh']+df2['OffshoreWindLt100MW_MWh']
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df2['Onshore_MWh']=df2['OnshoreWindLt50kW_MWh']+df2['OnshoreWindGe50kW_MWh']
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df2['Solar_MWh']=df2['SolarPowerLt10kW_MWh']+d

Unnamed: 0_level_0,OffshoreWindLt100MW_MWh,OffshoreWindGe100MW_MWh,OnshoreWindLt50kW_MWh,OnshoreWindGe50kW_MWh,HydroPowerMWh,SolarPowerLt10kW_MWh,SolarPowerGe10Lt40kW_MWh,SolarPowerGe40kW_MWh,SolarPowerSelfConMWh,UnknownProdMWh,GrossConsumptionMWh,Offshore_MWh,Onshore_MWh,Solar_MWh
HourDK,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
2022-01-01T00:00:00,144.484727,1747.386902,4.813434,1273.360534,1.637731,0.056808,0.009113,0.031290,0.0,5.171680,3611.806640,288.969454,1278.173968,0.097211
2022-01-01T01:00:00,150.110846,1751.599793,4.744580,1340.005585,1.641966,0.054143,0.009111,0.027780,0.0,4.867611,3600.315186,300.221692,1344.750165,0.091034
2022-01-01T02:00:00,130.517198,1796.572266,4.011871,1224.645172,1.637933,0.051622,0.008281,0.028540,0.0,5.019980,3490.401734,261.034396,1228.657043,0.088443
2022-01-01T03:00:00,136.758549,1628.485839,3.190033,1049.801178,1.639639,0.056474,0.008878,0.028670,0.0,4.862911,3316.040405,273.517098,1052.991211,0.094022
2022-01-01T04:00:00,141.342327,1317.700196,2.662985,1026.876068,1.634412,0.048473,0.009171,0.028560,0.0,3.409950,3311.823486,282.684654,1029.539053,0.086204
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-12-31T19:00:00,42.523418,827.019367,2.018027,722.054291,2.873119,0.170019,0.019504,0.040236,0.0,6.819271,4228.176148,85.046836,724.072318,0.229759
2022-12-31T20:00:00,28.448331,766.496630,1.441737,598.075409,2.875437,0.142202,0.015289,0.039440,0.0,6.603209,4090.847900,56.896662,599.517146,0.196931
2022-12-31T21:00:00,15.186549,862.758726,1.235196,498.366211,2.872020,0.117390,0.013630,0.034980,0.0,6.619489,4113.606934,30.373098,499.601407,0.166000
2022-12-31T22:00:00,13.415196,820.323399,1.323105,452.326248,2.870221,0.118744,0.009642,0.033489,0.0,6.300558,4028.385132,26.830392,453.649353,0.161875


# Analysis

In [28]:
sumstat = df2.agg(
    {
        "GrossConsumptionMWh": ["count", "min", "max", "mean"],
        "Offshore_MWh": ["count", "min", "max", "mean"],
        "Onshore_MWh": ["count", "min", "max", "mean"],
        "Solar_MWh": ["count", "min", "max", "mean"],
        "HydroPowerMWh": ["count", "min", "max", "mean"],
    
    }
)

sumstat.transpose()

Unnamed: 0,count,min,max,mean
GrossConsumptionMWh,17518.0,920.682495,4010.411865,2025.40888
Offshore_MWh,17518.0,0.0,491.13205,113.705805
Onshore_MWh,17518.0,0.004884,3443.214388,586.860044
Solar_MWh,17518.0,0.016974,1235.489007,125.731658
HydroPowerMWh,17518.0,0.0,4.823965,0.853172


We make a descriptive analysis of the danish energy production of wind-, solar-, and waterpower. This is shown in the tabular above.
we have grouped the data so that the data sums up all producers of wind, solar and water energy in their own categories. Thus, it is not clear whether it is a large or small producer thereby there is a large variation in the production in the observations. 

We find that the on shore wind parks are the largest contributors of energy and hydropower is the smallest contributor of green energy.


# Conclusion

ADD CONCISE CONLUSION.