# Vorlesung 4: Notebook Saisonale Speicherung

This notebook explores large scale integration of renewables sources in our energy system. The challenges of the consequent supply and demand balance, and how seasonal storage might help.

In [None]:
import numpy as np
import pandas as pd
import plotly.graph_objects as go

In [None]:
pd.options.plotting.backend = "plotly"
template = "plotly_white"
# template = "plotly_dark"

## Data
We use again the [Open Power System Dataset](https://data.open-power-system-data.org/time_series) for our toy-model. First we load the dataset and take only the power demand and renewable generation profiles from Germany (in federal level).

In [None]:
data = pd.read_csv("../data/time_series_60min_singleindex_filtered.csv", index_col=0, parse_dates=True)
data.index = data.index.tz_convert("Europe/Berlin") # convert timestap from UTC to local time

In [None]:
# function to filter and clean the dataset
def tso_dataframe(profile, tso):
    wind_columns = [
        f"{tso}_wind_onshore_generation_actual",
        f"{tso}_wind_offshore_generation_actual",
    ]
    wind_columns = profile.columns.intersection(wind_columns)   # some TSO regions have no offshore wind
    df_wind = profile[wind_columns].sum(axis=1)
    df_wind = df_wind.rename("wind")

    df_solar = profile[f"{tso}_solar_generation_actual"]
    df_solar = df_solar.rename("solar")

    df_load = profile[f"{tso}_load_actual_entsoe_transparency"]
    df_load = df_load.rename("load")

    df = pd.concat([df_load, df_solar, df_wind], axis=1) # join load, solar and wind data
    return df

In [None]:
df_DE = tso_dataframe(data, "DE")

In [None]:
df_DE.head()

## Residual load with increasing renewable penetration

To achieve the goals of decarbonization, Germany and many countries around the world plan to continue integrating renewable power sources to the grid. While today, roughly a third of the electricity consumption is covered by solar and wind, [the German government is targeting](https://www.iea.org/reports/germany-2020) on providing 65% of electricity from renewable sources on 2030 and over 80% in 2050. In the long term a 100% share would be desirable. 

Let's explore how increasing today's installed capacity of renewables affects the energy supply/demand balance, and try to answer under the assumption of self-sufficient system (no imports/exports): how much more renewables are required, compared with today (factor)?

### Residual load

Just as in the previous notebook, let's take a look at the residual load $P_{res}$ for Germany in the year 2019. This gives us an insight on the current state of renewable technologies' integration in our system. For the year 2019, Germany completed an installed capacity of solar PV of around 30 GW and a wind-power installed capacity of about 45 GW, while its peak load reached over 76 GW.

In [None]:
max_load = df_DE["load"].max() / 1e3
max_solar = df_DE["solar"].max() / 1e3
max_wind = df_DE["wind"].max() / 1e3
share = (df_DE["solar"].sum() + df_DE["wind"].sum()) / df_DE["load"].sum() * 100

print("Germany 2019:")
print("-------------")
print(f"Peak load:                {max_load:.2f} GW")
print(f"Solar installed capacity: {max_solar:.2f} GW")
print(f"Wind installed capacity:  {max_wind:.2f} GW")
print(f"Share of renewables:      {share:.2f} %") # Share of renewables in gross electricity consumption

In [None]:
def residual(df):
    df["residual"] = df["load"] - df["solar"] - df["wind"]
    return df

In [None]:
def plot_residual_curves(df, **kwargs):
    fig = go.Figure()
    fig.update_layout(xaxis_title="Time [h]", yaxis_title="Power [MW]", **kwargs)
    fig.add_trace(go.Scatter(x=np.arange(8760), y=np.zeros(8760), line={"color": "grey", "dash": "dash"}, opacity=0.7, showlegend=False, name=0))
    for name, residual in df.items():
        aggregated_residual = residual.sort_values(ascending=False).values
        time_hours = np.arange(aggregated_residual.size)
        fig.add_trace(go.Scatter(x=time_hours, y=aggregated_residual, name=name))
    return fig


In [None]:
df_DE = residual(df_DE)

In [None]:
plot_residual_curves(df_DE[["residual"]], title="Residual load duration curve (Germany, 2019)", width=700, height=600, template=template)

### Residual load with increasing renewables

Now let's quantify, how the residual load shifts under different expansion scenarios.

In [None]:
# function to re-calculate the residual with increasing wind and PV penetration, represented with scaled factors
def scale_renewables(df, scales):
    df_out = df[["residual"]]
    for scale in scales:
        df_scaled = df.copy()
        df_scaled["solar"] *= scale
        df_scaled["wind"]  *= scale
        residual(df_scaled)
        df_scaled_res = df_scaled["residual"].rename(f"residual: scale x{scale}")
        df_out = pd.concat([df_out, df_scaled_res], axis=1)

    return df_out

In [None]:
df_scaled=scale_renewables(df_DE, np.arange(2, 6))

In [None]:
plot_residual_curves(df_scaled, title="Residual load duration curve with increasing renewable penetration", width=800, height=600, template=template)

### Generation/consumption balance

The residual load duration curves show a significant amount of *"negative residuals"* with increasing renewables. What this means in therms of deficit/surplus energy balance is shown in the following section.

In [None]:
df_scaled=scale_renewables(df_DE, np.arange(1.5, 6.0, 0.5))

In [None]:
pd.concat([
    df_scaled.sum().rename("Energy Balance [TWh]") / 1e6, 
    df_scaled.min().rename("Min. Power [GW]") / 1e3,
    df_scaled.max().rename("Max. Power [GW]") / 1e3
    ], 
    axis=1
)

In [None]:
# building a dataframe with columns "demand" & "surplus" that split the energy 
# deficit and surplus caused from the demand and renewable generation
def energy_balance(df_scaled):
    df = df_scaled["residual"].copy()
    deficit = list()
    surplus = list()
    names = list()
    for name, df in df_scaled.items():
        p = df.loc[df > 0].sum() / 1e6 # MWh -> TWh
        n = df.loc[df < 0].sum() / 1e6
        deficit.append(p)
        surplus.append(n)
        names.append(name)

    return pd.DataFrame(data={
        "demand": deficit,
        "surplus": surplus,
    }, index = names
    )

In [None]:
energy_balance(df_scaled).plot.bar(template=template, labels={"value": "Energy [TWh]", "index": "Scenario"}, title="Energy balance with increasing grenewable penetration")

In the reference year 2019, the residual load is still at all times `> 0`, no surplus occurs. Then, with an increasing factor of wind and PV penetration, we observe in the balance an increasing in surplus renewable supply and decreasing demand deficit. But even in the most aggressive scenario with `scale = 5.5` the renewables do not manage to by itself to cover the demand at all times, despite a huge excess of generation for most of the year. 



## Balancing residual load through gross storage

We want to use energy storage to provide a 100 % supply from renewable energy sources. Let's estimate how much more renewables are needed (compared to the base scenario `"2019"`) and how big should the storage capacity be, to fulfill the requirements. 

In [None]:
# First, set surplus production equal to residual energy demand (storage would have to be 100% efficient).
res_frac = (df_DE["solar"].sum() + df_DE["wind"].sum()) / df_DE["load"].sum()
scale = 1 / res_frac
print(f"Factor: {scale:.2f} required scale to 100% renewable")

### Assuming an *"ideal"* storage system with no losses

In [None]:
# scale renewables for a 100 % renewable supply
df_ideal_storage = df_DE.copy()
df_ideal_storage["solar"] *= scale
df_ideal_storage["wind"]  *= scale
residual(df_ideal_storage)

# derive storage need for balancing the residual load
storage_power=-df_ideal_storage["residual"]
energy_level = storage_power.cumsum() * 1e6

In [None]:
storage_power.plot(template=template, labels={"value": "Storage Power in MW"})

In [None]:
energy_level.plot(template=template, labels={"value": "Energy in Wh"})

In [None]:
ideal_storage_cap=(energy_level.max() - energy_level.min())*1e-12
print(f"Ideal storage capacity in TWh: {ideal_storage_cap:.2f}")

### Simulating a *"real"* storage with limited capacity, power and losses 

In [None]:
# simulate the operation with a fixed storage capacity and power, and charge/discharge efficiency (eta_in, eta_out) 
# - which fraction of el. need can we cover with this type of storage
def simulate_bucket_storage(df, capacity, max_p_stor, eta_in, eta_out):
    df["energy"] = 0   # energy
    df["power_DC"] = 0 # power to charge/discharge the storage
    df["power_AC"] = 0 # power from the grid perspective: account for losses

    soe_old = capacity * 0.5 # initial state of energy
    p_stor = 0
    for i, row in df.iterrows():
        # if i == df.index[0]:
        #     row.soe = soe_old
            # continue
        if row.residual > 0:
            # discharge
            p_stor = row.residual / eta_out
        else:
            # charge
            p_stor = max(row.residual * eta_in, - max_p_stor) # limit charge power
        soe_new = soe_old - p_stor * 1            # power to energy for 1 hour sampling
        soe_new = min(max(soe_new, 0), capacity)  # limit storage energy level
        df.loc[i, "power_DC"] = soe_new - soe_old # real power considering SOE limits
        if row.residual > 0:
            power_ac = (soe_new - soe_old) * eta_out 
        else:
            power_ac = (soe_new - soe_old) / eta_in
        df.loc[i, "power_AC"] = power_ac # power 
        df.loc[i, "energy"] = soe_new
        soe_old = soe_new

    return df

In [None]:
scale = 3.7 # overdimension the renewables to compensate for losses
df_PtG_storage = df_DE.copy()
df_PtG_storage["solar"] *= scale
df_PtG_storage["wind"]  *= scale
residual(df_PtG_storage)

df_PtG_storage = simulate_bucket_storage(df_PtG_storage, 20e6, 80e3, 0.6, 0.6)

In [None]:
df_PtG_storage["power_DC"].plot(template=template, labels={"value": "Power [MW]"})

In [None]:
df_PtG_storage["energy"].plot(template=template, labels={"value": "Energy [MWh]"})

In [None]:
# Analyse der Leistung und des Anteils der Energie, der abgeregelt wurde bzw. nicht gespeichert werden konnte
df_PtG_storage["storage power"] =  -df_PtG_storage["power_AC"]
df_PtG_storage[["residual", "storage power"]].plot(template=template, labels={"value": "Power [MW]"})

In [None]:
fulfillment = (df_PtG_storage["storage power"] / df_PtG_storage["residual"]).sum() / df_PtG_storage.shape[0]
print(f"Fulfillment factor: {fulfillment*100:.2f} %")

In [None]:
balance = (df_PtG_storage[df_PtG_storage["residual"] >0]["residual"].sum() / df_PtG_storage[df_PtG_storage["residual"] <0]["residual"].abs().sum())
print(f"Balance demand/surplus: {balance*100:.2f} %")

## Analysis of storage operations and considerations for hybrid storage solutions

As discussed in this lecture, there is no one-fits-all energy storage solution. Instead, the optimal results from a combination of different technologies, each with their own advantages and shortcoming: specific capacity/power costs, efficiency and power ramp dynamics.

To emulate the combination of different storage technologies we assume they take care of 3 different time intervals:
* Long-term (week-average)
* Mid-term (day-average)
* Short-term (real-time)

What are the new capacity requirements? And what is the characteristic operation for each technology?

In [None]:
# We create a copy of the residual load data and store the rolling average
df_tech = df_ideal_storage[["residual"]].copy()
df_tech["energy"] = df_tech["residual"].cumsum() * 1e-6

In [None]:
df_tech["energy_week_roll"] = df_tech["energy"].rolling(pd.Timedelta(days=7)).mean()
df_tech[["energy", "energy_week_roll"]].plot(template=template, labels={"value": "Energy in TWh"})

In [None]:
# der Rollierende Mittelwert wird über verschiedene Mittlungs-Zeiträume gespeichert
df_tech["energy_delta_week_roll"] = df_tech["energy"] - df_tech["energy_week_roll"]
# df_tech[["energy", "energy_delta_week_roll"]].plot(template=template)

df_tech["energy_day_roll"] = df_tech["energy_delta_week_roll"].rolling(pd.Timedelta(days=1)).mean()
df_tech["energy_delta_day_roll"] = df_tech["energy_delta_week_roll"] - df_tech["energy_day_roll"]
# df_tech[["energy", "energy_delta_week_roll", "energy_delta_day_roll"]].plot(template=template)

df_tech["energy_hour_roll"] = df_tech["energy"] - (df_tech["energy_day_roll"] + df_tech["energy_week_roll"])
# df_tech["energy_hour_roll"].plot(template=template)

In [None]:
df_tech["energy week + day"] = df_tech["energy_week_roll"] + df_tech["energy_day_roll"] # + df_tech["energy_hour_roll"]
df_tech[["energy", "energy week + day"]].plot(template=template)

In [None]:
df_tech[["energy_hour_roll", "energy_day_roll", "energy_week_roll"]].plot(
    template=template, 
    labels={"value": "Energy in TWh"}
)

In [None]:
# Energy need
df_tech[["energy", "energy_hour_roll", "energy_day_roll", "energy_week_roll"]].abs().max()

In [None]:
# Energy throughput
df_tech[["energy", "energy_hour_roll", "energy_day_roll", "energy_week_roll"]].diff().abs().sum()

In [None]:
# Power max
df_tech[["energy", "energy_hour_roll", "energy_day_roll", "energy_week_roll"]].diff().abs().max()

In [None]:
# power time-series
df_tech[["energy", "energy_hour_roll", "energy_day_roll", "energy_week_roll"]].diff().plot(template=template, labels={"value": "Power [TW]"})

In [None]:
# Analysis of the energy levels of the hybrid storage
df_tech[["energy_week_roll", "energy_day_roll", "energy_hour_roll"]].plot.hist(labels={"value": "Energy [TWh]"}).update_layout(barmode='overlay').update_traces(opacity=0.75)


In [None]:
# Analyse of the driven power
df_tech[["energy_week_roll", "energy_day_roll", "energy_hour_roll"]].diff().plot.hist(labels={"value": "Power [TW]"}).update_layout(barmode='overlay').update_traces(opacity=0.75)

In [None]:
# function to find 
def group_cluster(series):
    df = series.to_frame()
    counter = 0
    first_zero = True
    df["tag"] = 0

    for i, v in series.items():
        if v == 0:
            t = 0
            if first_zero:
                counter += 1
                first_zero = False
        else:
            t = counter
            first_zero = True

        df.loc[i, "tag"] = t

    return df.groupby("tag")

In [None]:
groups = dict()
for name, series in df_tech[["energy_week_roll", "energy_day_roll", "energy_hour_roll"]].diff().items():
    df = pd.Series([max(v, 0) for v in series], name=name)
    groups[name] = group_cluster(df)

In [None]:
# amount of energy
sums = pd.DataFrame()
for name, group in groups.items():
    group_sum = group.sum()[name]
    group_sum = group_sum.loc[group_sum.index != 0]
    group_sum = group_sum.sort_values(ascending=False).reset_index(drop=True)
    # group_sum = group_sum.reset_index(drop=True)
    sums = pd.concat([sums, group_sum], axis=1)

In [None]:
sums.plot.bar(facet_row="variable", height=600, labels={"value": "Energy [MWh]", "index": "counts"})

In [None]:
# time duration
durations = pd.DataFrame()
for name, group in groups.items():
    group_count = group.count()[name]
    group_count = group_count.loc[group_count.index != 0]
    group_count = group_count.sort_values(ascending=False).reset_index(drop=True)
    # group_count = group_count.reset_index(drop=True)
    durations = pd.concat([durations, group_count], axis=1)

In [None]:
durations.plot.bar(facet_row="variable", height=600, labels={"value": "Duration [h]", "index": "counts"})