In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import math

In [None]:

files = []
for file in os.listdir('data'):
    if file.endswith('.csv'):
        files.append(file)
print(files)

In [None]:
df = pd.read_csv('data/' + files[0])

In [None]:
df

In [None]:
plt.scatter(x=df['longitude'], y=df['latitude'])
plt.show()

In [None]:
import geopandas as gpd
from shapely.geometry import Point
path_to_germany = "./data/vg2500_geo84/vg2500_bld.shp"
germany_gdf = gpd.read_file(path_to_germany)

In [None]:
germany_gdf.plot()

In [None]:
print("CRS for Germany shapefile:", germany_gdf.crs)

In [None]:
geometry = [Point(xy) for xy in zip(df.longitude, df.latitude)]
geo_df = gpd.GeoDataFrame(df, geometry=geometry)

In [None]:
geo_df.set_crs(germany_gdf.crs, inplace=True)

In [None]:
fig, ax = plt.subplots()
germany_gdf.plot(ax=ax, color='lightgrey')

geo_df.plot(ax=ax, marker='o', color='red', markersize=5)

plt.show()

## Convert datetimes (column time)

In [None]:
df["timestamps"] = pd.to_datetime(df["time"])
df["epoch_time"] = (df['timestamps'] - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')
df = df.drop(columns=["timestamps"])
print("DataFrame with Epoch Time:")
df

In [None]:
df["windspeed"] = np.sqrt(df['u10']**2 + df['v10']**2)

In [None]:
df

In [None]:
df.groupby(["forecast_origin"]).agg({"windspeed": ["sum", "std"]})

In [None]:
df.columns

- **longitude, latitude**: Geografische Längen- und Breitengrade der Messpunkte oder Prognosepunkte.

- **forecast_origin**: Datum und Uhrzeit, wann die Wettervorhersage erstellt wurde.

- **time**: Datum und Uhrzeit, für die die Wettervorhersage gilt.

- **cdir**: Oft eine Abkürzung für „Cloud Direction“ oder „Convection Direction“, die die Windrichtung in bestimmten Höhen angibt.

- **z**: Normalerweise die geopotentielle Höhe, die die Höhe über dem Meeresspiegel in Bezug auf das geopotentielle Niveau anstelle des tatsächlichen Abstands angibt.

- **msl**: "Mean Sea Level Pressure", also der mittlere Luftdruck auf Meereshöhe.

- **blh**: "Boundary Layer Height", die Höhe der atmosphärischen Grenzschicht, was wichtig für viele Prozesse in der Meteorologie ist.

- **tcc**: "Total Cloud Cover", der gesamte Wolkenbedeckungsgrad, ausgedrückt als Prozentsatz oder Bruchteil.

- **u10, v10**: Windkomponenten (u und v) in 10 Metern Höhe. „u“ ist die Ost-West-Komponente und „v“ die Nord-Süd-Komponente.

- **t2m**: "Temperature at 2 meters", die Lufttemperatur in 2 Metern Höhe über dem Boden.

- **ssr**: "Surface Solar Radiation", die auf die Erdoberfläche eingestrahlte Sonnenenergie.

- **tsr**: "Top of Atmosphere Solar Radiation", die am oberen Rand der Atmosphäre eingestrahlte Sonnenenergie.

- **sund**: Wahrscheinlich eine Abkürzung für die Anzahl der Sonnenstunden oder Sonnendauer.

- **tp**: "Total Precipitation", die Gesamtniederschlagsmenge über einen bestimmten Zeitraum.

- **fsr**: Oft steht dies für "Forecast Solar Radiation", also die vorhergesagte Sonneneinstrahlung.

- **u100, v100**: Ähnlich wie u10 und v10, aber für Windkomponenten in 100 Metern Höhe.


## Lets plot the windspeed throughout the year

In [None]:
import plotly.graph_objects as go


In [None]:
fig = go.Figure()
x_axis = df["time"]

fig.add_trace(go.Scatter(x=x_axis, y=df.windspeed,
                    mode='lines',
                    name='windspeed'))
fig.add_trace(go.Scatter(x=x_axis, y=df.ssr,
                    mode='lines',
                    name='solar'))

fig.show()





## Nichts lineares 
- Mehr Wind muss nicht mehr Energie heissen - Turbinen hab kapa
- Solar ebenso - ab 25 Grad (?) Limit erreicht, danach decrease glaub

# Unsere Aufgabe
Predicte wind und solar für die nächste Stunde anhand
- Variablen suchen wir aus ? Wir haben aber den jetzigen Stand gegeben ?
    - Sprich Uhrzeit, jetzigen Wind, wo wir uns befinden, Monat, wie Wind hatten wir davor ?
- Dürfen wir selber Daten anreichern ? Wetter daten w.r.t Sonne, re

----------

In [None]:
import seaborn as sns
from statsmodels.tsa.seasonal import seasonal_decompose

In [None]:
df_solar = df[["time", "ssr"]]
df_solar["time"]= pd.to_datetime(df_solar["time"])
df_solar['month_year'] = df_solar['time'].dt.strftime('%m-%Y')
df_solar.sort_values(by="time", inplace=True)
df_solar = df_solar.drop(columns=["time"])


In [None]:
df_solar[df_solar["month_year"]=="06-2020"]

In [None]:
result = seasonal_decompose(x= df_solar["ssr"], model='additive', period=12)
result.plot()
plt.suptitle('Solar radiation')
plt.tight_layout()
plt.show()

In [None]:
df_prices = pd.read_csv('data/' + files[1], sep=';')
df_realized_supply = pd.read_csv('data/' + files[2], sep=';')
df_installed_cp = pd.read_csv('data/' + files[3], sep=';')
df_realized_demand = pd.read_csv('data/' + files[4], sep=';')

In [None]:
dates = pd.date_range(start='1/1/2019', periods=3*12, freq='M')
radiation = np.sin(2 * np.pi * np.arange(1, len(dates)+1) / 12) * 1000 + np.random.normal(0, 100, len(dates))
df_solar = pd.DataFrame({'Date': dates, 'ssr': radiation})
df_solar.set_index('Date', inplace=True)

# Seasonal decomposition
result = seasonal_decompose(df_solar['ssr'], model='additive', period=12)
result.plot()
plt.suptitle('Solar Radiation')
plt.tight_layout()
plt.show()

In [None]:
df_realized_supply

In [None]:
df_realized_supply["timestamps"] = pd.to_datetime(df_realized_supply["Date from"])
fig = go.Figure()
x_axis = df_realized_supply["timestamps"]

fig.add_trace(go.Scatter(x=x_axis, y=df_realized_supply["Photovoltaic [MW]"],
                    mode='lines',
                    name='solar'))

fig.show()

In [None]:
df_realized_supply = pd.read_csv('data/' + files[2], sep=';')
df_realized_supply["time"]= pd.to_datetime(df_realized_supply["Date from"])
df_realized_supply['month_year'] = df_realized_supply['time'].dt.strftime('%Y-%m')
# drop everything besides PhotoVoltaic

df_realized_supply.columns

In [None]:
def preprocess_ssr(value):
    value = value.split(',')[0]
    value = value.replace('.', '')
    return float(value)


In [None]:
df_realized_supply["Photovoltaic [MW]"].unique()

In [None]:

df_realized_supply["Photovoltaic [MW]"].unique()

df_realized_supply["Photovoltaic [MW]"] = df_realized_supply["Photovoltaic [MW]"].apply(preprocess_ssr)
df_realized_supply["Wind Onshore [MW]"] = df_realized_supply["Wind Onshore [MW]"].apply(preprocess_ssr)


In [None]:
df_realized_supply["Photovoltaic [MW]"].unique()

In [None]:

monthly_avg_df = df_realized_supply.groupby('month_year').agg({"Photovoltaic [MW]": "mean", "Wind Onshore [MW]": "mean"}).reset_index()
monthly_avg_df.sort_values(by="month_year", inplace=True)
monthly_avg_df

In [None]:
fig = go.Figure()
x_axis = monthly_avg_df["month_year"]

fig.add_trace(go.Scatter(x=x_axis, y=monthly_avg_df["Photovoltaic [MW]"],
                    mode='lines',
                    name='solar'))

fig.add_trace(go.Scatter(x=x_axis, y=monthly_avg_df["Wind Onshore [MW]"],
                    mode='lines',
                    name='wind'))

fig.show()

# Kalmann Filter

## Initial Estimate

$$\hat{\mathbf{x}}_{0,0}, \quad \mathbf{P}_{0,0}$$

### Extrapolate ("Predict")

1. Extrapolate the state:
   
   $$ \hat{\mathbf{x}}_{n+1,n} = \mathbf{F} \hat{\mathbf{x}}_{n,n} + \mathbf{G} \mathbf{u}_n $$
   

2. Extrapolate uncertainty:
   
   $$\mathbf{P}_{n+1,n} = \mathbf{F} \mathbf{P}_{n,n} \mathbf{F}^T + \mathbf{Q}$$

<br>
We extraplolate the state at time n and the uncertainty to the next time step. Our guesses can be updated with the measurement that takes place later on.
<br>

## Measurement Update ("Correct")
After we have the measurement at time n+1, we can update our estimates with the measurement.

1. Compute the Kalman Gain:

   $$\mathbf{K}_n = \mathbf{P}_{n,n-1} \mathbf{H}^T (\mathbf{H} \mathbf{P}_{n,n-1} \mathbf{H}^T + \mathbf{R}_n)^{-1}$$


2. Update estimate with measurement:
   
$$   \hat{\mathbf{x}}_{n,n} = \hat{\mathbf{x}}_{n,n-1} + \mathbf{K}_n (\mathbf{z}_n - \mathbf{H} \hat{\mathbf{x}}_{n,n-1})$$
   

3. Update the estimate uncertainty:

   $$\mathbf{P}_{n,n} = (\mathbf{I} - \mathbf{K}_n \mathbf{H}) \mathbf{P}_{n,n-1} (\mathbf{I} - \mathbf{K}_n \mathbf{H})^T + \mathbf{K}_n \mathbf{R}_n \mathbf{K}_n^T$$ 
   bzw.
   
   $$ \mathbf{P}_{n,n} = (\mathbf{I} - \mathbf{K}_n \mathbf{H}) \mathbf{P}_{n,n-1}$$
   




The Kalman gain basically makes nothing but variance in estimate divided by variance in estimate + variance in measurement. If the Kalman gain is close to zero, this means that the measurement uncertainty is high and the estimate uncertainty is low. So we gice more weight to the estimate and only small weight to the measurement itself. For the opposite case, we trust the measurement and give it hence more weight.

After having the Kalman gain, we can update our estimate with the measurement. We can also update the uncertainty of our estimate, as we had extraploated the uncertainty in the first step which we can now update. The reason for the equaiton to look like that is a long derivation. WHat we basically do is we know our estimate update. Because of linearity, we can plug in the variances instead of the estimate itself. After that, we want to minimize the variance of our estimate (kinda equals minimizing the MSE loss, having less variance means we are more certain about our estimate). Setting that equation to 0 and doing fancy stuff, we get the equation above.

After doing all these steps, we repeat the process. Our new estimates become our prior estimates for the next time step. There, we again extrapolate the state and uncertainty and update our estimates with the measurement. 

# Problem
- Ich hab keine Ahnunng wie ich F machen soll, geschweige denn Q
- Wie sieht das lineare Modell aus für das predicten des States ??
- Wie sieht mein State überhaupt aus ? hab ich u100,v100, wind on+offshore ? theoretisch hab ich on+offshore ja nicht, das ist das was ich ja durch das lineare modell durch u100 und v100 rausbekomme

## Regressionsanlayse ??

$$ Power (W) = \frac{1}{2} \times ρ \times A \times v^3 $$




- Power = Watts
- ρ (rho, a Greek letter) = density of the air in kg/m3
- A = cross-sectional area of the wind in m2
- v = velocity of the wind in m/s>


In [None]:
import seaborn as sns
from statsmodels.tsa.seasonal import seasonal_decompose