Importing the Dependencies

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Problem Statement
Company XY introduced solar panels and accompanying batteries to make them self, less dependent on the main grid and therefore reduce their overall energy dependance and CO2 emissions. First phase of the project was successfully completed with installation of solar panels and accompanying batteries. The goal of the second phase is to analyze their consumption and try to reduce the usage of the main grid electricity to zero by using ML for prediction of next possible window where electricity from main grid will be needed. Current analysis has shown that the consumption of energy from main grid is still present and therefore the ML should be deployed to make predictions when the next possible usage from the main grid will take place so that the energy management team can put some measurements in place to prohibit such usage. For example, energy consumption should be reduced during this time to overcome the upcoming need for electricity from main grid.

nicht ELWOG  = Consumption from battery
Energie-Graz_Lf_ERZ = Consumption from mein Grid




## Data Import
Import the data from energydata file

In [29]:
df = pd.read_csv("Energiedaten/EV_AT0081000801000000000000000307612_Mai_2022.csv", delimiter=";")
sun_df = pd.read_csv("Energiedaten/HISTALP_AT_GRA_SU1_2022_2023-2.csv", delimiter=";")

new_columns = ["KA", "Energie-Erzeuger", "Von", "Bis", "Kwh"]

df.columns = new_columns
df = df.reset_index(drop=True)
df = df.drop("KA", axis=1)

df['Kwh'] = df['Kwh'].str.replace(',', '.').astype(float)

df['Von'] = pd.to_datetime(df['Von'], format="%d.%m.%Y %H:%M")
df['Bis'] = pd.to_datetime(df['Bis'], format="%d.%m.%Y %H:%M")

df.set_index('Von', inplace=True)

grid_data = df[df['Energie-Erzeuger'] == 'Energie-Graz_Lf_ERZ']

FileNotFoundError: [Errno 2] No such file or directory: 'Energiedaten/EV_AT0081000801000000000000000307612_Mai_2022.csv'

In [22]:
sun_df

Unnamed: 0,year,jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec,mar-may,jun-aug,sep-nov,dec-feb,apr-sep,oct-mar,jan-dec
0,2022,162,180,237,182,219,237,293,209,158,186,98,76,638,739,442,309,1298,999999,2237
1,2023,72,161,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999,999999


In [12]:
sun_df = sun_df.melt(id_vars='year', var_name='month', value_name='sun_hours')
month_map = {'jan': 1, 'feb': 2, 'mar': 3, 'apr': 4, 'may': 5, 'jun': 6, 'jul': 7, 'aug': 8, 'sep': 9, 'oct': 10, 'nov': 11, 'dec': 12}
sun_df.dropna(subset=['month'], inplace=True)
sun_df = sun_df[sun_df["year"] != 2023]
sun_df['month'] = sun_df['month'].map(month_map)
sun_df.dropna(subset=['month'], inplace=True)
sun_df['date'] = pd.to_datetime(sun_df['year'].astype(str) + '-' + sun_df['month'].astype(int).astype(str) + '-01')

df = df.merge(sun_df, left_on=df.index.to_period('M'), right_on=sun_df['date'].dt.to_period('M'), how='left')


In [21]:
df["Kwh"].max()

1.5525

## Data Analysis
Initial decision which Models/Algorithms arr going to br used as potential candidates for the final model. Elaborate why these models

In [64]:
#LSTM



## Data Preprocessing
Here you should prepare data accordingly

In [24]:
from sklearn.preprocessing import MinMaxScaler

grid_data_hourly = grid_data['Kwh'].resample('H').sum()

# Reshape data for scaling
data = np.array(grid_data_hourly).reshape(-1, 1)

scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data)

## Training and build Models
Use at least two different models. One from deep learning LSTM and one from classical ML (ensemble ML)

In [25]:
train_size = int(len(scaled_data) * 0.80)
test_size = len(scaled_data) - train_size

train, test = scaled_data[0:train_size,:], scaled_data[train_size:len(scaled_data),:]

def create_dataset(dataset, look_back=1):
    X, Y = [], []
    for i in range(len(dataset) - look_back - 1):
        a = dataset[i:(i + look_back), 0]
        X.append(a)
        Y.append(dataset[i + look_back, 0])
    return np.array(X), np.array(Y)

look_back = 10
X_train, Y_train = create_dataset(train, look_back)
X_test, Y_test = create_dataset(test, look_back)

# reshape input to be [samples, time steps, features]
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))
X_test = np.reshape(X_test, (X_test.shape[0], 1, X_test.shape[1]))


In [26]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM

model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(20))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, Y_train, epochs=150, batch_size=1, verbose=2)


Epoch 1/150
584/584 - 1s - loss: 0.0164 - 715ms/epoch - 1ms/step
Epoch 2/150
584/584 - 0s - loss: 0.0149 - 225ms/epoch - 386us/step
Epoch 3/150
584/584 - 0s - loss: 0.0116 - 217ms/epoch - 372us/step
Epoch 4/150
584/584 - 0s - loss: 0.0079 - 218ms/epoch - 374us/step
Epoch 5/150
584/584 - 0s - loss: 0.0058 - 219ms/epoch - 376us/step
Epoch 6/150
584/584 - 0s - loss: 0.0046 - 220ms/epoch - 376us/step
Epoch 7/150
584/584 - 0s - loss: 0.0039 - 217ms/epoch - 372us/step
Epoch 8/150
584/584 - 0s - loss: 0.0035 - 216ms/epoch - 370us/step
Epoch 9/150
584/584 - 0s - loss: 0.0033 - 218ms/epoch - 374us/step
Epoch 10/150
584/584 - 0s - loss: 0.0032 - 224ms/epoch - 383us/step
Epoch 11/150
584/584 - 0s - loss: 0.0030 - 217ms/epoch - 372us/step
Epoch 12/150
584/584 - 0s - loss: 0.0030 - 218ms/epoch - 373us/step
Epoch 13/150
584/584 - 0s - loss: 0.0029 - 217ms/epoch - 372us/step
Epoch 14/150
584/584 - 0s - loss: 0.0028 - 218ms/epoch - 373us/step
Epoch 15/150
584/584 - 0s - loss: 0.0028 - 217ms/epoch - 37

Epoch 122/150
584/584 - 0s - loss: 0.0023 - 217ms/epoch - 372us/step
Epoch 123/150
584/584 - 0s - loss: 0.0024 - 216ms/epoch - 370us/step
Epoch 124/150
584/584 - 0s - loss: 0.0023 - 216ms/epoch - 370us/step
Epoch 125/150
584/584 - 0s - loss: 0.0023 - 224ms/epoch - 383us/step
Epoch 126/150
584/584 - 0s - loss: 0.0023 - 219ms/epoch - 374us/step
Epoch 127/150
584/584 - 0s - loss: 0.0023 - 217ms/epoch - 371us/step
Epoch 128/150
584/584 - 0s - loss: 0.0023 - 219ms/epoch - 376us/step
Epoch 129/150
584/584 - 0s - loss: 0.0023 - 231ms/epoch - 396us/step
Epoch 130/150
584/584 - 0s - loss: 0.0023 - 219ms/epoch - 376us/step
Epoch 131/150
584/584 - 0s - loss: 0.0023 - 217ms/epoch - 372us/step
Epoch 132/150
584/584 - 0s - loss: 0.0023 - 220ms/epoch - 377us/step
Epoch 133/150
584/584 - 0s - loss: 0.0023 - 229ms/epoch - 391us/step
Epoch 134/150
584/584 - 0s - loss: 0.0023 - 219ms/epoch - 376us/step
Epoch 135/150
584/584 - 0s - loss: 0.0023 - 217ms/epoch - 371us/step
Epoch 136/150
584/584 - 0s - loss:

<keras.src.callbacks.History at 0x28c80ee50>

## Evaluation and selection of the model
Here multiple models should be evaluated and finaly selected. 


In [27]:
# Make predictions
train_predict = model.predict(X_train)
test_predict = model.predict(X_test)

# invert predictions back to original scale
train_predict = scaler.inverse_transform(train_predict)
Y_train = scaler.inverse_transform([Y_train])
test_predict = scaler.inverse_transform(test_predict)
Y_test = scaler.inverse_transform([Y_test])


import math
from sklearn.metrics import mean_squared_error

train_score = math.sqrt(mean_squared_error(Y_train[0], train_predict[:,0]))
print('Train Score: %.2f RMSE' % (train_score))

test_score = math.sqrt(mean_squared_error(Y_test[0], test_predict[:,0]))
print('Test Score: %.2f RMSE' % (test_score))


Train Score: 0.26 RMSE
Test Score: 0.44 RMSE


## Conclusion and reflection
Please reflect on the project and provide the conclusion

## Test the model with some data
Create your own new data representing wine and test the model