#A little bit more about the structure of our Neural Network

tf.keras.layers.LSTM is a type of layer in a neural network that is commonly used for processing sequential data, such as time-series data like stock prices or weather patterns. LSTM stands for Long Short-Term Memory, and it helps the neural network remember important information from past data points and use that information to make predictions about future data points.

units is a parameter that specifies how many LSTM units should be used in the layer. Each LSTM unit can be thought of as a small "brain" that helps the neural network process the sequential data.

return_sequences is a parameter that specifies whether or not the LSTM layer should return sequences of data as output. In some cases, it may be helpful to return sequences if you want to stack multiple LSTM layers together.

input_shape is a parameter that specifies the shape of the input data to the layer. In this case, we're telling the layer that the input data will have X_train.shape[1] time steps (or "windows") and 1 feature per time step.

tf.keras.layers.Dropout is a layer that helps prevent overfitting in the neural network by randomly dropping out (or "turning off") some of the units in the layer during training. This forces the other units to learn more robust features and prevents the network from memorizing the training data too closely.

tf.keras.layers.Dense is a standard fully connected layer in a neural network. It takes in the output from the previous LSTM layer and produces a single output value as the final prediction.

#Key Points



*   Our Model works on predicting whats going to happen the next day (but can be changed by changing the window size 24 = 1 Day)

*   Our Model is versatile as proven above it can work with predicting multiple factors 



In [1]:
! pip install opendatasets
import opendatasets as od
od.download('https://www.kaggle.com/datasets/sbonelondhlazi/sa-electricity-historical-data/download?datasetVersionNumber=1')

Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username: Your Kaggle username: Your Kaggle username: Your Kaggle Key: Your Kaggle Key: Your Kaggle Key: Your Kaggle Key: 

In [None]:
import pandas as pd 

df = pd.read_csv('/Users/dslearner23/Downloads/EskomData.csv')
df

Unnamed: 0,Date Time Hour Beginning,Original Res Forecast before Lockdown,Residual Forecast,RSA Contracted Forecast,Dispatchable Generation,Residual Demand,RSA Contracted Demand,International Exports,International Imports,Thermal Generation,...,Total RE Installed Capacity,Installed Eskom Capacity,Total PCLF,Total UCLF,Total OCLF,Total UCLF+OCLF,Non Comm Sentout,Drakensberg Gen Unit Hours,Palmiet Gen Unit Hours,Ingula Gen Unit Hours
2018-04-01 12:00:00 AM,,19904.967,20367.066,20237.0,20237.0,20722.058,1215.902,1120.0,19444.0,931.0,...,44546.0,3987.472,8028.710,275.907,8304.0,617.0,383.0,81.8,36.9,30.41
2018-04-01 01:00:00 AM,,19553.899,19988.733,19744.0,19744.0,20188.493,1203.474,1106.0,19297.0,930.0,...,44546.0,3987.472,7727.302,244.907,7972.0,209.0,388.0,83.0,38.5,32.85
2018-04-01 02:00:00 AM,,19314.284,19731.239,19631.0,19631.0,20019.603,1177.571,1117.0,19165.0,931.0,...,44546.0,3987.472,7704.704,193.727,7898.0,431.0,388.0,83.8,40.3,35.60
2018-04-01 03:00:00 AM,,19342.679,19753.554,19731.0,19731.0,20079.454,1184.312,1118.0,19279.0,930.0,...,44546.0,3990.072,7702.868,187.000,7889.0,868.0,389.0,85.0,42.0,37.76
2018-04-01 04:00:00 AM,,19538.890,19988.365,19890.0,19890.0,20237.490,1197.271,1108.0,19369.0,930.0,...,44546.0,3990.472,7685.115,187.000,7872.0,115.0,385.0,85.8,43.7,40.32
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2023-03-31 07:00:00 PM,,27746.667,29314.632,,,,,,,,...,,,,,,,,,,
2023-03-31 08:00:00 PM,,26101.656,27605.215,,,,,,,,...,,,,,,,,,,
2023-03-31 09:00:00 PM,,24646.570,26030.785,,,,,,,,...,,,,,,,,,,
2023-03-31 10:00:00 PM,,23384.907,24597.115,,,,,,,,...,,,,,,,,,,


#Option 1:

Predictive Maintenance: Implement predictive maintenance using machine learning algorithms to optimize energy usage and avoid unplanned downtime of power plants, transformers, and other energy infrastructure. By predicting potential equipment failures 

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

data = df

# Preprocess data
data = data.dropna()
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data)

# Prepare data for LSTM
window_size = 24 # number of hours in a day
X, Y = [], []
for i in range(window_size, len(scaled_data)):
    X.append(scaled_data[i-window_size:i, :])
    Y.append(scaled_data[i, 0])
X, Y = np.array(X), np.array(Y)

# Split data into training and testing sets
train_size = int(len(X) * 0.7)
X_train, Y_train = X[:train_size], Y[:train_size]
X_test, Y_test = X[train_size:], Y[train_size:]

# Build LSTM model
model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2])),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.LSTM(units=50, return_sequences=True),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.LSTM(units=50),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(units=42) # For all columns
])

# Compile model
model.compile(optimizer="adam", loss="mean_squared_error")

# Train model
model.fit(X_train, Y_train, epochs=20, batch_size=32)



Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7f61deed6490>

In [None]:
# Evaluate model
loss = model.evaluate(X_test, Y_test)
print("Test loss:", loss)


Test loss: 0.0029276800341904163


#Option 2

Renewable Energy Optimization: Use LSTM to optimize renewable energy sources such as solar panels and wind turbines based on data from the "Total RE Installed Capacity" column. LSTM can help predict weather patterns and adjust the performance of renewable energy sources accordingly.

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Load data from csv file
data = pd.read_csv("sa-electricity-historical-data/ESK2033.csv", usecols=["Total RE Installed Capacity"])

# Preprocess data
data = data.dropna()
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data)

# Prepare data for LSTM
window_size = 24 # number of hours in a day
X, Y = [], []
for i in range(window_size, len(scaled_data)):
    X.append(scaled_data[i-window_size:i, 0])
    Y.append(scaled_data[i, 0])
X, Y = np.array(X), np.array(Y)
X = np.reshape(X, (X.shape[0], X.shape[1], 1))

# Split data into training and testing sets
train_size = int(len(X) * 0.7)
X_train, Y_train = X[:train_size], Y[:train_size]
X_test, Y_test = X[train_size:], Y[train_size:]

model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(units=50, return_sequences=True),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(units=50),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(units=1)
])

# Compile model
model.compile(optimizer="adam", loss="mean_squared_error")

model.fit(X_train, Y_train, epochs=5, batch_size=32)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f617823c220>

In [None]:
# Evaluate model
loss = model.evaluate(X_test, Y_test)
print("Test loss:", loss)

Test loss: 0.0004819944442715496


#Option 3:

Smart Grid Implementation: Use LSTM to monitor energy usage in real-time using data from the "Residual Demand" and "RSA Contracted Demand" columns. By training an LSTM model on past energy demand and supply data, you can predict future demand and supply and adjust energy supply accordingly.

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Load data from csv file
data = pd.read_csv("sa-electricity-historical-data/ESK2033.csv", usecols=[ "Residual Demand","RSA Contracted Demand"])

# Preprocess data
data = data.dropna()
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data)

# Prepare data for LSTM
window_size = 24 # number of hours in a day
X, Y = [], []
for i in range(window_size, len(scaled_data)):
    X.append(scaled_data[i-window_size:i, 0])
    Y.append(scaled_data[i, 0])
X, Y = np.array(X), np.array(Y)
X = np.reshape(X, (X.shape[0], X.shape[1], 1))

# Split data into training and testing sets
train_size = int(len(X) * 0.7)
X_train, Y_train = X[:train_size], Y[:train_size]
X_test, Y_test = X[train_size:], Y[train_size:]

model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(units=50, return_sequences=True),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(units=50),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(units=2) # Changed to 2 cause its 2 outputs for our 2 Demand Factors
])

# Compile model
model.compile(optimizer="adam", loss="mean_squared_error")

model.fit(X_train, Y_train, epochs=10, batch_size=32)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f61de65fa30>

In [None]:
# Evaluate model
loss = model.evaluate(X_test, Y_test)
print("Test loss:", loss)

Test loss: 0.0007812557741999626


#Option 4

Thermal Generation Optimization: You can use LSTM to optimize thermal generation sources based on data from the "Thermal Generation" column. LSTM can help predict future energy demand and adjust the performance of thermal generation sources accordingly.

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Load data from csv file
data = pd.read_csv("sa-electricity-historical-data/ESK2033.csv", usecols=[ "Thermal Generation"])

# Preprocess data
data = data.dropna()
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data)

# Prepare data for LSTM
window_size = 24 # number of hours in a day
X, Y = [], []
for i in range(window_size, len(scaled_data)):
    X.append(scaled_data[i-window_size:i, 0])
    Y.append(scaled_data[i, 0])
X, Y = np.array(X), np.array(Y)
X = np.reshape(X, (X.shape[0], X.shape[1], 1))

# Split data into training and testing sets
train_size = int(len(X) * 0.7)
X_train, Y_train = X[:train_size], Y[:train_size]
X_test, Y_test = X[train_size:], Y[train_size:]

model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(units=50, return_sequences=True),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(units=50),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(units=1)
])

# Compile model
model.compile(optimizer="adam", loss="mean_squared_error")

model.fit(X_train, Y_train, epochs=5, batch_size=32)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f6178464c40>

In [None]:

predictions = model.predict(X_test)
rmse = np.sqrt(np.mean((predictions.flatten() - Y_test.flatten())**2))
print("Root Mean Squared Error:", rmse)

Root Mean Squared Error: 0.017496673583946357


In [None]:
model.save('thermal_prediction.h5')

#API Integration



*   API's are one of the fastest way to get data so by turning our model into a API it will be able to help us track our thermal predictions easily and be able to easily integrate the model into an app 
*   This will also help ESKOM as an entity to make their job predicting easy


*   This is one of the many ways we can implement our machine learning models into the structure of ESKOM in order to mitigate loadshedding and other factors that are affecting the organisation





In [None]:
! pip install fastapi gunicorn unicorn

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting fastapi
  Downloading fastapi-0.95.0-py3-none-any.whl (57 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.1/57.1 KB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting gunicorn
  Downloading gunicorn-20.1.0-py3-none-any.whl (79 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.5/79.5 KB[0m [31m12.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting unicorn
  Downloading unicorn-2.0.1.post1-py2.py3-none-manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.1/16.1 MB[0m [31m77.2 MB/s[0m eta [36m0:00:00[0m
Collecting starlette<0.27.0,>=0.26.1
  Downloading starlette-0.26.1-py3-none-any.whl (66 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.9/66.9 KB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
Collecting anyio<5,>=3.4.

In [None]:
from fastapi import FastAPI
from pydantic import BaseModel
import tensorflow as tf
import numpy as np

app = FastAPI()

# Load trained model
model = tf.keras.models.load_model('thermal_prediction.h5')

# Define request body model
class RequestBody(BaseModel):
    data: list

# Define prediction function
def make_predictions(data):
    # Preprocess data as needed (e.g. scaling, reshaping)
    scaled_data = scaler.transform(data)
    X = np.array([scaled_data])
    # Make predictions using loaded model
    predictions = model.predict(X)
    # Postprocess predictions as needed (e.g. inverse scaling)
    return predictions

# Define route to handle incoming requests
@app.post('/predict')
def predict(request: RequestBody):
    # Get request data
    data = request.data
    # Make predictions using defined function
    predictions = make_predictions(data)
    # Return predictions as JSON response
    return {'predictions': predictions.tolist()}




In [None]:
import requests
import json

# Define endpoint URL
endpoint_url = 'http://localhost:8000/predict'

# Define input data for prediction
input_data = [1, 2, 3, 4, 5]

# Define request body
request_body = {'data': input_data}

# Send POST request to endpoint with request body
response = requests.post(endpoint_url, json=request_body)

# Print response text and content
print(response.text)
print(response.content)

# Parse JSON response and extract predictions
response_data = json.loads(response.text)
predictions = response_data['predictions']

# Print predictions
print(predictions)

ConnectionError: ignored