# Deep Learning Notebook
### MGTA 611 Business Applications of Artificial Intelligence
### Sarah Mansoor
### March 27, 2023

### Table of Contents
* Data preparation
* Simple Neural Network
* Recurrent Neural Network (RNN)
* LSTM Neural Network
* Convolutional Neural Network (CNN)
* Traditional vs Deep Learning solutions

The notebook starts with preparing the data, which involves loading the MISO offers data from Big Query into the notebook and cleaning the data by renaming columns, removing duplicates and outliers. Once the data is cleaned, it is split into training and test sets before being passed into various models. The models used for training include a simple neural network, a recurrent neural network, LSTM neural network, and a convolutional neural network. Finally, the notebook concludes with a comparison of traditional data science methods and deep learning solutions.

## Data Preparation
### Loading Data from Big Query to Notebook

In [1]:
# Import bigquery from the google cloud library
# Import service account from the google oauth2 library
from google.cloud import bigquery
from google.oauth2 import service_account

In [2]:
# Initialize credentials using the service account json key file
credentials = service_account.Credentials.from_service_account_file(
    'misoelect-16349cd8bba4.json')
project_id = 'misoelect'
# Initilize the client in big query with the following credentials and project ID
client = bigquery.Client(credentials= credentials, project=project_id)

#### MISO Offers Table

The first step is to load the MISO Offers table from big query. I limited the number of rows to 1,555,000 as the models would take a long time to run with all the values from 2016. 

In [100]:
# Load the miso_offers table from the misodb in misoelect's bigquery
miso_offers = "misoelect.misodb.miso_offers"
# Query for the table using only the year 2016
query_job = client.query("""
SELECT * 
FROM `misoelect.misodb.miso_offers`
WHERE
((Region = 'Central') OR (Region = 'South') OR (Region = 'North'))
AND
(BeginningTimeEST BETWEEN '2016-01-01 00:00:00' AND '2016-12-31 12:00:00')
LIMIT 1555000
""")
# Wait for the job to complete.
results = query_job.result() 
# Create dataframe from results
miso_offers_df = results.to_dataframe()

In [101]:
miso_offers_df.head()

Unnamed: 0,Region,OwnerCode,UnitCode,UnitType,BeginningTimeEST,EndTimeEST,EconomicMax,EconomicMin,EmergencyMax,EmergencyMin,...,MW6,Price7,MW7,Price8,MW8,Price9,MW9,Price10,MW10,Slope
0,Central,122062517,2968,4,2016-06-03 01:00:00,2016-06-03 02:00:00,495.0,255.0,495.0,255.0,...,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0,0
1,Central,122062517,2968,4,2016-06-03 03:00:00,2016-06-03 04:00:00,495.0,255.0,495.0,255.0,...,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0,0
2,Central,122062517,2968,4,2016-06-03 10:00:00,2016-06-03 11:00:00,495.0,255.0,495.0,255.0,...,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0,0
3,Central,122062517,2968,4,2016-06-03 12:00:00,2016-06-03 13:00:00,495.0,255.0,495.0,255.0,...,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0,0
4,Central,122062517,2968,4,2016-06-03 13:00:00,2016-06-03 14:00:00,495.0,255.0,495.0,255.0,...,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0,0


### Cleaning Data

To clean the data, I will begin by renaming the columns to improve their consistency and clarity. Following this, I will remove several columns such as CurtailmentOfferPrice and TargetMWReduction, as they contain only NA values. I will also remove UnitType, OwnerCode, Economic Flag, Unit Available flag, Slope, and Emergency Flag, as these are no longer relevant according to the Miso Offers report. Next, I will remove any rows containing NA values and set the Beginning time as the index for time series modeling. Finally, I will eliminate any outliers in the data.

In [108]:
miso_offers_df.rename(columns={'Price1': 'Price1Offers', 'MW1':'MW1Offers', 
                               'Price2': 'Price2Offers', 'MW2':'MW2Offers', 
                               'Price3': 'Price3Offers', 'MW3':'MW3Offers',  
                               'Price4': 'Price4Offers', 'MW4':'MW4Offers', 
                               'Price5': 'Price5Offers', 'MW5':'MW5Offers', 
                               'Price6': 'Price6Offers', 'MW6':'MW6Offers', 
                               'Price7': 'Price7Offers', 'MW7':'MW7Offers', 
                               'Price8': 'Price8Offers', 'MW8':'MW8Offers', 
                               'Price9': 'Price9Offers', 'MW9':'MW9Offers',
                               'Price10': 'Price10Offers', 'MW10':'MW10Offers',
                               'LMP': 'LMPOffers', 'MW': 'MWOffers'}, inplace=True)
miso_offers_df

Unnamed: 0,Region,UnitCode,BeginningTimeEST,EndTimeEST,EconomicMax,EconomicMin,EmergencyMax,EmergencyMin,MustRunFlag,SelfScheduledMW,...,Price6Offers,MW6Offers,Price7Offers,MW7Offers,Price8Offers,MW8Offers,Price9Offers,MW9Offers,Price10Offers,MW10Offers
0,1,2968,2016-06-03 01:00:00,2016-06-03 02:00:00,495.0,255.0,495.0,255.0,1,0.0,...,21.96,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0
1,1,2968,2016-06-03 03:00:00,2016-06-03 04:00:00,495.0,255.0,495.0,255.0,1,0.0,...,21.96,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0
2,1,2968,2016-06-03 10:00:00,2016-06-03 11:00:00,495.0,255.0,495.0,255.0,1,0.0,...,21.96,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0
3,1,2968,2016-06-03 12:00:00,2016-06-03 13:00:00,495.0,255.0,495.0,255.0,1,0.0,...,21.96,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0
4,1,2968,2016-06-03 13:00:00,2016-06-03 14:00:00,495.0,255.0,495.0,255.0,1,0.0,...,21.96,343.0,22.25,373.0,22.62,411.0,22.98,448.0,23.49,501.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1554995,1,6118,2016-04-12 16:00:00,2016-04-12 17:00:00,12.6,0.0,12.6,0.0,1,0.0,...,,,,,,,,,,
1554996,1,6118,2016-04-12 19:00:00,2016-04-12 20:00:00,47.2,0.0,47.2,0.0,1,0.0,...,,,,,,,,,,
1554997,1,6118,2016-04-12 20:00:00,2016-04-12 21:00:00,61.6,0.0,61.6,0.0,1,0.0,...,,,,,,,,,,
1554998,1,6118,2016-04-12 21:00:00,2016-04-12 22:00:00,70.1,0.0,70.1,0.0,1,0.0,...,,,,,,,,,,


In [102]:
# Import pandas library
import pandas as pd

In [103]:
# remove CurtailmentOfferPrice, TargetMWReduction
# remove UnitType, OwnerCode, Economic Flag, Unit Available flag, Slope,
# and Emergency Flag 
miso_offers_df = miso_offers_df.drop(columns=['CurtailmentOfferPrice', 'TargetMWReduction', 
                                              'UnitType', 'OwnerCode', 'EconomicFlag', 
                                              'UnitAvailableFlag', 'EmergencyFlag', 'Slope'])

In [109]:
# drop NAs
miso_offers_df = miso_offers_df.dropna(subset=['EconomicMax', 'EconomicMin', 
                                               'EmergencyMax', 'EmergencyMin',
                                               'Price1Offers', 'MW1Offers', 'Price2Offers', 'MW2Offers', 
                                               'Price3Offers', 'MW3Offers', 'Price4Offers', 'MW4Offers', 
                                               'Price5Offers', 'MW5Offers', 'Price6Offers', 'MW6Offers',
                                               'Price7Offers', 'MW7Offers', 'Price8Offers', 'MW8Offers', 
                                               'Price9Offers', 'MW9Offers', 'Price10Offers', 'MW10Offers', 
                                               'LMPOffers', 'MWOffers', 'SelfScheduledMW',
                                               'MustRunFlag', 'UnitCode'])

In [110]:
# remove duplicates
miso_offers_df = miso_offers_df.drop_duplicates()

In [111]:
# Convert Region to Categorical
import pandas as pd

# Convert the Region column to categorical
miso_offers_df['Region'] = pd.Categorical(miso_offers_df['Region'], 
                                          categories=['Central', 'South', 'North'], ordered=True)
miso_offers_df['Region'] = miso_offers_df['Region'].cat.codes + 1

In [112]:
miso_offers_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 89747 entries, 0 to 1554991
Data columns (total 32 columns):
 #   Column            Non-Null Count  Dtype         
---  ------            --------------  -----         
 0   Region            89747 non-null  int8          
 1   UnitCode          89747 non-null  object        
 2   BeginningTimeEST  89747 non-null  datetime64[ns]
 3   EndTimeEST        89747 non-null  datetime64[ns]
 4   EconomicMax       89747 non-null  float64       
 5   EconomicMin       89747 non-null  float64       
 6   EmergencyMax      89747 non-null  float64       
 7   EmergencyMin      89747 non-null  float64       
 8   MustRunFlag       89747 non-null  Int64         
 9   SelfScheduledMW   89747 non-null  float64       
 10  MWOffers          89747 non-null  float64       
 11  LMPOffers         89747 non-null  float64       
 12  Price1Offers      89747 non-null  float64       
 13  MW1Offers         89747 non-null  float64       
 14  Price2Offers      89

In [None]:
# Convert the "BeginningTimeEST" column to a datetime type
df["BeginningTimeEST"] = pd.to_datetime(df["BeginningTimeEST"])

# Get the minimum and maximum date from the column
min_date = df["BeginningTimeEST"].min()
max_date = df["BeginningTimeEST"].max()

# Print the date range
print("Date range: {} to {}".format(min_date.date(), max_date.date()))

#### Set date-time as index

In [113]:
miso_offers_df.set_index('BeginningTimeEST', inplace=True)

In [173]:
miso_offers_df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 89747 entries, 2016-06-03 01:00:00 to 2016-04-12 21:00:00
Data columns (total 31 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   Region           89747 non-null  int8          
 1   UnitCode         89747 non-null  object        
 2   EndTimeEST       89747 non-null  datetime64[ns]
 3   EconomicMax      89747 non-null  float64       
 4   EconomicMin      89747 non-null  float64       
 5   EmergencyMax     89747 non-null  float64       
 6   EmergencyMin     89747 non-null  float64       
 7   MustRunFlag      89747 non-null  Int64         
 8   SelfScheduledMW  89747 non-null  float64       
 9   MWOffers         89747 non-null  float64       
 10  LMPOffers        89747 non-null  float64       
 11  Price1Offers     89747 non-null  float64       
 12  MW1Offers        89747 non-null  float64       
 13  Price2Offers     89747 non-null  float64       
 14  MW2

#### Remove Outliers

In [158]:
from sklearn.ensemble import IsolationForest

miso_offers_if = miso_offers_df[['EconomicMax', 'EconomicMin', 'Region',
                                               'EmergencyMax', 'EmergencyMin',
                                               'Price1Offers', 'MW1Offers', 'Price2Offers', 'MW2Offers', 
                                               'Price3Offers', 'MW3Offers', 'Price4Offers', 'MW4Offers', 
                                               'Price5Offers', 'MW5Offers', 'Price6Offers', 'MW6Offers',
                                               'Price7Offers', 'MW7Offers', 'Price8Offers', 'MW8Offers', 
                                               'Price9Offers', 'MW9Offers', 'Price10Offers', 'MW10Offers', 
                                               'LMPOffers', 'MWOffers', 'SelfScheduledMW',
                                               'MustRunFlag', 'UnitCode']]

#Train an Isolation Forest
model = IsolationForest(contamination = 0.1)
model.fit(miso_offers_if)

#Predict anomalies
predictions = model.predict(miso_offers_if)

# Get the indices of the rows with positive value in predictions
positive_indices = [i for i, x in enumerate(predictions) if x>0]

# Subset the dataframe using the indices
df_trimmed = miso_offers_if.iloc[positive_indices]

miso_offers_if.shape



(89747, 30)

In [159]:
df_trimmed.shape

(80773, 30)

A total of 8,974 rows were identified as outliers in the data through the use of Isolation Forests and were subsequently removed.

#### Splitting Dataset

In this section, I will divide the dataset into a training set and a test set using the Min-Max scaler. The training set will comprise 60% of the data while the test set will consist of the remaining 40%.

In [115]:
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import numpy as np

In [189]:
# Split the data into training and test sets
train_size = int(len(miso_offers) * 0.6)
train, test = miso_offers.iloc[:train_size], miso_offers.iloc[train_size:]

# Scale the data as a numpy array
scaler = MinMaxScaler()
train_sc = scaler.fit_transform(train)
test_sc = scaler.transform(test)

# Convert the scaled data back to a DataFrame with datetime index
train_sc_df = pd.DataFrame(train_sc, columns=train.columns, index=train.index)
test_sc_df = pd.DataFrame(test_sc, columns=test.columns, index=test.index)


In [191]:
train_sc_df.head

<bound method NDFrame.head of                      EconomicMax  EconomicMin  Region  EmergencyMax  \
BeginningTimeEST                                                      
2016-06-03 01:00:00     0.793269     0.415987     0.0      0.760369   
2016-06-03 03:00:00     0.793269     0.415987     0.0      0.760369   
2016-06-03 10:00:00     0.793269     0.415987     0.0      0.760369   
2016-06-03 12:00:00     0.793269     0.415987     0.0      0.760369   
2016-06-03 13:00:00     0.793269     0.415987     0.0      0.760369   
...                          ...          ...     ...           ...   
2016-12-26 19:00:00     0.145833     0.089723     0.0      0.152074   
2016-12-26 02:00:00     0.145833     0.089723     0.0      0.152074   
2016-12-26 03:00:00     0.145833     0.089723     0.0      0.152074   
2016-12-26 07:00:00     0.145833     0.089723     0.0      0.152074   
2016-12-26 08:00:00     0.145833     0.089723     0.0      0.152074   

                     EmergencyMin  Price1Offer

In [222]:
# Here I am splitting the scaled dataset into X and Y dataframes
X_train = train_sc_df[['Region',
                                'EconomicMax','EconomicMin','EmergencyMax',
                                'EmergencyMin','MustRunFlag','SelfScheduledMW','MWOffers',
                                'Price1Offers','MW1Offers','Price2Offers','MW2Offers','Price3Offers','MW3Offers','Price4Offers',
                                'MW4Offers','Price5Offers','MW5Offers','Price6Offers','MW6Offers','Price7Offers','MW7Offers',
                                'Price8Offers','MW8Offers','Price9Offers','MW9Offers','Price10Offers','MW10Offers', 'UnitCode']]
y_train = train_sc_df['LMPOffers']
print(X_train.shape, y_train.shape)

(48463, 29) (48463,)


In [223]:
# Here I am splitting the scaled dataset into X and Y dataframes
X_test = test_sc_df[['Region',
                                'EconomicMax','EconomicMin','EmergencyMax',
                                'EmergencyMin','MustRunFlag','SelfScheduledMW','MWOffers',
                                'Price1Offers','MW1Offers','Price2Offers','MW2Offers','Price3Offers','MW3Offers','Price4Offers',
                                'MW4Offers','Price5Offers','MW5Offers','Price6Offers','MW6Offers','Price7Offers','MW7Offers',
                                'Price8Offers','MW8Offers','Price9Offers','MW9Offers','Price10Offers','MW10Offers', 'UnitCode']]
y_test = test_sc_df['LMPOffers']
print(X_test.shape, y_test.shape)

(32310, 29) (32310,)


The training set has of 48,463 rows and the test set has of 32,310. 

#### Models

##### Simple Neural Network Model

In this section, I will enhance the previous neural network model by incorporating more data and increasing the number of epochs. The previous model was trained and tested on 4,327 and 2,886 rows respectively and achieved a training accuracy of 0.337 and testing accuracy of 0.292. The goal of this model is to improve upon those results.

This model uses the ReLU activation that is applied to the output of each neuron in a neural network. It is a simple and efficient function that allows the network to learn nonlinear relationships between the input and output. The model also uses the Adam optimizer that is used to update the weights of the neural network during the training process. It is a stochastic gradient descent method which computes individual adaptive learning rates for different parameters, making it an efficient algorithm for large datasets and complex models. Using the Adam optimizer can help speed up the training process and improve the overall performance of the neural network.

In [179]:
import tensorflow as tf
from tensorflow.keras import models
from tensorflow.keras import layers

def neural_network(X, y):
    # Define the model
    model = models.Sequential()
    model.add(layers.Dense(50, activation='relu', input_shape=[28]))
    model.add(layers.Dense(32, activation='relu'))
    model.add(layers.Dense(16, activation='relu'))
    model.add(layers.Dense(32, activation='relu'))

    # Output layer
    model.add(layers.Dense(1))

    # Compile the model
    model.compile(loss='mean_squared_error', 
                  optimizer='adam', 
                  metrics=['accuracy'])

    # Fit the model to the data
    history = model.fit(X, y, epochs=200)

    return model

basic_model = neural_network(X_train, y_train)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


In [180]:
from sklearn.metrics import mean_squared_error, r2_score

# Make predictions
train_preds = basic_model.predict(X_train)
test_preds = basic_model.predict(X_test)

train_r2 = r2_score(y_train, train_preds)
test_r2 = r2_score(y_test, test_preds)

print(f"Simple Neural Network Model Train accuracy: {train_r2}")
print(f"Simple Neural Network Model Test accuracy: {test_r2}")

Simple Neural Network Model Train accuracy: 0.44120630450457465
Simple Neural Network Model Test accuracy: 0.4384352074929001


The simple neural network model in this section was able to achieve a training accuracy of 0.441 and testing accuracy of 0.438. This is a notable improvement compared to the previous notebook's model, which had a training accuracy of 0.337 and testing accuracy of 0.292. The larger amount of data used in this model may have contributed to this improvement, with the training set consisting of 48,463 rows and the test set consisting of 32,310 rows, more than 10 times the amount of data in the previous notebook. This model was trained for 200 epochs compared to the previous model's 150 epochs. 

##### RNN

This section applies a Recurrent Neural Network to the data. A Recurrent Neural Network works with sequential data, such as time-series data like in this notebook. Unlike traditional feedforward neural networks, RNNs have loops that allow information to persist from one step of the sequence to the next. This means that RNNs are able to capture temporal dependencies and make predictions based on past events.

An RNN processes input sequences one element at a time and maintains a hidden state that represents the network's "memory" of the previous inputs. At each step, the input is combined with the previous hidden state to produce a new hidden state and an output. This output is then fed back into the network as input for the next step, allowing the network to make predictions that depend on the entire input sequence.

This RNN model includes 1 input SimpleRNN layer, 2 hidden SimpleRNN layers and 1 output layer. 

In [181]:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.layers import SimpleRNN, Dense, Dropout

def rnn(X_train, y_train, X_test, y_test):
    
    # Define the model
    model = tf.keras.Sequential()
    model.add(layers.SimpleRNN(units=64, return_sequences=True, input_shape=(X_train.shape[1], 1)))
    model.add(layers.SimpleRNN(units=32, return_sequences=True))
    model.add(layers.SimpleRNN(units=16))
    model.add(layers.Dense(units=1))

    # Compile the model
    model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])

    # Fit the model to the data
    history = model.fit(X_train, y_train, epochs=200, batch_size=32, validation_data=(X_test, y_test))

    return model, history

In [182]:
rnn_model = rnn(X_train, y_train, X_test, y_test)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 

In [183]:
from sklearn.metrics import mean_squared_error, r2_score

# Make predictions
train_preds = rnn_model[0].predict(X_train)
test_preds = rnn_model[0].predict(X_test)

train_r2 = r2_score(y_train, train_preds)
test_r2 = r2_score(y_test, test_preds)

print(f"RNN Model Train accuracy: {train_r2}")
print(f"RNN Model Test accuracy: {test_r2}")

RNN Model Train accuracy: 0.3865505539629056
RNN Model Test accuracy: 0.38378839422502886


This RNN model had a training accuracy of 0.385 and testing accuracy of 0.383. The RNN model showed a significant improvement in both training and testing accuracy compared to the previous data science notebook. In the previous notebook, the training accuracy was 0.337 and testing accuracy was 0.292, whereas the RNN model achieved a training accuracy of 0.385 and testing accuracy of 0.383. This performance indicates that the RNN model is able to capture some temporal patterns and dependencies in the data, which is a good sign for time series modeling. There is still room for improvement as the accuracy is not the best. The RNN architecture may need to be modified for better performance. 

The subsequent RNN model implements a single input layer and a single output layer, utilizing Adam as the optimizer and training for 150 epochs.

In [184]:
import tensorflow as tf
from tensorflow.keras import layers
import numpy as np

def rnn(X_train, y_train, X_test, y_test):

    # Define the model
    model = tf.keras.Sequential()
    model.add(layers.SimpleRNN(units=64, input_shape=(X_train.shape[1], 1)))
    model.add(layers.Dense(units=1))

    # Compile the model
    model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])

    # Fit the model to the data
    history = model.fit(X_train, y_train, epochs=150, batch_size=32)

    return model, history

In [185]:
rnn_model2 = rnn(X_train, y_train, X_test, y_test)

Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150
Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74/150
Epoch 75/150
Epoch 76/150
Epoch 77/150
Epoch 78

Epoch 150/150


In [186]:
from sklearn.metrics import mean_squared_error, r2_score

# Make predictions
train_preds = rnn_model2[0].predict(X_train)
test_preds = rnn_model2[0].predict(X_test)

train_r2 = r2_score(y_train, train_preds)
test_r2 = r2_score(y_test, test_preds)

print(f"RNN Model Train accuracy: {train_r2}")
print(f"RNN Model Test accuracy: {test_r2}")

RNN Model Train accuracy: 0.4032004192129188
RNN Model Test accuracy: 0.40122144720200714


This RNN model has a training accuracy of 0.403 and a testing accuracy of 0.401. This is an improvement from the previous RNN model. Since this model used fewer layers, it suggests that the simpler architecture performed better due to the reduction in the number of parameters that needed to be learned. This can result in better generalization and less overfitting, allowing the model to better capture the underlying patterns in the data.

##### LSTM Model

This section includes a LSTM Neural Network Model which includes an input LSTM layer with 50 units, a dropout layer, and one output layer. It is compiled using mean squared error loss and adam optimizer and ran for 200 epochs. 

Long Short-Term Memory (LSTM) is a type of recurrent neural network that is designed to handle the vanishing gradient problem in traditional RNNs. LSTM networks are used for sequence prediction problems and have the ability to maintain long-term dependencies in the input data. They achieve this by using a memory cell, which allows the network to selectively remember or forget information based on the input. 

LSTM can be useful in predicting the local marginal price by being able to capture and analyze the long-term dependencies and patterns in the time-series data. The inclusion of dropout layers in the LSTM model can help prevent overfitting and improve generalization to new data.

In [228]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout

def lstm_model(X_train, y_train, dropout_rate=0.2):
    model = Sequential()
    
    # Add LSTM layer with 50 units and input shape of (n_steps, n_features)
    model.add(LSTM(50, input_shape=(X_train.shape[1], 1)))
    
    # Add dropout layer to prevent overfitting
    model.add(Dropout(dropout_rate))
    
    # Add output layer with single output value
    model.add(Dense(1))
    
    # Compile the model
    model.compile(loss='mean_squared_error', optimizer='adam')
    
    # Fit the model to the data
    model.fit(X_train, y_train, epochs=200, batch_size=16, verbose=2)
    
    return model

lstm_model = lstm_model(X_train, y_train, dropout_rate=0.2)

Epoch 1/200
3029/3029 - 58s - loss: 0.0080 - 58s/epoch - 19ms/step
Epoch 2/200
3029/3029 - 51s - loss: 0.0063 - 51s/epoch - 17ms/step
Epoch 3/200
3029/3029 - 48s - loss: 0.0061 - 48s/epoch - 16ms/step
Epoch 4/200
3029/3029 - 49s - loss: 0.0059 - 49s/epoch - 16ms/step
Epoch 5/200
3029/3029 - 65s - loss: 0.0057 - 65s/epoch - 22ms/step
Epoch 6/200
3029/3029 - 48s - loss: 0.0057 - 48s/epoch - 16ms/step
Epoch 7/200
3029/3029 - 51s - loss: 0.0056 - 51s/epoch - 17ms/step
Epoch 8/200
3029/3029 - 49s - loss: 0.0054 - 49s/epoch - 16ms/step
Epoch 9/200
3029/3029 - 47s - loss: 0.0051 - 47s/epoch - 15ms/step
Epoch 10/200
3029/3029 - 44s - loss: 0.0049 - 44s/epoch - 15ms/step
Epoch 11/200
3029/3029 - 43s - loss: 0.0047 - 43s/epoch - 14ms/step
Epoch 12/200
3029/3029 - 47s - loss: 0.0046 - 47s/epoch - 15ms/step
Epoch 13/200
3029/3029 - 52s - loss: 0.0045 - 52s/epoch - 17ms/step
Epoch 14/200
3029/3029 - 50s - loss: 0.0044 - 50s/epoch - 17ms/step
Epoch 15/200
3029/3029 - 46s - loss: 0.0043 - 46s/epoch -

Epoch 122/200
3029/3029 - 43s - loss: 0.0034 - 43s/epoch - 14ms/step
Epoch 123/200
3029/3029 - 43s - loss: 0.0034 - 43s/epoch - 14ms/step
Epoch 124/200
3029/3029 - 43s - loss: 0.0034 - 43s/epoch - 14ms/step
Epoch 125/200
3029/3029 - 51s - loss: 0.0034 - 51s/epoch - 17ms/step
Epoch 126/200
3029/3029 - 51s - loss: 0.0034 - 51s/epoch - 17ms/step
Epoch 127/200
3029/3029 - 50s - loss: 0.0034 - 50s/epoch - 17ms/step
Epoch 128/200
3029/3029 - 53s - loss: 0.0034 - 53s/epoch - 17ms/step
Epoch 129/200
3029/3029 - 52s - loss: 0.0034 - 52s/epoch - 17ms/step
Epoch 130/200
3029/3029 - 45s - loss: 0.0034 - 45s/epoch - 15ms/step
Epoch 131/200
3029/3029 - 58s - loss: 0.0034 - 58s/epoch - 19ms/step
Epoch 132/200
3029/3029 - 54s - loss: 0.0034 - 54s/epoch - 18ms/step
Epoch 133/200
3029/3029 - 44s - loss: 0.0034 - 44s/epoch - 14ms/step
Epoch 134/200
3029/3029 - 53s - loss: 0.0034 - 53s/epoch - 18ms/step
Epoch 135/200
3029/3029 - 44s - loss: 0.0034 - 44s/epoch - 14ms/step
Epoch 136/200
3029/3029 - 42s - lo

In [231]:
from sklearn.metrics import mean_squared_error, r2_score

# Make predictions
train_preds = lstm_model.predict(X_train)
test_preds = lstm_model.predict(X_test)

train_r2 = r2_score(y_train, train_preds)
test_r2 = r2_score(y_test, test_preds)

print(f"LSTM Model Train accuracy: {train_r2}")
print(f"LSTM Model Test accuracy: {test_r2}")

LSTM Model Train accuracy: 0.4552504215593729
LSTM Model Test accuracy: 0.4206856353355606


The LSTM Model had a training accuracy of 0.455 and a testing accuracy of 0.421 which is better than both the RNN models and the simple neural network models. This suggests that the LSTM is better able to capture the temporal dependencies in the time series data. This is important in the context of predicting local marginal price, as past prices are likely to be good predictors of future prices. The LSTM's ability to remember past inputs and use that information to inform future predictions makes it a powerful tool for time series forecasting. The LSTM model has demonstrated its effectiveness in predicting the local marginal price. Its ability to handle time-series data makes it a valuable tool in the field of energy economics.

#### Convolutional Neural Network

In this section, I will use a convolutional neural network for my time-series data. A convolutional neural network (CNN) is a deep learning model used for image classification tasks but can also be used for time-series data. CNNs apply a series of filters to the input data, which allows the model to detect features in the data. In the context of time-series data, these filters can help identify patterns or trends in the data over time. CNNs can be especially useful for processing long sequences of time-series data, where traditional RNN models may struggle with vanishing gradients. In the context of predicting the local marginal price (LMP), CNN can be useful because it can capture temporal patterns in the data, such as seasonality or trend, while also identifying local features and patterns within each time series. 

The first CNN model applies several layers to the input data, including convolutional layer, a max pooling layer, a flatten layer, a fully connected layer, and finally an output layer. By using these layers, the model can learn important features in the time series data, and use them to make predictions on future data points. The binary crossentropy loss function and Adam optimizer help the model learn more effectively by adjusting the weights and biases of the network based on the error between predicted and actual values. This approach may lead to more accurate predictions of local marginal prices.

In [210]:
from keras.models import Sequential
from keras.layers import Conv1D, MaxPooling1D, Flatten, Dense

conv_model = Sequential()

# Add a 1D convolutional layer with 32 filters, kernel size of 3 and ReLU activation
conv_model.add(Conv1D(filters=32, kernel_size=3, activation='relu', input_shape=(X_train.shape[1], 1)))

# Add a max pooling layer
conv_model.add(MaxPooling1D(pool_size=2))

# Flatten the output
conv_model.add(Flatten())

# Add a fully connected layer with 100 units and ReLU activation
conv_model.add(Dense(100, activation='relu'))

# Add the output layer with sigmoid activation for binary classification
conv_model.add(Dense(1, activation='sigmoid'))

# Compile the model with binary cross-entropy loss and Adam optimizer
conv_model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
conv_model.fit(X_train, y_train, epochs=200, batch_size=32, validation_data=(X_test, y_test), verbose=2)


Epoch 1/200
1515/1515 - 9s - loss: 0.6912 - accuracy: 4.1269e-05 - val_loss: 0.6907 - val_accuracy: 0.0000e+00 - 9s/epoch - 6ms/step
Epoch 2/200
1515/1515 - 6s - loss: 0.6900 - accuracy: 4.1269e-05 - val_loss: 0.6902 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 3/200
1515/1515 - 13s - loss: 0.6894 - accuracy: 4.1269e-05 - val_loss: 0.6896 - val_accuracy: 0.0000e+00 - 13s/epoch - 9ms/step
Epoch 4/200
1515/1515 - 13s - loss: 0.6891 - accuracy: 4.1269e-05 - val_loss: 0.6895 - val_accuracy: 0.0000e+00 - 13s/epoch - 8ms/step
Epoch 5/200
1515/1515 - 18s - loss: 0.6889 - accuracy: 4.1269e-05 - val_loss: 0.6893 - val_accuracy: 0.0000e+00 - 18s/epoch - 12ms/step
Epoch 6/200
1515/1515 - 19s - loss: 0.6888 - accuracy: 2.0634e-05 - val_loss: 0.6894 - val_accuracy: 0.0000e+00 - 19s/epoch - 13ms/step
Epoch 7/200
1515/1515 - 11s - loss: 0.6887 - accuracy: 2.0634e-05 - val_loss: 0.6890 - val_accuracy: 0.0000e+00 - 11s/epoch - 7ms/step
Epoch 8/200
1515/1515 - 6s - loss: 0.6887 - accuracy: 2.0

Epoch 63/200
1515/1515 - 6s - loss: 0.6879 - accuracy: 2.0634e-05 - val_loss: 0.6884 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 64/200
1515/1515 - 5s - loss: 0.6879 - accuracy: 4.1269e-05 - val_loss: 0.6884 - val_accuracy: 0.0000e+00 - 5s/epoch - 4ms/step
Epoch 65/200
1515/1515 - 6s - loss: 0.6879 - accuracy: 4.1269e-05 - val_loss: 0.6884 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 66/200
1515/1515 - 6s - loss: 0.6879 - accuracy: 4.1269e-05 - val_loss: 0.6885 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 67/200
1515/1515 - 6s - loss: 0.6879 - accuracy: 4.1269e-05 - val_loss: 0.6885 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 68/200
1515/1515 - 6s - loss: 0.6879 - accuracy: 4.1269e-05 - val_loss: 0.6885 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 69/200
1515/1515 - 6s - loss: 0.6879 - accuracy: 4.1269e-05 - val_loss: 0.6885 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 70/200
1515/1515 - 5s - loss: 0.6879 - accuracy: 2.0634e

Epoch 124/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6884 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 125/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6884 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 126/200
1515/1515 - 5s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6883 - val_accuracy: 0.0000e+00 - 5s/epoch - 4ms/step
Epoch 127/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6884 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 128/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6883 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 129/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6886 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 130/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6884 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 131/200
1515/1515 - 6s - loss: 0.6877 - accuracy:

Epoch 185/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6883 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 186/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6885 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 187/200
1515/1515 - 5s - loss: 0.6876 - accuracy: 4.1269e-05 - val_loss: 0.6882 - val_accuracy: 0.0000e+00 - 5s/epoch - 4ms/step
Epoch 188/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6883 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 189/200
1515/1515 - 6s - loss: 0.6877 - accuracy: 4.1269e-05 - val_loss: 0.6883 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 190/200
1515/1515 - 5s - loss: 0.6876 - accuracy: 4.1269e-05 - val_loss: 0.6883 - val_accuracy: 0.0000e+00 - 5s/epoch - 4ms/step
Epoch 191/200
1515/1515 - 6s - loss: 0.6876 - accuracy: 4.1269e-05 - val_loss: 0.6882 - val_accuracy: 0.0000e+00 - 6s/epoch - 4ms/step
Epoch 192/200
1515/1515 - 5s - loss: 0.6877 - accuracy:

<keras.callbacks.History at 0x7fde1553ff40>

In [211]:
from sklearn.metrics import mean_squared_error, r2_score

# Make predictions
train_preds = conv_model.predict(X_train)
test_preds = conv_model.predict(X_test)

train_r2 = r2_score(y_train, train_preds)
test_r2 = r2_score(y_test, test_preds)

print(f"Convolutional Neural Network Train accuracy: {train_r2}")
print(f"Convolutional Neural Network Test accuracy: {test_r2}")

Convolutional Neural Network Train accuracy: 0.45787733848172296
Convolutional Neural Network Test accuracy: 0.42529556994447715


This CNN model had a training accuracy of 0.458 and a testing accuracy of 0.425. This is an improvement from both the RNN models seen previously. This improvement in accuracy suggests that the CNN model is better suited for predicting the local marginal price compared to the RNN models used previously. CNNs are particularly useful for identifying patterns in time-series data, which is essential in predicting future values of the local marginal price. The convolutional layers of the CNN are able to identify relevant features in the time series data, while the max pooling layer reduces the dimensionality of the data, making it easier for the fully connected layer to make predictions. The binary crossentropy loss and Adam optimizer used in this model further improve its performance.

The next CNN model has 1 convolutional input layer, 2 max pooling layers, a hidden convolutional layer, a flatten layer, a densely connected layer, and finally an output layer. The model is compiled using MAE loss and adam optimizer and ran for 200 epochs. The addition of more layers and the use of MAE loss instead of binary crossentropy loss in this CNN model suggest a different approach to handling the time-series data for predicting the local marginal price. With more layers, the model is able to capture more complex patterns and features in the data. Using MAE loss instead of binary crossentropy loss emphasizes the accuracy of the model's predictions, rather than the classification accuracy. With 200 epochs, the model has more opportunities to adjust its weights and improve its performance. 

In [214]:
from keras.models import Sequential
from keras.layers import Conv1D, MaxPooling1D, Flatten, Dense

# Define the model
conv_model2 = Sequential()
conv_model2.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(X_train.shape[1], 1)))
conv_model2.add(MaxPooling1D(pool_size=2))
conv_model2.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
conv_model2.add(MaxPooling1D(pool_size=2))
conv_model2.add(Flatten())
conv_model2.add(Dense(100, activation='relu'))
conv_model2.add(Dense(1))

# Compile the model
conv_model2.compile(loss='mae', optimizer='adam')

# Fit the model to the data
history = conv_model2.fit(X_train, y_train, epochs=200, batch_size=72, validation_data=(X_test, y_test), verbose=2, shuffle=False)

Epoch 1/200
674/674 - 7s - loss: 0.0696 - val_loss: 0.0602 - 7s/epoch - 10ms/step
Epoch 2/200
674/674 - 5s - loss: 0.0586 - val_loss: 0.0524 - 5s/epoch - 7ms/step
Epoch 3/200
674/674 - 6s - loss: 0.0572 - val_loss: 0.0525 - 6s/epoch - 8ms/step
Epoch 4/200
674/674 - 7s - loss: 0.0557 - val_loss: 0.0538 - 7s/epoch - 10ms/step
Epoch 5/200
674/674 - 5s - loss: 0.0548 - val_loss: 0.0524 - 5s/epoch - 8ms/step
Epoch 6/200
674/674 - 5s - loss: 0.0538 - val_loss: 0.0523 - 5s/epoch - 8ms/step
Epoch 7/200
674/674 - 6s - loss: 0.0537 - val_loss: 0.0507 - 6s/epoch - 8ms/step
Epoch 8/200
674/674 - 5s - loss: 0.0527 - val_loss: 0.0491 - 5s/epoch - 8ms/step
Epoch 9/200
674/674 - 5s - loss: 0.0527 - val_loss: 0.0501 - 5s/epoch - 8ms/step
Epoch 10/200
674/674 - 5s - loss: 0.0522 - val_loss: 0.0503 - 5s/epoch - 7ms/step
Epoch 11/200
674/674 - 5s - loss: 0.0519 - val_loss: 0.0495 - 5s/epoch - 7ms/step
Epoch 12/200
674/674 - 5s - loss: 0.0516 - val_loss: 0.0498 - 5s/epoch - 7ms/step
Epoch 13/200
674/674 - 

Epoch 101/200
674/674 - 5s - loss: 0.0434 - val_loss: 0.0437 - 5s/epoch - 7ms/step
Epoch 102/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0435 - 5s/epoch - 7ms/step
Epoch 103/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0438 - 5s/epoch - 7ms/step
Epoch 104/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0435 - 5s/epoch - 7ms/step
Epoch 105/200
674/674 - 4s - loss: 0.0433 - val_loss: 0.0434 - 4s/epoch - 7ms/step
Epoch 106/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0433 - 5s/epoch - 7ms/step
Epoch 107/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0433 - 5s/epoch - 7ms/step
Epoch 108/200
674/674 - 5s - loss: 0.0434 - val_loss: 0.0437 - 5s/epoch - 7ms/step
Epoch 109/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0437 - 5s/epoch - 8ms/step
Epoch 110/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0442 - 5s/epoch - 7ms/step
Epoch 111/200
674/674 - 5s - loss: 0.0433 - val_loss: 0.0434 - 5s/epoch - 7ms/step
Epoch 112/200
674/674 - 5s - loss: 0.0432 - val_loss: 0.0436 - 5s/epoch - 7ms/step
Epoc

Epoch 200/200
674/674 - 5s - loss: 0.0425 - val_loss: 0.0437 - 5s/epoch - 7ms/step


In [215]:
from sklearn.metrics import mean_squared_error, r2_score

# Make predictions
train_preds = conv_model2.predict(X_train)
test_preds = conv_model2.predict(X_test)

train_r2 = r2_score(y_train, train_preds)
test_r2 = r2_score(y_test, test_preds)

print(f"Convolutional Neural Network Train accuracy: {train_r2}")
print(f"Convolutional Neural Network Test accuracy: {test_r2}")

Convolutional Neural Network Train accuracy: 0.3602555851602608
Convolutional Neural Network Test accuracy: 0.33263842229571494


This CNN model had a training accuracy of 0.36 and a testing accuracy of 0.333. The previous convolutional model performed better than this one. The previous model contained a total of 5 layers whereas this one contained a total of 7 layers, the main difference being that the one before used one max pooling layer whereas this one used two. It's possible that the addition of the extra max pooling layer in this model led to a loss of important information or reduced the resolution of the features being learned, resulting in a lower accuracy. 

Overall, the use of convolutional and max pooling layers in predicting local marginal price with an accuracy of 0.46 in the first convolutional model is significant. These layers are designed to extract features from the input data, which can be very useful in time series data like local marginal price. The convolutional layers can detect important patterns in the time series, such as trends and seasonality, while the max pooling layers can help to reduce the dimensionality of the data and improve the model's ability to generalize to new data.

#### Comparing Machine Learning Models with Artificial Intelligence Models

In [232]:
import pandas as pd

# create a list of dictionaries containing the model names and accuracy values
model_data = [
    {'Model': 'Linear Regression', 'Training Accuracy': 0.122, 'Testing Accuracy': 0.132},
    {'Model': 'Random Forest', 'Training Accuracy': 0.454, 'Testing Accuracy': 0.374},
    {'Model': 'Decision Tree', 'Training Accuracy': 0.457, 'Testing Accuracy': 0.368},
    {'Model': 'Decision Tree Feature Selection', 'Training Accuracy': 0.357, 'Testing Accuracy': 0.322},
    {'Model': 'Simple Neural Network (DS Notebook)', 'Training Accuracy': 0.337, 'Testing Accuracy': 0.292},
    {'Model': 'Simple Neural Network (DL Notebook)', 'Training Accuracy': 0.441, 'Testing Accuracy': 0.338},
    {'Model': 'RNN Model 1', 'Training Accuracy': 0.386, 'Testing Accuracy': 0.383},
    {'Model': 'RNN Model 2', 'Training Accuracy': 0.403, 'Testing Accuracy': 0.401},
    {'Model': 'LSTM Model', 'Training Accuracy': 0.455, 'Testing Accuracy': 0.421},
    {'Model': 'CNN Model 1', 'Training Accuracy': 0.457, 'Testing Accuracy': 0.425},
    {'Model': 'CNN Model 2', 'Training Accuracy': 0.36, 'Testing Accuracy': 0.332}
]

# create a DataFrame from the list of dictionaries
df = pd.DataFrame(model_data)

# set the index to be the Model column
df.set_index('Model', inplace=True)

# display the DataFrame
print(df)

                                     Training Accuracy  Testing Accuracy
Model                                                                   
Linear Regression                                0.122             0.132
Random Forest                                    0.454             0.374
Decision Tree                                    0.457             0.368
Decision Tree Feature Selection                  0.357             0.322
Simple Neural Network (DS Notebook)              0.337             0.292
Simple Neural Network (DL Notebook)              0.441             0.338
RNN Model 1                                      0.386             0.383
RNN Model 2                                      0.403             0.401
LSTM Model                                       0.455             0.421
CNN Model 1                                      0.457             0.425
CNN Model 2                                      0.360             0.332


The machine learning models and the neural network models used in both notebooks are suitable for time-series analysis. 

Machine learning algorithms typically involve feature engineering, where the user selects and transforms relevant variables to create input features for the model. These features are then used to train a machine learning model to predict future values of the time series.

Neural networks can perform feature extraction automatically from the raw time series data without the need for explicit feature engineering. This can be particularly useful when dealing with complex time series data that may have non-linear relationships between the input and output variables.

From the table above, the model with the best training and testing accuracy is very close between the LSTM and the convolutional neural networks for predicting the Local Marginal Price. 

Some of the machine learning models have training and testing accuracies that are similiar to that of the LSTM or CNN models like the decision tree model for example. 

The effectiveness of the LSTM and CNN models suggests that both LSTM and CNN models are effective in capturing the patterns and dependencies in the time series data for predicting the Local Marginal Price. The LSTM model is specifically designed for handling sequential data and can remember long-term dependencies in the data, while the CNN model is able to extract features and patterns from the data using its convolutional and pooling layers. 