# **Introduction:**

This file serves to perform a grid search to determine a baseline for the best hyperparameters for an ANFIS model for use in multi-robot task allocation through regression on FIS-generated data. The goal for designing this ANFIS is to compare its performance against an ANFIS to determine which is better at approximating the FIS, which will be achieved through the use of the coefficient of determination $R^{2}$, root mean square error (RMSE), and mean absolute error (MAE).

**Date Created: 3/2/2025**

**Date Modified: 4/2/2025**

# **Import Packages:**

This section imports all necessary packages for the ANFIS implementation.

In [10]:
# import packages:
import numpy as np
import pandas as pd
from itertools import product
import time 
import json
import os
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Input, Model, constraints
from tensorflow.keras.optimizers import Adam
from keras.layers import Layer
from keras.callbacks import EarlyStopping
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import r2_score

# **Define Model Class:**

This section defines the classes that make up the constituent layers of the ANFIS.

In [11]:
# need to define a constraint for training the parameters:
class OrderedConstraint(constraints.Constraint):
    # constructor:
    def __init__(self):
        pass

    # call function for constraint:
    def __call__(self, W):
        return tf.sort(W, axis = 2)

# first layer -> membership layer:
class MF_Layer(Layer): 
    # constructor:
    def __init__(self, num_inputs, num_mfs, mf_type, **kwargs):
        super(MF_Layer, self).__init__(**kwargs)
        self.num_inputs = num_inputs
        self.num_mfs = num_mfs

        # check if string passed:
        if not type(mf_type) is str:
            raise TypeError('Only strings are permitted to be passed')
        
        # check if a recognized membership function was passed:
        if any(mf in mf_type for mf in ['Smoothed Triangular', 'Gaussian', 'Generalized Bell']):
            pass
        else:
            raise ValueError('Unrecognized MF passed to function')
        
        # assign mf type to object, which will determine the number of parameters generated:
        if mf_type == 'Gaussian':
            self.mf_type = 'Gaussian'
            self.num_antecedents = 2
            self.constraints = None
            self.init_max = 50.0
            self.init_min = 0.0
        elif mf_type == 'Smoothed Triangular':
            self.mf_type = 'Smoothed Triangular'
            self.num_antecedents = 3
            self.constraints = OrderedConstraint()
            self.init_max = 50.0
            self.init_min = 0.0
        elif mf_type == 'Generalized Bell':
            self.mf_type = 'Generalized Bell'
            self.num_antecedents = 3
            self.constraints = None
            self.init_max = 50.0
            self.init_min = 1.0

        # need to initialize antecedent parameters
        self.mf_params = self.add_weight(
            shape = (self.num_inputs, self.num_mfs, self.num_antecedents),             
            initializer= tf.keras.initializers.RandomUniform(self.init_min, self.init_max),
            trainable = True,
            name = 'Antecedent_Params',
            constraint = self.constraints
        )

    # custom setting of weights:
    def set_weights(self, params):
        # this function is used to set weights based on what a user provides
        # user must provide weights in the form of a np.array of shape (num_mfs, num_params)

        if params.shape != (self.num_inputs, self.num_mfs, self.num_antecedents):
            raise ValueError(f'Parameters provided are not of correct shape, expected ({self.num_inputs}, {self.num_mfs}, {self.num_antecedents})')

        self.mf_params = params

    # function call:
    def call(self, inputs):
        # need to initialize the membership values:
        membership_values = []

        # for every input:
        for i in range(self.num_inputs):
            # get the memberships for that input:
            input_mf_params = self.mf_params[i]

            # need to now compute the fuzzified value for each membership function:
            fuzzified_values = []

            # for every membership function:
            for j in range(self.num_mfs):

                # if gaussian:
                if self.mf_type == 'Gaussian':
                    # define parameters:
                    mean = input_mf_params[j, 0]  # mean of the gaussian
                    std = input_mf_params[j, 1]   # standard deviation of the gaussian

                    # compute output:
                    output = tf.exp(-0.5 * tf.square((inputs[:, i] - mean) / (std + 1e-6)))
                    fuzzified_values.append(output)

                # if smoothed triangular:
                if self.mf_type == 'Smoothed Triangular':
                    # define parameters
                    a = input_mf_params[j, 0]   # a parameter
                    b = input_mf_params[j, 1]   # b parameter
                    c = input_mf_params[j, 2]   # c parameter

                    # smoothing factor beta:
                    beta = 100.0

                    # check if we are on the edges:
                    is_left_edge = tf.equal(a, b)
                    is_right_edge = tf.equal(b, c)

                    # compute softplus-based smoothed triangular membership function:
                    left = tf.nn.softplus(beta * (inputs[:, i] - a)) / (tf.nn.softplus(beta * (b - a)) + 1e-6)
                    right = tf.nn.softplus(beta * (c - inputs[:, i])) / (tf.nn.softplus(beta * (c - b)) + 1e-6)

                    # deal with edge case:
                    left = tf.where((inputs[:, i] == a) & is_left_edge, 1.0, left)
                    right = tf.where((inputs[:, i] == c) & is_right_edge, 1.0, right)

                    # compute output:
                    output = tf.maximum(0.0, tf.minimum(left, right))
                    fuzzified_values.append(output)

                # if generalized bell:
                if self.mf_type == 'Generalized Bell':
                    # define parameters
                    a = input_mf_params[j, 0]
                    b = input_mf_params[j, 1]
                    c = input_mf_params[j, 2]

                    # clamp b:
                    b = tf.clip_by_value(b, 1e-6, 5.0)

                    # compute output:
                    output = 1 / (1 + tf.abs((inputs[:, i] - c) / (a + 1e-6)) ** (2 * b))
                    fuzzified_values.append(output)
            
            # need to now stack the mf values for that given input:
            membership_values.append(tf.stack(fuzzified_values, axis = -1))

        # stack everything and return:
        return tf.stack(membership_values, axis = 1)
    
# second layer -> firing strength layer:
class FS_Layer(Layer):
    # constructor:
    def __init__(self, num_inputs, num_mfs, **kwargs):
        super(FS_Layer, self).__init__(**kwargs)
        self.num_inputs = num_inputs
        self.num_mfs = num_mfs
        self.num_rules = num_mfs ** num_inputs

    # call function:
    def call(self, membership_values):
        # this layer accepts the membership values, which have shape (batch_size, num_inputs, num_mfs):
        batch_size = tf.shape(membership_values)[0]

        # initialize the firing strengths:
        firing_strengths = tf.ones((batch_size, self.num_rules), dtype = tf.float32)

        # generate all the rule combinations:
        rules = list(product(range(self.num_mfs), repeat = self.num_inputs))    # example [(0, 0, 0), (0, 0, 1), ...]

        # need to check each input, each mf combination, and multiply their values together:
        for rule_index, combination in enumerate(rules):
            # print(f'combination: {combination}')
            rule_strength = tf.ones((batch_size, ), dtype = tf.float32)

            # for every input and membership function:
            for input_index, mf_index in enumerate(combination):
                # print(f'input: {input_index + 1} | mf: {mf_index + 1}')

                # correctly extract the fuzzified values based on the combination index:
                rule_strength *= membership_values[:, input_index, mf_index] + 1e-6
            
            # update the firing strengths:
            rule_strength = tf.expand_dims(rule_strength, axis = -1)  # shape: (batch_size, 1)
            firing_strengths = tf.concat(
                [firing_strengths[:, :rule_index], rule_strength, firing_strengths[:, rule_index + 1:]],
                axis = 1,
            )
            # print(f'firing strength: {firing_strengths}')

        return firing_strengths
    
# third layer -> normalization layer:
class NM_Layer(Layer):
    # constructor:
    def __init__(self, num_inputs, num_mfs, **kwargs):
        super(NM_Layer, self).__init__(**kwargs)
        self.num_inputs = num_inputs
        self.num_mfs = num_mfs

    # call function:
    def call(self, firing_strengths):
        # this function accepts inputs of size (batch_size, num_rules).
        # need to first get the total firing strength:
        total_firing_strength = tf.reduce_sum(firing_strengths, axis = 1, keepdims = True)
        
        # can now normalize the firing strengths:
        normalized_strengths = firing_strengths / (total_firing_strength + 1e-10)   # add a buffer in case the total firing strength is zero

        return normalized_strengths
    
# fourth layer -> consequent layer:
class CN_Layer(Layer):
    # constructor: 
    def __init__(self, num_inputs, num_mfs, **kwargs):
        super(CN_Layer, self).__init__(**kwargs)
        self.num_inputs = num_inputs
        self.num_mfs = num_mfs
        self.num_rules = num_mfs ** num_inputs

        # need to initialize the consequent parameters:
        self.consequent_params = self.add_weight(
            shape = (self.num_rules, self.num_inputs + 1),
            initializer = tf.keras.initializers.RandomUniform(-1.0, 1.0, seed = 1234),
            trainable = True,
            name = 'Consequent_Params'
        )

    # this function is used for manually setting the consequent parameters:
    def set_cons(self, params):
        # this function accepts parameters as an array of size (num_rules, num_inputs + 1):
        if params.shape != (self.num_rules, self.num_inputs + 1):
            raise ValueError(f'Parameters provided are not of correct shape, expected ({self.num_rules}, {self.num_inputs + 1})')
        
        # assign parameters:
        self.consequent_params = params

    # call function:
    def call(self, input_list):
        # unpack inputs from list:
        normalized_strengths, inputs = input_list

        # get the batch size:
        batch_size = tf.shape(normalized_strengths)[0]

        # the output is given by the multiplication of the inputs with the consequent weights,
        # such as: o_k = w_bar_k * (x_1 * p_k + x_2 * q_k + x_3 * r_k + ... + s_k)
        # can therefore extend the inputs to be (batch_size, num_inputs + bias) for ease of multiplication:
        inputs_with_bias = tf.concat([inputs, tf.ones((batch_size, 1), dtype = tf.float32)], axis = -1)

        # need to now reshape the normalized strengths to be of size (batch_size, num_rules, 1)
        # this effectively flips it into a 'column vector' of sorts, where each individual value is now vertically aligned
        normalized_strengths = tf.reshape(normalized_strengths, (batch_size, self.num_rules, 1))

        # get the consequent parameters, which have shape (num_rules, num_inputs + 1):
        consequent_params = self.consequent_params

        # expand inputs with bias to match the rule axis: (batch_size, num_rules, num_inputs + 1)
        inputs_with_bias_expanded = tf.expand_dims(inputs_with_bias, axis = 1)

        # calculate the consequent for each rule
        consequents = tf.reduce_sum(normalized_strengths * inputs_with_bias_expanded * consequent_params, axis = 2)

        return consequents

# fifth layer -> output layer:
class O_Layer(Layer):
    # constructor:
    def __init__(self, num_inputs, num_mfs, **kwargs):
        super(O_Layer, self).__init__(**kwargs)
        self.num_inputs = num_inputs
        self.num_output = num_mfs

    # call function:
    def call(self, consequents):
        output = tf.reduce_sum(consequents, axis = 1, keepdims = True)
        return output


# **Data Importation:**

This section imports and processes the data for use in the ANFIS.

In [12]:
# import data from csv as pandas dataframe:
data = pd.read_csv('V3_Data.csv')
print('\nData loaded sucessfully')


Data loaded sucessfully


Split into X and Y:

In [13]:
# perform split:
x_data = data.drop(columns = 'Suitability').astype('float32').values
y_data = data['Suitability'].astype('float32').values

Split data into train, validation, and testing:

In [14]:
# split the data using train_test_split:
x_train, x_filler, y_train, y_filler = train_test_split(x_data, y_data, test_size = 0.2)
x_val, x_test, y_val, y_test = train_test_split(x_filler, y_filler, test_size = 0.5)

# get the split results:
print(f'Training examples have shape: {x_train.shape}')
print(f'Validation examples have shape: {x_val.shape}')
print(f'Testing examples have shape:{x_test.shape}\n')

Training examples have shape: (8000, 3)
Validation examples have shape: (1000, 3)
Testing examples have shape:(1000, 3)



Scale the data:

In [15]:
# define a scaler:
scaler = StandardScaler()

# scale each set:
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)
x_test = scaler.transform(x_test)

# **Model Exploration:**

Within this section a function is defined for instantiating models using the Keras API, for use in performing a hyperparameter search to determine the best combination of hyperparameters. The hyperparameters that are being considered are:

* number of epochs
* batch size
* learning rate
* membership function type
* number of membership functions

In [16]:
# define parameter values to be explored:
num_epochs = [250, 500]
batch_sizes = [32, 64, 128]
learning_rates = [0.0001, 0.0005, 0.001]
membership_functions = ['Smoothed Triangular', 'Gaussian', 'Generalized Bell']
num_mfs = [3, 4, 5]

### DEBUGGING ###
# num_epochs = [500]
# batch_sizes = [32]
# learning_rates = [0.0001]
# membership_functions = ['Smoothed Triangular']
# num_mfs = [3]

combinations = len(num_epochs) * len(batch_sizes) * len(learning_rates) * len(membership_functions) * len(num_mfs)

LOSS_FUNCTION = 'mse'
METRICS = ['mae', keras.metrics.RootMeanSquaredError(), keras.metrics.R2Score()]

Define a model generation function:

In [17]:
# define function:
def BuildAnfis(input_shape, num_inputs, num_mfs, mf_type, rate):
    # define the inputs:
    inputs = Input(shape = input_shape)

    # add the custom layers:
    membership_layer = MF_Layer(num_inputs = num_inputs, num_mfs = num_mfs, mf_type = mf_type)(inputs)
    firing_layer = FS_Layer(num_inputs = num_inputs, num_mfs = num_mfs)(membership_layer)
    normalization_layer = NM_Layer(num_inputs = num_inputs, num_mfs = num_mfs)(firing_layer)
    consequent_layer = CN_Layer(num_inputs = num_inputs, num_mfs = num_mfs)([normalization_layer, inputs])
    output_layer = O_Layer(num_inputs = num_inputs, num_mfs = num_mfs)(consequent_layer)

    # compile the model:
    model = Model(inputs = inputs, outputs = output_layer)
    model.compile(optimizer = Adam(learning_rate = rate), 
                  loss = LOSS_FUNCTION, 
                  metrics = METRICS)
    
    return model

Now we must perform the grid search. This process entails:

* Creating a directory to save the search results in
* Creating a model using the aforementioned ***BuildAnfis()*** function
* Saving the parameters used in the creation of the model within a dictionary called ***model_params***
* Training the model, saving the results into a dictionary called ***training_results***
* Combining the training parameters with the training results into a JSON dump

While iterating through each combination of parameters.

In [18]:
# begin grid search:
j = 1
for epochs in num_epochs:
    for batch in batch_sizes:
        for rate in learning_rates:
            for mf in membership_functions:
                for num in num_mfs:
                    # update user:
                    print(f'examining model {j}/{combinations}', end = '\r')
                    j +=1

                    # make a directory to save into:
                    output_dir = os.path.join(os.getcwd(), f"anfis_search_results//{str(epochs)}_{str(batch)}_{str(rate)}_{mf}_{str(num)}")
                    os.makedirs(output_dir, exist_ok = True)

                    # build the model:
                    tf.keras.backend.clear_session()
                    model = BuildAnfis(input_shape = (3, ),
                                       num_inputs = 3,
                                       num_mfs = num,
                                       mf_type = mf,
                                       rate = rate)
                    
                    # save the training parameters into a dictionary:
                    training_params = {
                        'mf_type' : mf,
                        'num_mfs' : num,
                        'learning_rate' : rate,
                        'batch_size' : batch,
                        'num_epochs' : epochs
                    }

                    # train the model:
                    train_start = time.time()
                    history = model.fit(x_train, y_train,
                                        epochs = epochs,
                                        batch_size = batch,
                                        validation_data = (x_val, y_val),
                                        verbose = 0)
                    train_time = time.time() - train_start

                    # store training results:
                    training_results = {}
                    for i in history.history.keys():
                        training_results[i] = history.history[i][-1]
                    training_results['train_time'] = train_time

                    # save both results to a directory:
                    params_path = os.path.join(output_dir, "params_results.json")
                    with open(params_path, "w") as f:
                        json.dump({'parameters': training_params, 'results': training_results}, f, indent = 4)               
                    

examining model 162/162

# **Examine Hyperparameter Search Results:**

This section examines the data that was collected during the hyperparameter grid search. Each combination of hyperparameters had its training parameters and training results saved into separate dictionaries, which were then concatenated into a JSON dump. This section pertains to iterating through each of the folders of the tests and amalgamating the results into a Pandas DataFrame for further analysis:

In [26]:
# initialize results list:
results = []
grid_search_folder = os.path.join(os.getcwd(), "anfis_search_results")

for folder in os.listdir(grid_search_folder):
    folder_path = os.path.join(grid_search_folder, folder)

    if os.path.isdir(folder_path):
        params_file = os.path.join(folder_path, 'params_results.json')

        if os.path.exists(params_file):
            with open(params_file, 'r') as f:
                data = json.load(f)

                # flatten the JSON:
                extracted_data = {
                    # training parameters:
                    'MF_type'       : data['parameters']['mf_type'],
                    'MF_num'        : data['parameters']['num_mfs'],
                    'learning_rate' : data['parameters']['learning_rate'],
                    'batch_size'    : data['parameters']['batch_size'],
                    'num_epochs'    : data['parameters']['num_epochs'],

                    # training results:
                    'train_MSE'     : data['results']['loss'],
                    'train_MAE'     : data['results']['mae'],
                    'train_RMSE'    : data['results']['root_mean_squared_error'],
                    'train_R2'      : data['results']['r2_score'],
                    'val_MSE'       : data['results']['val_loss'],
                    'val_MAE'       : data['results']['val_mae'],
                    'val_RMSE'      : data['results']['val_root_mean_squared_error'],
                    'val_R2'        : data['results']['val_r2_score'],
                    'training_time' : data['results']['train_time']
                }

                results.append(extracted_data)

# turn the results into a dataframe:
results_df = pd.DataFrame(results)

# insert an identifer for models:
results_df.insert(0, 'model_name', [f'model {index + 1}' for index, row in results_df.iterrows()])

# save consolidated data into a CSV file:
results_df.to_csv('consolidated_results.csv', index = False)

Need to now determine the best hyperparameter combination based on the results from this analysis, which have been consolidated into a single DataFrame. Going to organized the DataFrame by the best of each metric. The metrics that will be examined are:

* Training MSE (loss)
* Validation MSE (loss)
* Validation MAE
* Validation RMSE
* Validation $R^{2}$

Sort the consolidated results by the lowest training loss:

In [27]:
results_df.sort_values(by = 'train_MSE', ascending = True).head(15)

Unnamed: 0,model_name,MF_type,MF_num,learning_rate,batch_size,num_epochs,train_MSE,train_MAE,train_RMSE,train_R2,val_MSE,val_MAE,val_RMSE,val_R2,training_time
130,model 131,Generalized Bell,4,0.001,32,500,0.090078,0.240629,0.300129,0.963234,0.087986,0.237183,0.296624,0.964361,222.335216
92,model 93,Gaussian,5,0.0005,128,500,0.090378,0.244659,0.300629,0.963111,0.08797,0.240941,0.296597,0.964367,826.748628
126,model 127,Gaussian,3,0.001,32,500,0.091951,0.246258,0.303235,0.962469,0.087073,0.237976,0.295082,0.964731,115.826656
94,model 95,Generalized Bell,4,0.0005,128,500,0.092126,0.243156,0.303522,0.962398,0.089283,0.238489,0.298802,0.963836,302.328224
20,model 21,Gaussian,5,0.001,128,250,0.093806,0.248178,0.306278,0.961712,0.090685,0.242661,0.301139,0.963268,245.70169
41,model 42,Generalized Bell,5,0.0005,32,250,0.0942,0.248071,0.30692,0.961551,0.091909,0.244778,0.303165,0.962772,200.25141
158,model 159,Generalized Bell,5,0.001,64,500,0.094888,0.248418,0.308039,0.961271,0.091104,0.24193,0.301835,0.963098,603.285423
104,model 105,Generalized Bell,5,0.001,128,500,0.095019,0.248234,0.308252,0.961217,0.092485,0.244383,0.304114,0.962538,950.39757
50,model 51,Generalized Bell,5,0.001,32,250,0.095119,0.249238,0.308414,0.961176,0.091451,0.24239,0.302409,0.962957,199.234537
124,model 125,Smoothed Triangular,4,0.0005,32,500,0.09532,0.250122,0.308739,0.961094,0.092609,0.24585,0.304318,0.962488,239.801952


Sort the consolidated results by the lowest validation loss:

In [28]:
results_df.sort_values(by = 'val_MSE', ascending = True).head(15)

Unnamed: 0,model_name,MF_type,MF_num,learning_rate,batch_size,num_epochs,train_MSE,train_MAE,train_RMSE,train_R2,val_MSE,val_MAE,val_RMSE,val_R2,training_time
126,model 127,Gaussian,3,0.001,32,500,0.091951,0.246258,0.303235,0.962469,0.087073,0.237976,0.295082,0.964731,115.826656
92,model 93,Gaussian,5,0.0005,128,500,0.090378,0.244659,0.300629,0.963111,0.08797,0.240941,0.296597,0.964367,826.748628
130,model 131,Generalized Bell,4,0.001,32,500,0.090078,0.240629,0.300129,0.963234,0.087986,0.237183,0.296624,0.964361,222.335216
94,model 95,Generalized Bell,4,0.0005,128,500,0.092126,0.243156,0.303522,0.962398,0.089283,0.238489,0.298802,0.963836,302.328224
110,model 111,Gaussian,5,0.0001,32,500,0.095936,0.251464,0.309735,0.960843,0.090448,0.242283,0.300745,0.963364,390.827667
20,model 21,Gaussian,5,0.001,128,250,0.093806,0.248178,0.306278,0.961712,0.090685,0.242661,0.301139,0.963268,245.70169
158,model 159,Generalized Bell,5,0.001,64,500,0.094888,0.248418,0.308039,0.961271,0.091104,0.24193,0.301835,0.963098,603.285423
50,model 51,Generalized Bell,5,0.001,32,250,0.095119,0.249238,0.308414,0.961176,0.091451,0.24239,0.302409,0.962957,199.234537
32,model 33,Generalized Bell,5,0.0001,32,250,0.096348,0.249676,0.3104,0.960674,0.091605,0.242811,0.302664,0.962895,201.651433
41,model 42,Generalized Bell,5,0.0005,32,250,0.0942,0.248071,0.30692,0.961551,0.091909,0.244778,0.303165,0.962772,200.25141


Sort the consolidated results by the lowest validation MAE:

In [29]:
results_df.sort_values(by = 'train_MAE', ascending = True).head(15)

Unnamed: 0,model_name,MF_type,MF_num,learning_rate,batch_size,num_epochs,train_MSE,train_MAE,train_RMSE,train_R2,val_MSE,val_MAE,val_RMSE,val_R2,training_time
130,model 131,Generalized Bell,4,0.001,32,500,0.090078,0.240629,0.300129,0.963234,0.087986,0.237183,0.296624,0.964361,222.335216
94,model 95,Generalized Bell,4,0.0005,128,500,0.092126,0.243156,0.303522,0.962398,0.089283,0.238489,0.298802,0.963836,302.328224
92,model 93,Gaussian,5,0.0005,128,500,0.090378,0.244659,0.300629,0.963111,0.08797,0.240941,0.296597,0.964367,826.748628
126,model 127,Gaussian,3,0.001,32,500,0.091951,0.246258,0.303235,0.962469,0.087073,0.237976,0.295082,0.964731,115.826656
41,model 42,Generalized Bell,5,0.0005,32,250,0.0942,0.248071,0.30692,0.961551,0.091909,0.244778,0.303165,0.962772,200.25141
20,model 21,Gaussian,5,0.001,128,250,0.093806,0.248178,0.306278,0.961712,0.090685,0.242661,0.301139,0.963268,245.70169
104,model 105,Generalized Bell,5,0.001,128,500,0.095019,0.248234,0.308252,0.961217,0.092485,0.244383,0.304114,0.962538,950.39757
158,model 159,Generalized Bell,5,0.001,64,500,0.094888,0.248418,0.308039,0.961271,0.091104,0.24193,0.301835,0.963098,603.285423
13,model 14,Generalized Bell,4,0.0005,128,250,0.095382,0.249063,0.30884,0.961069,0.092014,0.243012,0.303338,0.962729,105.267374
50,model 51,Generalized Bell,5,0.001,32,250,0.095119,0.249238,0.308414,0.961176,0.091451,0.24239,0.302409,0.962957,199.234537


Sort the consolidated results by the lowest validation RMSE:

In [30]:
results_df.sort_values(by = 'val_RMSE', ascending = True).head(15)

Unnamed: 0,model_name,MF_type,MF_num,learning_rate,batch_size,num_epochs,train_MSE,train_MAE,train_RMSE,train_R2,val_MSE,val_MAE,val_RMSE,val_R2,training_time
126,model 127,Gaussian,3,0.001,32,500,0.091951,0.246258,0.303235,0.962469,0.087073,0.237976,0.295082,0.964731,115.826656
92,model 93,Gaussian,5,0.0005,128,500,0.090378,0.244659,0.300629,0.963111,0.08797,0.240941,0.296597,0.964367,826.748628
130,model 131,Generalized Bell,4,0.001,32,500,0.090078,0.240629,0.300129,0.963234,0.087986,0.237183,0.296624,0.964361,222.335216
94,model 95,Generalized Bell,4,0.0005,128,500,0.092126,0.243156,0.303522,0.962398,0.089283,0.238489,0.298802,0.963836,302.328224
110,model 111,Gaussian,5,0.0001,32,500,0.095936,0.251464,0.309735,0.960843,0.090448,0.242283,0.300745,0.963364,390.827667
20,model 21,Gaussian,5,0.001,128,250,0.093806,0.248178,0.306278,0.961712,0.090685,0.242661,0.301139,0.963268,245.70169
158,model 159,Generalized Bell,5,0.001,64,500,0.094888,0.248418,0.308039,0.961271,0.091104,0.24193,0.301835,0.963098,603.285423
50,model 51,Generalized Bell,5,0.001,32,250,0.095119,0.249238,0.308414,0.961176,0.091451,0.24239,0.302409,0.962957,199.234537
32,model 33,Generalized Bell,5,0.0001,32,250,0.096348,0.249676,0.3104,0.960674,0.091605,0.242811,0.302664,0.962895,201.651433
41,model 42,Generalized Bell,5,0.0005,32,250,0.0942,0.248071,0.30692,0.961551,0.091909,0.244778,0.303165,0.962772,200.25141


Sort the consolidated results by the highest validation $R^{2}$:

In [32]:
results_df.sort_values(by = 'val_R2', ascending = False).head(15)

Unnamed: 0,model_name,MF_type,MF_num,learning_rate,batch_size,num_epochs,train_MSE,train_MAE,train_RMSE,train_R2,val_MSE,val_MAE,val_RMSE,val_R2,training_time
126,model 127,Gaussian,3,0.001,32,500,0.091951,0.246258,0.303235,0.962469,0.087073,0.237976,0.295082,0.964731,115.826656
92,model 93,Gaussian,5,0.0005,128,500,0.090378,0.244659,0.300629,0.963111,0.08797,0.240941,0.296597,0.964367,826.748628
130,model 131,Generalized Bell,4,0.001,32,500,0.090078,0.240629,0.300129,0.963234,0.087986,0.237183,0.296624,0.964361,222.335216
94,model 95,Generalized Bell,4,0.0005,128,500,0.092126,0.243156,0.303522,0.962398,0.089283,0.238489,0.298802,0.963836,302.328224
110,model 111,Gaussian,5,0.0001,32,500,0.095936,0.251464,0.309735,0.960843,0.090448,0.242283,0.300745,0.963364,390.827667
20,model 21,Gaussian,5,0.001,128,250,0.093806,0.248178,0.306278,0.961712,0.090685,0.242661,0.301139,0.963268,245.70169
158,model 159,Generalized Bell,5,0.001,64,500,0.094888,0.248418,0.308039,0.961271,0.091104,0.24193,0.301835,0.963098,603.285423
50,model 51,Generalized Bell,5,0.001,32,250,0.095119,0.249238,0.308414,0.961176,0.091451,0.24239,0.302409,0.962957,199.234537
32,model 33,Generalized Bell,5,0.0001,32,250,0.096348,0.249676,0.3104,0.960674,0.091605,0.242811,0.302664,0.962895,201.651433
41,model 42,Generalized Bell,5,0.0005,32,250,0.0942,0.248071,0.30692,0.961551,0.091909,0.244778,0.303165,0.962772,200.25141


Sort the consolidated results by the lowest training loss, validation loss, validation MAE, validation RMSE, and highest validation $R^{2}$:

In [34]:
results_df.sort_values(by = ['train_MSE', 'val_MSE', 'val_MAE', 'val_RMSE', 'val_R2'], ascending = [True, True, True, True, False]).head(3)

Unnamed: 0,model_name,MF_type,MF_num,learning_rate,batch_size,num_epochs,train_MSE,train_MAE,train_RMSE,train_R2,val_MSE,val_MAE,val_RMSE,val_R2,training_time
130,model 131,Generalized Bell,4,0.001,32,500,0.090078,0.240629,0.300129,0.963234,0.087986,0.237183,0.296624,0.964361,222.335216
92,model 93,Gaussian,5,0.0005,128,500,0.090378,0.244659,0.300629,0.963111,0.08797,0.240941,0.296597,0.964367,826.748628
126,model 127,Gaussian,3,0.001,32,500,0.091951,0.246258,0.303235,0.962469,0.087073,0.237976,0.295082,0.964731,115.826656


# **Results and Conclusions:**

This section is to be written following the re-running of the hyperparameter sweep such that it includes $R^{2}$