# About
Hyperparameter optimization is required to get the most out of your machine learning models.

Hyperparameters are points of choice or configuration that allow a machine learning model to be customized for a specific task or dataset.

Parameters are different from hyperparameters. Parameters are learned automatically; hyperparameters are set manually to help guide the learning process.

Choosing a hyperparameter grid is probably the most difficult part of hyperparameter tuning: it's nearly impossible ahead of time to say which values of hyperparameters will work well and the optimal settings will depend on the dataset. Moreover, the hyperparameters have complex interactions with each other which means that just tuning one at a time doesn't work because when we start changing other hyperparameters that will affect the one we just tuned!

! https://practicaldatascience.co.uk/machine-learning/how-to-use-model-selection-and-hyperparameter-tuning

# Libraries

In [1]:
%run "../../main_global.ipynb"

Connection with MySQL database is ready!


In [2]:
#pip install xgboost

In [3]:
#pip install pytictoc

In [4]:
import numpy as np
import pandas as pd

import os
os.getcwd()

# Save trained models
import joblib

# Data
from sklearn.model_selection import train_test_split
from sklearn.utils.multiclass import type_of_target

# Hypertuning tools
from sklearn.model_selection import KFold
from sklearn.model_selection import RandomizedSearchCV

# Metrics
from sklearn.metrics import SCORERS

# Nonlinear models
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn import svm
from sklearn.gaussian_process import GaussianProcessRegressor

# Ensemble models
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import BaggingRegressor
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.ensemble import GradientBoostingRegressor
from xgboost import XGBRegressor

# Random seed
from numpy.random import seed
seed(101)

In [5]:
from pytictoc import TicToc
t = TicToc() #create instance of class
s = t

In [36]:
# Metrics
from sklearn.metrics import accuracy_score
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_absolute_percentage_error
from sklearn.metrics import r2_score
from sklearn.metrics import max_error

# User-Defined Functions

In [6]:
class multivariate_samples(object):
    """
    Sequential processing of data to obtain time series.
    
    Activities:
    - initial_df: Read SQL dataset for specific station number.
    - samples_creation: Creation of samples array.
    """

    def __init__(self, table_name, target, cols = '*', where = ""):
        """
        Input:
        * station_number: Database station number to process
        """
        self.table_name = table_name
        self.cols = cols
        self.where = where
        self.target = target
        
    def initial_df(self):
        # Read raw dataset components from SQL database
        sql_df = qdata("Select {} from {} {}".format(self.cols, self.table_name, self.where))
        
        if self.cols == '*':
            col_names = [i[0] for i in qdata("show columns from {}".format(self.table_name))]
        else: 
            col_names = self.cols.split(', ')

        # Create dataframe
        df = pd.DataFrame(sql_df)
        df.columns = col_names

        # Set `datetime` column as dataframe index
        df = df.set_index('datetime')
        df.sort_index(inplace=True)
        
        # Save temporary array with unmodified target information
        target_arr = df[self.target]
        
        # Data normalization
        df=(df-df.min())/(df.max()-df.min())
        df = df.fillna(0)
        df[self.target] = target_arr

        # Overview
        return df
    
    def samples_creation(self, n_steps, target_name):
        """
        Transformation of Dataframe object into numpy.ndarray objects (input, output)
        """
        
        # Rearrangin dataset to place target as last column
        df = self.initial_df()
        
        target_col = df[target_name]

        df = df.loc[:, df.columns != target_name]
        df[target_name] = target_col     
        
        arr = df.to_numpy()
        del target_col
        
        # Creating samples
        tmp = list(reversed(range(len(arr)+1)))
        tmp = tmp[:-n_steps][::-1]
        tmp = pd.DataFrame(tmp).reset_index(drop = False)
        tmp.columns = ["index", "end_ix"]
        
        # Create empty lists 
        X, y = list(), list()

        for i, end_ix in zip(tmp["index"], tmp["end_ix"]):
            
            # Gather input and output parts of the pattern
            seq_x, seq_y = arr[i:end_ix, :-1], arr[end_ix-1, -1]
            X.append(seq_x)
            y.append(seq_y)        
        
        return np.array(X), np.array(y)

In [7]:
def hyper_tuning(name, model, space, X, y):
    # The searching algorithm includes a “cv” argument that allows:
    # a) An integer number of folds to be specified, e.g. 5
    #cross_val = 5
    # b) A configured cross-validation object.
    kfold = KFold(n_splits=3, shuffle=False)

    # The scoring metric must be maximizing, meaning better models result in larger scores.
    scoring_metric = 'neg_mean_squared_error'

    # Search for best hyperparameters
    grid = RandomizedSearchCV(estimator=model, 
                              param_distributions=search_space, 
                              cv=kfold, 
                              n_iter=100,
                              scoring=scoring_metric)

    result = grid.fit(X_test, y_test)
    
    # Save the trained model
    filename = 'ml_trained_models/{}.sav'.format(name)
    joblib.dump(result, filename)

    return result

In [8]:
# Evaluate a single model
def single_model_evaluation(X_test, y_test, name):
    # Load the trained model
    filename = 'ml_trained_models/{}.sav'.format(name)
    model = joblib.load(filename)

    # make predictions
    y_prediction = model.predict(X_test)
    
    metrics = dict()
    # evaluate predictions
    metrics["RMSE"] = mean_squared_error(y_test, y_prediction, squared=False)
    metrics["MAE"] = mean_absolute_error(y_test, y_prediction)
    metrics["MAPE (%)"] = mean_absolute_percentage_error(y_test, y_prediction) *100
    metrics["R^2 (%)"] = r2_score(y_test, y_prediction) * 100
    metrics["Max Error"] = max_error(y_test, y_prediction)    
    
    return metrics

# Data

## Sample preparation

In [9]:
sql_table = "sima_station_MVI_MICE_CE"
target = "pm25"

# Define columns of interest from sql table
#     Select all columns:
column = "datetime, co, no, no2, o3, pm10, pm25, prs, rh, so2, sr, tout, wdr, wsr"
# We remove NOx because it has high correlation with NO. Also rainf because it barely has any information

#column = "*"
#     Select specific columns:
#column = "datetime, co, no, no2, nox, o3, pm10, pm25, prs, rainf, rh, so2, sr, tout, wdr, wsr "

# Filter data with WHERE command
sql_where = "where datetime >=\'2021-04-17 23:00:00\'"
#"where datetime > \'2020-04-20\'"

# Initialize class to create multivariate samples:
multi_ts = multivariate_samples(sql_table, target, column, sql_where)

# Datasets can't be trained with sample batches by default. So parameter is 1.
X, y = multi_ts.samples_creation(1, target)

# Training and test datasets are prepared, avoiding shuffling because it is a time series.
X_train, X_test, y_train, y_test = train_test_split(X[:,0,:], y, test_size = 0.30, shuffle= False)

In [10]:
type_of_target(y_train)

'continuous'

# Hyperparameter tuning

## Objective function

In [11]:
#sorted(SCORERS.keys())

# Random Search
RandomizedSearchCV for random search evaluates models for a given hyperparameter vector using cross-validation, hence the “CV” suffix of each class name.

It requires two arguments. 
1. The first is the model that you are optimizing. This is an instance of the model with values of hyperparameters set that you want to optimize. 
2. The second is the search space. This is defined as a dictionary where the names are the hyperparameter arguments to the model and the values are discrete values or a distribution of values to sample in the case of a random search.

## K-Nearest Neighbors
KNeighborsRegressor()

In [12]:
# Select an algorithm
model = KNeighborsRegressor()
model.get_params()

{'algorithm': 'auto',
 'leaf_size': 30,
 'metric': 'minkowski',
 'metric_params': None,
 'n_jobs': None,
 'n_neighbors': 5,
 'p': 2,
 'weights': 'uniform'}

In [13]:
# define search space
search_space = [{
    'n_neighbors': list(range(1,10)),
    'weights': list(['uniform', 'distance']),
    'algorithm': list(['auto', 'ball_tree', 'kd_tree', 'brute']),
    'leaf_size': list(range(15, 45)),
    'p': list([1,2]),
    'metric': list(['euclidean', 'manhattan','chebyshev', 'minkowski']),
    # The search can be made parallel using various if not all of your CPU cores 
    # We can set it to -1 to automatically use all of the cores in the system.
    'n_jobs': list([-1])
}]

In [14]:
t.tic()
result_KNN = hyper_tuning("KNN", model, search_space, X_train, y_train)
t.toc(restart=True)
# Get the results
print(result_KNN.best_score_)
print("")
print(result_KNN.best_estimator_)
print("")
print(result_KNN.best_params_)

Elapsed time is 12.963648 seconds.
-133.3859987640618

KNeighborsRegressor(algorithm='kd_tree', leaf_size=18, n_jobs=-1, n_neighbors=9,
                    p=1, weights='distance')

{'weights': 'distance', 'p': 1, 'n_neighbors': 9, 'n_jobs': -1, 'metric': 'minkowski', 'leaf_size': 18, 'algorithm': 'kd_tree'}


## Classification and Regression Tree
DecisionTreeRegressor()

In [15]:
# Select an algorithm
model = DecisionTreeRegressor()
model.get_params()

{'ccp_alpha': 0.0,
 'criterion': 'squared_error',
 'max_depth': None,
 'max_features': None,
 'max_leaf_nodes': None,
 'min_impurity_decrease': 0.0,
 'min_samples_leaf': 1,
 'min_samples_split': 2,
 'min_weight_fraction_leaf': 0.0,
 'random_state': None,
 'splitter': 'best'}

In [16]:
# define search space
search_space = [{
    'criterion': list(['squared_error', 'friedman_mse', 'absolute_error', 'poisson'])
    , 'splitter': list(['best', 'random'])
    , 'max_depth': list(range(1,10))
    , 'min_samples_split': list(range(2,10))
    , 'min_samples_leaf': list(range(1,10))
    , 'min_weight_fraction_leaf': list(np.linspace(0.0,0.5))
}]

In [17]:
t.tic()
result_DTR = hyper_tuning("DecisionTrees", model, search_space, X_train, y_train)
t.toc(restart=True)

# Get the results
print(result_DTR.best_score_)
print("")
print(result_DTR.best_estimator_)
print("")
print(result_DTR.best_params_)

Elapsed time is 2.908043 seconds.
-170.69338613567842

DecisionTreeRegressor(criterion='friedman_mse', max_depth=9, min_samples_leaf=9,
                      min_samples_split=7,
                      min_weight_fraction_leaf=0.02040816326530612)

{'splitter': 'best', 'min_weight_fraction_leaf': 0.02040816326530612, 'min_samples_split': 7, 'min_samples_leaf': 9, 'max_depth': 9, 'criterion': 'friedman_mse'}


## Support Vector Regression - Polynomial
svm.SVR(kernel='poly')

In [18]:
# Select an algorithm
model = svm.SVR()
model.get_params()

{'C': 1.0,
 'cache_size': 200,
 'coef0': 0.0,
 'degree': 3,
 'epsilon': 0.1,
 'gamma': 'scale',
 'kernel': 'rbf',
 'max_iter': -1,
 'shrinking': True,
 'tol': 0.001,
 'verbose': False}

In [19]:
# define search space
search_space = [{
    'kernel': list(['poly'])
    # `degree` is a parameter used when kernel is set to ‘poly’.
    , 'degree': list([0, 2, 3, 4, 5, 6])
    # Gamma is a parameter for non linear hyperplanes. 
    # The higher the gamma value it tries to exactly fit the training data set
    , 'gamma' : list([0.1, 1, 10, 100])
    # C is the penalty parameter of the error term. 
    # It controls the trade off between smooth decision boundary and classifying the training points correctly.
    , 'C': list([0.1, 1, 10, 100, 1000])
}]

In [20]:
if(False):
    t.tic()
    result_SVM_poly = hyper_tuning("SVR_Poly", model, search_space, X_train, y_train)
    t.toc(restart=True)

    # Get the results
    print(result_SVM_poly.best_score_)
    print("")
    print(result_SVM_poly.best_estimator_)
    print("")
    print(result_SVM_poly.best_params_)

## Support Vector Regression - RBF
svm.SVR(kernel='rbf')

In [21]:
# Select an algorithm
model = svm.SVR()
model.get_params()

{'C': 1.0,
 'cache_size': 200,
 'coef0': 0.0,
 'degree': 3,
 'epsilon': 0.1,
 'gamma': 'scale',
 'kernel': 'rbf',
 'max_iter': -1,
 'shrinking': True,
 'tol': 0.001,
 'verbose': False}

In [22]:
# define search space
search_space = [{
    'kernel': list(['rbf'])
    # Gamma is a parameter for non linear hyperplanes. 
    # The higher the gamma value it tries to exactly fit the training data set
    , 'gamma' : list([0.1, 1, 10, 100])
    # C is the penalty parameter of the error term. 
    # It controls the trade off between smooth decision boundary and classifying the training points correctly.
    , 'C': list([0.1, 1, 10, 100, 1000])
}]

In [23]:
t.tic()
result_SVM_RBF = hyper_tuning("SVR_RBF", model, search_space, X_train, y_train)
t.toc(restart=True)

# Get the results
print(result_SVM_RBF.best_score_)
print("")
print(result_SVM_RBF.best_estimator_)
print("")
print(result_SVM_RBF.best_params_)



Elapsed time is 29.747079 seconds.
-96.31969565267705

SVR(C=1000, gamma=0.1)

{'kernel': 'rbf', 'gamma': 0.1, 'C': 1000}


## Support Vector Regression - Linear
svm.SVR(kernel='linear')

In [24]:
# Select an algorithm
model = svm.SVR()
model.get_params()

{'C': 1.0,
 'cache_size': 200,
 'coef0': 0.0,
 'degree': 3,
 'epsilon': 0.1,
 'gamma': 'scale',
 'kernel': 'rbf',
 'max_iter': -1,
 'shrinking': True,
 'tol': 0.001,
 'verbose': False}

In [25]:
# define search space
search_space = [{
    'kernel': list(['linear'])
    # Gamma is a parameter for non linear hyperplanes. 
    # The higher the gamma value it tries to exactly fit the training data set
    , 'gamma' : list([0.1, 1, 10, 100])
    # C is the penalty parameter of the error term. 
    # It controls the trade off between smooth decision boundary and classifying the training points correctly.
    , 'C': list([0.1, 1, 10, 100, 1000])
}]

In [26]:
t.tic()
result_SVM_Linear = hyper_tuning("SVR_Linear", model, search_space, X_train, y_train)
t.toc(restart=True)

# Get the results
print(result_SVM_Linear.best_score_)
print("")
print(result_SVM_Linear.best_estimator_)
print("")
print(result_SVM_Linear.best_params_)



Elapsed time is 13.959796 seconds.
-100.8383024821169

SVR(C=100, gamma=0.1, kernel='linear')

{'kernel': 'linear', 'gamma': 0.1, 'C': 100}


## Random Forest
RandomForestRegressor()

In [27]:
# Select an algorithm
model = RandomForestRegressor()
model.get_params()

{'bootstrap': True,
 'ccp_alpha': 0.0,
 'criterion': 'squared_error',
 'max_depth': None,
 'max_features': 'auto',
 'max_leaf_nodes': None,
 'max_samples': None,
 'min_impurity_decrease': 0.0,
 'min_samples_leaf': 1,
 'min_samples_split': 2,
 'min_weight_fraction_leaf': 0.0,
 'n_estimators': 100,
 'n_jobs': None,
 'oob_score': False,
 'random_state': None,
 'verbose': 0,
 'warm_start': False}

In [28]:
# define search space
search_space = [{
    # `n_estimators` represents the number of trees in the forest. 
    # Usually the higher the number of trees the better to learn the data. It is also computationally expensive.
    'n_estimators': list([100, 200, 300, 400, 500])
    # `max_depth` represents the depth of each tree in the forest. 
    # The deeper the tree, the more splits it has and it captures more information about the data.
    , 'max_depth': list(np.linspace(1, 32, 32, endpoint=True))
    # `min_samples_split` represents the minimum number of samples required to split an internal node. 
    , 'min_samples_split': list([2, 3, 4, 5, 6, 7, 8, 9, 10]) # list(np.linspace(1, 1, 10, endpoint=True))
    # `min_samples_leaf` The minimum number of samples required to be at a leaf node.
    #, 'min_samples_leafs': list([1, 2, 4])
    # `max_features`: Represents the number of features to consider when looking for the best split.
    , 'max_features': list(range(1,X_train.shape[1]))
}]

In [29]:
t.tic()
result_RF = hyper_tuning("RandomForest", model, search_space, X_train, y_train)
t.toc(restart=True)

# Get the results 
print(result_RF.best_score_)
print("")
print(result_RF.best_estimator_)
print("")
print(result_RF.best_params_)

Elapsed time is 329.859395 seconds.
-109.9435418724932

RandomForestRegressor(max_depth=22.0, max_features=4, n_estimators=400)

{'n_estimators': 400, 'min_samples_split': 2, 'max_features': 4, 'max_depth': 22.0}


## Extra-trees regressor
ExtraTreesRegressor()

In [30]:
# Select an algorithm
model = ExtraTreesRegressor()
model.get_params()

{'bootstrap': False,
 'ccp_alpha': 0.0,
 'criterion': 'squared_error',
 'max_depth': None,
 'max_features': 'auto',
 'max_leaf_nodes': None,
 'max_samples': None,
 'min_impurity_decrease': 0.0,
 'min_samples_leaf': 1,
 'min_samples_split': 2,
 'min_weight_fraction_leaf': 0.0,
 'n_estimators': 100,
 'n_jobs': None,
 'oob_score': False,
 'random_state': None,
 'verbose': 0,
 'warm_start': False}

In [31]:
# define search space
search_space = [{
    # `n_estimators` represents the number of trees in the forest. 
    # Usually the higher the number of trees the better to learn the data. It is also computationally expensive.
    'n_estimators': list([1, 2, 4, 8, 16, 32, 64, 100, 200])
    , 'criterion': ['squared_error']
    # `max_depth` represents the depth of each tree in the forest. 
    # The deeper the tree, the more splits it has and it captures more information about the data.
    , 'max_depth': list(np.linspace(1, 32, 32, endpoint=True))
    # `min_samples_split` represents the minimum number of samples required to split an internal node. 
    , 'min_samples_split': list([2, 3, 4, 5, 6, 7, 8, 9, 10]) # list(np.linspace(1, 1, 10, endpoint=True))
    # `min_samples_leaf` The minimum number of samples required to be at a leaf node.
    #, 'min_samples_leafs': list(np.linspace(0.1, 0.5, 5, endpoint=True))
    # `max_features`: Represents the number of features to consider when looking for the best split.
    , 'max_features': list(range(1,X_train.shape[1]))

}]

In [32]:
t.tic()
result_ETR = hyper_tuning("ExtraTrees", model, search_space, X_train, y_train)
t.toc(restart=True)

# Get the results
print(result_ETR.best_score_)
print("")
print(result_ETR.best_estimator_)
print("")
print(result_ETR.best_params_)

Elapsed time is 24.816417 seconds.
-103.267230170774

ExtraTreesRegressor(max_depth=18.0, max_features=11, min_samples_split=5,
                    n_estimators=64)

{'n_estimators': 64, 'min_samples_split': 5, 'max_features': 11, 'max_depth': 18.0, 'criterion': 'squared_error'}


## XG Boost 
XGBRegressor()

In [33]:
# Select an algorithm
model = XGBRegressor()
model.get_params()

{'objective': 'reg:squarederror',
 'base_score': None,
 'booster': None,
 'callbacks': None,
 'colsample_bylevel': None,
 'colsample_bynode': None,
 'colsample_bytree': None,
 'early_stopping_rounds': None,
 'enable_categorical': False,
 'eval_metric': None,
 'gamma': None,
 'gpu_id': None,
 'grow_policy': None,
 'importance_type': None,
 'interaction_constraints': None,
 'learning_rate': None,
 'max_bin': None,
 'max_cat_to_onehot': None,
 'max_delta_step': None,
 'max_depth': None,
 'max_leaves': None,
 'min_child_weight': None,
 'missing': nan,
 'monotone_constraints': None,
 'n_estimators': 100,
 'n_jobs': None,
 'num_parallel_tree': None,
 'predictor': None,
 'random_state': None,
 'reg_alpha': None,
 'reg_lambda': None,
 'sampling_method': None,
 'scale_pos_weight': None,
 'subsample': None,
 'tree_method': None,
 'validate_parameters': None,
 'verbosity': None}

In [34]:
# define search space
search_space = [{
    'max_depth': [3, 5, 6, 10, 15, 20]
    , 'learning_rate': [0.01, 0.1, 0.2, 0.3]
    , 'subsample': np.arange(0.5, 1.0, 0.1)
    , 'colsample_bytree': np.arange(0.4, 1.0, 0.1)
    , 'colsample_bylevel': np.arange(0.4, 1.0, 0.1)
    , 'n_estimators': [100, 500, 1000, 1500, 2000]
}]

In [35]:
t.tic()
result_XGB = hyper_tuning("XGBoost", model, search_space, X_train, y_train)
t.toc(restart=True)

# Get the results
print(result_XGB.best_score_)
print("")
print(result_XGB.best_estimator_)
print("")
print(result_XGB.best_params_)

Elapsed time is 542.755983 seconds.
-97.98322713932491

XGBRegressor(base_score=0.5, booster='gbtree', callbacks=None,
             colsample_bylevel=0.6, colsample_bynode=1, colsample_bytree=0.6,
             early_stopping_rounds=None, enable_categorical=False,
             eval_metric=None, gamma=0, gpu_id=-1, grow_policy='depthwise',
             importance_type=None, interaction_constraints='',
             learning_rate=0.01, max_bin=256, max_cat_to_onehot=4,
             max_delta_step=0, max_depth=6, max_leaves=0, min_child_weight=1,
             missing=nan, monotone_constraints='()', n_estimators=1000,
             n_jobs=0, num_parallel_tree=1, predictor='auto', random_state=0,
             reg_alpha=0, reg_lambda=1, ...)

{'subsample': 0.5, 'n_estimators': 1000, 'max_depth': 6, 'learning_rate': 0.01, 'colsample_bytree': 0.6, 'colsample_bylevel': 0.6}


# Loading and evaluating models

In [37]:
# Load the trained model
filename = 'ml_trained_models/{}.sav'.format("KNN")
model = joblib.load(filename)

# make predictions
y_prediction = model.predict(X_test)

metrics = dict()
# evaluate predictions
metrics["RMSE"] = mean_squared_error(y_test, y_prediction, squared=False)
metrics["MAE"] = mean_absolute_error(y_test, y_prediction)
metrics["MAPE (%)"] = mean_absolute_percentage_error(y_test, y_prediction) *100
metrics["R^2 (%)"] = r2_score(y_test, y_prediction) * 100
metrics["Max Error"] = max_error(y_test, y_prediction)    

print(y_prediction)
metrics

[ 9.   12.   10.   ... 22.25 25.19 27.87]


{'RMSE': 0.0, 'MAE': 0.0, 'MAPE (%)': 0.0, 'R^2 (%)': 100.0, 'Max Error': 0.0}

In [38]:
# Evaluate a dict of models {name:object}, returns {name:score}
def multiple_model_evaluation(X_test, y_test, models_list):
    metrics_df = pd.DataFrame()
    
    for name in models_list:
        # evaluate the model
        s.tic()
        tmp_df = pd.DataFrame(single_model_evaluation(X_test, y_test, name), index=[0])
        tmp_df.insert(0, "Model Name", name, True)
        tmp_df.insert(0, "Type", "ML", True)
        metrics_df = metrics_df.append(tmp_df)
        print("> {}.".format(name))
        s.toc(restart=True)
        
    return metrics_df.reset_index(drop = True)

In [39]:
# get model list
models_list = ["KNN", "DecisionTrees", "SVR_RBF", "SVR_Linear", "RandomForest", "ExtraTrees", "XGBoost"]

# evaluate models
t.tic() #Start timer
results = multiple_model_evaluation(X_test, y_test, models_list)
t.toc() #Time elapsed since t.tic()

results

> KNN.
Elapsed time is 0.220636 seconds.
> DecisionTrees.
Elapsed time is 0.012958 seconds.
> SVR_RBF.
Elapsed time is 0.369502 seconds.
> SVR_Linear.
Elapsed time is 0.165148 seconds.
> RandomForest.
Elapsed time is 0.317994 seconds.
> ExtraTrees.
Elapsed time is 0.044833 seconds.
> XGBoost.
Elapsed time is 0.035049 seconds.
Elapsed time is 0.000370 seconds.


Unnamed: 0,Type,Model Name,RMSE,MAE,MAPE (%),R^2 (%),Max Error
0,ML,KNN,0.0,0.0,0.0,100.0,0.0
1,ML,DecisionTrees,9.96084,7.083602,56.385265,63.942364,68.625079
2,ML,SVR_RBF,8.489852,5.640337,29.947322,73.805777,80.739253
3,ML,SVR_Linear,9.628586,6.628575,26.804631,66.307727,80.389328
4,ML,RandomForest,2.806154,1.919072,17.787861,97.138272,27.241561
5,ML,ExtraTrees,1.871815,1.21141,7.609728,98.726699,28.946576
6,ML,XGBoost,3.656935,2.70164,29.296188,95.13996,26.219849


# Sources:
## Main 
https://practicaldatascience.co.uk/machine-learning/how-to-use-model-selection-and-hyperparameter-tuning


* sklearn.model_selection.RandomizedSearchCV
    - https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html 
    - https://scikit-learn.org/stable/modules/grid_search.html?highlight=randomsearchcv
* sklearn.model_selection.KFold
    - https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html
    - https://machinelearningmastery.com/k-fold-cross-validation/


## Models
* KNN
    - https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html
    - https://scikit-learn.org/stable/modules/generated/sklearn.metrics.DistanceMetric.html#sklearn.metrics.DistanceMetric
* DecisionTreeRegressor()
    - https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
* pmdarima
    - https://towardsdatascience.com/efficient-time-series-using-pythons-pmdarima-library-f6825407b7f0
* SVM
    - https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html?highlight=svm%20svr%20kernel%20poly
    - https://medium.com/all-things-ai/in-depth-parameter-tuning-for-svc-758215394769
* Random Forest
    - https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html
    - https://medium.com/all-things-ai/in-depth-parameter-tuning-for-random-forest-d67bb7e920d
* XGBoost
    - https://towardsdatascience.com/xgboost-fine-tune-and-optimize-your-model-23d996fab663
    
* Gaussian NB
    - https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html
    - https://medium.com/analytics-vidhya/how-to-improve-naive-bayes-9fa698e14cba
    - https://www.analyticsvidhya.com/blog/2021/01/gaussian-naive-bayes-with-hyperpameter-tuning/
    
## Metrics
* Metrics and scoring: quantifying the quality of predictions
    - https://scikit-learn.org/stable/modules/model_evaluation.html
    - https://openclassrooms.com/en/courses/6401081-improve-the-performance-of-a-machine-learning-model/6539936-improve-your-feature-selection