### Hyperparameter_Tuning

Random and grid search are tuning techniques commonly used when it comes to searching and setting hyperparameter that the model may achieve much better fitting to the data. They search different sets of values defined within a search space for model hyperparameters and return a subset that results in the best performing model on a given dataset. 

Both hyperparameter optimizations would be as much valuable as in experimenting machine learning, however they are not without limitations. Grid search is known for "brute-force" approach citing to permutations applied to each pairs of values of each space group when creating collections of hyperparameters. When a search space grows in length or size, a computer would unavoidably need to test the sheer number of permutations with an exhaustive numbers of iterations to be executed. Finding the trade-off between the time required to run all of the permutations and the search capability of the optimization is a must. An alternative to grid search, random search, simultaneously test the effect of different hyperparameter subsets at the same time using random sampling of each hyperparameter group. Due to its random behaviour, its explorations on the hyperparameter space is limited. In consequence, in order to fill up the gap so that the search can arrive at a global minima, a large number of iterations would be necessary. 

HyperOpt is in finding the optimum sets of hyperparameter based on Bayes Theorem. A brief description 


Due to stochastic nature of HyperOpt, results may vary each time when running this procedure as their differences in numerical precision are much likely to be seen. 


In [38]:
import numpy as np
import pandas as pd 
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.linear_model import Lasso, Ridge, RidgeCV
from sklearn.model_selection import cross_val_score
from statsmodels.tsa.arima.model import ARIMA

plt.style.use('fivethirtyeight')

In [2]:
from math import sqrt

from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

from sklearn.metrics import explained_variance_score, mean_absolute_error, median_absolute_error, mean_squared_error, r2_score
import hyperopt
from hyperopt import fmin, hp, tpe, Trials, space_eval, STATUS_OK
from hyperopt.pyll import scope as ho_scope
from hyperopt.pyll.stochastic import sample as ho_sample
from functools import partial

### Load and Prepare Data

In [4]:
### This inferred freq is 'MS' or Month Start after successfully getting this data loaded
data = pd.read_csv(
        "Internet_sales_UK_preprocessed.csv",
        parse_dates=["date"],
        index_col=["date"],
    )

int_col = list(data.select_dtypes("int").columns)
float_col = list(data.select_dtypes("float").columns)
data[int_col] = data[int_col].astype('int16')
data[float_col] = data[float_col].astype('float32')

data['Log_KPC4'] = np.log(data['KPC4'])
data['Log_KPB8'] = np.log(data['KPB8'])

kpc4_log_diff = data['Log_KPC4'].diff()
kpc4_log_diff = kpc4_log_diff.dropna()
kpb8_log_diff = data['Log_KPB8'].diff()
kpb8_log_diff = kpb8_log_diff.dropna()
    

In [6]:
train = data.loc[data.index < "2019-01-01"]
test = data.loc[data.index >= "2019-01-01"]

# the target variable
y_train = train["KPC4"].copy()
y_test = test["KPC4"].copy()

# remove raw time series from predictors set
X_train = train.drop(['KPC4','KPB8','KPB8_lag_1', 'KPB8_lag_3',
                    'KPB8_lag_6', 'KPB8_lag_12',
                    'KPB8_window_3_mean', 'KPB8_window_3_std',
                            'KPB8_window_6_mean', 'KPB8_window_6_std'], axis=1)
X_test = test.drop(['KPC4','KPB8','KPB8_lag_1', 'KPB8_lag_3',
                    'KPB8_lag_6', 'KPB8_lag_12',
                    'KPB8_window_3_mean', 'KPB8_window_3_std',
                    'KPB8_window_6_mean', 'KPB8_window_6_std'], axis=1)    

### Setting Up HyperOpt Experiment

In [29]:
search_space = {   
        'alpha': hp.uniform("alpha", 0.1, 0.9),
        'tol': hp.uniform("tol", 0.0001, 0.1),
        'fit_intercept': hp.choice('fit_intercept', [True, False]),
        'selection': hp.choice('selection', ["cyclic", "random"])
}

In [31]:
def objective(params):
    params = {
                'alpha': params['alpha'], 
                'tol': params['tol'],
                'fit_intercept': params['fit_intercept'],
                'selection': params['selection']
             }

    lr = Lasso(**params)
    score = cross_val_score(lr, X_train, y_train,
            scoring="r2", cv=5).mean()

    return {'loss': score, 'status': STATUS_OK }

If we just want to define search spaces available for different models to explore and test (and so we would have to rearrange a function objective), below shown modified codes that probably easier for others to find a model with an optimized hyperparameters in seamless way.

In [None]:
### This cell is for personal testing only. This is not used througout this section 
search_space = hp.choice('classifier_type', [
    {
        'type': 'lasso',
        'alpha': hp.uniform("alpha", 0.1, 0.9),
        'tol': hp.uniform("tol", 0.0001, 0.001)
    },
    {
        'type': 'ridge',
        'alphas': hp.uniform("alphas", 0.0001, 0.1),
        'gcv_mode': hp.choice('gcv_mode', ["auto", "svd", "eigen"]),
        'fit_intercept': hp.choice('fit_intercept', [True, False])
    },
])

def objective(params):

    regression_type = params['type']
    del params['type']
    if regression_type == 'lasso':
        lr = Lasso(**params)
    elif regression_type == 'ridge':
        lr = RidgeCV(**params)
    else:
        return 0

    score = cross_val_score(lr, X_train, y_train, scoring="r2", cv=5).mean()

    # Because fmin() tries to minimize the objective, this function must return the negative accuracy. 
    return {'loss': score, 'status': STATUS_OK}

### Running HyperOpt Experiment

#### Lasso

In [32]:
best_result_lasso = fmin(
                fn=objective, 
                space=search_space,
                algo=tpe.suggest,
                trials=Trials(),
                max_evals=100)

100%|█████████████████████████████████████████████| 100/100 [00:03<00:00, 25.29trial/s, best loss: -13.527628648149411]


In [34]:
hyperopt.space_eval(search_space, best_result_lasso)

{'alpha': 0.5827499074887039,
 'fit_intercept': False,
 'selection': 'random',
 'tol': 0.06623517909765843}

#### Ridge

In [53]:
search_space = {   
        'alpha': hp.uniform("alpha", 0.0001, 0.1),
        'solver': hp.choice('solver', ["svd", "cholesky", "lsqr", "sparse_cg", "sag", "saga", "auto"]),
        'tol': hp.uniform("tol", 0.0001, 0.1),
        'fit_intercept': hp.choice('fit_intercept', [True, False]),
}

In [54]:
def objective(params):
    params = {
                'alpha': params['alpha'], 
                'solver': params['solver'],
                'tol': params['tol'],
                'fit_intercept': params['fit_intercept']
             }

    lr = Ridge(**params)
    score = cross_val_score(lr, X_train, y_train,
            scoring="r2", cv=5).mean()

    return {'loss': score, 'status': STATUS_OK }

In [55]:
best_result_ridge = fmin(
                fn=objective, 
                space=search_space,
                algo=tpe.suggest,
                trials=Trials(),
                max_evals=100)

  6%|██▉                                             | 6/100 [00:00<00:03, 29.60trial/s, best loss: 0.8586935407269204]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



  9%|████▎                                           | 9/100 [00:00<00:03, 28.47trial/s, best loss: 0.7811918087739963]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



 18%|████████▍                                      | 18/100 [00:00<00:03, 26.16trial/s, best loss: 0.7811918087739963]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



 24%|███████████▌                                    | 24/100 [00:00<00:03, 23.23trial/s, best loss: 0.760691295422701]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=Tru

 30%|██████████████▍                                 | 30/100 [00:01<00:03, 19.72trial/s, best loss: 0.725202594358995]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



 45%|█████████████████████▏                         | 45/100 [00:01<00:02, 24.59trial/s, best loss: 0.6778398822411266]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



 66%|██████████████████████████████▎               | 66/100 [00:02<00:01, 24.22trial/s, best loss: -3.8113374663402504]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



 78%|███████████████████████████████████▉          | 78/100 [00:03<00:00, 25.04trial/s, best loss: -3.8113374663402504]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



 87%|████████████████████████████████████████      | 87/100 [00:03<00:00, 25.76trial/s, best loss: -3.8113374663402504]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



 96%|████████████████████████████████████████████▏ | 96/100 [00:03<00:00, 24.88trial/s, best loss: -3.8113374663402504]

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T

  return linalg.solve(A, Xy, sym_pos=True, overwrite_a=True).T



100%|█████████████████████████████████████████████| 100/100 [00:04<00:00, 24.15trial/s, best loss: -3.8113374663402504]


In [56]:
hyperopt.space_eval(search_space, best_result_ridge)

{'alpha': 0.05240704639688512,
 'fit_intercept': False,
 'solver': 'sag',
 'tol': 0.09896748769827081}