# Applied Machine Learning
## Grid search to determine the best training parameters
- Author: Lorien Pratt
- Copyright: Quantellia LLC 2019.  All Rights Reserved

Grid search does many model runs to find which one produces the best result after a few epochs (assuming that this is a good proxy for the final learning performance, which may or may not be true).

Grid search explores multiple network architectures (number of layers, number of hidden units per layer) and other learning parameters. 

## Setup

In [12]:
# Set up to be able to invoke R from inside this Python 2 notebook
#%load_ext rpy2.ipython
#import rpy2.rinterface

##### Install and initialize the H2O library, which we will use to do the grid search
Note that this will generate a lot of warnings. These are expected, and not errors but rather notifications

In [13]:
require(h2o)
h2o.init()
h2o.no_progress() # Turns off progress bars, which don't display well in Jupyter

 Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         23 hours 4 seconds 
    H2O cluster timezone:       Etc/UTC 
    H2O data parsing timezone:  UTC 
    H2O cluster version:        3.26.0.10 
    H2O cluster version age:    4 days  
    H2O cluster name:           H2O_started_from_R_jupyter_cif027 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.09 GB 
    H2O cluster total cores:    4 
    H2O cluster allowed cores:  4 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 
    R Version:                  R version 3.6.1 (2019-07-05) 



Set up my initials for file names

In [14]:
my_initials<-"nm"

Read in the test and training files that we created in the Prepare Data lesson, and convert them to h2o's internal "hex" format

In [15]:
train_filename<-paste0("data/",my_initials,"_train_auto.csv"); print( train_filename )
test_filename<-paste0("data/",my_initials,"_test_auto.csv"); print( test_filename )
backtest_filename<-paste0("data/",my_initials,"_backtest_auto.csv"); print( backtest_filename )

[1] "data/nm_train_auto.csv"
[1] "data/nm_test_auto.csv"
[1] "data/nm_backtest_auto.csv"


Read in the test and training files you created in the previous step. Convert them to h2o files along the way.

In [None]:
train_hex <- h2o.importFile(train_filename, parse = TRUE, header = TRUE, 
                            sep = "", col.names = NULL, col.types = NULL, na.strings = NULL)
test_hex <- h2o.importFile(test_filename, parse = TRUE, header = TRUE, 
                           sep = "", col.names = NULL, col.types = NULL, na.strings = NULL)

Tell the grid search which of the columns are predictors.  First, let's look at the top of the dataset again to remind us of the structure...

In [6]:
head(train_hex)

car.name,cylinders,displacement,horsepower,weight,acceleration,model.year,origin,mpg
<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
chevroelt chevelle malibu,6,250,105,3897,18.5,75,1,16.0
oldsmobile cutlass salon brougham,8,260,90,3420,22.2,79,1,23.9
plymouth reliant,4,135,84,2385,12.9,81,1,30.0
audi	5000,5,131,103,2830,15.9,78,2,20.3
oldsmobile omega brougham,6,173,115,2700,12.9,79,1,26.8
plymouth fury iii,8,318,150,4096,13.0,71,1,14.0


Set the predictor columns and chec that they're the right ones

In [7]:
predictors <- c(2,3,4,5,6,7,8)
names(train_hex)[predictors]

Tell the model training which of the columns is the target column (in this case, the very last column, mpg)

In [8]:
targetcol<-ncol(train_hex)

Tell the grid search which of the columns are predictors.  First, let's loo at the top of the dataset again to remind us of the structure...

Create a set of grid search *hyperparameters* .  These are the alternative structures we'll try to see which one
creates the best results after running the speciied number of epochs

In [9]:
hyper_params <- list(
    hidden=list(1, 5, 10, c(5,5), c(10,10,10)),
    l1=c(0, .01, .00001),
    l2=c(0, .01, 0.001, .00001),
    input_dropout_ratio=c(0, .01, .0001),
    epochs=c(100)
)

Run the grid test with these parameters.  This can take a little while, during which there will be no feedback.

In [10]:
grid_result <- h2o.grid(
    algorithm="deeplearning",
    x=predictors,
    y=targetcol,
    grid_id="grid_2", # Can't be reused; consider incrementing on subsequent runs. TBD: try kernel restart for this instead
    training_frame=train_hex,
    validation_frame=test_hex,
    quiet_mode=FALSE,
    export_weights_and_biases=TRUE,
    activation="Tanh",
    autoencoder=FALSE,
    ignore_const_cols=FALSE,
    standardize=FALSE,
    train_samples_per_iteration=0,
    adaptive_rate=FALSE, # Manaully tuned learning rate
    classification_stop = -1, # Dispable automatic stopping
    regression_stop = -1, # Disable automatic stopping
    stopping_rounds = 0, # Don't stop automatically
    hyper_params = hyper_params

)

Display the grid search results

In [11]:
h2o.getGrid("grid_2", sort_by="mse", decreasing=FALSE)

H2O Grid Details

Grid ID: grid_2 
Used hyper parameters: 
  -  epochs 
  -  hidden 
  -  input_dropout_ratio 
  -  l1 
  -  l2 
Number of models: 180 
Number of failed models: 0 

Hyper-Parameter Search Summary: ordered by increasing mse
  epochs       hidden input_dropout_ratio     l1   l2       model_ids
1  100.0 [10, 10, 10]                0.01    0.0  0.0 grid_2_model_10
2  100.0         [10]                 0.0   0.01  0.0 grid_2_model_18
3  100.0 [10, 10, 10]              1.0E-4   0.01  0.0 grid_2_model_30
4  100.0       [5, 5]              1.0E-4 1.0E-5 0.01 grid_2_model_89
5  100.0 [10, 10, 10]                 0.0   0.01 0.01 grid_2_model_65
                 mse
1   58.6383415372254
2 58.887268785677044
3 58.893496403324285
4  58.95553127839519
5  60.11445319547099

---
    epochs hidden input_dropout_ratio     l1     l2        model_ids
175  100.0   [10]                0.01 1.0E-5 1.0E-5 grid_2_model_173
176  100.0    [5]                0.01 1.0E-5   0.01  grid_2_model_82
177