## Course - Hyperparameter Tuning in Python

# Module 2 Grid Search, 3 Random Search

## Introducing Grid Search

### Build Grid Search functions
In data science it is a great idea to try building algorithms, models and processes 'from scratch' so you can really understand what is happening at a deeper level. Of course there are great packages and libraries for this work (and we will get to that very soon!) but building from scratch will give you a great edge in your data science work.

In this exercise, you will create a function to take in 2 hyperparameters, build models and return results. You will use this function in a future exercise.

You will have available the X_train, X_test, y_train and y_test datasets available.

Instructions

Build a function that takes two parameters called learning_rate and max_depth for the learning rate and maximum depth.
Add capability in the function to build a GBM model and fit it to the data with the input hyperparameters.
Have the function return the results of that model and the chosen hyperparameters (learning_rate and max_depth).

In [None]:
# Create the function
def gbm_grid_search(learning_rate, max_depth):

	# Create the model
    model = GradientBoostingClassifier(learning_rate=learning_rate, max_depth=max_depth)
    
    # Use the model to make predictions
    predictions = model.fit(X_train, y_train).predict(X_test)
    
    # Return the hyperparameters and score
    return([learning_rate, max_depth, accuracy_score(y_test,predictions)])

### Iteratively tune multiple hyperparameters
In this exercise, you will build on the function you previously created to take in 2 hyperparameters, build a model and return the results. You will now use that to loop through some values and then extend this function and loop with another hyperparameter.

The function gbm_grid_search(learn_rate, max_depth) is available in this exercise.

If you need to remind yourself of the function you can run the function print_func() that has been created for you

In [None]:
# Create the relevant lists
results_list = []
learn_rate_list = [0.01, 0.1, 0.5]
max_depth_list = [2, 4, 6]

# Create the for loop
for learn_rate in learn_rate_list:
    for max_depth in max_depth_list:
        results_list.append(gbm_grid_search(learn_rate, max_depth))

# Print the results
print(results_list)

### Build Grid Search functions
In data science it is a great idea to try building algorithms, models and processes 'from scratch' so you can really understand what is happening at a deeper level. Of course there are great packages and libraries for this work (and we will get to that very soon!) but building from scratch will give you a great edge in your data science work.

In this exercise, you will create a function to take in 2 hyperparameters, build models and return results. You will use this function in a future exercise.

You will have available the X_train, X_test, y_train and y_test datasets available.

Instructions

Build a function that takes two parameters called learning_rate and max_depth for the learning rate and maximum depth.
Add capability in the function to build a GBM model and fit it to the data with the input hyperparameters.
Have the function return the results of that model and the chosen hyperparameters (learning_rate and max_depth).

In [None]:
# Create the function
def gbm_grid_search(learning_rate, max_depth):

	# Create the model
    model = GradientBoostingClassifier(learning_rate=learning_rate, max_depth=max_depth)
    
    # Use the model to make predictions
    predictions = model.fit(X_train, y_train).predict(X_test)
    
    # Return the hyperparameters and score
    return([learning_rate, max_depth, accuracy_score(y_test,predictions)])

Extend your loop to call gbm_grid_search (available in your console), then test the values [0.4 , 0.6] for the subsample hyperparameter and print the results. max_depth_list & learn_rate_list are available in your environment.

In [None]:
results_list = []

# Create the new list to test
subsample_list = [0.4 , 0.6] 

for learn_rate in learn_rate_list:
    for max_depth in max_depth_list:
    
    	# Extend the for loop
        for subsample in subsample_list:
        	
            # Extend the results to include the new hyperparameter
            results_list.append(gbm_grid_search_extended(learn_rate, max_depth, subsample))
            
# Print results
print(results_list)            

### GridSearchCV with Scikit Learn
The GridSearchCV module from Scikit Learn provides many useful features to assist with efficiently undertaking a grid search. You will now put your learning into practice by creating a GridSearchCV object with certain parameters.

The desired options are:

A Random Forest Estimator, with the split criterion as 'entropy'
5-fold cross validation
The hyperparameters max_depth (2, 4, 8, 15) and max_features ('auto' vs 'sqrt')
Use roc_auc to score the models
Use 4 cores for processing in parallel
Ensure you refit the best model and return training scores
You will have available X_train, X_test, y_train & y_test datasets.

Instructions

Create a Random Forest estimator as specified in the context above.
Create a parameter grid as specified in the context above.
Create a GridSearchCV object as outlined in

In [None]:
# Create a Random Forest Classifier with specified criterion
rf_class = RandomForestClassifier(criterion ='entropy')

# Create the parameter grid
param_grid = {"max_depth": [2, 4, 8, 15], "max_features": ["auto","sqrt"]} 

# Create a GridSearchCV object
grid_rf_class = GridSearchCV(
    estimator=rf_class,
    param_grid=param_grid,
    scoring='roc_auc',
    n_jobs=4,
    cv=5,
    refit=True, return_train_score=True)
print(grid_rf_class)

### Exploring the grid search results
You will now explore the cv_results_ property of the GridSearchCV object defined in the video. This is a dictionary that we can read into a pandas DataFrame and contains a lot of useful information about the grid search we just undertook.

A reminder of the different column types in this property:

time_ columns
param_ columns (one for each hyperparameter) and the singular params column (with all hyperparameter settings)
a train_score column for each cv fold including the mean_train_score and std_train_score columns
a test_score column for each cv fold including the mean_test_score and std_test_score columns
a rank_test_score column with a number from 1 to n (number of iterations) ranking the rows based on their mean_test_score
Instructions
100 XP
Read the cv_results_ property of the grid_rf_class GridSearchCV object into a data frame & print the whole thing out to inspect.
Extract & print the singular column containing a dictionary of all hyperparameters used in each iteration of the grid search.
Extract & print the row that had the best mean test score by indexing using the rank_test_score column.

In [None]:
# Read the cv_results property into a dataframe & print it out
cv_results_df = pd.DataFrame(grid_rf_class.cv_results_)
print(cv_results_df)

# Extract and print the column with a dictionary of hyperparameters used
column = cv_results_df.loc[:, ["params"]]
print(column)

# Extract and print the row that had the best mean test score
best_row = cv_results_df[cv_results_df['rank_test_score'] == 1 ]
print(best_row)

### Analyzing the best results
At the end of the day, we primarily care about the best performing 'square' in a grid search. Luckily Scikit Learn's gridSearchCv objects have a number of parameters that provide key information on just the best square (or row in cv_results_).

Three properties you will explore are:

best_score_ – The score (here ROC_AUC) from the best-performing square.
best_index_ – The index of the row in cv_results_ containing information on the best-performing square.
best_params_ – A dictionary of the parameters that gave the best score, for example 'max_depth': 10
The grid search object grid_rf_class is available.

A dataframe (cv_results_df) has been created from the cv_results_ for you on line 6. This will help you index into the results.

Instructions

Extract and print out the ROC_AUC score from the best performing square in grid_rf_class.
Create a variable from the best-performing row by indexing into cv_results_df.
Create a variable, best_n_estimators by extracting the n_estimators parameter from the best-performing square in grid_rf_class and print it out.

In [None]:
# Print out the ROC_AUC score from the best-performing square
best_score = grid_rf_class.best_score_
print(best_score)

# Create a variable from the row related to the best-performing square
cv_results_df = pd.DataFrame(grid_rf_class.cv_results_)
best_row = cv_results_df.loc[[grid_rf_class.best_index_]]
print(best_row)

# Get the n_estimators parameter from the best-performing square and print
best_n_estimators = grid_rf_class.best_params_["n_estimators"]
print(best_n_estimators)

### Randomly Sample Hyperparameters
To undertake a random search, we firstly need to undertake a random sampling of our hyperparameter space.

In this exercise, you will firstly create some lists of hyperparameters that can be zipped up to a list of lists. Then you will randomly sample hyperparameter combinations preparation for running a random search.

You will use just the hyperparameters learning_rate and min_samples_leaf of the GBM algorithm to keep the example illustrative and not overly complicated.

Instructions

Create a list of 200 values for the learning_rate hyperparameter between 0.01 and 1.5 and assign to the list learn_rate_list.
Create a list of values between 10 and 40 inclusive for the hyperparameter min_samples_leaf and assign to the list min_samples_list.
Combine these lists into a list of lists to sample from.
Randomly sample 250 models from these hyperparameter combinations and print the result.

In [None]:
250# Create a list of values for the learning_rate hyperparameter
learn_rate_list = list(np.linspace(0.01,1.5,200))

# Create a list of values for the min_samples_leaf hyperparameter
min_samples_list = list(range(10,41))

# Combination list
combinations_list = [list(x) for x in product(learn_rate_list, min_samples_list)]

# Sample hyperparameter combinations for a random search.
random_combinations_index = np.random.choice(range(0, len(combinations_list)), 250, replace=False)
combinations_random_chosen = [combinations_list[x] for x in random_combinations_index]

# Print the result
print(combinations_random_chosen)

### Randomly Search with Random Forest
To solidify your knowledge of random sampling, let's try a similar exercise but using different hyperparameters and a different algorithm.

As before, create some lists of hyperparameters that can be zipped up to a list of lists. You will use the hyperparameters criterion, max_depth and max_features of the random forest algorithm. Then you will randomly sample hyperparameter combinations in preparation for running a random search.

You will use a slightly different package for sampling in this task, random.sample().

Instructions

Create lists of the values 'gini' and 'entropy' for criterion & "auto", "sqrt", "log2", None for max_features.
Create a list of values between 3 and 55 inclusive for the hyperparameter max_depth and assign to the list max_depth_list. Remember that range(N,M) will create a list from N to M-1.
Combine these lists into a list of lists to sample from using product().
Randomly sample 150 models from the combined list and print the result.


In [None]:
# Create lists for criterion and max_features
criterion_list = ['gini','entropy']
max_feature_list = ["auto", "sqrt", "log2", None]

# Create a list of values for the max_depth hyperparameter
max_depth_list = list(range(3,56))

# Combination list
combinations_list = [list(x) for x in product(criterion_list,max_feature_list , max_depth_list)]

# Sample hyperparameter combinations for a random search
combinations_random_chosen = random.sample(combinations_list, 150)

# Print the result
print(combinations_random_chosen)

### Visualizing a Random Search
Visualizing the search space of random search allows you to easily see the coverage of this technique and therefore allows you to see the effect of your sampling on the search space.

In this exercise you will use several different samples of hyperparameter combinations and produce visualizations of the search space.

The function sample_and_visualize_hyperparameters() takes a single argument (number of combinations to sample) and then randomly samples hyperparameter combinations, just like you did in the last exercise! The function will then visualize the combinations.

If you want to see the function definition, you can use Python's handy inspect library, like so:

print(inspect.getsource(sample_and_visualize_hyperparameters))

Instructions

Confirm how many possible hyperparameter combinations there are in combinations_list by assigning to the variable number_combs and print this out.
Sample and visualize 50, 500 and 1500 combinations. You will use a loop for succinctness. What do you notice about the visualization?
Now sample and visualize the entire set of combinations. You have already made a variable to assist with this. What does this look like?

In [None]:
# Confirm how many hyperparameter combinations & print
number_combs = len(combinations_list)
print(number_combs)

# Sample and visualise specified combinations
for x in [50, 500,1500]:
    sample_and_visualize_hyperparameters(x)
    
# Sample all the hyperparameter combinations & visualise
sample_and_visualize_hyperparameters(number_combs)

### The RandomizedSearchCV Object
Just like the GridSearchCV library from Scikit Learn, RandomizedSearchCV provides many useful features to assist with efficiently undertaking a random search. You're going to create a RandomizedSearchCV object, making the small adjustment needed from the GridSearchCV object.

The desired options are:

A default Gradient Boosting Classifier Estimator
5-fold cross validation
Use accuracy to score the models
Use 4 cores for processing in parallel
Ensure you refit the best model and return training scores
Randomly sample 10 models
The hyperparameter grid should be for learning_rate (150 values between 0.1 and 2) and min_samples_leaf (all values between and including 20 and 64).

You will have available X_train & y_train datasets.

Instructions

Create a parameter grid as specified in the context above.
Create a RandomizedSearchCV object as outlined in the context above.
Fit the RandomizedSearchCV object to the training data.
Print the values chosen by the modeling process for both hyperparameters.

In [None]:
# Create the parameter grid
param_grid = {'learning_rate': np.linspace(0.1,2,150), 'min_samples_leaf': list(range(20,65))} 

# Create a random search object
random_GBM_class = RandomizedSearchCV(
    estimator = GradientBoostingClassifier(),
    param_distributions = param_grid,
    cv = 5,
    scoring='accuracy', n_jobs=4, n_iter = 10, refit=True, return_train_score = True)

# Fit to the training data
random_GBM_class.fit(X_train, y_train)

# Print the values used for both hyperparameters
print(random_GBM_class.cv_results_['param_learning_rate'])
print(random_GBM_class.cv_results_['param_min_samples_leaf'])

### RandomSearchCV in Scikit Learn
Let's practice building a RandomizedSearchCV object using Scikit Learn.

The hyperparameter grid should be for max_depth (all values between and including 5 and 25) and max_features ('auto' and 'sqrt').

The desired options for the RandomizedSearchCV object are:

A RandomForestClassifier Estimator with n_estimators of 80.
3-fold cross validation (cv)
Use roc_auc to score the models
Use 4 cores for processing in parallel (n_jobs)
Ensure you refit the best model and return training scores
Only sample 5 models for efficiency (n_iter)
X_train & y_train datasets are loaded for you.

Remember, to extract the chosen hyperparameters these are found in cv_results_ with a column per hyperparameter. For example, the column for the hyperparameter criterion would be param_criterion.

Instructions

Create a hyperparameter grid as specified in the context above.
Create a RandomizedSearchCV object as outlined in the context above.
Fit the RandomizedSearchCV object to the training data.
Index into the cv_results_ object to print the values chosen by the modeling process for both hyperparameters (max_depth and max_features).

In [None]:
# Create the parameter grid
param_grid = {'max_depth': list(range(5,26)), 'max_features': ['auto','sqrt']} 

# Create a random search object
random_rf_class = RandomizedSearchCV(
    estimator = RandomForestClassifier(n_estimators=80),
    param_distributions = param_grid, n_iter = 5,
    scoring='roc_auc', n_jobs=4, cv = 3, refit=True, return_train_score = True )

# Fit to the training data
random_rf_class.fit(X_train, y_train)

# Print the values used for both hyperparameters
print(random_rf_class.cv_results_['param_max_depth'])
print(random_rf_class.cv_results_['param_max_features'])

## Comparing Grid and Random Search

### Grid and Random Search Side by Side
Visualizing the search space of random and grid search together allows you to easily see the coverage that each technique has and therefore brings to life their specific advantages and disadvantages.

In this exercise, you will sample hyperparameter combinations in a grid search way as well as a random search way, then plot these to see the difference.

You will have available:

combinations_list which is a list of combinations of learn_rate and min_samples_leaf for this algorithm
The function visualize_search() which will make your hyperparameter combinations into X and Y coordinates and plot both grid and random search combinations on the same graph. It takes as input two lists of hyperparameter combinations.

In [None]:
# Sample grid coordinates
grid_combinations_chosen = combinations_list[0:300]

# Create a list of sample indexes
sample_indexes = list(range(0,len(combinations_list)))

# Randomly sample 300 indexes
random_indexes = np.random.choice(sample_indexes, 300, replace=False)

# Use indexes to create random sample
random_combinations_chosen = [combinations_list[index] for index in random_indexes]

# Call the function to produce the visualization
visualize_search(grid_combinations_chosen, random_combinations_chosen)