# Decison Trees

The objective of this notebook is to use a trustworthy and personally engaging dataset to build and assess a prediction model. The method entails thorough data preparation, which begins with removing unimportant variables and setting up a 70/30 split between training and testing. Either removing rows or columns or using imputation techniques depending on column types are used to deal with missing values. A new column is added using feature engineering, and suitable transformers are used to provide a structured column transformation for binary, categorical, and numeric features. The modelling process starts with a baseline model that uses a Dummy Classifier. Next, all possible parameter combinations for the DecisionTreeClassifier are thoroughly explored using GridSearchCV. Using classification measures, the final evaluation ensures a thorough overview of the whole process by comparing the baseline model with the best-performing model.

In [2]:
# Common imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


np.random.seed(42)


The Python script used for data modification and visualization is initialized by this code. Key libraries are imported, including Pandas for data processing, Matplotlib for displaying, and NumPy for numerical calculations. By establishing a seed for NumPy's random number generation, the line `np.random.seed(42)' guarantees repeatability. In general, the script aims to establish a basis for using data, guaranteeing consistent outcomes even in situations where unpredictability is present.

# Get the data

In [113]:
#We will predict the "Exited" in the data set:

bankchurn = pd.read_csv(r'C:\Users\vadla\Downloads\BankChurn\BankChurn.csv')
bankchurn.head()

Unnamed: 0,id,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,0,15674932,Okwudilichukwu,668,France,Male,33.0,3,0.0,2,1.0,0.0,181449.97,0
1,1,15749177,Okwudiliolisa,627,France,Male,33.0,1,0.0,2,1.0,1.0,49503.5,0
2,2,15694510,Hsueh,678,France,Male,40.0,10,0.0,2,1.0,0.0,184866.69,0
3,3,15741417,Kao,581,France,Male,34.0,2,148882.54,1,1.0,1.0,84560.88,0
4,4,15766172,Chiemenam,716,Spain,Male,33.0,5,0.0,2,1.0,1.0,15068.83,0


The CSV file is read via the 'pd.read_csv()' method, which then generates a DataFrame called 'bankchurn' to contain the data. The initial few rows of the DataFrame are then shown using the 'head()' function, giving a brief summary of the organization and contents of the dataset. It is standard procedure to examine the data and ascertain its format prior to any additional processing or analysis.

# Split the data into train and test

In [114]:
from sklearn.model_selection import train_test_split

train_set, test_set = train_test_split(bankchurn, test_size=0.3)

Here, it uses the scikit-learn function 'train_test_split' to split the 'bankchurn' dataset into two subsets: 'train_set' for model training and 'test_set' for performance evaluation. The function distributes the data at random, setting aside 30% for the test set and the remaining 70% for training. To ensure that the model can be evaluated on unknown data after it has been trained on the training set, machine learning requires this separation. As a consequence, the "train_set" and "test_set" that were produced are now prepared for additional study and model training, completing an essential process in the machine learning workflow.

## Check the missing values

In [115]:
train_set.isna().sum()

id                 0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

Here, it explores the 'train_set' DataFrame to see if any values are missing. The number of missing values for each column is then determined by using the'sum()' function once the 'isna()' technique has been used to detect the missing values. In essence, this command counts the number of missing values in each training dataset column.

In [116]:
test_set.isna().sum()

id                 0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

In order to determine and count the missing values in each column, this code looks through the 'test_set' DataFrame. The amount of missing values in each column is counted using the'sum()' function, while the 'isna()' technique finds the missing values. To evaluate the completeness of the data in the testing set, this procedure is necessary. 

# Data Prep

In [117]:
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder

from sklearn.preprocessing import FunctionTransformer

It makes easier to create a thorough data pretreatment pipeline by importing essential modules from the scikit-learn package. The modules consist of ColumnTransformer, which applies unique transformers to columns in the dataset; Pipeline, which sequences data processing stages; SimpleImputer, which handles missing values; StandardScaler, which scales numeric features; and OneHotEncoder, which encodes categorical features. In addition, the import of FunctionTransformer enables the pipeline to incorporate user-defined functions as custom transformers. These tools work together to provide a versatile and reliable preprocessing pipeline that can handle a range of data pretreatment needs in machine learning operations.

## Drop the Unnecessary colums

In [118]:
# We can't use the following columns in this tutorial, because they are for classification tasks

train = train_set.drop(['id', 'CustomerId','Surname'], axis=1)
test = test_set.drop(['id', 'CustomerId','Surname'], axis=1)

## Separating the target variable

In [119]:
train_y = train[['Exited']]
test_y = test[['Exited']]

train_inputs = train.drop(['Exited'], axis=1)
test_inputs = test.drop(['Exited'], axis=1)

The data is being organized by this code in preparation for a machine learning model. It entails dividing training and testing sets with respect to the target variable, "Exited." Predictive variables are represented by the 'Exited' column in the train_y and test_y DataFrames alone. By removing the 'Exited' column, the feature variables are segregated into the train_inputs and test_inputs DataFrames simultaneously. The input features that the machine learning model will use to create predictions are stored in these DataFrames. This way of dividing the dataset makes it evident how the input characteristics and the result variable differ, which makes training and assessing the prediction model easier.

## Feature Engineering

In [120]:
def new_col(df):
    
    #Create a copy so that we don't overwrite the existing dataframe
    df1 = df.copy()

        # Conditions
    conditions = [
        (df1['CreditScore'] < 650),
        (df1['CreditScore'] >= 650),
    ]

    #not valued =0
    #valued = 1 
    # Choices
    choices = [0, 1]

    # Applying the conditions and choices to the DataFrame
    df1['creditscore_class'] = np.select(conditions, choices)

    return df1[['creditscore_class']]
    # You can use this to check whether the calculation is made correctly:
#     return df1    

A method for producing a unique column in a DataFrame is presented by the supplied Python function, new_col. A copy of the input DataFrame is created to guarantee data integrity. Some requirements pertaining to the 'CreditScore' column must be met before the new column 'creditscore_class' may be created. The value assigned to the 'creditscore_class' is 0 if the credit score is less than 650, and 1 if it is more than or equal to 650. It produces a DataFrame with only the newly created 'creditscore_class' column in it. 

In [121]:
#Let's test the new function:

# Send the train set to the function we created
new_col(train_set)

Unnamed: 0,creditscore_class
76047,0
57212,0
159972,1
7950,0
65073,0
...,...
119879,1
103694,1
131932,1
146867,1


##  Identify the numerical and categorical columns

In [122]:
train_inputs.dtypes

CreditScore          int64
Geography           object
Gender              object
Age                float64
Tenure               int64
Balance            float64
NumOfProducts        int64
HasCrCard          float64
IsActiveMember     float64
EstimatedSalary    float64
dtype: object

It reveals the data types of each column in the 'train_inputs' DataFrame. It provides insights into the nature of the data, aiding in preprocessing decisions, as machine learning algorithms often require specific data representations for optimal performance.

In [123]:
# Identify the numerical columns
numeric_columns = train_inputs.select_dtypes(include=[np.number]).columns.to_list()

# Identify the categorical columns
categorical_columns = train_inputs.select_dtypes('object').columns.to_list()

In [124]:
# Identify the binary columns so we can pass them through without transforming
binary_columns = ['HasCrCard', 'IsActiveMember']

In [125]:
# Be careful: numerical columns already includes the binary columns,
# So, we need to remove the binary columns from numerical columns.

for col in binary_columns:
    numeric_columns.remove(col)

In [126]:
binary_columns

['HasCrCard', 'IsActiveMember']

In [127]:
numeric_columns

['CreditScore', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'EstimatedSalary']

In [128]:
categorical_columns

['Geography', 'Gender']

In [129]:
feat_eng_columns = ['CreditScore']

In [130]:
train_inputs.dtypes

CreditScore          int64
Geography           object
Gender              object
Age                float64
Tenure               int64
Balance            float64
NumOfProducts        int64
HasCrCard          float64
IsActiveMember     float64
EstimatedSalary    float64
dtype: object

# Pipeline

In [131]:
numeric_transformer = Pipeline(steps=[
                ('imputer', SimpleImputer(strategy='median')),
                ('scaler', StandardScaler())])

In [132]:
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='unknown')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))])

In [133]:
binary_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent'))])

In [134]:
my_new_column = Pipeline(steps=[('my_new_column', FunctionTransformer(new_col)),
                               ('scaler', StandardScaler())])

In [135]:
preprocessor = ColumnTransformer([
        ('num', numeric_transformer, numeric_columns),
        ('cat', categorical_transformer, categorical_columns),
        ('binary', binary_transformer, binary_columns),
        ('trans', my_new_column, feat_eng_columns)],
        remainder='passthrough')

#passtrough is an optional step. You don't have to use it.

Here an extensive data preparation pipeline is been constructed. It includes several converters for binary, category, numeric, and feature-engineered columns. The median is used to impute missing values for numerical columns, and StandardScaler is then used to scale the numbers. One-hot encoding is used in categorical columns, which have missing values filled in with "unknown." For imputation, binary columns utilize the value that occurs most frequently. A user-defined function called "new_col" is integrated into a custom pipeline called "my_new_column," which also scales the output. Utilizing ColumnTransformer, the last "preprocessor" integrates all transformers and correctly handles different kinds of columns.

# Transform: fit_transform() for TRAIN

In [136]:
#Fit and transform the train data
train_x = preprocessor.fit_transform(train_inputs)

train_x

array([[-1.65957673,  0.77436888,  0.34734527, ...,  1.        ,
         0.        , -1.09509054],
       [-0.47299662,  0.43624006, -0.00888708, ...,  0.        ,
         1.        , -1.09509054],
       [ 1.88767327, -0.57814643, -0.00888708, ...,  1.        ,
         0.        ,  0.91316651],
       ...,
       [ 1.20070584,  0.09811123,  0.70357761, ...,  1.        ,
         0.        ,  0.91316651],
       [ 0.35136429,  1.11249771, -0.36511942, ...,  1.        ,
         1.        ,  0.91316651],
       [ 0.12653859, -0.35272721, -1.43381644, ...,  0.        ,
         0.        ,  0.91316651]])

In [137]:
train_x.shape

(115523, 14)

# Tranform: transform() for TEST

In [138]:
# Transform the test data
test_x = preprocessor.transform(test_inputs)

test_x

array([[ 1.7627701 , -0.80356565,  1.05980995, ...,  1.        ,
         0.        ,  0.91316651],
       [-0.34809345, -1.3671137 ,  0.34734527, ...,  1.        ,
         0.        , -1.09509054],
       [-1.14747373,  0.09811123,  0.70357761, ...,  1.        ,
         0.        , -1.09509054],
       ...,
       [-0.18571933, -1.3671137 ,  0.34734527, ...,  1.        ,
         0.        , -1.09509054],
       [ 1.21319616,  0.21082084,  0.34734527, ...,  1.        ,
         0.        ,  0.91316651],
       [ 0.26393207,  0.66165927,  1.41604229, ...,  1.        ,
         0.        ,  0.91316651]])

In [139]:
test_x.shape

(49511, 14)

# Baseline

In [140]:
from sklearn.dummy import DummyClassifier

dummy_clf = DummyClassifier(strategy="most_frequent")

dummy_clf.fit(train_x, train_y)

In [141]:
from sklearn.metrics import accuracy_score

In [142]:
#Baseline Train Accuracy
dummy_train_pred = dummy_clf.predict(train_x)

baseline_train_acc = accuracy_score(train_y, dummy_train_pred)

print('Baseline Train Accuracy: {}' .format(baseline_train_acc))

Baseline Train Accuracy: 0.7875487998060993


In [143]:
#Baseline Test Accuracy
dummy_test_pred = dummy_clf.predict(test_x)

baseline_test_acc = accuracy_score(test_y, dummy_test_pred)

print('Baseline Test Accuracy: {}' .format(baseline_test_acc))

Baseline Test Accuracy: 0.7903900143402476


Utilizing scikit-learn's Dummy Classifier, a baseline model is created here. Predicting the most prevalent class in the training data is the method the classifier is configured to use. Using train_x and train_y as training data, the model is trained via the fit technique. The performance of the baseline model is then evaluated using accuracy ratings on the training and testing datasets. The accuracy_score function from scikit-learn is used to determine the accuracy, which is a measure of the percentage of correct predictions. The baseline testing accuracy is around 79.04%, whereas the acquired baseline training accuracy is approximately 78.75%. These accuracy scores provide as benchmarks for comparison and are used to assess how well more complex models perform.

# Decision Tree


In [144]:
from sklearn.tree import DecisionTreeClassifier 

tree_clf = DecisionTreeClassifier(max_depth=5)

tree_clf.fit(train_x, train_y)

In this code, scikit-learn is used to create a Decision Tree Classifier. The maximum depth of five layers set by the classifier restricts the intricacy of the tree. By reducing overfitting, this depth parameter makes sure the model adapts effectively to new, untested data. Using the training data train_x for features and train_y for the target, the classifier is trained using the fit technique. In order to generate predictions, decision trees, a kind of supervised learning algorithm, divide the feature space according to the properties of the data. This model balances identifying patterns in the training data with avoiding too complicated solutions by defining a maximum depth.

## Accuracy

In [145]:
from sklearn.metrics import accuracy_score

In [146]:
#Train accuracy:
train_y_pred = tree_clf.predict(train_x)

print(accuracy_score(train_y, train_y_pred))

0.8572751746405478


In [147]:
#Test accuracy:
test_y_pred = tree_clf.predict(test_x)

print(accuracy_score(test_y, test_y_pred))

0.857667992971259


A Decision Tree Classifier that was previously trained on the dataset is evaluated using this code. The code computes and outputs the training and testing accuracies using the scikit-learn accuracy_score function. The percentage of accurately predicted outcomes on the training dataset is shown by the training accuracy, which is around 85.73%. Likewise, the testing accuracy, which is around 85.77%, indicates how accurate the model is with unknown data. These values for accuracy show how effectively the Decision Tree Classifier adapts to new scenarios. Greater accuracy values indicate more effective learning and prediction skills, but for a thorough assessment of the model's performance, other metrics must be taken into account.

## Classification Matrix

In [148]:
from sklearn.metrics import confusion_matrix

#Test confusion matrix
confusion_matrix(test_y, test_y_pred)

array([[37097,  2036],
       [ 5011,  5367]], dtype=int64)

5367 occurrences were accurately predicted to be positive, True Positives (TP).
True Negative (TN): The correct prediction of 37097 cases was negative.
False Positive (FP): In 2036 cases, the results were negative even though the prediction was positive.
5011 cases were labeled as false negatives (FN) when, in fact, they were positive.
This matrix offers comprehensive information on the model's performance, especially with regard to the number of accurate and inaccurate predictions made for each class.

## Feature Importance

In [149]:
np.round(tree_clf.feature_importances_,3)

array([0.   , 0.436, 0.   , 0.03 , 0.383, 0.   , 0.   , 0.033, 0.   ,
       0.   , 0.003, 0.   , 0.115, 0.   ])

In [150]:
np.round(tree_clf.feature_importances_,3)[-1]

0.0

Using NumPy, the method determines and rounds the feature importances acquired from a trained Decision Tree Classifier. The significance of every feature in the dataset is represented, rounded to three decimal places, by the array. The quantification of each feature's contribution to the model's decision-making process is achieved through feature importances. The last feature in this particular instance has a significance of 0.0, indicating that it has little bearing on the classifier's predictions. Comprehending the significance of features aids in determining whether characteristics have a substantial effect on the model's performance and can direct the process of feature selection or engineering.

## More Regularization

In [151]:
tree_clf2 = DecisionTreeClassifier(min_samples_leaf = 10, max_depth=5)

tree_clf2.fit(train_x, train_y)

In [152]:
#Train accuracy:
train_y_pred = tree_clf2.predict(train_x)

print(accuracy_score(train_y, train_y_pred))

0.8572492057858608


In [153]:
#Test accuracy:
test_y_pred = tree_clf2.predict(test_x)

print(accuracy_score(test_y, test_y_pred))

0.8577083880349822


In [154]:
#Test confusion matrix
confusion_matrix(test_y, test_y_pred)

array([[37097,  2036],
       [ 5009,  5369]], dtype=int64)

Using min_samples_leaf set to 10 and max_depth=5, a new Decision Tree Classifier, {tree_clf2}, is trained in this code. These hyperparameters determine the minimum number of samples needed to be a leaf node as well as the complexity of the tree. Next, the training and testing datasets' accuracy scores are determined. The testing accuracy is around 85.77%, while the training accuracy is about 85.72%. In addition, a confusion matrix showing the model's predictions as a 2x2 array is provided for the testing data. Offering comprehensive insights into the classifier's performance on the testing set, this matrix displays the counts of true positive, true negative, false positive, and false negative cases.

## Prediction probabilities

In [155]:
#Select a random observation

random = test_x[50:51]
random

array([[ 0.42630619,  0.88707849, -1.0775841 , -0.88649192,  0.81457671,
        -0.15502069,  0.        ,  0.        ,  1.        ,  1.        ,
         0.        ,  1.        ,  1.        ,  0.91316651]])

In [156]:
# Observe the input variables of the observation

tree_clf2.predict_proba(random)

array([[0.94337812, 0.05662188]])

In [157]:
# Round the probability values

np.round(tree_clf2.predict_proba(random),2)

array([[0.94, 0.06]])

An array of numbers is used in this code to represent a randomly chosen observation from the testing dataset ({test_x}). Many aspects, including credit score, balance, and other properties, are among the input variables used in the observation. We then utilize the trained Decision Tree Classifier ({tree_clf2}) to forecast the likelihood that this observation will fall into each class. The model allocates a high probability (about 94%) to the first class and a lesser probability (around 6%) to the second class, as shown by the output array (0.94337812, 0.05662188). These probabilities reveal how confident the model is in its predictions. These probabilities are then rounded to two decimal places for easier understanding in the following step, yielding (0.94, 0.06). 

## Change to entropy

In [158]:
tree_clf3 = DecisionTreeClassifier(max_depth=5, criterion='entropy')
tree_clf3.fit(train_x, train_y)

In [159]:
#Train accuracy:
train_y_pred = tree_clf3.predict(train_x)

print(accuracy_score(train_y, train_y_pred))

0.8572665183556522


In [160]:
#Test accuracy:
test_y_pred = tree_clf3.predict(test_x)

print(accuracy_score(test_y, test_y_pred))

0.8577083880349822


In [161]:
#Test confusion matrix
confusion_matrix(test_y, test_y_pred)

array([[37097,  2036],
       [ 5009,  5369]], dtype=int64)

It trains a new Decision Tree Classifier tree_clf3 with a maximum depth of 5 and 'entropy' as the criteria. Emphasizing the decrease in uncertainty, the 'entropy' criteria calculates the information gain at each split. Then, the training and testing datasets' accuracy scores are calculated. The testing accuracy is around 85.77%, whereas the training accuracy is approximately 85.73%. A confusion matrix for the testing data is also included, outlining the predictions made by the model. An extensive evaluation of the classifier's performance on the testing set is provided by the matrix, which displays counts of true positives, true negatives, false positives, and false negatives.

# Randomized Grid Search


In [162]:
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

param_grid = {'max_depth': randint(low=5, high=20), 
              'min_samples_leaf': randint(low=5, high=20)}

tree_gs = RandomizedSearchCV(DecisionTreeClassifier(), param_grid, 
                             n_iter=15, cv=5, verbose=1,
                             scoring='accuracy',
                             return_train_score=True)

tree_gs.fit(train_x, train_y)

Fitting 5 folds for each of 15 candidates, totalling 75 fits


In [163]:
cvres = tree_gs.cv_results_

for mean_score, params in zip(cvres["mean_test_score"], cvres["params"]):
    print(mean_score, params)

0.8620794090239607 {'max_depth': 7, 'min_samples_leaf': 6}
0.8568943163241878 {'max_depth': 10, 'min_samples_leaf': 6}
0.8621573168618992 {'max_depth': 7, 'min_samples_leaf': 17}
0.844351356732273 {'max_depth': 15, 'min_samples_leaf': 8}
0.8449746096946316 {'max_depth': 14, 'min_samples_leaf': 7}
0.8530768843428861 {'max_depth': 14, 'min_samples_leaf': 19}
0.8426114462399463 {'max_depth': 16, 'min_samples_leaf': 8}
0.8515014351710036 {'max_depth': 15, 'min_samples_leaf': 16}
0.8609281303257861 {'max_depth': 6, 'min_samples_leaf': 17}
0.8524622672839458 {'max_depth': 13, 'min_samples_leaf': 13}
0.8571539823915989 {'max_depth': 5, 'min_samples_leaf': 18}
0.8571539823915989 {'max_depth': 5, 'min_samples_leaf': 12}
0.8449226676391831 {'max_depth': 15, 'min_samples_leaf': 9}
0.8516745747313991 {'max_depth': 16, 'min_samples_leaf': 19}
0.8501770292761996 {'max_depth': 16, 'min_samples_leaf': 15}


In [164]:
#Find the best parameter set
tree_gs.best_params_

{'max_depth': 7, 'min_samples_leaf': 17}

In [165]:
tree_gs.best_estimator_

In [166]:
#Train accuracy:
train_y_pred = tree_gs.best_estimator_.predict(train_x)

print(accuracy_score(train_y, train_y_pred))

0.8644598911039361


In [167]:
#Test accuracy:
test_y_pred = tree_gs.best_estimator_.predict(test_x)

print(accuracy_score(test_y, test_y_pred))

0.864696734059098


In [168]:
#Test confusion matrix
confusion_matrix(test_y, test_y_pred)

array([[37164,  1969],
       [ 4730,  5648]], dtype=int64)

The purpose of this code is to optimize the hyperparameters of a Decision Tree Classifier using Randomized Search Cross-Validation (RandomizedSearchCV). The hyperparameters that are taken into consideration are "max_depth" and "min_samples_leaf," along with their respective ranges. The search is set up to use 5-fold cross-validation for 15 iterations. The results are printed after fitting, along with the mean test scores and corresponding parameter settings. The search yielded {'max_depth': 7'min_samples_leaf': 17} as the optimal parameter set. After then, training and testing are conducted on both training and testing datasets to determine which estimator performs best, a Decision Tree Classifier using these parameters.
Results:
Training Accuracy: Approximately 86.45%
Testing Accuracy: Around 86.47%
Testing Confusion Matrix:
True positive (TP): 5648
True Negative (TN): 37164
False Positive: 1969
False Negative: 4730

Now, let's calculate precision, recall, and F1 score:

Precision:Precision measures the accuracy of positive predictions. It is the ratio of true positives to the sum of true positives and false positives. 

Precision = TP/(TP + FP)

Recall (Sensitivity): Recall gauges the model's ability to capture all relevant instances. It is the ratio of true positives to the sum of true positives and false negatives.

Recall = TP/(TP + FN)

F1 Score: F1 score is the harmonic mean of precision and recall, providing a balanced metric between the two.

 F1 Score = 2 * ((Precision * Recall)/ Precision + Recall))

Now, let's compute these metrics for the given confusion matrix:
Precision = 5648/ (5648 + 1969) = 0.74149928
Recall = 5648 / (5648 + 4730) = 0.54422817
F1 Score = 2 * ((0.21962045)/(1.28572745)) = 0.34162831

Considering these metrics, the Precision indicates the model's ability to make accurate positive predictions, the Recall measures its ability to capture relevant instances, and the F1 Score balances these two metrics. In this context, a higher F1 Score is generally desirable as it signifies a better balance between precision and recall. Therefore, in conclusion, while both accuracy and F1 Score are crucial, the F1 Score provides a more balanced evaluation, and in this case, the model with hyperparameters {'max_depth': 7, 'min_samples_leaf': 17} is suggested as the better-performing model.