## 17 Apr Boosting - 2 | Gradient Boost 

Q1. What is Gradient Boosting Regression?

Ans: 
    
    Gradient Boosting Regression is a popular machine learning technique that combines multiple weak prediction models to create a strong prediction model. It is an extension of the boosting algorithm, where each weak learner is trained on the residual errors of the previous learner.

    In Gradient Boosting Regression, the algorithm starts by building an initial prediction model, such as a decision tree, which is often referred to as the base learner. The base learner is trained on the entire dataset and used to make predictions. The residual errors are then calculated by subtracting the predicted values from the actual values.

    The next step is to train a new base learner on the residual errors. This new base learner is trained to predict the residual errors of the previous learner, and its output is added to the predictions of the previous learner. This process is repeated iteratively, with each new base learner being trained on the residual errors of the previous learners.

    The key idea behind Gradient Boosting Regression is to minimize the mean squared error (MSE) between the predicted values and the actual values. In each iteration, the algorithm tries to find the base learner that minimizes the MSE of the residual errors.

    One of the benefits of Gradient Boosting Regression is that it is able to handle both continuous and categorical data. It is also less prone to overfitting compared to other algorithms, such as decision trees, because it combines multiple weak learners to create a strong model that generalizes well to new data.

Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a 
simple regression problem as an example and train the model on a small dataset. Evaluate the model's 
performance using metrics such as mean squared error and R-squared.

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt 

In [2]:
#Load the data
from sklearn.datasets import load_iris

In [3]:
data = load_iris()

In [4]:
data

{'data': array([[5.1, 3.5, 1.4, 0.2],
        [4.9, 3. , 1.4, 0.2],
        [4.7, 3.2, 1.3, 0.2],
        [4.6, 3.1, 1.5, 0.2],
        [5. , 3.6, 1.4, 0.2],
        [5.4, 3.9, 1.7, 0.4],
        [4.6, 3.4, 1.4, 0.3],
        [5. , 3.4, 1.5, 0.2],
        [4.4, 2.9, 1.4, 0.2],
        [4.9, 3.1, 1.5, 0.1],
        [5.4, 3.7, 1.5, 0.2],
        [4.8, 3.4, 1.6, 0.2],
        [4.8, 3. , 1.4, 0.1],
        [4.3, 3. , 1.1, 0.1],
        [5.8, 4. , 1.2, 0.2],
        [5.7, 4.4, 1.5, 0.4],
        [5.4, 3.9, 1.3, 0.4],
        [5.1, 3.5, 1.4, 0.3],
        [5.7, 3.8, 1.7, 0.3],
        [5.1, 3.8, 1.5, 0.3],
        [5.4, 3.4, 1.7, 0.2],
        [5.1, 3.7, 1.5, 0.4],
        [4.6, 3.6, 1. , 0.2],
        [5.1, 3.3, 1.7, 0.5],
        [4.8, 3.4, 1.9, 0.2],
        [5. , 3. , 1.6, 0.2],
        [5. , 3.4, 1.6, 0.4],
        [5.2, 3.5, 1.5, 0.2],
        [5.2, 3.4, 1.4, 0.2],
        [4.7, 3.2, 1.6, 0.2],
        [4.8, 3.1, 1.6, 0.2],
        [5.4, 3.4, 1.5, 0.4],
        [5.2, 4.1, 1.5, 0.1],
  

In [5]:
#Independent Variable 

X = pd.DataFrame(data['data'])

In [6]:
X.head()

Unnamed: 0,0,1,2,3
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [7]:
#check null values in X 

X.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       150 non-null    float64
 1   1       150 non-null    float64
 2   2       150 non-null    float64
 3   3       150 non-null    float64
dtypes: float64(4)
memory usage: 4.8 KB


In [8]:
#Dependent Variable 

y = pd.DataFrame(data['target'])

In [9]:
y.head()

Unnamed: 0,0
0,0
1,0
2,0
3,0
4,0


In [10]:
#check null values in y 

y.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       150 non-null    int64
dtypes: int64(1)
memory usage: 1.3 KB


In [11]:
# train Test Split

In [12]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y , test_size=0.30, random_state=42)

In [13]:
X_train.count() 

0    105
1    105
2    105
3    105
dtype: int64

In [14]:
X_test.count()

0    45
1    45
2    45
3    45
dtype: int64

In [15]:
# Apply Gradient Boosting 

from sklearn.ensemble import GradientBoostingRegressor

In [16]:
grdb_reg = GradientBoostingRegressor()

In [17]:
grdb_reg.fit(X_train, y_train)

  y = column_or_1d(y, warn=True)


In [18]:
y_pred = grdb_reg.predict(X_test)

In [19]:
from sklearn.metrics import mean_squared_error, r2_score, confusion_matrix, classification_report

In [20]:
mean_squared_error(y_pred, y_test)

0.00823438456116209

In [21]:
r2_score(y_pred, y_test)

0.9876148640838254

Q3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to 
optimise the performance of the model. Use grid search or random search to find the best 
hyperparameters

In [22]:
grid = {'n_estimators': [10, 50, 100, 500],
        'learning_rate' : [0.0001, 0.001, 0.01, 0.1, 1.0],
        'max_depth' : [3, 7, 9] 
        }


In [23]:
gb_reg = GradientBoostingRegressor(random_state=42)

In [24]:
from sklearn.model_selection import GridSearchCV

grid_search = GridSearchCV(estimator=gb_reg, 
                           param_grid=grid, 
                           verbose=4)

In [25]:
grid_search.fit(X_train, y_train)

  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


Fitting 5 folds for each of 60 candidates, totalling 300 fits
[CV 1/5] END learning_rate=0.0001, max_depth=3, n_estimators=10;, score=-0.053 total time=   0.0s
[CV 2/5] END learning_rate=0.0001, max_depth=3, n_estimators=10;, score=0.002 total time=   0.0s
[CV 3/5] END learning_rate=0.0001, max_depth=3, n_estimators=10;, score=-0.005 total time=   0.0s
[CV 4/5] END learning_rate=0.0001, max_depth=3, n_estimators=10;, score=0.002 total time=   0.0s
[CV 5/5] END learning_rate=0.0001, max_depth=3, n_estimators=10;, score=-0.138 total time=   0.0s
[CV 1/5] END learning_rate=0.0001, max_depth=3, n_estimators=50;, score=-0.045 total time=   0.0s
[CV 2/5] END learning_rate=0.0001, max_depth=3, n_estimators=50;, score=0.009 total time=   0.0s
[CV 3/5] END learning_rate=0.0001, max_depth=3, n_estimators=50;, score=0.003 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.0001, max_depth=3, n_estimators=50;, score=0.009 total time=   0.0s
[CV 5/5] END learning_rate=0.0001, max_depth=3, n_estimators=50;, score=-0.130 total time=   0.0s
[CV 1/5] END learning_rate=0.0001, max_depth=3, n_estimators=100;, score=-0.036 total time=   0.1s
[CV 2/5] END learning_rate=0.0001, max_depth=3, n_estimators=100;, score=0.019 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.0001, max_depth=3, n_estimators=100;, score=0.012 total time=   0.1s
[CV 4/5] END learning_rate=0.0001, max_depth=3, n_estimators=100;, score=0.019 total time=   0.1s
[CV 5/5] END learning_rate=0.0001, max_depth=3, n_estimators=100;, score=-0.120 total time=   0.1s
[CV 1/5] END learning_rate=0.0001, max_depth=3, n_estimators=500;, score=0.039 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.0001, max_depth=3, n_estimators=500;, score=0.092 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.0001, max_depth=3, n_estimators=500;, score=0.083 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.0001, max_depth=3, n_estimators=500;, score=0.091 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.0001, max_depth=3, n_estimators=500;, score=-0.045 total time=   0.3s
[CV 1/5] END learning_rate=0.0001, max_depth=7, n_estimators=10;, score=-0.053 total time=   0.0s
[CV 2/5] END learning_rate=0.0001, max_depth=7, n_estimators=10;, score=0.002 total time=   0.0s
[CV 3/5] END learning_rate=0.0001, max_depth=7, n_estimators=10;, score=-0.005 total time=   0.0s
[CV 4/5] END learning_rate=0.0001, max_depth=7, n_estimators=10;, score=0.002 total time=   0.0s
[CV 5/5] END learning_rate=0.0001, max_depth=7, n_estimators=10;, score=-0.138 total time=   0.0s
[CV 1/5] END learning_rate=0.0001, max_depth=7, n_estimators=50;, score=-0.045 total time=   0.0s
[CV 2/5] END learning_rate=0.0001, max_depth=7, n_estimators=50;, score=0.009 total time=   0.0s
[CV 3/5] END learning_rate=0.0001, max_depth=7, n_estimators=50;, score=0.003 total time=   0.0s
[CV 4/5] END learning_rate=0.0001, max_depth=7, n_estimators=50;, score=0.010 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.0001, max_depth=7, n_estimators=50;, score=-0.130 total time=   0.0s
[CV 1/5] END learning_rate=0.0001, max_depth=7, n_estimators=100;, score=-0.035 total time=   0.1s
[CV 2/5] END learning_rate=0.0001, max_depth=7, n_estimators=100;, score=0.018 total time=   0.1s
[CV 3/5] END learning_rate=0.0001, max_depth=7, n_estimators=100;, score=0.012 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.0001, max_depth=7, n_estimators=100;, score=0.020 total time=   0.1s
[CV 5/5] END learning_rate=0.0001, max_depth=7, n_estimators=100;, score=-0.120 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.0001, max_depth=7, n_estimators=500;, score=0.039 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.0001, max_depth=7, n_estimators=500;, score=0.088 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.0001, max_depth=7, n_estimators=500;, score=0.083 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.0001, max_depth=7, n_estimators=500;, score=0.095 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.0001, max_depth=7, n_estimators=500;, score=-0.047 total time=   0.3s
[CV 1/5] END learning_rate=0.0001, max_depth=9, n_estimators=10;, score=-0.053 total time=   0.0s
[CV 2/5] END learning_rate=0.0001, max_depth=9, n_estimators=10;, score=0.002 total time=   0.0s
[CV 3/5] END learning_rate=0.0001, max_depth=9, n_estimators=10;, score=-0.005 total time=   0.0s
[CV 4/5] END learning_rate=0.0001, max_depth=9, n_estimators=10;, score=0.002 total time=   0.0s
[CV 5/5] END learning_rate=0.0001, max_depth=9, n_estimators=10;, score=-0.138 total time=   0.0s
[CV 1/5] END learning_rate=0.0001, max_depth=9, n_estimators=50;, score=-0.045 total time=   0.0s
[CV 2/5] END learning_rate=0.0001, max_depth=9, n_estimators=50;, score=0.009 total time=   0.0s
[CV 3/5] END learning_rate=0.0001, max_depth=9, n_estimators=50;, score=0.003 total time=   0.0s
[CV 4/5] END learning_rate=0.0001, max_depth=9, n_estimators=50;, score=0.010 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.0001, max_depth=9, n_estimators=50;, score=-0.130 total time=   0.0s
[CV 1/5] END learning_rate=0.0001, max_depth=9, n_estimators=100;, score=-0.035 total time=   0.1s
[CV 2/5] END learning_rate=0.0001, max_depth=9, n_estimators=100;, score=0.018 total time=   0.1s
[CV 3/5] END learning_rate=0.0001, max_depth=9, n_estimators=100;, score=0.012 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.0001, max_depth=9, n_estimators=100;, score=0.020 total time=   0.1s
[CV 5/5] END learning_rate=0.0001, max_depth=9, n_estimators=100;, score=-0.120 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.0001, max_depth=9, n_estimators=500;, score=0.039 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.0001, max_depth=9, n_estimators=500;, score=0.089 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.0001, max_depth=9, n_estimators=500;, score=0.083 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.0001, max_depth=9, n_estimators=500;, score=0.095 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.0001, max_depth=9, n_estimators=500;, score=-0.047 total time=   0.3s
[CV 1/5] END learning_rate=0.001, max_depth=3, n_estimators=10;, score=-0.036 total time=   0.0s
[CV 2/5] END learning_rate=0.001, max_depth=3, n_estimators=10;, score=0.019 total time=   0.0s
[CV 3/5] END learning_rate=0.001, max_depth=3, n_estimators=10;, score=0.012 total time=   0.0s
[CV 4/5] END learning_rate=0.001, max_depth=3, n_estimators=10;, score=0.019 total time=   0.0s
[CV 5/5] END learning_rate=0.001, max_depth=3, n_estimators=10;, score=-0.120 total time=   0.0s
[CV 1/5] END learning_rate=0.001, max_depth=3, n_estimators=50;, score=0.039 total time=   0.0s
[CV 2/5] END learning_rate=0.001, max_depth=3, n_estimators=50;, score=0.092 total time=   0.0s
[CV 3/5] END learning_rate=0.001, max_depth=3, n_estimators=50;, score=0.083 total time=   0.0s
[CV 4/5] END learning_rate=0.001, max_depth=3, n_estimators=50;, score=0.091 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.001, max_depth=3, n_estimators=50;, score=-0.045 total time=   0.0s
[CV 1/5] END learning_rate=0.001, max_depth=3, n_estimators=100;, score=0.126 total time=   0.1s
[CV 2/5] END learning_rate=0.001, max_depth=3, n_estimators=100;, score=0.175 total time=   0.1s
[CV 3/5] END learning_rate=0.001, max_depth=3, n_estimators=100;, score=0.163 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.001, max_depth=3, n_estimators=100;, score=0.173 total time=   0.1s
[CV 5/5] END learning_rate=0.001, max_depth=3, n_estimators=100;, score=0.040 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.001, max_depth=3, n_estimators=500;, score=0.587 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.001, max_depth=3, n_estimators=500;, score=0.602 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.001, max_depth=3, n_estimators=500;, score=0.575 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.001, max_depth=3, n_estimators=500;, score=0.611 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.001, max_depth=3, n_estimators=500;, score=0.499 total time=   0.3s
[CV 1/5] END learning_rate=0.001, max_depth=7, n_estimators=10;, score=-0.035 total time=   0.0s
[CV 2/5] END learning_rate=0.001, max_depth=7, n_estimators=10;, score=0.018 total time=   0.0s
[CV 3/5] END learning_rate=0.001, max_depth=7, n_estimators=10;, score=0.012 total time=   0.0s
[CV 4/5] END learning_rate=0.001, max_depth=7, n_estimators=10;, score=0.020 total time=   0.0s
[CV 5/5] END learning_rate=0.001, max_depth=7, n_estimators=10;, score=-0.120 total time=   0.0s
[CV 1/5] END learning_rate=0.001, max_depth=7, n_estimators=50;, score=0.040 total time=   0.0s
[CV 2/5] END learning_rate=0.001, max_depth=7, n_estimators=50;, score=0.089 total time=   0.0s
[CV 3/5] END learning_rate=0.001, max_depth=7, n_estimators=50;, score=0.083 total time=   0.0s
[CV 4/5] END learning_rate=0.001, max_depth=7, n_estimators=50;, score=0.095 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.001, max_depth=7, n_estimators=50;, score=-0.047 total time=   0.0s
[CV 1/5] END learning_rate=0.001, max_depth=7, n_estimators=100;, score=0.125 total time=   0.1s
[CV 2/5] END learning_rate=0.001, max_depth=7, n_estimators=100;, score=0.172 total time=   0.1s
[CV 3/5] END learning_rate=0.001, max_depth=7, n_estimators=100;, score=0.163 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.001, max_depth=7, n_estimators=100;, score=0.181 total time=   0.1s
[CV 5/5] END learning_rate=0.001, max_depth=7, n_estimators=100;, score=0.038 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.001, max_depth=7, n_estimators=500;, score=0.573 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.001, max_depth=7, n_estimators=500;, score=0.590 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.001, max_depth=7, n_estimators=500;, score=0.575 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.001, max_depth=7, n_estimators=500;, score=0.632 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.001, max_depth=7, n_estimators=500;, score=0.478 total time=   0.3s
[CV 1/5] END learning_rate=0.001, max_depth=9, n_estimators=10;, score=-0.035 total time=   0.0s
[CV 2/5] END learning_rate=0.001, max_depth=9, n_estimators=10;, score=0.018 total time=   0.0s
[CV 3/5] END learning_rate=0.001, max_depth=9, n_estimators=10;, score=0.012 total time=   0.0s
[CV 4/5] END learning_rate=0.001, max_depth=9, n_estimators=10;, score=0.020 total time=   0.0s
[CV 5/5] END learning_rate=0.001, max_depth=9, n_estimators=10;, score=-0.120 total time=   0.0s
[CV 1/5] END learning_rate=0.001, max_depth=9, n_estimators=50;, score=0.040 total time=   0.0s
[CV 2/5] END learning_rate=0.001, max_depth=9, n_estimators=50;, score=0.089 total time=   0.0s
[CV 3/5] END learning_rate=0.001, max_depth=9, n_estimators=50;, score=0.083 total time=   0.0s
[CV 4/5] END learning_rate=0.001, max_depth=9, n_estimators=50;, score=0.095 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.001, max_depth=9, n_estimators=50;, score=-0.047 total time=   0.0s
[CV 1/5] END learning_rate=0.001, max_depth=9, n_estimators=100;, score=0.125 total time=   0.1s
[CV 2/5] END learning_rate=0.001, max_depth=9, n_estimators=100;, score=0.172 total time=   0.1s
[CV 3/5] END learning_rate=0.001, max_depth=9, n_estimators=100;, score=0.163 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.001, max_depth=9, n_estimators=100;, score=0.181 total time=   0.1s
[CV 5/5] END learning_rate=0.001, max_depth=9, n_estimators=100;, score=0.038 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.001, max_depth=9, n_estimators=500;, score=0.573 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.001, max_depth=9, n_estimators=500;, score=0.589 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.001, max_depth=9, n_estimators=500;, score=0.575 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.001, max_depth=9, n_estimators=500;, score=0.632 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.001, max_depth=9, n_estimators=500;, score=0.478 total time=   0.3s
[CV 1/5] END learning_rate=0.01, max_depth=3, n_estimators=10;, score=0.126 total time=   0.0s
[CV 2/5] END learning_rate=0.01, max_depth=3, n_estimators=10;, score=0.175 total time=   0.0s
[CV 3/5] END learning_rate=0.01, max_depth=3, n_estimators=10;, score=0.164 total time=   0.0s
[CV 4/5] END learning_rate=0.01, max_depth=3, n_estimators=10;, score=0.174 total time=   0.0s
[CV 5/5] END learning_rate=0.01, max_depth=3, n_estimators=10;, score=0.041 total time=   0.0s
[CV 1/5] END learning_rate=0.01, max_depth=3, n_estimators=50;, score=0.589 total time=   0.0s
[CV 2/5] END learning_rate=0.01, max_depth=3, n_estimators=50;, score=0.603 total time=   0.0s
[CV 3/5] END learning_rate=0.01, max_depth=3, n_estimators=50;, score=0.576 total time=   0.0s
[CV 4/5] END learning_rate=0.01, max_depth=3, n_estimators=50;, score=0.612 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.01, max_depth=3, n_estimators=50;, score=0.501 total time=   0.0s
[CV 1/5] END learning_rate=0.01, max_depth=3, n_estimators=100;, score=0.830 total time=   0.1s
[CV 2/5] END learning_rate=0.01, max_depth=3, n_estimators=100;, score=0.808 total time=   0.1s
[CV 3/5] END learning_rate=0.01, max_depth=3, n_estimators=100;, score=0.773 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.01, max_depth=3, n_estimators=100;, score=0.839 total time=   0.1s
[CV 5/5] END learning_rate=0.01, max_depth=3, n_estimators=100;, score=0.752 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.01, max_depth=3, n_estimators=500;, score=0.952 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.01, max_depth=3, n_estimators=500;, score=0.903 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.01, max_depth=3, n_estimators=500;, score=0.834 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.01, max_depth=3, n_estimators=500;, score=0.991 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.01, max_depth=3, n_estimators=500;, score=0.846 total time=   0.3s
[CV 1/5] END learning_rate=0.01, max_depth=7, n_estimators=10;, score=0.126 total time=   0.0s
[CV 2/5] END learning_rate=0.01, max_depth=7, n_estimators=10;, score=0.161 total time=   0.0s
[CV 3/5] END learning_rate=0.01, max_depth=7, n_estimators=10;, score=0.164 total time=   0.0s
[CV 4/5] END learning_rate=0.01, max_depth=7, n_estimators=10;, score=0.182 total time=   0.0s
[CV 5/5] END learning_rate=0.01, max_depth=7, n_estimators=10;, score=0.038 total time=   0.0s
[CV 1/5] END learning_rate=0.01, max_depth=7, n_estimators=50;, score=0.574 total time=   0.0s
[CV 2/5] END learning_rate=0.01, max_depth=7, n_estimators=50;, score=0.591 total time=   0.0s
[CV 3/5] END learning_rate=0.01, max_depth=7, n_estimators=50;, score=0.576 total time=   0.0s
[CV 4/5] END learning_rate=0.01, max_depth=7, n_estimators=50;, score=0.634 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.01, max_depth=7, n_estimators=50;, score=0.480 total time=   0.0s
[CV 1/5] END learning_rate=0.01, max_depth=7, n_estimators=100;, score=0.803 total time=   0.1s
[CV 2/5] END learning_rate=0.01, max_depth=7, n_estimators=100;, score=0.801 total time=   0.1s
[CV 3/5] END learning_rate=0.01, max_depth=7, n_estimators=100;, score=0.773 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.01, max_depth=7, n_estimators=100;, score=0.866 total time=   0.1s
[CV 5/5] END learning_rate=0.01, max_depth=7, n_estimators=100;, score=0.706 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.01, max_depth=7, n_estimators=500;, score=0.936 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.01, max_depth=7, n_estimators=500;, score=0.909 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.01, max_depth=7, n_estimators=500;, score=0.849 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.01, max_depth=7, n_estimators=500;, score=1.000 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.01, max_depth=7, n_estimators=500;, score=0.837 total time=   0.3s
[CV 1/5] END learning_rate=0.01, max_depth=9, n_estimators=10;, score=0.126 total time=   0.0s
[CV 2/5] END learning_rate=0.01, max_depth=9, n_estimators=10;, score=0.161 total time=   0.0s
[CV 3/5] END learning_rate=0.01, max_depth=9, n_estimators=10;, score=0.164 total time=   0.0s
[CV 4/5] END learning_rate=0.01, max_depth=9, n_estimators=10;, score=0.182 total time=   0.0s
[CV 5/5] END learning_rate=0.01, max_depth=9, n_estimators=10;, score=0.038 total time=   0.0s
[CV 1/5] END learning_rate=0.01, max_depth=9, n_estimators=50;, score=0.574 total time=   0.0s
[CV 2/5] END learning_rate=0.01, max_depth=9, n_estimators=50;, score=0.591 total time=   0.0s
[CV 3/5] END learning_rate=0.01, max_depth=9, n_estimators=50;, score=0.576 total time=   0.0s
[CV 4/5] END learning_rate=0.01, max_depth=9, n_estimators=50;, score=0.634 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.01, max_depth=9, n_estimators=50;, score=0.480 total time=   0.0s
[CV 1/5] END learning_rate=0.01, max_depth=9, n_estimators=100;, score=0.803 total time=   0.1s
[CV 2/5] END learning_rate=0.01, max_depth=9, n_estimators=100;, score=0.801 total time=   0.1s
[CV 3/5] END learning_rate=0.01, max_depth=9, n_estimators=100;, score=0.773 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.01, max_depth=9, n_estimators=100;, score=0.866 total time=   0.1s
[CV 5/5] END learning_rate=0.01, max_depth=9, n_estimators=100;, score=0.706 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.01, max_depth=9, n_estimators=500;, score=0.936 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.01, max_depth=9, n_estimators=500;, score=0.909 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.01, max_depth=9, n_estimators=500;, score=0.849 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.01, max_depth=9, n_estimators=500;, score=1.000 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.01, max_depth=9, n_estimators=500;, score=0.837 total time=   0.3s
[CV 1/5] END learning_rate=0.1, max_depth=3, n_estimators=10;, score=0.843 total time=   0.0s
[CV 2/5] END learning_rate=0.1, max_depth=3, n_estimators=10;, score=0.821 total time=   0.0s
[CV 3/5] END learning_rate=0.1, max_depth=3, n_estimators=10;, score=0.782 total time=   0.0s
[CV 4/5] END learning_rate=0.1, max_depth=3, n_estimators=10;, score=0.850 total time=   0.0s
[CV 5/5] END learning_rate=0.1, max_depth=3, n_estimators=10;, score=0.754 total time=   0.0s
[CV 1/5] END learning_rate=0.1, max_depth=3, n_estimators=50;, score=0.954 total time=   0.0s
[CV 2/5] END learning_rate=0.1, max_depth=3, n_estimators=50;, score=0.904 total time=   0.0s
[CV 3/5] END learning_rate=0.1, max_depth=3, n_estimators=50;, score=0.840 total time=   0.0s
[CV 4/5] END learning_rate=0.1, max_depth=3, n_estimators=50;, score=0.992 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.1, max_depth=3, n_estimators=50;, score=0.845 total time=   0.0s
[CV 1/5] END learning_rate=0.1, max_depth=3, n_estimators=100;, score=0.951 total time=   0.1s
[CV 2/5] END learning_rate=0.1, max_depth=3, n_estimators=100;, score=0.895 total time=   0.1s
[CV 3/5] END learning_rate=0.1, max_depth=3, n_estimators=100;, score=0.836 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.1, max_depth=3, n_estimators=100;, score=0.993 total time=   0.1s
[CV 5/5] END learning_rate=0.1, max_depth=3, n_estimators=100;, score=0.844 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.1, max_depth=3, n_estimators=500;, score=0.949 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.1, max_depth=3, n_estimators=500;, score=0.893 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.1, max_depth=3, n_estimators=500;, score=0.836 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.1, max_depth=3, n_estimators=500;, score=0.992 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.1, max_depth=3, n_estimators=500;, score=0.841 total time=   0.3s
[CV 1/5] END learning_rate=0.1, max_depth=7, n_estimators=10;, score=0.819 total time=   0.0s
[CV 2/5] END learning_rate=0.1, max_depth=7, n_estimators=10;, score=0.800 total time=   0.0s
[CV 3/5] END learning_rate=0.1, max_depth=7, n_estimators=10;, score=0.784 total time=   0.0s
[CV 4/5] END learning_rate=0.1, max_depth=7, n_estimators=10;, score=0.878 total time=   0.0s
[CV 5/5] END learning_rate=0.1, max_depth=7, n_estimators=10;, score=0.718 total time=   0.0s
[CV 1/5] END learning_rate=0.1, max_depth=7, n_estimators=50;, score=0.937 total time=   0.0s
[CV 2/5] END learning_rate=0.1, max_depth=7, n_estimators=50;, score=0.906 total time=   0.0s
[CV 3/5] END learning_rate=0.1, max_depth=7, n_estimators=50;, score=0.849 total time=   0.0s
[CV 4/5] END learning_rate=0.1, max_depth=7, n_estimators=50;, score=1.000 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.1, max_depth=7, n_estimators=50;, score=0.837 total time=   0.0s
[CV 1/5] END learning_rate=0.1, max_depth=7, n_estimators=100;, score=0.936 total time=   0.1s
[CV 2/5] END learning_rate=0.1, max_depth=7, n_estimators=100;, score=0.906 total time=   0.1s
[CV 3/5] END learning_rate=0.1, max_depth=7, n_estimators=100;, score=0.848 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.1, max_depth=7, n_estimators=100;, score=1.000 total time=   0.1s
[CV 5/5] END learning_rate=0.1, max_depth=7, n_estimators=100;, score=0.837 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.1, max_depth=7, n_estimators=500;, score=0.936 total time=   0.3s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=0.1, max_depth=7, n_estimators=500;, score=0.906 total time=   0.3s
[CV 3/5] END learning_rate=0.1, max_depth=7, n_estimators=500;, score=0.848 total time=   0.3s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.1, max_depth=7, n_estimators=500;, score=1.000 total time=   0.2s


  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.1, max_depth=7, n_estimators=500;, score=0.837 total time=   0.3s
[CV 1/5] END learning_rate=0.1, max_depth=9, n_estimators=10;, score=0.819 total time=   0.0s
[CV 2/5] END learning_rate=0.1, max_depth=9, n_estimators=10;, score=0.800 total time=   0.0s
[CV 3/5] END learning_rate=0.1, max_depth=9, n_estimators=10;, score=0.784 total time=   0.0s
[CV 4/5] END learning_rate=0.1, max_depth=9, n_estimators=10;, score=0.878 total time=   0.0s
[CV 5/5] END learning_rate=0.1, max_depth=9, n_estimators=10;, score=0.718 total time=   0.0s
[CV 1/5] END learning_rate=0.1, max_depth=9, n_estimators=50;, score=0.937 total time=   0.0s
[CV 2/5] END learning_rate=0.1, max_depth=9, n_estimators=50;, score=0.906 total time=   0.0s
[CV 3/5] END learning_rate=0.1, max_depth=9, n_estimators=50;, score=0.849 total time=   0.0s
[CV 4/5] END learning_rate=0.1, max_depth=9, n_estimators=50;, score=1.000 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.1, max_depth=9, n_estimators=50;, score=0.837 total time=   0.0s
[CV 1/5] END learning_rate=0.1, max_depth=9, n_estimators=100;, score=0.936 total time=   0.1s
[CV 2/5] END learning_rate=0.1, max_depth=9, n_estimators=100;, score=0.906 total time=   0.1s
[CV 3/5] END learning_rate=0.1, max_depth=9, n_estimators=100;, score=0.848 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=0.1, max_depth=9, n_estimators=100;, score=1.000 total time=   0.1s
[CV 5/5] END learning_rate=0.1, max_depth=9, n_estimators=100;, score=0.837 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=0.1, max_depth=9, n_estimators=500;, score=0.936 total time=   0.3s
[CV 2/5] END learning_rate=0.1, max_depth=9, n_estimators=500;, score=0.906 total time=   0.3s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=0.1, max_depth=9, n_estimators=500;, score=0.848 total time=   0.2s
[CV 4/5] END learning_rate=0.1, max_depth=9, n_estimators=500;, score=1.000 total time=   0.3s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=0.1, max_depth=9, n_estimators=500;, score=0.837 total time=   0.2s
[CV 1/5] END learning_rate=1.0, max_depth=3, n_estimators=10;, score=0.924 total time=   0.0s
[CV 2/5] END learning_rate=1.0, max_depth=3, n_estimators=10;, score=0.836 total time=   0.0s
[CV 3/5] END learning_rate=1.0, max_depth=3, n_estimators=10;, score=0.809 total time=   0.0s
[CV 4/5] END learning_rate=1.0, max_depth=3, n_estimators=10;, score=0.989 total time=   0.0s
[CV 5/5] END learning_rate=1.0, max_depth=3, n_estimators=10;, score=0.756 total time=   0.0s
[CV 1/5] END learning_rate=1.0, max_depth=3, n_estimators=50;, score=0.923 total time=   0.0s
[CV 2/5] END learning_rate=1.0, max_depth=3, n_estimators=50;, score=0.835 total time=   0.0s
[CV 3/5] END learning_rate=1.0, max_depth=3, n_estimators=50;, score=0.809 total time=   0.0s
[CV 4/5] END learning_rate=1.0, max_depth=3, n_estimators=50;, score=0.989 total time=   0.0s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=1.0, max_depth=3, n_estimators=50;, score=0.756 total time=   0.0s
[CV 1/5] END learning_rate=1.0, max_depth=3, n_estimators=100;, score=0.923 total time=   0.1s
[CV 2/5] END learning_rate=1.0, max_depth=3, n_estimators=100;, score=0.835 total time=   0.1s
[CV 3/5] END learning_rate=1.0, max_depth=3, n_estimators=100;, score=0.809 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 4/5] END learning_rate=1.0, max_depth=3, n_estimators=100;, score=0.989 total time=   0.1s
[CV 5/5] END learning_rate=1.0, max_depth=3, n_estimators=100;, score=0.756 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=1.0, max_depth=3, n_estimators=500;, score=0.923 total time=   0.3s
[CV 2/5] END learning_rate=1.0, max_depth=3, n_estimators=500;, score=0.835 total time=   0.3s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=1.0, max_depth=3, n_estimators=500;, score=0.809 total time=   0.2s
[CV 4/5] END learning_rate=1.0, max_depth=3, n_estimators=500;, score=0.989 total time=   0.2s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=1.0, max_depth=3, n_estimators=500;, score=0.756 total time=   0.2s
[CV 1/5] END learning_rate=1.0, max_depth=7, n_estimators=10;, score=0.928 total time=   0.0s
[CV 2/5] END learning_rate=1.0, max_depth=7, n_estimators=10;, score=0.691 total time=   0.0s
[CV 3/5] END learning_rate=1.0, max_depth=7, n_estimators=10;, score=0.812 total time=   0.0s
[CV 4/5] END learning_rate=1.0, max_depth=7, n_estimators=10;, score=1.000 total time=   0.0s
[CV 5/5] END learning_rate=1.0, max_depth=7, n_estimators=10;, score=0.837 total time=   0.0s
[CV 1/5] END learning_rate=1.0, max_depth=7, n_estimators=50;, score=0.928 total time=   0.0s
[CV 2/5] END learning_rate=1.0, max_depth=7, n_estimators=50;, score=0.691 total time=   0.0s
[CV 3/5] END learning_rate=1.0, max_depth=7, n_estimators=50;, score=0.812 total time=   0.0s
[CV 4/5] END learning_rate=1.0, max_depth=7, n_estimators=50;, score=1.000 total time=   0.0s
[CV 5/5] END learning_rate=1.0, max_depth=7, n_estimators=5

  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=1.0, max_depth=7, n_estimators=100;, score=0.928 total time=   0.1s
[CV 2/5] END learning_rate=1.0, max_depth=7, n_estimators=100;, score=0.691 total time=   0.1s
[CV 3/5] END learning_rate=1.0, max_depth=7, n_estimators=100;, score=0.812 total time=   0.1s
[CV 4/5] END learning_rate=1.0, max_depth=7, n_estimators=100;, score=1.000 total time=   0.1s
[CV 5/5] END learning_rate=1.0, max_depth=7, n_estimators=100;, score=0.837 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=1.0, max_depth=7, n_estimators=500;, score=0.928 total time=   0.2s


  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=1.0, max_depth=7, n_estimators=500;, score=0.691 total time=   0.2s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=1.0, max_depth=7, n_estimators=500;, score=0.812 total time=   0.2s
[CV 4/5] END learning_rate=1.0, max_depth=7, n_estimators=500;, score=1.000 total time=   0.2s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=1.0, max_depth=7, n_estimators=500;, score=0.837 total time=   0.2s
[CV 1/5] END learning_rate=1.0, max_depth=9, n_estimators=10;, score=0.928 total time=   0.0s
[CV 2/5] END learning_rate=1.0, max_depth=9, n_estimators=10;, score=0.691 total time=   0.0s
[CV 3/5] END learning_rate=1.0, max_depth=9, n_estimators=10;, score=0.812 total time=   0.0s
[CV 4/5] END learning_rate=1.0, max_depth=9, n_estimators=10;, score=1.000 total time=   0.0s
[CV 5/5] END learning_rate=1.0, max_depth=9, n_estimators=10;, score=0.837 total time=   0.0s
[CV 1/5] END learning_rate=1.0, max_depth=9, n_estimators=50;, score=0.928 total time=   0.0s
[CV 2/5] END learning_rate=1.0, max_depth=9, n_estimators=50;, score=0.691 total time=   0.0s
[CV 3/5] END learning_rate=1.0, max_depth=9, n_estimators=50;, score=0.812 total time=   0.0s
[CV 4/5] END learning_rate=1.0, max_depth=9, n_estimators=50;, score=1.000 total time=   0.0s
[CV 5/5] END learning_rate=1.0, max_depth=9, n_estimators=5

  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 1/5] END learning_rate=1.0, max_depth=9, n_estimators=100;, score=0.928 total time=   0.1s
[CV 2/5] END learning_rate=1.0, max_depth=9, n_estimators=100;, score=0.691 total time=   0.1s
[CV 3/5] END learning_rate=1.0, max_depth=9, n_estimators=100;, score=0.812 total time=   0.1s
[CV 4/5] END learning_rate=1.0, max_depth=9, n_estimators=100;, score=1.000 total time=   0.1s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=1.0, max_depth=9, n_estimators=100;, score=0.837 total time=   0.1s
[CV 1/5] END learning_rate=1.0, max_depth=9, n_estimators=500;, score=0.928 total time=   0.3s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 2/5] END learning_rate=1.0, max_depth=9, n_estimators=500;, score=0.691 total time=   0.2s


  y = column_or_1d(y, warn=True)


[CV 3/5] END learning_rate=1.0, max_depth=9, n_estimators=500;, score=0.812 total time=   0.2s
[CV 4/5] END learning_rate=1.0, max_depth=9, n_estimators=500;, score=1.000 total time=   0.3s


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


[CV 5/5] END learning_rate=1.0, max_depth=9, n_estimators=500;, score=0.837 total time=   0.2s


  y = column_or_1d(y, warn=True)


In [27]:
y_pred = grid_search.predict(X_test)

In [28]:
grid_search.best_params_

{'learning_rate': 0.1, 'max_depth': 3, 'n_estimators': 50}

In [29]:
mean_squared_error(y_pred, y_test)

0.005178208777694715

In [30]:
r2_score(y_pred, y_test)

0.992325087454757

Q4. What is a weak learner in Gradient Boosting?

ans:
    
    In Gradient Boosting, a weak learner refers to a model that performs only slightly better than random guessing. It is also referred to as a base learner or a weak classifier/regressor.

    A weak learner in Gradient Boosting is usually a simple model, such as a decision tree with a small number of nodes or a linear regression model with few features. The idea behind using a weak learner is that, by combining the outputs of many weak learners, we can create a strong predictor that generalizes well to new data.

    The concept of a weak learner is central to Gradient Boosting, as it is used to build the ensemble of models that make up the final predictor. In each iteration of the algorithm, a new weak learner is trained on the residual errors of the previous weak learners. The output of the weak learner is then combined with the outputs of the previous weak learners to update the predictions.

    By iteratively adding weak learners to the ensemble, Gradient Boosting is able to create a strong predictor that can accurately model complex relationships between the input features and the target variable

Q5. What is the intuition behind the Gradient Boosting algorithm?

Ans: 
    
    The intuition behind Gradient Boosting algorithm is to iteratively add simple models (i.e., weak learners) to the ensemble in a way that they can correct the errors made by the previous models. The main idea is to combine many weak learners to create a strong learner that can make accurate predictions.

    The algorithm starts with an initial prediction model, such as a decision tree or a linear regression model. The initial model is trained on the entire dataset, and its output is used as the starting point for the ensemble.

    In each subsequent iteration, a new weak learner is trained on the residual errors of the previous model. The weak learner is designed to capture the patterns in the residual errors that the previous model could not capture. The output of the weak learner is then added to the predictions of the previous model to update the overall prediction.

    The key idea behind Gradient Boosting is to minimize the loss function (e.g., mean squared error) between the predicted values and the actual values. Each iteration of the algorithm is focused on reducing the loss function by finding the weak learner that best fits the residual errors.

    The algorithm continues to add new weak learners to the ensemble until the loss function is minimized or a stopping criterion is met. The final prediction is then the weighted sum of the outputs of all the weak learners in the ensemble.

Q6. How does Gradient Boosting algorithm build an ensemble of weak learners?

Ams: 


    1. Initialize the model: The algorithm starts with an initial prediction model, which can be a simple model like a decision tree or a linear regression model. This model is trained on the entire dataset and its output is used as the starting point for the ensemble.

    2. Compute the residual errors: The difference between the predicted and actual target variable values is computed for each data point in the training set. These residual errors are then used as the target variable for the next weak learner.

    3. Train the weak learner: A new weak learner is trained on the residual errors from the previous step. The weak learner can be any model that can approximate the residual errors, but typically decision trees with a small number of nodes or linear regression models are used.

    4. Add the weak learner to the ensemble: The output of the new weak learner is added to the outputs of the previous learners to update the overall prediction. The output of the weak learner is scaled by a learning rate parameter, which controls the contribution of each weak learner to the final prediction.

    5. Repeat: Steps 2-4 are repeated for a fixed number of iterations, or until the loss function reaches a desired level of accuracy. At each iteration, a new weak learner is trained on the residual errors of the previous learners, and its output is added to the ensemble to improve the overall prediction.

    The final prediction is the weighted sum of the outputs of all the weak learners in the ensemble. The weights are determined by the learning rate and the order in which the learners were added to the ensemble.

Q7. What are the steps involved in constructing the mathematical intuition of Gradient Boosting algorithm?

Ans: 


    1. Define the loss function: The first step is to define a differentiable loss function that measures the error between the predicted values and the actual values. For regression problems, the most common loss function is mean squared error (MSE), while for classification problems, binary cross-entropy or categorical cross-entropy loss functions can be used.

    2. Initialize the model: The algorithm starts with an initial prediction model, which can be a simple model like a decision tree or a linear regression model. The output of the initial model is denoted as F_0(x), where x is the input feature vector.

    3. Compute the negative gradient of the loss function: The negative gradient of the loss function with respect to the output of the previous model is computed. This is denoted as r_ij, where i is the index of the training example and j is the index of the weak learner.

    4. Train the weak learner: A new weak learner is trained on the negative gradients computed in the previous step. The weak learner is trained to approximate the negative gradients as closely as possible. The output of the weak learner is denoted as h_j(x).

    5. Update the ensemble: The output of the weak learner is scaled by a learning rate parameter (α) and added to the output of the previous model to update the overall prediction. The updated prediction is denoted as F_j(x) = F_{j-1}(x) + α * h_j(x).

    6. Repeat: Steps 3-5 are repeated for a fixed number of iterations, or until the loss function reaches a desired level of accuracy. At each iteration, a new weak learner is trained to approximate the negative gradients of the loss function, and its output is added to the ensemble to improve the overall prediction.