## Bagging Regressor

![Bagging](https://www.researchgate.net/publication/369343824/figure/fig4/AS:11431281127720553@1679147347645/Bagging-regression-model.png)

The Bagging Regressor is an ensemble learning method that combines the predictions of multiple base regressors to improve the overall performance of the model. It's a variant of the Bagging (Bootstrap Aggregating) technique, commonly used for regression tasks.

Here's how the Bagging Regressor works:

1. **Bootstrap Sampling**:
   - Multiple subsets of the original training dataset are created by random sampling with replacement (bootstrap sampling).
   - Each subset is of the same size as the original dataset.
   - Some instances may appear multiple times in a subset, while others may not appear at all.

2. **Base Regressor Training**:
   - A base regressor (e.g., decision tree regressor, linear regressor, etc.) is trained on each bootstrap sample independently.
   - Each base regressor is exposed to slightly different variations of the original dataset due to the randomness introduced by bootstrap sampling.

3. **Aggregation**:
   - Once all base regressors are trained, their predictions are aggregated to make the final prediction.
   - For regression tasks, the most common approach is to average the predictions of all base regressors.
   - Alternatively, weighted averaging or other aggregation methods can be used based on specific requirements.

The key idea behind Bagging Regressor is to reduce the variance of the model by combining predictions from multiple models trained on slightly different subsets of the data. This helps to smooth out irregularities and noise in the training data, leading to better generalization performance on unseen data.

Bagging Regressor is particularly effective when the base regressors are unstable or prone to overfitting. By introducing randomness in the training process, Bagging helps to decorrelate the base regressors and reduce their tendency to overfit the training data.

In scikit-learn, you can use the `BaggingRegressor` class to create a Bagging Regressor model. It provides flexibility in choosing the base regressor and hyperparameters such as the number of base regressors (`n_estimators`), the size of each bootstrap sample (`max_samples`), and other settings.

### Bagging Regressor hyperparameters

```
class sklearn.ensemble.BaggingRegressor(estimator=None, n_estimators=10, *, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=None, random_state=None, verbose=0)
```

1. **estimator**:
   - This parameter specifies the base regressor (or estimator) to be used for training each individual model within the ensemble.
   - It is set to `None` by default, which means that a decision tree regressor (`DecisionTreeRegressor`) will be used as the base estimator. However, you can specify any other regressor or estimator as the base estimator.

2. **n_estimators**:
   - This parameter defines the number of base regressors (or estimators) to be used in the ensemble.
   - It is set to 10 by default. Increasing the number of estimators can improve the performance of the Bagging Regressor but will also increase the computational cost.

3. **max_samples**:
   - This parameter controls the number of samples to be drawn from the training data for training each base regressor.
   - It is set to 1.0 by default, which means that each base regressor is trained on the entire training dataset. You can set it to a value less than 1.0 to use a fraction of the training data for each base regressor.

4. **max_features**:
   - This parameter controls the number of features to be used for training each base regressor.
   - It is set to 1.0 by default, which means that all features are used. You can set it to a value less than 1.0 to use a fraction of the features for training each base regressor.

5. **bootstrap**:
   - This parameter specifies whether bootstrap samples should be used when training the base regressors.
   - It is set to `True` by default, meaning that bootstrap samples are used. If set to `False`, the entire training dataset is used for training each base regressor.

6. **bootstrap_features**:
   - This parameter specifies whether bootstrap samples should be used when selecting features for training each base regressor.
   - It is set to `False` by default, meaning that all features are considered for training each base regressor.

7. **oob_score**:
   - This parameter specifies whether to use out-of-bag samples to estimate the generalization performance of the ensemble.
   - It is set to `False` by default. If set to `True`, the out-of-bag samples (i.e., samples not included in the bootstrap sample for training) are used to estimate the accuracy of the ensemble.

8. **warm_start**:
   - This parameter specifies whether to reuse the solution of the previous call to fit and add more estimators to the ensemble.
   - It is set to `False` by default. If set to `True`, you can incrementally train the Bagging Regressor by calling the `fit()` method multiple times.

9. **n_jobs**:
   - This parameter specifies the number of jobs to run in parallel for both fitting and predicting.
   - It is set to `None` by default, which means that only one job will be run. You can set it to an integer value to use multiple CPU cores for parallel processing.

10. **random_state**:
    - This parameter specifies the random seed for random number generation.
    - It is set to `None` by default, meaning that a random seed will be automatically selected. You can set it to an integer value to ensure reproducibility of results.

11. **verbose**:
    - This parameter controls the verbosity of the output during fitting.
    - It is set to 0 by default, which means that no output will be generated. You can set it to a positive integer value to display progress messages during fitting.

These are the hyperparameters available in the `BaggingRegressor` class in scikit-learn, each controlling different aspects of the ensemble learning process. Adjusting these hyperparameters can help you fine-tune the performance of the Bagging Regressor for your specific regression task.

In [1]:
import warnings
warnings.filterwarnings('ignore')

from sklearn import datasets

boston = datasets.load_boston()
X_boston, Y_boston = boston.data, boston.target
print('Dataset features names : '+ str(boston.feature_names))
print('Dataset features size : '+ str(boston.data.shape))
print('Dataset target size : '+ str(boston.target.shape))

Dataset features names : ['CRIM' 'ZN' 'INDUS' 'CHAS' 'NOX' 'RM' 'AGE' 'DIS' 'RAD' 'TAX' 'PTRATIO'
 'B' 'LSTAT']
Dataset features size : (506, 13)
Dataset target size : (506,)


In [2]:
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import r2_score

In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X_boston, Y_boston , train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sets Sizes : ',X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sets Sizes :  (404, 13) (102, 13) (404,) (102,)


In [4]:
lr = LinearRegression()
dt = DecisionTreeRegressor()
knn = KNeighborsRegressor()

In [5]:
lr.fit(X_train,Y_train)
dt.fit(X_train,Y_train)
knn.fit(X_train,Y_train)

In [6]:
y_pred1 = lr.predict(X_test)
y_pred2 = dt.predict(X_test)
y_pred3 = knn.predict(X_test)

In [7]:
print("R^2 score for LR",r2_score(Y_test,y_pred1))
print("R^2 score for DT",r2_score(Y_test,y_pred2))
print("R^2 score for KNN",r2_score(Y_test,y_pred3))

R^2 score for LR 0.6592466510354119
R^2 score for DT 0.46064407885473546
R^2 score for KNN 0.5475962186976784


#### Applying Bagging

In [8]:
from sklearn.ensemble import BaggingRegressor

bag_regressor = BaggingRegressor(random_state=1)
bag_regressor.fit(X_train, Y_train)

In [9]:
y_pred = bag_regressor.predict(X_test)

In [10]:
print("R^2 score for Bagging",r2_score(Y_test,y_pred))

R^2 score for Bagging 0.8184644795411804


In [11]:
print('Training Coefficient of R^2 :', bag_regressor.score(X_train, Y_train))
print('Test Coefficient of R^2 :', bag_regressor.score(X_test, Y_test))

Training Coefficient of R^2 : 0.9799359879973576
Test Coefficient of R^2 : 0.8184644795411804


In [12]:
%%time

n_samples = boston.data.shape[0]
n_features = boston.data.shape[1]

params = {'base_estimator': [None, LinearRegression(), KNeighborsRegressor()],
          'n_estimators': [20,50,100],
          'max_samples': [0.5,1.0],
          'max_features': [0.5,1.0],
          'bootstrap': [True, False],
          'bootstrap_features': [True, False]}

bagging_regressor_grid = GridSearchCV(BaggingRegressor(random_state=1, n_jobs=-1), param_grid =params, cv=3, n_jobs=-1, verbose=1)
bagging_regressor_grid.fit(X_train, Y_train)

print('Train R^2 Score : %.3f'%bagging_regressor_grid.best_estimator_.score(X_train, Y_train))
print('Test R^2 Score : %.3f'%bagging_regressor_grid.best_estimator_.score(X_test, Y_test))
print('Best R^2 Score Through Grid Search : %.3f'%bagging_regressor_grid.best_score_)
print('Best Parameters : ',bagging_regressor_grid.best_params_)

Fitting 3 folds for each of 144 candidates, totalling 432 fits
Train R^2 Score : 0.983
Test R^2 Score : 0.805
Best R^2 Score Through Grid Search : 0.871
Best Parameters :  {'base_estimator': None, 'bootstrap': True, 'bootstrap_features': False, 'max_features': 1.0, 'max_samples': 1.0, 'n_estimators': 50}
CPU times: total: 672 ms
Wall time: 11.9 s


In [20]:
bag_regressor1 = BaggingRegressor(base_estimator = None,
                                  n_estimators = 50,
                                  max_samples = 1.0,
                                  max_features = 1.0,
                                  bootstrap =  True, 
                                  bootstrap_features = False,   
                                  random_state=1)

bag_regressor1.fit(X_train, Y_train)

In [21]:
y_pred4 = bag_regressor1.predict(X_test)
r2_score(Y_test, y_pred4)

0.805126651796481

In [23]:
bag_regressor1.estimators_samples_[0]

array([132, 220, 266, 279,  82, 249, 300, 369, 365, 292,  74, 244, 316,
       363, 258, 191, 103, 328,  65, 306,  53, 158, 228,  60,  13, 291,
       331, 291,  83, 209, 376,  69,  33, 369, 345,  86, 303, 359,  37,
       230, 387, 344, 382, 262, 129, 131, 395, 100, 267, 138, 322,  26,
       221, 277, 185, 204, 250,  28,  86, 122, 178, 309,  18, 116, 286,
        12,  77, 177, 282, 223,  21, 393, 128, 211, 396, 183,  42, 146,
       104, 383, 184, 124, 173, 336,   6, 163, 306, 111, 350,  55, 181,
       241, 159,  33, 347, 154,  17, 179, 170, 237,  32, 148, 320, 168,
       269, 358,  62, 297,  60, 216, 120, 310, 216, 106, 361, 386,  93,
       276,  75, 342, 253, 270, 334, 203, 106,  37, 206,  92, 100, 336,
       316, 266, 142, 202,  38,  85, 258, 103, 377, 331, 240, 194, 145,
       131, 238,  22, 375,  24,  57,  26, 384, 372, 170, 380, 301, 200,
       295,  44, 381, 216,  78, 274, 285, 170,   7, 264, 136, 387,  36,
        83, 133, 387, 321, 181, 336,  96, 235, 110,  13,  65, 35

In [25]:
from sklearn.model_selection import cross_val_score

cross_val_score(bag_regressor1, X_boston, Y_boston, cv=5, scoring='r2')

array([0.76171358, 0.86362648, 0.72362671, 0.45534378, 0.18969469])