## Bagging Regressor

The BaggingRegressor is an ensemble learning method provided by Scikit-learn, which implements the bagging (Bootstrap Aggregating) technique for regression tasks. It combines the predictions of multiple base regressors trained on different subsets of the training data to make a final prediction. Here's a detailed overview of the BaggingRegressor:

### Key Features and Parameters:

1. **Base Estimator**:
   - The base estimator is the regression algorithm used to train each base regressor in the ensemble.
   - It can be any regressor from Scikit-learn, such as decision trees, linear regression, support vector regression, etc.

2. **n_estimators**:
   - The number of base regressors (estimators) to include in the ensemble.
   - Increasing the number of estimators typically leads to better performance, but it also increases computational complexity.

3. **max_samples**:
   - The number or proportion of samples to draw from the training data for each base regressor.
   - It controls the size of the bootstrap sample used for training each base regressor.
   - By default, it is set to the size of the training dataset.

4. **max_features**:
   - The number or proportion of features to consider for each base regressor.
   - It controls the size of the random subset of features used for training each base regressor.
   - If set to 1.0, all features are considered for each base regressor.
   - If set to less than 1.0, it specifies the proportion of features to consider.
   - If set to 'sqrt', it considers the square root of the total number of features.
   - If set to 'log2', it considers the logarithm base 2 of the total number of features.

5. **bootstrap**:
   - Whether to use bootstrap sampling (with replacement) when creating the training datasets for each base regressor.
   - If set to True (default), bootstrap sampling is used.
   - If set to False, pasting (sampling without replacement) is used.

6. **bootstrap_features**:
   - Whether to use bootstrap sampling when selecting features for each base regressor.
   - If set to True (default), bootstrap sampling is used.
   - If set to False, all features are considered for each base regressor.

7. **n_jobs**:
   - The number of CPU cores to use for parallelizing the training of base regressors.
   - If set to -1 (default), all available CPU cores are used.

### How BaggingRegressor Works:

1. **Training**:
   - The BaggingRegressor first creates multiple bootstrap samples (or subsets) of the training data, each containing a random subset of instances (samples).
   - Then, it trains a base regressor (specified by the base estimator parameter) independently on each bootstrap sample.
   - Each base regressor learns to predict the target variable based on the features present in its respective bootstrap sample.

2. **Prediction Aggregation**:
   - During prediction, each base regressor makes its own individual predictions on the unseen data.
   - The BaggingRegressor aggregates the predictions of all base regressors by averaging them to obtain the final prediction.

### Advantages:

- **Variance Reduction**: BaggingRegressor reduces overfitting and variance by combining predictions from multiple base regressors trained on different subsets of the data.
  
- **Robustness**: By training models on diverse subsets of the data, BaggingRegressor is more robust to outliers and noise.

- **Parallelization**: BaggingRegressor supports parallel training of base regressors, allowing for efficient utilization of computational resources.

### Example Usage:

```python
from sklearn.ensemble import BaggingRegressor
from sklearn.tree import DecisionTreeRegressor

# Define base regressor
base_regressor = DecisionTreeRegressor()

# Create BaggingRegressor
bagging_regressor = BaggingRegressor(base_estimator=base_regressor, n_estimators=10, max_samples=0.8, max_features=0.8)

# Train BaggingRegressor
bagging_regressor.fit(X_train, y_train)

# Make predictions
y_pred = bagging_regressor.predict(X_test)
```

In this example, we create a BaggingRegressor with a base decision tree regressor and train it on the training data (X_train, y_train). We then make predictions on the test data (X_test) using the trained BaggingRegressor.

In [1]:
from sklearn import datasets

boston = datasets.load_boston()
X_boston, Y_boston = boston.data, boston.target
print('Dataset features names : '+ str(boston.feature_names))
print('Dataset features size : '+ str(boston.data.shape))
print('Dataset target size : '+ str(boston.target.shape))

Dataset features names : ['CRIM' 'ZN' 'INDUS' 'CHAS' 'NOX' 'RM' 'AGE' 'DIS' 'RAD' 'TAX' 'PTRATIO'
 'B' 'LSTAT']
Dataset features size : (506, 13)
Dataset target size : (506,)


In [16]:
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import r2_score

In [2]:
from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X_boston, Y_boston , train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sets Sizes : ',X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sets Sizes :  (404, 13) (102, 13) (404,) (102,)


In [13]:
lr = LinearRegression()
dt = DecisionTreeRegressor()
knn = KNeighborsRegressor()

In [14]:
lr.fit(X_train,Y_train)
dt.fit(X_train,Y_train)
knn.fit(X_train,Y_train)

KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
                    metric_params=None, n_jobs=None, n_neighbors=5, p=2,
                    weights='uniform')

In [15]:
y_pred1 = lr.predict(X_test)
y_pred2 = dt.predict(X_test)
y_pred3 = knn.predict(X_test)

In [17]:
print("R^2 score for LR",r2_score(Y_test,y_pred1))
print("R^2 score for DT",r2_score(Y_test,y_pred2))
print("R^2 score for KNN",r2_score(Y_test,y_pred3))

R^2 score for LR 0.6592466510354125
R^2 score for DT 0.4379566831493381
R^2 score for KNN 0.5475962186976784


In [3]:
from sklearn.ensemble import BaggingRegressor

bag_regressor = BaggingRegressor(random_state=1)
bag_regressor.fit(X_train, Y_train)

BaggingRegressor(base_estimator=None, bootstrap=True, bootstrap_features=False,
                 max_features=1.0, max_samples=1.0, n_estimators=10,
                 n_jobs=None, oob_score=False, random_state=1, verbose=0,
                 warm_start=False)

In [5]:
Y_preds = bag_regressor.predict(X_test)

print('Training Coefficient of R^2 : %.3f'%bag_regressor.score(X_train, Y_train))
print('Test Coefficient of R^2 : %.3f'%bag_regressor.score(X_test, Y_test))

Training Coefficient of R^2 : 0.980
Test Coefficient of R^2 : 0.812


In [10]:
%%time

n_samples = boston.data.shape[0]
n_features = boston.data.shape[1]

params = {'base_estimator': [None, LinearRegression(), KNeighborsRegressor()],
          'n_estimators': [20,50,100],
          'max_samples': [0.5,1.0],
          'max_features': [0.5,1.0],
          'bootstrap': [True, False],
          'bootstrap_features': [True, False]}

bagging_regressor_grid = GridSearchCV(BaggingRegressor(random_state=1, n_jobs=-1), param_grid =params, cv=3, n_jobs=-1, verbose=1)
bagging_regressor_grid.fit(X_train, Y_train)

print('Train R^2 Score : %.3f'%bagging_regressor_grid.best_estimator_.score(X_train, Y_train))
print('Test R^2 Score : %.3f'%bagging_regressor_grid.best_estimator_.score(X_test, Y_test))
print('Best R^2 Score Through Grid Search : %.3f'%bagging_regressor_grid.best_score_)
print('Best Parameters : ',bagging_regressor_grid.best_params_)

Fitting 3 folds for each of 144 candidates, totalling 432 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=-1)]: Done  46 tasks      | elapsed:    8.0s
[Parallel(n_jobs=-1)]: Done 196 tasks      | elapsed:   32.3s
[Parallel(n_jobs=-1)]: Done 432 out of 432 | elapsed:  1.1min finished


Train R^2 Score : 0.983
Test R^2 Score : 0.802
Best R^2 Score Through Grid Search : 0.870
Best Parameters :  {'base_estimator': None, 'bootstrap': True, 'bootstrap_features': False, 'max_features': 1.0, 'max_samples': 1.0, 'n_estimators': 50}
CPU times: user 1.23 s, sys: 85 ms, total: 1.32 s
Wall time: 1min 6s
