Bagging is a way to use bootstrapping to build a stronger, more stable model from many weaker ones. [datacamp](https://www.datacamp.com/tutorial/what-bagging-in-machine-learning-a-guide-with-examples)



### What bagging is

- **Name:** “Bagging” = **B**ootstrap **agg**regat**ing**. [datacamp](https://www.datacamp.com/tutorial/what-bagging-in-machine-learning-a-guide-with-examples)
- **Idea:**  
  1. Create many different training sets using bootstrap sampling from the original data. [datacamp](https://www.datacamp.com/tutorial/what-bagging-in-machine-learning-a-guide-with-examples)
  2. Train one model on each of these sampled datasets. [datacamp](https://www.datacamp.com/tutorial/what-bagging-in-machine-learning-a-guide-with-examples)
  3. Combine all their predictions into a final prediction (by voting or averaging). [machinelearningmastery](https://machinelearningmastery.com/demystifying-ensemble-methods-boosting-bagging-and-stacking-explained/)

Each model sees a slightly different version of the data, so they don’t all make the same mistakes. When you aggregate them, random errors tend to cancel out, which **reduces variance** and makes predictions more reliable. [blog.paperspace](https://blog.paperspace.com/bagging-ensemble-methods/)



### Why bagging works well with decision trees

Bagging can be used with many model types (for classification and regression), but it is especially effective with **decision trees**. [geeksforgeeks](https://www.geeksforgeeks.org/machine-learning/bagging-vs-boosting-in-machine-learning/)

- Deep decision trees are **low bias, high variance**: they can fit complex patterns but are very sensitive to the exact data they see. [blog.paperspace](https://blog.paperspace.com/bagging-ensemble-methods/)
- Bagging is a method designed to reduce **variance**. [geeksforgeeks](https://www.geeksforgeeks.org/machine-learning/bagging-vs-boosting-in-machine-learning/)
- So we deliberately allow each tree to grow deep (pure leaves, strong individual fit), then use bagging to smooth out their instability. [blog.paperspace](https://blog.paperspace.com/bagging-ensemble-methods/)

The result: a powerful ensemble that keeps the flexibility of deep trees but is much less overfitted. [bradleyboehmke.github](https://bradleyboehmke.github.io/HOML/bagging.html)



### How bagging works step by step

Imagine you already have training data $(X_{\text{train}}, y_{\text{train}})$ and test data $(X_{\text{test}}, y_{\text{test}})$. The process:

1. **Repeat for many estimators (e.g., 1000 trees):**  
   - Draw a **bootstrap sample** from the training set (same size as the original training set, sampled with replacement). [ibm](https://www.ibm.com/think/topics/bagging)
   - Train a fresh decision tree on this sampled data. [datacamp](https://www.datacamp.com/tutorial/what-bagging-in-machine-learning-a-guide-with-examples)
   - Store this trained tree in a list of models. [blog.paperspace](https://blog.paperspace.com/bagging-ensemble-methods/)

2. **Individual model performance:**  
   - Each tree can be evaluated on the test set to get its own accuracy. Often, each deep tree already has reasonable accuracy but is noisy. [blog.paperspace](https://blog.paperspace.com/bagging-ensemble-methods/)

3. **Ensemble prediction:**  
   - For **classification**, each tree predicts a class for every test point. The ensemble then chooses the class that **most trees vote for** (majority vote). [machinelearningmastery](https://machinelearningmastery.com/demystifying-ensemble-methods-boosting-bagging-and-stacking-explained/)
   - For **regression**, the ensemble usually takes the **average** of all tree predictions. [machinelearningmastery](https://machinelearningmastery.com/demystifying-ensemble-methods-boosting-bagging-and-stacking-explained/)

Because many different trees all contribute, the ensemble accuracy is typically higher than the average accuracy of single trees, sometimes noticeably so, even when the single trees are already quite good. [bradleyboehmke.github](https://bradleyboehmke.github.io/HOML/bagging.html)



### Out-of-bag (OOB) evaluation

Normally, you hold out a separate test set to estimate model performance. With bagging, you get another option for free: **out-of-bag (OOB) evaluation**. [codesignal](https://codesignal.com/learn/courses/ensembles-in-machine-learning/lessons/bagging-in-machine-learning)

Key idea:

- In each bootstrap sample, about **63%** of the original training points are included, and about **37%** are **not** included (“out-of-bag”) for that particular model. (This follows from the bootstrap math in the previous video.) [ibm](https://www.ibm.com/think/topics/bagging)
- For any given data point:
  - It will be **in-bag** for some subset of trees (used to train those trees).  
  - It will be **out-of-bag** for the remaining trees (those trees have never seen that point during training). [ibm](https://www.ibm.com/think/topics/bagging)

How OOB evaluation works:

1. For each data point, look only at the trees **where this point was out-of-bag**.  
2. Ask those trees to predict the label for this point.  
3. Aggregate those predictions (vote or average) to get an OOB prediction for that point. [ibm](https://www.ibm.com/think/topics/bagging)
4. Compare these OOB predictions to the true labels over all points to compute an overall OOB accuracy (or error). [ibm](https://www.ibm.com/think/topics/bagging)

Benefits:

- You can use **all** your data for training and still get an **unbiased performance estimate**, without needing a separate test set. [codesignal](https://codesignal.com/learn/courses/ensembles-in-machine-learning/lessons/bagging-in-machine-learning)
- This is particularly convenient when data is limited. [ibm](https://www.ibm.com/think/topics/bagging)

In libraries like scikit-learn, you turn this on with a parameter (e.g., `oob_score=True` for a bagging classifier), and the library stores the resulting OOB score in an attribute (e.g., `oob_score_`). [codesignal](https://codesignal.com/learn/courses/ensembles-in-machine-learning/lessons/bagging-in-machine-learning)



### Using bagging in practice (high level)

With a typical bagging implementation (like a BaggingClassifier):

- You choose a **base estimator** (often a decision tree). [geeksforgeeks](https://www.geeksforgeeks.org/machine-learning/bagging-vs-boosting-in-machine-learning/)
- You set the **number of estimators** (how many trees/models to build). [bradleyboehmke.github](https://bradleyboehmke.github.io/HOML/bagging.html)
- You call a standard **fit** method to train, and **predict** to get predictions, just like with a single model. [codesignal](https://codesignal.com/learn/courses/ensembles-in-machine-learning/lessons/bagging-in-machine-learning)
- If you enable OOB scoring, the model automatically computes an OOB performance estimate during training. [geeksforgeeks](https://www.geeksforgeeks.org/machine-learning/bagging-vs-boosting-in-machine-learning/)

As you increase the number of estimators, the ensemble usually improves quickly at first, then reaches a point where adding more models gives almost no benefit while costing more computation.  For some datasets, this plateau might be as low as around a few dozen trees; for others, it might be in the hundreds. [bradleyboehmke.github](https://bradleyboehmke.github.io/HOML/bagging.html)



### Connection to random forests

Researchers then asked: can we make the trees inside the bagging ensemble even more diverse and decorrelated to push performance further? One answer is **random forests**, which extend bagging by also randomizing the **features** used when building each tree.  Random forests keep the bootstrap + aggregation idea from bagging, then add extra randomness to further boost robustness and accuracy. [machinelearningmastery](https://machinelearningmastery.com/demystifying-ensemble-methods-boosting-bagging-and-stacking-explained/)