# Ensemble Techniques And Its Types-2

#### Q1. How does bagging reduce overfitting in decision trees?

Bagging reduces overfitting in decision trees by introducing diversity among the individual trees in the ensemble. Each decision tree is trained on a bootstrapped (randomly sampled with replacement) subset of the training data, which means that each tree sees a slightly different perspective of the data. When predictions are combined, the ensemble tends to generalize better to unseen data because the diversity among the trees helps reduce the impact of overfitting in individual trees.

#### Q2. What are the advantages and disadvantages of using different types of base learners in bagging?

* **Advantages:**
    * Diverse base learners can capture different aspects of the data, leading to improved ensemble performance.
    * Using various types of base learners can increase the ensemble's robustness.
* **Disadvantages:**
    * Diversity among base learners can sometimes lead to increased computational complexity.
    * Combining predictions from very different base learners may require more sophisticated aggregation methods.

#### Q3. How does the choice of base learner affect the bias-variance tradeoff in bagging?

The choice of the base learner can impact the bias-variance tradeoff in bagging. In general:
   * If the base learner has high variance (e.g., a deep decision tree), bagging tends to reduce variance, leading to lower overall variance and a potential increase in bias.
   * If the base learner has high bias (e.g., a shallow decision tree), bagging can reduce bias, leading to a decrease in overall bias and a potential increase in variance.

The impact on the bias-variance tradeoff depends on the specific characteristics of the base learner and the dataset.

#### Q4. Can bagging be used for both classification and regression tasks? How does it differ in each case?

Yes, bagging can be used for both classification and regression tasks.
* In classification, each base learner typically produces a class prediction, and the ensemble combines these predictions using majority voting.
* In regression, each base learner produces a numeric prediction, and the ensemble combines these predictions using averaging or another aggregation method.
* The primary difference lies in how the predictions are combined, but the general bagging framework remains similar for both tasks.

#### Q5. What is the role of ensemble size in bagging? How many models should be included in the ensemble?

The ensemble size (i.e., the number of base learners or decision trees) in bagging can impact the ensemble's performance. Generally:
* Increasing the ensemble size tends to improve performance up to a certain point, as it increases the diversity among the base learners.
* After a certain threshold, adding more models may have diminishing returns, and computational resources become a limiting factor.

The optimal ensemble size can vary depending on the problem and dataset. It's often determined through experimentation and cross-validation.

#### Q6. Can you provide an example of a real-world application of bagging in machine learning?

In [1]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.metrics import accuracy_score

# Loading dataset & Spliting them
ds = load_breast_cancer()
x = ds.data
y = ds.target
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.35,random_state=45)

# Creating a model
bmodel = DecisionTreeClassifier(random_state=45) # Base Model
n_estimator = 10 
bagging_model = BaggingClassifier(bmodel, n_estimators = n_estimator, random_state=45)

# Training, Prediction and Evalaution
bmodel.fit(x_train,y_train)
bagging_model.fit(x_train,y_train)
by_pred = bmodel.predict(x_test)
y_pred = bagging_model.predict(x_test)
acc = accuracy_score(y_test,by_pred)
bagg_acc = accuracy_score(y_test,y_pred)
print(f"Accuracy of Base Model: {acc:.4f}")
print(f"Accuracy of Bagging Model: {bagg_acc:.4f}")

Accuracy of Base Model: 0.9350
Accuracy of Bagging Model: 0.9500
