8)	Implement a Bagging model for the built-in wine dataset

In machine learning, for building solid and reliable models prediction accuracy is the key factor. Ensemble learning is a supervised machine-learning technique that combines multiple models to build a more powerful and robust model. The idea is that by combining the strengths of multiple models, we can create a model that is more robust and less likely to overfit the data. It can be used for both classifications and regression tasks.
Ensemble learning techniques can be categorized in three ways:
Bagging (Bootstrap Aggregating)
Boosting
Stacking (Stacked Generalization)
Bagging is a supervised machine-learning technique, and it can be used for both regression and classification tasks. In this article we will discuss the bagging classifier.
Bagging Classifier
Bagging (or Bootstrap aggregating) is a type of ensemble learning in which multiple base models are trained independently in parallel on different subsets of the training data. Each subset is generated using bootstrap sampling, in which data points are picked at random with replacement. In the case of the Bagging classifier, the final prediction is made by aggregating the predictions of the all-base model, using majority voting. In the case of regression, the final prediction is made by averaging the predictions of the all-base model, and that is known as bagging regression.

Bagging Classifier
Bagging helps improve accuracy and reduce overfitting, especially in models that have high variance.
Working of Bagging Classifier :
Bootstrap Sampling: In Bootstrap Sampling randomly ‘n’ subsets of original training data are sampled with replacement. This step ensures that the base models are trained on diverse subsets of the data, as some samples may appear multiple times in the new subset, while others may be omitted. It reduces the risks of overfitting and improves the accuracy of the model.

Let's break it down step by step:
Original training dataset: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Resampled training set 1: [2, 3, 3, 5, 6, 1, 8, 10, 9, 1]
Resampled training set 2: [1, 1, 5, 6, 3, 8, 9, 10, 2, 7]
Resampled training set 3: [1, 5, 8, 9, 2, 10, 9, 7, 5, 4]
Base Model Training: In bagging, multiple base models are used. After the Bootstrap Sampling, each base model is independently trained using a specific learning algorithm, such as decision trees, support vector machines, or neural networks on a different bootstrapped subset of data. These models are typically called “Weak learners” because they may not be highly accurate on their own. Since the base model is trained independently of different subsets of data. To make the model computationally efficient and less time-consuming, the base models can be trained in parallel.
Aggregation: Once all the base models are trained, it is used to make predictions on the unseen data i.e. the subset of data on which that base model is not trained. In the bagging classifier, the predicted class label for the given instance is chosen based on the majority voting. The class which has the majority voting is the prediction of the model.
Out-of-Bag (OOB) Evaluation: Some samples are excluded from the training subset of particular base models during the bootstrapping method. These “out-of-bag” samples can be used to estimate the model’s performance without the need for cross-validation.
Final Prediction: After aggregating the predictions from all the base models, Bagging produces a final prediction for each instance.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns

# Load the wine dataset
wine = load_wine()
X, y = wine.data, wine.target

# i. Data scaling
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# ii. Training and testing of the model
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# iii. Create the Bagging model
base_estimator = DecisionTreeClassifier(random_state=42)
try:
    # For newer versions of scikit-learn
    model = BaggingClassifier(estimator=base_estimator, n_estimators=10, random_state=42)
except TypeError:
    # For older versions of scikit-learn
    model = BaggingClassifier(base_estimator=base_estimator, n_estimators=10, random_state=42)

model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# iv. Display confusion matrix and classification report
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(cm)

print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=wine.target_names))

# Visualize the confusion matrix
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=wine.target_names,
            yticklabels=wine.target_names)
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()

# v. Display k-fold cross-validation score
cv_scores = cross_val_score(model, X_scaled, y, cv=5)
print("\nCross-validation scores:", cv_scores)
print("Mean CV score:", cv_scores.mean())
print("Standard deviation of CV scores:", cv_scores.std())

# Feature importance
feature_importance = np.mean([tree.feature_importances_ for tree in model.estimators_], axis=0)
feature_importance = 100.0 * (feature_importance / feature_importance.max())
sorted_idx = np.argsort(feature_importance)
pos = np.arange(sorted_idx.shape[0]) + .5

plt.figure(figsize=(12, 6))
plt.barh(pos, feature_importance[sorted_idx], align='center')
plt.yticks(pos, np.array(wine.feature_names)[sorted_idx])
plt.xlabel('Relative Importance')
plt.title('Variable Importance')
plt.show()