# AdaBoost Classifier on the Iris Dataset

In this notebook, we will apply the AdaBoost algorithm to classify the Iris dataset. We will use a Decision Tree classifier as a weak classifier for boosting. We will also evaluate the model's performance using accuracy and a confusion matrix.

In [1]:
# Import necessary libraries
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

### Step 1: Load the Iris Dataset
The Iris dataset contains 150 samples of iris flowers, each with 4 features: sepal length, sepal width, petal length, and petal width. There are three species of iris flowers in the dataset.

In [2]:
# Load dataset
iris = load_iris()
X = iris.data
y = iris.target

### Step 2: Split the Dataset into Training and Testing Sets
We will split the dataset into 80% for training and 20% for testing using `train_test_split`.

In [3]:
# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Step 3: Apply AdaBoost with a Decision Tree Classifier
We will use a Decision Tree with a maximum depth of 1 as the base classifier for the AdaBoost algorithm.

In [4]:
# Weak classifier (decision tree)
dt = DecisionTreeClassifier(max_depth=1)

# Boosting with AdaBoost
ada = AdaBoostClassifier(base_estimator=dt, n_estimators=50, random_state=42)
ada.fit(X_train, y_train)

### Step 4: Predict and Evaluate the Model
After training the AdaBoost classifier, we will make predictions on the test set and evaluate the model using accuracy and a confusion matrix.

In [5]:
# Predict and evaluate
y_pred_ada = ada.predict(X_test)
accuracy_ada = accuracy_score(y_test, y_pred_ada)
print(f'AdaBoost Accuracy: {accuracy_ada:.2f}')

AdaBoost Accuracy: 0.97

### Step 5: Confusion Matrix
We will visualize the confusion matrix to see how well the model is performing for each class (species).

In [6]:
# Confusion Matrix
cm_ada = confusion_matrix(y_test, y_pred_ada)
sns.heatmap(cm_ada, annot=True, fmt='d', cmap='Blues', xticklabels=iris.target_names, yticklabels=iris.target_names)
plt.title('Confusion Matrix for AdaBoost')
plt.show()