# Homework #4: Random Forests and Ensembles

In this assignment, you will develop your own ensemble learning models

Fill in the code if indicated with the comment "PUT YOUR CODE HERE" and follow all the steps in the document.

In this section, please run the provided Python code, add the code needed to complete the tasks described below, and use the results to answer the questions in the HW assignment.

# Part 1: Implementing Ensemble Methods



**Objective**:
The objective of this assignment is to explore and compare the performance of three popular ensemble learning algorithms: Random Forest, AdaBoost, and XGBoost, using a dataset. You will analyze the decision boundaries of each model and discuss the differences in their performance.

**Dataset**:
You will use the Iris dataset, a classic dataset in machine learning and statistics. The Iris dataset contains 150 samples of iris flowers, each with four features: sepal length, sepal width, petal length, and petal width. The task is to classify each sample into one of three species: setosa, versicolor, or virginica.

**Tasks**:


1. Train three ensemble learning models: Random Forest, AdaBoost, and XGBoost, using the training set. Make sure to use n_estimators=100, random_state=42 for all three models.


In [None]:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from xgboost import XGBClassifier
from mlxtend.plotting import plot_decision_regions

# Load the Iris dataset
iris = load_iris()
X, y = iris.data[:, :2], iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Random Forest classifier - Your Code Goes Here (find rf_clf)


# Train AdaBoost classifier - Your Code Goes Here (find adaboost_clf)


# Train XGBoost classifier - Your Code Goes Here (find xgb_clf)


# Plot decision boundaries for Random Forest
plt.figure(figsize=(12, 4))
plt.subplot(1, 3, 1)
plot_decision_regions(X_train, y_train, clf=rf_clf, legend=2)
plt.title('Random Forest Decision Boundaries')

# Plot decision boundaries for AdaBoost
plt.subplot(1, 3, 2)
plot_decision_regions(X_train, y_train, clf=adaboost_clf, legend=2)
plt.title('AdaBoost Decision Boundaries')

# Plot decision boundaries for XGBoost
plt.subplot(1, 3, 3)
plot_decision_regions(X_train, y_train, clf=xgb_clf, legend=2)
plt.title('XGBoost Decision Boundaries')

plt.show()

# Evaluate model accuracies on the testing set
rf_acc = rf_clf.score(X_test, y_test)
adaboost_acc = adaboost_clf.score(X_test, y_test)
xgb_acc = xgb_clf.score(X_test, y_test)

print("Random Forest Accuracy:", rf_acc)
print("AdaBoost Accuracy:", adaboost_acc)
print("XGBoost Accuracy:", xgb_acc)

In [None]:
from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix

# Evaluate model performance on the testing set
def evaluate_model_performance(clf, X_test, y_test):
    y_pred = clf.predict(X_test)
    precision = precision_score(y_test, y_pred, average='weighted')
    recall = recall_score(y_test, y_pred, average='weighted')
    f1 = f1_score(y_test, y_pred, average='weighted')
    cm = confusion_matrix(y_test, y_pred)
    return precision, recall, f1, cm

# Evaluate model performance for Random Forest
rf_precision, rf_recall, rf_f1, rf_cm = evaluate_model_performance(rf_clf, X_test, y_test)
print("Random Forest Metrics:")
print("Precision:", rf_precision)
print("Recall:", rf_recall)
print("F1-Score:", rf_f1)
print("Confusion Matrix:\n", rf_cm)

# Evaluate model performance for AdaBoost
adaboost_precision, adaboost_recall, adaboost_f1, adaboost_cm = evaluate_model_performance(adaboost_clf, X_test, y_test)
print("\nAdaBoost Metrics:")
print("Precision:", adaboost_precision)
print("Recall:", adaboost_recall)
print("F1-Score:", adaboost_f1)
print("Confusion Matrix:\n", adaboost_cm)

# Evaluate model performance for XGBoost
xgb_precision, xgb_recall, xgb_f1, xgb_cm = evaluate_model_performance(xgb_clf, X_test, y_test)
print("\nXGBoost Metrics:")
print("Precision:", xgb_precision)
print("Recall:", xgb_recall)
print("F1-Score:", xgb_f1)
print("Confusion Matrix:\n", xgb_cm)