Below is a comprehensive project that covers the topics you've provided, structured to demonstrate different ensemble methods in machine learning, including Bagging, Random Forest, Boosting, and Stacking classifiers. The project uses the scikit-learn library to implement these concepts on a synthetic dataset, specifically the "moons" dataset for classification tasks.

## Project Structure
### Import Libraries
### Generate Dataset
### Train-Test Split
### Bagging Classifier
### Random Forest Classifier
### AdaBoost Classifier
### Gradient Boosting Regressor
### Stacking Classifier
### Results and Feature Importances


In [1]:
# 1. Import Libraries
import numpy as np
import pandas as pd
from sklearn.datasets import make_moons, fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingClassifier, RandomForestClassifier, AdaBoostClassifier, GradientBoostingRegressor, StackingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.preprocessing import OrdinalEncoder
from sklearn.pipeline import make_pipeline
from sklearn.compose import make_column_transformer
from sklearn.metrics import accuracy_score

In [2]:
# 2. Generate Dataset
# Generating the moons dataset for classification
X, y = make_moons(n_samples=500, noise=0.3, random_state=42)


In [3]:
# 3. Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

In [6]:
# 4. Bagging Classifier
bag_clf = BaggingClassifier(
    DecisionTreeClassifier(max_depth=3),  # Removed base_estimator keyword
    n_estimators=500,
    max_samples=0.8,
    n_jobs=-1,
    random_state=42
)

# Fit the model
bag_clf.fit(X_train, y_train)

# Bagging Classifier Score
bagging_score = bag_clf.score(X_test, y_test)
print("Bagging Classifier Score:", bagging_score)

Bagging Classifier Score: 0.904


In [8]:
# 5. Random Forest Classifier
rnd_clf = RandomForestClassifier(n_estimators=500, max_leaf_nodes=16, n_jobs=-1, random_state=42)
rnd_clf.fit(X_train, y_train)
y_pred_rf = rnd_clf.predict(X_test)

# Random Forest Classifier Score
rf_score = accuracy_score(y_test, y_pred_rf)
print("Random Forest Classifier Score:", rf_score)


Random Forest Classifier Score: 0.912


In [10]:
# 6. AdaBoost Classifier
ada_clf = AdaBoostClassifier(
    estimator=DecisionTreeClassifier(max_depth=1),  # Changed base_estimator to estimator
    n_estimators=30,
    learning_rate=0.5,
    random_state=42
)

# Fit the model
ada_clf.fit(X_train, y_train)

# AdaBoost Classifier Score
ada_score = ada_clf.score(X_test, y_test)
print("AdaBoost Classifier Score:", ada_score)



AdaBoost Classifier Score: 0.904


In [12]:
# 8. Stacking Classifier
stacking_clf = StackingClassifier(
    estimators=[
        ('lr', DecisionTreeClassifier(max_depth=3, random_state=42)),
        ('rf', RandomForestClassifier(n_estimators=100, random_state=42)),
        ('svc', DecisionTreeClassifier(max_depth=1, random_state=42))
    ],
    final_estimator=RandomForestClassifier(random_state=43),
    cv=5  # number of cross-validation folds
)
stacking_clf.fit(X_train, y_train)

# Stacking Classifier Score
stacking_score = stacking_clf.score(X_test, y_test)
print("Stacking Classifier Score:", stacking_score)

# 9. Feature Importances (Random Forest)
# Displaying feature importances for Random Forest
feature_importances = rnd_clf.feature_importances_
print("Random Forest Feature Importances:")
for score, name in zip(feature_importances, ["Feature 1", "Feature 2"]):
    print(round(score, 2), name)

# 10. Conclusion
print("\nConclusion: This project demonstrates various ensemble methods in cl")

Stacking Classifier Score: 0.896
Random Forest Feature Importances:
0.42 Feature 1
0.58 Feature 2

Conclusion: This project demonstrates various ensemble methods in cl
