BAGGING (Random Forest Example)
load_iris() – loads a small flower dataset used for testing ML models.

train_test_split() – splits data into training (80%) and testing (20%).

RandomForestClassifier() – creates a bagging model made of many decision trees.

fit() – teaches the model using the training data.

predict() – uses the learned model to guess labels for new data.

accuracy_score() – checks how good those predictions are.

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the Gradient Boosting model
gb = GradientBoostingClassifier(random_state=42)
gb.fit(X_train, y_train)

# Make predictions
y_pred = gb.predict(X_test)

# Evaluate accuracy
print("Boosting (Gradient Boosting) Accuracy:", accuracy_score(y_test, y_pred))


Boosting (Gradient Boosting) Accuracy: 1.0


BOOSTING (Gradient Boosting Example)
GradientBoostingClassifier() – builds trees one after another.

Each new tree tries to fix the mistakes of the previous ones.

fit() trains the boosting model step by step.

predict() gets final predictions after all corrections.

accuracy_score() evaluates performance.

Boosting = train models sequentially → each fixes errors of the last.

In [2]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the Gradient Boosting model
gb = GradientBoostingClassifier(random_state=42)
gb.fit(X_train, y_train)

# Make predictions
y_pred = gb.predict(X_test)

# Evaluate accuracy
print("Boosting (Gradient Boosting) Accuracy:", accuracy_score(y_test, y_pred))


Boosting (Gradient Boosting) Accuracy: 1.0


STACKING (Combine Models + Meta-Learner)
DecisionTreeClassifier() – first base model.

SVC() – second base model (SVM).

These models each learn differently.

LogisticRegression() – the “meta-learner” that learns from their predictions.

StackingClassifier() – combines all models into one powerful ensemble.

fit() trains the whole stack.

predict() uses combined knowledge from all models.

accuracy_score() checks performance.

Stacking = train multiple models → feed their predictions into a final model.

In [3]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import StackingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Base models
base_estimators = [
    ('tree', DecisionTreeClassifier()),
    ('svc', SVC(probability=True))
]

# Meta-learner
meta_learner = LogisticRegression()

# Build stacking model
stack_model = StackingClassifier(
    estimators=base_estimators,
    final_estimator=meta_learner
)

# Train stacking model
stack_model.fit(X_train, y_train)

# Predict and evaluate
y_pred = stack_model.predict(X_test)
print("Stacking Accuracy:", accuracy_score(y_test, y_pred))


Stacking Accuracy: 1.0


The perfect accuracy results (1.00) observed for stacking and bagging are due to the limited dataset size and controlled lab conditions. Real-world datasets, which are larger and more complex, typically yield more realistic accuracy scores. These lab results should be interpreted with these constraints in mind.