### Question 1: Ensemble Techniques â€“ Bagging vs Boosting
**Bagging** trains multiple models independently on random subsets to reduce variance.
**Boosting** trains models sequentially to correct errors and reduce bias.


### Question 2: Random Forest vs Single Decision Tree
Random Forest reduces overfitting by averaging multiple decision trees trained on bootstrapped samples.
Key hyperparameters: `n_estimators`, `max_features`.


### Question 3: Stacking in Ensemble Learning
Stacking combines predictions from multiple models using a meta-model to improve performance.


### Question 4: OOB Score in Random Forest
Out-of-Bag score evaluates model performance using samples not included in training.


### Question 5: AdaBoost vs Gradient Boosting
AdaBoost adjusts weights of misclassified samples; Gradient Boosting minimizes loss using gradients.


### Question 6: CatBoost and Categorical Features
CatBoost handles categorical features using ordered target statistics and efficient encoding.


In [1]:
# Question 7: Bagging with KNN
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

model = BaggingClassifier(KNeighborsClassifier(), n_estimators=10, random_state=42)
model.fit(X_train, y_train)
print("Bagging KNN Accuracy:", accuracy_score(y_test, model.predict(X_test)))

Bagging KNN Accuracy: 1.0


In [2]:
# Question 8: AdaBoost with Decision Tree
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier

model = AdaBoostClassifier(DecisionTreeClassifier(max_depth=1), n_estimators=50, random_state=42)
model.fit(X_train, y_train)
print("AdaBoost Accuracy:", accuracy_score(y_test, model.predict(X_test)))

AdaBoost Accuracy: 0.9473684210526315


In [3]:
# Question 9: Gradient Boosting
from sklearn.ensemble import GradientBoostingClassifier

model = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
model.fit(X_train, y_train)
print("Gradient Boosting Accuracy:", accuracy_score(y_test, model.predict(X_test)))

Gradient Boosting Accuracy: 1.0


In [4]:
# Question 10: Stacking Classifier
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression

estimators = [
    ('knn', KNeighborsClassifier()),
    ('dt', DecisionTreeClassifier())
]

model = StackingClassifier(estimators=estimators, final_estimator=LogisticRegression(), cv=5)
model.fit(X_train, y_train)
print("Stacking Classifier Accuracy:", accuracy_score(y_test, model.predict(X_test)))

Stacking Classifier Accuracy: 1.0
