---
###  Theory Questions
---

<div style="font-family: Verdana; font-size: 18px; line-height: 1.6;">


### 1. **What is Boosting in Machine Learning?**

Boosting is an **ensemble technique** that combines multiple **weak learners** (typically decision trees) to create a strong predictive model. It works by training models sequentially, where each new model tries to correct the errors made by the previous ones.

---

### 2. **How does Boosting differ from Bagging?**

| Feature        | Boosting                      | Bagging                         |
| -------------- | ----------------------------- | ------------------------------- |
| Training       | Sequential                    | Parallel                        |
| Focus          | Correcting previous errors    | Reducing variance via averaging |
| Example Models | AdaBoost, Gradient Boosting   | Random Forest                   |
| Overfitting    | Less prone with proper tuning | Less prone                      |

---

### 3. **What is the key idea behind AdaBoost?**

The key idea in AdaBoost (Adaptive Boosting) is to **assign higher weights** to the data points that are misclassified by previous models, so that subsequent models focus more on those difficult cases.

---

### 4. **Explain the working of AdaBoost with an example**

**Steps:**

1. Start with equal weights for all training samples.
2. Train a weak learner (e.g., decision stump).
3. Increase weights of misclassified samples.
4. Train the next model on the updated weights.
5. Repeat steps 2–4 for `T` iterations.
6. Final prediction is a **weighted vote** of all weak learners.

**Example:**
If sample A is misclassified in the first round, its weight increases so the next learner gives it more focus. Eventually, the combined model classifies A correctly due to this increased attention.

---

### 5. **What is Gradient Boosting, and how is it different from AdaBoost?**

Gradient Boosting builds models sequentially like AdaBoost, but instead of adjusting weights, it **fits new models to the gradient of the loss function** (i.e., it reduces prediction error using gradient descent).

 **Difference:**

* **AdaBoost** uses exponential loss and updates weights.
* **Gradient Boosting** uses any differentiable loss and fits residual errors directly.

---

### 6. **What is the loss function in Gradient Boosting?**

The loss function in Gradient Boosting depends on the problem:

* **Regression:** Mean Squared Error (MSE)
* **Classification:** Log Loss (Binary Cross Entropy)
* It uses **gradients of the loss** to train subsequent learners.

---

### 7. **How does XGBoost improve over traditional Gradient Boosting?**

XGBoost (Extreme Gradient Boosting) offers:

* **Regularization** to reduce overfitting
* **Parallelization** of tree construction
* **Tree pruning** using max depth and min child weight
* **Efficient handling of missing values**
* **Out-of-core computation** for large datasets

---

### 8. **What is the difference between XGBoost and CatBoost?**

| Feature   | XGBoost             | CatBoost                               |
| --------- | ------------------- | -------------------------------------- |
| Data Type | Needs preprocessing | Handles categorical features natively  |
| Speed     | Very fast           | Optimized for categorical data         |
| Accuracy  | High                | Often higher with categorical features |

---

### 9. **What are some real-world applications of Boosting techniques?**

* **Fraud detection**
* **Credit scoring**
* **Customer churn prediction**
* **Search ranking**
* **Medical diagnosis**
* **Click-through rate prediction**

---

### 10. **How does regularization help in XGBoost?**

Regularization in XGBoost:

* Controls model complexity (via L1/L2 penalties)
* Prevents overfitting
* Helps in selecting simpler trees by penalizing large weights and complex trees

---

### 11. **What are some hyperparameters to tune in Gradient Boosting models?**

Key hyperparameters:

* `n_estimators`: Number of boosting rounds
* `learning_rate`: Shrinks the contribution of each tree
* `max_depth`: Max depth of individual trees
* `min_child_weight`: Minimum samples per leaf
* `subsample`: Fraction of samples used per tree
* `colsample_bytree`: Fraction of features used per tree
* `gamma`: Minimum loss reduction for further splits
* `lambda`, `alpha`: L2 and L1 regularization terms

---

### 12. **What is the concept of Feature Importance in Boosting?**

Feature Importance shows **how much each feature contributes** to improving the model's performance. Boosting models measure it by:

* **Frequency of usage** in splits
* **Information gain** from splits
* **Permutation importance** after training

---

### 13. **Why is CatBoost efficient for categorical data?**

CatBoost handles categorical data efficiently by:

* Automatically encoding categorical features without manual preprocessing
* Using **ordered target statistics** to avoid overfitting
* Optimizing performance and reducing leakage using smart encoding



---
### PRACTICAL QUESTIONS:
---



### 1. **Train an AdaBoost Classifier on a sample dataset and print model accuracy**

```python
from sklearn.ensemble import AdaBoostClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

model = AdaBoostClassifier(n_estimators=50)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
```

---

### 2. **Train an AdaBoost Regressor and evaluate performance using MAE**

```python
from sklearn.ensemble import AdaBoostRegressor
from sklearn.metrics import mean_absolute_error
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

model = AdaBoostRegressor(n_estimators=100)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("MAE:", mean_absolute_error(y_test, y_pred))
```

---

### 3. **Train a Gradient Boosting Classifier on the Breast Cancer dataset and print feature importance**

```python
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
X, y = data.data, data.target

model = GradientBoostingClassifier()
model.fit(X, y)

importances = model.feature_importances_
for name, importance in zip(data.feature_names, importances):
    print(f"{name}: {importance:.4f}")
```

---

### 4. **Train a Gradient Boosting Regressor and evaluate using R-Squared Score**

```python
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import r2_score

model = GradientBoostingRegressor()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("R2 Score:", r2_score(y_test, y_pred))
```

---

### 5. **Train an XGBoost Classifier on a dataset and compare accuracy with Gradient Boosting**

```python
from xgboost import XGBClassifier
from sklearn.ensemble import GradientBoostingClassifier

xgb = XGBClassifier(use_label_encoder=False, eval_metric='logloss')
gbc = GradientBoostingClassifier()

xgb.fit(X_train, y_train)
gbc.fit(X_train, y_train)

xgb_acc = accuracy_score(y_test, xgb.predict(X_test))
gbc_acc = accuracy_score(y_test, gbc.predict(X_test))

print("XGBoost Accuracy:", xgb_acc)
print("Gradient Boosting Accuracy:", gbc_acc)
```

---

### 6. **Train a CatBoost Classifier and evaluate using F1-Score**

```python
from catboost import CatBoostClassifier
from sklearn.metrics import f1_score

model = CatBoostClassifier(verbose=0)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("F1 Score:", f1_score(y_test, y_pred))
```

---

### 7. **Train an XGBoost Regressor and evaluate using MSE**

```python
from xgboost import XGBRegressor
from sklearn.metrics import mean_squared_error

model = XGBRegressor()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("MSE:", mean_squared_error(y_test, y_pred))
```

---

### 8. **Train an AdaBoost Classifier and visualize feature importance**

```python
import matplotlib.pyplot as plt
import numpy as np

model = AdaBoostClassifier()
model.fit(X_train, y_train)

plt.bar(range(X.shape[1]), model.feature_importances_)
plt.xlabel("Feature Index")
plt.ylabel("Importance")
plt.title("AdaBoost Feature Importance")
plt.show()
```

---

### 9. **Train a Gradient Boosting Regressor and plot learning curves**

```python
import matplotlib.pyplot as plt

model = GradientBoostingRegressor(n_estimators=200)
model.fit(X_train, y_train)

errors = [mean_squared_error(y_test, y_pred) for y_pred in model.staged_predict(X_test)]

plt.plot(errors)
plt.xlabel("n_estimators")
plt.ylabel("Test MSE")
plt.title("Gradient Boosting Learning Curve")
plt.show()
```

---

### 10. **Train an XGBoost Classifier and visualize feature importance**

```python
from xgboost import plot_importance

model = XGBClassifier(use_label_encoder=False, eval_metric='logloss')
model.fit(X_train, y_train)

plot_importance(model)
plt.show()
```

---

### 11. **Train a CatBoost Classifier and plot the confusion matrix**

```python
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

model = CatBoostClassifier(verbose=0)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm)
disp.plot()
plt.show()
```

---

### 12. **Train an AdaBoost Classifier with different numbers of estimators and compare accuracy**

```python
for n in [10, 50, 100, 200]:
    model = AdaBoostClassifier(n_estimators=n)
    model.fit(X_train, y_train)
    acc = accuracy_score(y_test, model.predict(X_test))
    print(f"n_estimators={n}: Accuracy={acc:.4f}")
```

---

### 13. **Train a Gradient Boosting Classifier and visualize the ROC curve**

```python
from sklearn.metrics import roc_curve, auc

model = GradientBoostingClassifier()
model.fit(X_train, y_train)
y_proba = model.predict_proba(X_test)[:, 1]

fpr, tpr, _ = roc_curve(y_test, y_proba)
roc_auc = auc(fpr, tpr)

plt.plot(fpr, tpr, label=f'AUC = {roc_auc:.2f}')
plt.plot([0, 1], [0, 1], 'k--')
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend()
plt.show()
```

---

### 14. **Train an XGBoost Regressor and tune the learning rate using GridSearchCV**

```python
from sklearn.model_selection import GridSearchCV

param_grid = {'learning_rate': [0.01, 0.1, 0.2, 0.3]}
model = XGBRegressor()
grid = GridSearchCV(model, param_grid, scoring='neg_mean_squared_error', cv=3)
grid.fit(X_train, y_train)

print("Best Learning Rate:", grid.best_params_)
print("Best Score (Neg MSE):", grid.best_score_)
```

---

### 15. **Train a CatBoost Classifier on an imbalanced dataset and compare performance with class weighting**

```python
# Assuming y is imbalanced
model_bal = CatBoostClassifier(class_weights=[1, 5], verbose=0)
model_bal.fit(X_train, y_train)
y_pred = model_bal.predict(X_test)

print("F1 Score with class weights:", f1_score(y_test, y_pred))
```

---

### 16. **Train an AdaBoost Classifier and analyze the effect of different learning rates**

```python
for lr in [0.01, 0.1, 0.5, 1.0]:
    model = AdaBoostClassifier(learning_rate=lr)
    model.fit(X_train, y_train)
    acc = accuracy_score(y_test, model.predict(X_test))
    print(f"Learning Rate={lr}: Accuracy={acc:.4f}")
```

---

### 17. **Train an XGBoost Classifier for multi-class classification and evaluate using log-loss**

```python
from sklearn.datasets import load_digits
from sklearn.metrics import log_loss

digits = load_digits()
X, y = digits.data, digits.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

model = XGBClassifier(objective='multi:softprob', num_class=10, use_label_encoder=False, eval_metric='mlogloss')
model.fit(X_train, y_train)
y_proba = model.predict_proba(X_test)

print("Log Loss:", log_loss(y_test, y_proba))
```
