## AdaBoost Classifier / Regressor

In [None]:
from sklearn.ensemble import AdaBoostClassifier, AdaBoostRegressor
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score, mean_squared_error

# Load dataset (using iris for classification)
X, y = load_iris(return_X_y=True)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# AdaBoost Classifier (For classification problem)
base_model = DecisionTreeClassifier(max_depth=1)
ada_boost_model = AdaBoostClassifier(base_estimator=base_model, n_estimators=50, learning_rate=1)
ada_boost_model.fit(X_train, y_train)

# Predict and evaluate
y_pred = ada_boost_model.predict(X_test)
print("AdaBoost Accuracy:", accuracy_score(y_test, y_pred))


##
For regression, replace AdaBoostClassifier with AdaBoostRegressor and use a regression dataset.

**Tuning AdaBoost**
n_estimators: Number of boosting rounds (more estimators, typically better performance but also higher computation).

learning_rate: Controls the contribution of each weak model (lower values typically improve performance but may need more estimators).

base_estimator: A weak learner like DecisionTreeClassifier with limited depth (commonly used max_depth=1).

## Gradient Boosting Classifier / Regressor

In [None]:
from sklearn.ensemble import GradientBoostingClassifier, GradientBoostingRegressor
from sklearn.metrics import accuracy_score, mean_squared_error

# Gradient Boosting Classifier (For classification)
gb_model = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)
gb_model.fit(X_train, y_train)

# Predict and evaluate
y_pred = gb_model.predict(X_test)
print("Gradient Boosting Accuracy:", accuracy_score(y_test, y_pred))


## 
For regression, replace GradientBoostingClassifier with GradientBoostingRegressor and use a regression dataset.

Tuning Gradient Boosting
n_estimators: Number of boosting stages (more estimators increase performance).

learning_rate: Determines the step size at each iteration (lower values require more estimators but can improve performance).

max_depth: The maximum depth of the trees (higher values can overfit the model).

subsample: Fraction of samples used for fitting each base learner (use smaller values to prevent overfitting).

min_samples_split and min_samples_leaf: Control the minimum number of samples required to split or leaf node.

## XGBoost Classifier / Regressor

In [None]:
import xgboost as xgb
from sklearn.metrics import accuracy_score, mean_squared_error

# Prepare the data
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# XGBoost model (For classification)
params = {
    'objective': 'multi:softmax',  # for classification
    'num_class': 3,  # Number of classes
    'max_depth': 3,
    'learning_rate': 0.1,
    'n_estimators': 100
}

xgb_model = xgb.train(params, dtrain, num_boost_round=100)

# Predict and evaluate
y_pred = xgb_model.predict(dtest)
print("XGBoost Accuracy:", accuracy_score(y_test, y_pred))


## 
For regression, replace the objective to 'reg:squarederror' in params and use a regression dataset.

Tuning XGBoost
n_estimators: Number of boosting rounds.

learning_rate: Step size shrinking.

max_depth: Depth of trees (higher values can lead to overfitting).

subsample: Fraction of samples used to fit trees (values between 0.5 and 1 are often optimal).

colsample_bytree: Fraction of features used per tree (to reduce overfitting).

gamma: Minimum loss reduction required to make a further partition (higher values make the algorithm more conservative).

min_child_weight: Minimum sum of instance weight (L2 regularization).

## Hyperparameter Tuning for All Models
For each model, we can use GridSearchCV to find the best hyperparameters. Hereâ€™s an example for Gradient Boosting:

In [None]:
from sklearn.model_selection import GridSearchCV

# Gradient Boosting Grid Search
param_grid = {
    'n_estimators': [50, 100, 200],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [3, 4, 5]
}

grid_search = GridSearchCV(GradientBoostingClassifier(), param_grid, cv=5, n_jobs=-1)
grid_search.fit(X_train, y_train)

print("Best parameters:", grid_search.best_params_)


## Summary of Key Hyperparameters:

Model	Hyperparameters	Tuning Focus

**AdaBoost**	n_estimators, learning_rate, base_estimator	Control boosting rounds and weak learners

**Gradient Boosting**	n_estimators, learning_rate, max_depth, subsample, min_samples_split	Control number of trees, learning rate, and depth of trees

**XGBoost**	n_estimators, learning_rate, max_depth, subsample, colsample_bytree, gamma	Regularization, depth, and feature selection
