## Bagging Classifier

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X = data.data
y = data.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define base model
base_model = DecisionTreeClassifier()

# Create Bagging Classifier
bagging_clf = BaggingClassifier(base_model, n_estimators=50)

# Train model
bagging_clf.fit(X_train, y_train)

# Predict
y_pred = bagging_clf.predict(X_test)

# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))


Accuracy: 1.0


## Bagging Regressor

In [2]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error

# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define base model
base_model = DecisionTreeRegressor()

# Create Bagging Regressor
bagging_reg = BaggingRegressor(base_model, n_estimators=50, random_state=42)

# Train model
bagging_reg.fit(X_train, y_train)

# Predict
y_pred = bagging_reg.predict(X_test)

# Evaluate
print("MSE:", mean_squared_error(y_test, y_pred))

MSE: 0.2579153056796594


In the code snippet, `BaggingClassifier` with `n_estimators=50`. This means that the ensemble will consist of 50 different models. 

Here’s a breakdown:

- **`BaggingClassifier(base_model, n_estimators=50)`**: This line creates a bagging ensemble with 50 base models. Each of these 50 models is trained on a different bootstrap sample of the training data.

There are 50 models in the bagging ensemble created by the `BaggingClassifier` in your code.

## Custom Bagging Classifier with Multiple Algorithms

In [3]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import VotingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X = data.data
y = data.target 

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define base models
model1 = DecisionTreeClassifier()
model2 = LogisticRegression(max_iter=1000)
model3 = SVC(probability=True)

# Create a Voting Classifier
voting_clf = VotingClassifier(estimators=[
    ('dt', model1),
    ('lr', model2),
    ('svc', model3)
], voting='soft')

# Train model
voting_clf.fit(X_train, y_train)

# Predict
y_pred = voting_clf.predict(X_test)

# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))


Accuracy: 1.0


## Custom Bagging Regressor with Multiple Algorithms

In [4]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.ensemble import VotingRegressor
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error

# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target 

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define base models
model1 = DecisionTreeRegressor()
model2 = LinearRegression()
model3 = SVR()

# Create a Voting Regressor
voting_reg = VotingRegressor(estimators=[
    ('dt', model1),
    ('lr', model2),
    ('svr', model3)
])

# Train model
voting_reg.fit(X_train, y_train)

# Predict
y_pred = voting_reg.predict(X_test)

# Evaluate
print("MSE:", mean_squared_error(y_test, y_pred))


MSE: 0.4882859624408465


## Use Hyperparameter Tuning For Our Wish

###  Hyperparameter Tuning

Use `GridSearchCV` to tune hyperparameters for both classifiers and regressors.

#### For Classification

```python
from sklearn.model_selection import GridSearchCV

# Define parameter grid
param_grid = {
    'n_estimators': [50, 100],
    'max_samples': [0.8, 1.0]
}

# Create GridSearchCV
grid_search = GridSearchCV(BaggingClassifier(base_model), param_grid, cv=5)

# Train model
grid_search.fit(X_train, y_train)

# Best parameters and score
print("Best Parameters:", grid_search.best_params_)
print("Best Score:", grid_search.best_score_)
```

#### For Regression

```python
from sklearn.model_selection import GridSearchCV

# Define parameter grid
param_grid = {
    'n_estimators': [50, 100],
    'max_samples': [0.8, 1.0]
}

# Create GridSearchCV
grid_search = GridSearchCV(BaggingRegressor(base_model), param_grid, cv=5)

# Train model
grid_search.fit(X_train, y_train)

# Best parameters and score
print("Best Parameters:", grid_search.best_params_)
print("Best Score:", grid_search.best_score_)
```

#### Prepared By,
Ahamed Basith