# Bagging
It is also known as a bootstrapping method. Base models are run on bags to get a fair distribution of the whole dataset. A bag is a subset of the dataset along with a replacement to make the size of the bag the same as the whole dataset. The final output is formed after combining the output of all base models. 

## Algorithm:
1. Create multiple datasets from the train dataset by selecting observations with replacements

2. Run a base model on each of the created datasets independently

3. Combine the predictions of all the base models to each the final output




In [2]:
# Importing necessary libraries
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import xgboost as xgb
from sklearn.ensemble import BaggingRegressor

# Loading the inbuilt California housing dataset
data = fetch_california_housing()
df = pd.DataFrame(data.data, columns=data.feature_names)  # Creating DataFrame
df["target"] = data.target  # Adding target column

# Splitting features and target variable
X = df.drop(columns=["target"])
y = df["target"]

# Splitting dataset into training and validation sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

# Initializing the Bagging model with XGBoost as the base estimator
model = BaggingRegressor(estimator=xgb.XGBRegressor(), n_estimators=10, random_state=42)

# Training the model
model.fit(X_train, y_train)

# Predicting on test data
pred = model.predict(X_test)

# Evaluating model performance using Mean Squared Error (MSE)
mse = mean_squared_error(y_test, pred)
print(f"Mean Squared Error: {mse:.4f}")


Mean Squared Error: 0.2014
