# XGBoost

1. XGBoost stands for eXtreme Gradient Boosting.
2. XGBoost, a popular machine learning algorithm known for its efficiency and performance in classification and regression tasks.
3. XGBoost (Extreme Gradient Boosting) is a powerful machine learning algorithm that has gained immense popularity in recent years due to its exceptional performance on a wide range of problems. It is an implementation of the gradient boosting algorithm, which is a type of ensemble learning technique.

###  XGBoost with intuition:

1. Start with a Weak Learner:

- XGBoost starts with a weak learner, which could be a decision tree with a single node. This initial tree simply predicts the average price of all houses in the dataset.

2. Calculate Errors

- Then, XGBoost calculates the errors between the predicted prices and the actual prices of the houses.

3. Build a Tree to Correct Errors

-  Now, XGBoost builds a new decision tree to correct these errors. It tries to find patterns in the data that the initial tree missed. This tree is fitted on the errors (residuals) from the previous predictions.

4. Update Predictions

- The predictions from the new tree are added to the predictions of the previous tree, and the combined predictions are used to update the model's understanding of the data.

5. Repeat the Process: 

- Steps 2-4 are repeated iteratively. Each new tree is built to correct the errors of the combined predictions of all previous trees.

6. Final Prediction

- Eventually, XGBoost combines the predictions from all the trees to make a final prediction. The final prediction is the sum of the predictions from all the trees.



##### The "gradient" in XGBoost refers to the gradient of the loss function used in optimization. By iteratively fitting new trees to the gradients of the loss function, XGBoost improves the model's performance with each iteration.

##### In summary, XGBoost builds an ensemble of weak learners (decision trees) in a sequential manner, where each new tree corrects the errors of the combined predictions of all previous trees. This iterative process allows XGBoost to create a powerful predictive model.

In [1]:
pip install xgboost

Note: you may need to restart the kernel to use updated packages.


In [2]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score

In [3]:
# Load the Breast Cancer dataset
data = load_breast_cancer()
X = data.data
y = data.target

In [4]:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [5]:
# Initialize XGBoost classifier
xgb_model = XGBClassifier()

In [6]:
# Train the model
xgb_model.fit(X_train, y_train)

XGBClassifier(base_score=None, booster=None, callbacks=None,
              colsample_bylevel=None, colsample_bynode=None,
              colsample_bytree=None, device=None, early_stopping_rounds=None,
              enable_categorical=False, eval_metric=None, feature_types=None,
              gamma=None, grow_policy=None, importance_type=None,
              interaction_constraints=None, learning_rate=None, max_bin=None,
              max_cat_threshold=None, max_cat_to_onehot=None,
              max_delta_step=None, max_depth=None, max_leaves=None,
              min_child_weight=None, missing=nan, monotone_constraints=None,
              multi_strategy=None, n_estimators=None, n_jobs=None,
              num_parallel_tree=None, random_state=None, ...)

In [7]:
# Make predictions on the test set
y_pred = xgb_model.predict(X_test)

In [8]:
# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Accuracy: 0.956140350877193
