## XGBoost (extreme gradient boosting)

### Gradient Boosting

: goes through cycles to iteratively add models into an ensemble.<br>
&nbsp; It begins by initializing the ensemble with a single model.<br>
&nbsp; Then, we start the cycle:<br>
* Use the current ensemble to generate predictions for each observation in the dataset. To make a prediction, we add the predictions from all models in the ensemble and calculate a loss function.
* Use the loss function to fit a new model that will be added to the ensemble. Specifically, we determine model parameters so that adding this new model to the ensemble will reduce the loss.
* Add the new model to ensemble.
* Repeat!

```python
    # cf) n_estimators : the number of models that we include in the ensemble
    # cf) learning_rate : multiply the predictions from each model by the learning rate before adding them in. (each tree we add to the ensemble helps us less. So, we can set a higher value for n_estimators without overfitting.)
    # cf) n_jobs : (On larger datasets) equal to the number of cores on your machine.
    # cf) early_stopping_rounds : the number of rounds of straight deterioration before stopping
    # cf) eval_set parameter : to calculate the validation scores
    # cf) verbose : writes the evaluation metric measured on the validation set to standard erorr

    from xgboost import XGBRegressor

    my_model = XGBRegressor(n_estimators=1000, learning_rate=0.05, n_jobs=4)
    my_model.fit(X_train, y_train, 
                early_stopping_rounds=5, 
                eval_set=[(X_valid, y_valid)],
                verbose=False)
```