# Introductino to XGBoost

### What is Boosting & Gradient Boosting?
* Boosting is an ensemble technique where **new models are added to correct the errors made by existing models**. Models are added sequentially until no further improvements can be made

* Gradient boosting is an approach where new models are created that predict the residuals or errors of prior models and then added together to make the final prediction. 

* It is called gradient boosting because it ** uses a gradient descent algorithm** to minimize the loss when adding new models.

* This approach supports both regression and classification predictive modeling problems.


### Why use XGBoost?
* **Execution Speed**: Generally, XGBoost is fast. Really fast when compared to other implementations of gradient boosting.
* **Model Performance**: XGBoost dominates structured or tabular datasets on classification and regression predictive modeling problems.
* **Scalibility**: Supports distributed training on multiple machines, including AWS, GCE, Azure, and Yarn clusters. Can be integrated with Flink, Spark and other cloud dataflow systems.
* The XGBoost library implements the gradient boosting decision tree algorithm.

In [22]:
from sklearn.datasets import load_iris
# load data
iris = load_iris()
X, y = iris.data, iris.target
X_train, y_train = X[:135], y[:135]
X_test, y_test = X[135:], y[135:]

In [24]:
import xgboost as xgb
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score

clf = xgb.XGBClassifier(max_depth=3, n_estimators=300, learning_rate=0.05)
clf.fit(X_train, y_train)
# make prediction
preds = clf.predict(X_test)
accuracy_score(y_test, preds)

0.93333333333333335

### Reference

https://github.com/dmlc/xgboost/tree/master/demo#machine-learning-challenge-winning-solutions
    
https://xgboost.readthedocs.io/en/latest/model.html
    
http://machinelearningmastery.com/gentle-introduction-xgboost-applied-machine-learning/
    
http://xgboost.readthedocs.io/en/latest/build.html
    
http://homes.cs.washington.edu/~tqchen/2016/03/10/story-and-lessons-behind-the-evolution-of-xgboost.html