## xgboost gpu 支持
```
git clone --recursive https://github.com/dmlc/xgboost
mkdir build
cd build
cmake .. -DUSE_CUDA=ON
make -j
```
具体可以查看：https://xgboost.readthedocs.io/en/latest/build.html#building-with-gpu-support



In [1]:
import sys
sys.path.append('/mnt/wc/xgboost/python-package')
sys.path

['',
 '/mnt/wc/anaconda3/envs/fastai/lib/python36.zip',
 '/mnt/wc/anaconda3/envs/fastai/lib/python3.6',
 '/mnt/wc/anaconda3/envs/fastai/lib/python3.6/lib-dynload',
 '/mnt/wc/anaconda3/envs/fastai/lib/python3.6/site-packages',
 '/mnt/wc/anaconda3/envs/fastai/lib/python3.6/site-packages/IPython/extensions',
 '/root/.ipython',
 '/mnt/wc/xgboost/python-package']

In [6]:
import xgboost as xgb
import pandas as pd
import numpy as np

In [7]:
df = pd.DataFrame(np.random.rand(40000,200))
y = df.iloc[:,-1]
df = df.iloc[:,:-1]

In [8]:
# Without using the GPUs: 
m_cpu = xgb.XGBRegressor()
%time m_cpu.fit(df,y)

CPU times: user 59.3 s, sys: 52 ms, total: 59.3 s
Wall time: 59.4 s


XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, objective='reg:linear', random_state=0,
       reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
       silent=True, subsample=1)

In [9]:
# Using the GPUs: 
gpu_params = {'tree_method':'gpu_hist', 
              'predictor':'gpu_predictor',
              'n_jobs': -1}

m_gpu = xgb.XGBRegressor(**gpu_params)
%time m_gpu.fit(df,y)

CPU times: user 2.73 s, sys: 580 ms, total: 3.31 s
Wall time: 1.91 s


XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=-1, nthread=None, objective='reg:linear',
       predictor='gpu_predictor', random_state=0, reg_alpha=0,
       reg_lambda=1, scale_pos_weight=1, seed=None, silent=True,
       subsample=1, tree_method='gpu_hist')

In [10]:
m_gpu.score(df,y)

0.037861668371099944

## boosting算法
关于boosting算法，之前写过一篇文章 [boosting原理](https://www.zybuluo.com/zhuanxu/note/970185)，里面的涉及的数学推导会比较多，想看数学原理的可以去查看。

第一个boosting算法：AdaBoost，然后Friedman将AdaBoost推广到一般Gradient Boosting框架，得到Gradient Boosting Machines (GBM): 将boosting视作一个数值优化问题，采用类似梯度下降的方式优化求解。

另外关于boost和bag可以看下面图：
![](http://static.zybuluo.com/zhuanxu/3lryqzm33fi8i6maz3tmwebc/image_1c9ot34k5ec6okq12fu85i1nji9.png)
其中boost产生了GBDT，bag产生了随机森林

### 简单例子：毒蘑菇
可以看 [1-1 基本模型调用.ipynb](https://www.zybuluo.com/zhuanxu/note/969884)

In [11]:
# 显示图片
%matplotlib inline
%config InlineBackend.figure_format = 'retina'