Skip to content

Commit 2cff735

Browse files
authored
Update doc for feature constraints and n_gpus. (dmlc#4596)
* Update doc for feature constraints. * Fix some warnings. * Clean up doc for `n_gpus`.
1 parent 9fa29ad commit 2cff735

File tree

5 files changed

+173
-83
lines changed

5 files changed

+173
-83
lines changed

doc/gpu/index.rst

+49-58
Original file line numberDiff line numberDiff line change
@@ -50,22 +50,25 @@ Supported parameters
5050
+--------------------------------+----------------------------+--------------+
5151
| ``gpu_id`` | |tick| | |tick| |
5252
+--------------------------------+----------------------------+--------------+
53-
| ``n_gpus`` | |cross| | |tick| |
53+
| ``n_gpus`` (deprecated) | |cross| | |tick| |
5454
+--------------------------------+----------------------------+--------------+
5555
| ``predictor`` | |tick| | |tick| |
5656
+--------------------------------+----------------------------+--------------+
5757
| ``grow_policy`` | |cross| | |tick| |
5858
+--------------------------------+----------------------------+--------------+
5959
| ``monotone_constraints`` | |cross| | |tick| |
6060
+--------------------------------+----------------------------+--------------+
61+
| ``interaction_constraints`` | |cross| | |tick| |
62+
+--------------------------------+----------------------------+--------------+
6163
| ``single_precision_histogram`` | |cross| | |tick| |
6264
+--------------------------------+----------------------------+--------------+
6365

6466
GPU accelerated prediction is enabled by default for the above mentioned ``tree_method`` parameters but can be switched to CPU prediction by setting ``predictor`` to ``cpu_predictor``. This could be useful if you want to conserve GPU memory. Likewise when using CPU algorithms, GPU accelerated prediction can be enabled by setting ``predictor`` to ``gpu_predictor``.
6567

6668
The experimental parameter ``single_precision_histogram`` can be set to True to enable building histograms using single precision. This may improve speed, in particular on older architectures.
6769

68-
The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0.
70+
The device ordinal (which GPU to use if you have many of them) can be selected using the
71+
``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
6972

7073

7174
The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.
@@ -80,15 +83,7 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu
8083
8184
Single Node Multi-GPU
8285
=====================
83-
.. note:: Single node multi-GPU training is deprecated. Please use distributed GPU training with one process per GPU.
84-
85-
Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.
86-
87-
.. note:: Enabling multi-GPU training
88-
89-
Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
90-
XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter.
91-
86+
.. note:: Single node multi-GPU training with `n_gpus` parameter is deprecated after 0.90. Please use distributed GPU training with one process per GPU.
9287

9388
Multi-node Multi-GPU Training
9489
=============================
@@ -101,66 +96,64 @@ Objective functions
10196
===================
10297
Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status.
10398

104-
.. |tick| unicode:: U+2714
105-
.. |cross| unicode:: U+2718
106-
107-
+-----------------+-------------+
108-
| Objectives | GPU support |
109-
+-----------------+-------------+
110-
| reg:squarederror| |tick| |
111-
+-----------------+-------------+
112-
| reg:logistic | |tick| |
113-
+-----------------+-------------+
114-
| binary:logistic | |tick| |
115-
+-----------------+-------------+
116-
| binary:logitraw | |tick| |
117-
+-----------------+-------------+
118-
| binary:hinge | |tick| |
119-
+-----------------+-------------+
120-
| count:poisson | |tick| |
121-
+-----------------+-------------+
122-
| reg:gamma | |tick| |
123-
+-----------------+-------------+
124-
| reg:tweedie | |tick| |
125-
+-----------------+-------------+
126-
| multi:softmax | |tick| |
127-
+-----------------+-------------+
128-
| multi:softprob | |tick| |
129-
+-----------------+-------------+
130-
| survival:cox | |cross| |
131-
+-----------------+-------------+
132-
| rank:pairwise | |cross| |
133-
+-----------------+-------------+
134-
| rank:ndcg | |cross| |
135-
+-----------------+-------------+
136-
| rank:map | |cross| |
137-
+-----------------+-------------+
138-
139-
For multi-gpu support, objective functions also honor the ``n_gpus`` parameter,
140-
which, by default is set to 1. To disable running objectives on GPU, just set
141-
``n_gpus`` to 0.
99+
+--------------------+-------------+
100+
| Objectives | GPU support |
101+
+--------------------+-------------+
102+
| reg:squarederror | |tick| |
103+
+--------------------+-------------+
104+
| reg:squaredlogerror| |tick| |
105+
+--------------------+-------------+
106+
| reg:logistic | |tick| |
107+
+--------------------+-------------+
108+
| binary:logistic | |tick| |
109+
+--------------------+-------------+
110+
| binary:logitraw | |tick| |
111+
+--------------------+-------------+
112+
| binary:hinge | |tick| |
113+
+--------------------+-------------+
114+
| count:poisson | |tick| |
115+
+--------------------+-------------+
116+
| reg:gamma | |tick| |
117+
+--------------------+-------------+
118+
| reg:tweedie | |tick| |
119+
+--------------------+-------------+
120+
| multi:softmax | |tick| |
121+
+--------------------+-------------+
122+
| multi:softprob | |tick| |
123+
+--------------------+-------------+
124+
| survival:cox | |cross| |
125+
+--------------------+-------------+
126+
| rank:pairwise | |cross| |
127+
+--------------------+-------------+
128+
| rank:ndcg | |cross| |
129+
+--------------------+-------------+
130+
| rank:map | |cross| |
131+
+--------------------+-------------+
132+
133+
Objective will run on GPU if GPU updater (``gpu_hist``), otherwise they will run on CPU by
134+
default. For unsupported objectives XGBoost will fall back to using CPU implementation by
135+
default.
142136

143137
Metric functions
144138
===================
145139
Following table shows current support status for evaluation metrics on the GPU.
146140

147-
.. |tick| unicode:: U+2714
148-
.. |cross| unicode:: U+2718
149-
150141
+-----------------+-------------+
151142
| Metric | GPU Support |
152143
+=================+=============+
153144
| rmse | |tick| |
154145
+-----------------+-------------+
146+
| rmsle | |tick| |
147+
+-----------------+-------------+
155148
| mae | |tick| |
156149
+-----------------+-------------+
157150
| logloss | |tick| |
158151
+-----------------+-------------+
159152
| error | |tick| |
160153
+-----------------+-------------+
161-
| merror | |cross| |
154+
| merror | |tick| |
162155
+-----------------+-------------+
163-
| mlogloss | |cross| |
156+
| mlogloss | |tick| |
164157
+-----------------+-------------+
165158
| auc | |cross| |
166159
+-----------------+-------------+
@@ -181,10 +174,8 @@ Following table shows current support status for evaluation metrics on the GPU.
181174
| tweedie-nloglik | |tick| |
182175
+-----------------+-------------+
183176

184-
As for objective functions, metrics honor the ``n_gpus`` parameter,
185-
which, by default is set to 1. To disable running metrics on GPU, just set
186-
``n_gpus`` to 0.
187-
177+
Similar to objective functions, default device for metrics is selected based on tree
178+
updater and predictor (which is selected based on tree updater).
188179

189180
Benchmarks
190181
==========

doc/tutorials/feature_interaction_constraint.rst

+104-4
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,107 @@ parameter:
171171
num_boost_round = 1000, evals = evallist,
172172
early_stopping_rounds = 10)
173173
174-
**Choice of tree construction algorithm**. To use feature interaction
175-
constraints, be sure to set the ``tree_method`` parameter to either ``exact``
176-
or ``hist``. Currently, GPU algorithms (``gpu_hist``, ``gpu_exact``) do not
177-
support feature interaction constraints.
174+
**Choice of tree construction algorithm**. To use feature interaction constraints, be sure
175+
to set the ``tree_method`` parameter to one of the following: ``exact``, ``hist`` or
176+
``gpu_hist``. Support for ``gpu_hist`` is added after (excluding) version 0.90.
177+
178+
179+
**************
180+
Advanced topic
181+
**************
182+
183+
The intuition behind interaction constraint is simple. User have prior knowledge about
184+
relations between different features, and encode it as constraints during model
185+
construction. But there are also some subtleties around specifying constraints. Take
186+
constraint ``[[1, 2], [2, 3, 4]]`` as an example, the second feature appears in two
187+
different interaction sets ``[1, 2]`` and ``[2, 3, 4]``, so the union set of features
188+
allowed to interact with ``2`` is ``{1, 3, 4}``. In following diagram, root splits at
189+
feature ``2``. because all its descendants should be able to interact with it, so at the
190+
second layer all 4 features are legitimate split candidates for further splitting,
191+
disregarding specified constraint sets.
192+
193+
.. plot::
194+
:nofigs:
195+
196+
from graphviz import Source
197+
source = r"""
198+
digraph feature_interaction_illustration4 {
199+
graph [fontname = "helvetica"];
200+
node [fontname = "helvetica"];
201+
edge [fontname = "helvetica"];
202+
0 [label=<x<SUB><FONT POINT-SIZE="11">2</FONT></SUB>>, shape=box, color=black, fontcolor=black];
203+
1 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box];
204+
2 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
205+
3 [label="...", shape=none];
206+
4 [label="...", shape=none];
207+
5 [label="...", shape=none];
208+
6 [label="...", shape=none];
209+
0 -> 1;
210+
0 -> 2;
211+
1 -> 3;
212+
1 -> 4;
213+
2 -> 5;
214+
2 -> 6;
215+
}
216+
"""
217+
Source(source, format='png').render('../_static/feature_interaction_illustration4', view=False)
218+
Source(source, format='svg').render('../_static/feature_interaction_illustration5', view=False)
219+
220+
.. figure:: ../_static/feature_interaction_illustration4.png
221+
:align: center
222+
:figwidth: 80 %
223+
224+
``{1, 2, 3, 4}`` represents the sets of legitimate split features.
225+
226+
This has lead to some interesting implications of feature interaction constraints. Take
227+
``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available
228+
features in our training datasets for presentation purpose, careful readers might have
229+
found out that the above constraint is same with ``[0, 1, 2]``. Since no matter which
230+
feature is chosen for split in root node, all its descendants have to include every
231+
feature as legitimate split candidates to avoid violating interaction constraints.
232+
233+
For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
234+
root node. At the second layer of built tree, ``1`` is the only legitimate split
235+
candidate except for ``0`` itself, since they belong to the same constraint set.
236+
Following the grow path of our example tree below, the node at second layer splits at
237+
feature ``1``. But due to the fact that ``1`` also belongs to second constraint set ``[1,
238+
3, 4]``, at third layer, we need to include all features as candidates to comply with its
239+
ascendants.
240+
241+
.. plot::
242+
:nofigs:
243+
244+
from graphviz import Source
245+
source = r"""
246+
digraph feature_interaction_illustration5 {
247+
graph [fontname = "helvetica"];
248+
node [fontname = "helvetica"];
249+
edge [fontname = "helvetica"];
250+
0 [label=<x<SUB><FONT POINT-SIZE="11">0</FONT></SUB>>, shape=box, color=black, fontcolor=black];
251+
1 [label="...", shape=none];
252+
2 [label=<x<SUB><FONT POINT-SIZE="11">1</FONT></SUB>>, shape=box, color=black, fontcolor=black];
253+
3 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
254+
4 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
255+
5 [label="...", shape=none];
256+
6 [label="...", shape=none];
257+
7 [label="...", shape=none];
258+
8 [label="...", shape=none];
259+
0 -> 1;
260+
0 -> 2;
261+
2 -> 3;
262+
2 -> 4;
263+
3 -> 5;
264+
3 -> 6;
265+
4 -> 7;
266+
4 -> 8;
267+
}
268+
"""
269+
Source(source, format='png').render('../_static/feature_interaction_illustration6', view=False)
270+
Source(source, format='svg').render('../_static/feature_interaction_illustration7', view=False)
271+
272+
273+
.. figure:: ../_static/feature_interaction_illustration6.png
274+
:align: center
275+
:figwidth: 80 %
276+
277+
``{0, 1, 3, 4}`` represents the sets of legitimate split features.

python-package/xgboost/dask.py

+12-11
Original file line numberDiff line numberDiff line change
@@ -101,24 +101,25 @@ def _run_with_rabit(rabit_args, func, *args):
101101

102102

103103
def run(client, func, *args):
104-
"""
105-
Launch arbitrary function on dask workers. Workers are connected by rabit, allowing
106-
distributed training. The environment variable OMP_NUM_THREADS is defined on each worker
107-
according to dask - this means that calls to xgb.train() will use the threads allocated by
108-
dask by default, unless the user overrides the nthread parameter.
104+
"""Launch arbitrary function on dask workers. Workers are connected by rabit,
105+
allowing distributed training. The environment variable OMP_NUM_THREADS is
106+
defined on each worker according to dask - this means that calls to
107+
xgb.train() will use the threads allocated by dask by default, unless the
108+
user overrides the nthread parameter.
109109
110-
Note: Windows platforms are not officially supported. Contributions are welcome here.
110+
Note: Windows platforms are not officially
111+
supported. Contributions are welcome here.
111112
112113
:param client: Dask client representing the cluster
113-
:param func: Python function to be executed by each worker. Typically contains xgboost
114-
training code.
114+
:param func: Python function to be executed by each worker. Typically
115+
contains xgboost training code.
115116
:param args: Arguments to be forwarded to func
116117
:return: Dict containing the function return value for each worker
118+
117119
"""
118120
if platform.system() == 'Windows':
119-
logging.warning(
120-
'Windows is not officially supported for dask/xgboost integration. Contributions '
121-
'welcome.')
121+
logging.warning('Windows is not officially supported for dask/xgboost'
122+
'integration. Contributions welcome.')
122123
workers = list(client.scheduler_info()['workers'].keys())
123124
env = client.run(_start_tracker, len(workers), workers=[workers[0]])
124125
rabit_args = [('%s=%s' % item).encode() for item in env[workers[0]].items()]

python-package/xgboost/plotting.py

+6-8
Original file line numberDiff line numberDiff line change
@@ -184,18 +184,16 @@ def to_graphviz(booster, fmap='', num_trees=0, rankdir='UT',
184184
no_color : str, default '#FF0000'
185185
Edge color when doesn't meet the node condition.
186186
condition_node_params : dict (optional)
187-
condition node configuration,
188-
{'shape':'box',
189-
'style':'filled,rounded',
190-
'fillcolor':'#78bceb'
191-
}
187+
condition node configuration,
188+
{'shape':'box',
189+
'style':'filled,rounded',
190+
'fillcolor':'#78bceb'}
192191
193192
leaf_node_params : dict (optional)
194193
leaf node configuration
195194
{'shape':'box',
196-
'style':'filled',
197-
'fillcolor':'#e48038'
198-
}
195+
'style':'filled',
196+
'fillcolor':'#e48038'}
199197
200198
kwargs :
201199
Other keywords passed to graphviz graph_attr

python-package/xgboost/sklearn.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -105,8 +105,8 @@ class XGBModel(XGBModelBase):
105105
Value in the data which needs to be present as a missing value. If
106106
None, defaults to np.nan.
107107
importance_type: string, default "gain"
108-
The feature importance type for the feature_importances_ property: either "gain",
109-
"weight", "cover", "total_gain" or "total_cover".
108+
The feature importance type for the feature_importances\\_ property:
109+
either "gain", "weight", "cover", "total_gain" or "total_cover".
110110
\\*\\*kwargs : dict, optional
111111
Keyword arguments for XGBoost Booster object. Full documentation of parameters can
112112
be found here: https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst.

0 commit comments

Comments
 (0)