diff --git a/doc/glossary.rst b/doc/glossary.rst index ff3a3da0cbfb7..8ae39edf55c28 100644 --- a/doc/glossary.rst +++ b/doc/glossary.rst @@ -1177,18 +1177,19 @@ Methods predicted class. Columns are ordered according to :term:`classes_`. multilabel classification - Scikit-learn is inconsistent in its representation of multilabel - decision functions. Some estimators represent it like multiclass - multioutput, i.e. a list of 2d arrays, each with two columns. Others - represent it with a single 2d array, whose columns correspond to - the individual binary classification decisions. The latter - representation is ambiguously identical to the multiclass - classification format, though its semantics differ: it should be - interpreted, like in the binary case, by thresholding at 0. - - TODO: `This gist - `_ - highlights the use of the different formats for multilabel. + Scikit-learn is inconsistent in its representation of :term:`multilabel` + decision functions. It may be represented one of two ways: + + - List of 2d arrays, each array of shape: (`n_samples`, 2), like in + multiclass multioutput. List is of length `n_labels`. + + - Single 2d array of shape (`n_samples`, `n_labels`), with each + 'column' in the array corresponding to the individual binary + classification decisions. This is identical to the + multiclass classification format, though its semantics differ: it + should be interpreted, like in the binary case, by thresholding at + 0. + multioutput classification A list of 2d arrays, corresponding to each multiclass decision function.