Skip to content

Commit

Permalink
DOC Clarify RobustScaler behavior with sparse input (#8858)
Browse files Browse the repository at this point in the history
  • Loading branch information
naoyak authored and jnothman committed Jul 29, 2017
1 parent 30a7ce9 commit f6c7080
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 6 deletions.
2 changes: 1 addition & 1 deletion doc/modules/preprocessing.rst
Expand Up @@ -199,7 +199,7 @@ matrices as input, as long as ``with_mean=False`` is explicitly passed
to the constructor. Otherwise a ``ValueError`` will be raised as
silently centering would break the sparsity and would often crash the
execution by allocating excessive amounts of memory unintentionally.
:class:`RobustScaler` cannot be fited to sparse inputs, but you can use
:class:`RobustScaler` cannot be fitted to sparse inputs, but you can use
the ``transform`` method on sparse inputs.

Note that the scalers accept both Compressed Sparse Rows and Compressed
Expand Down
13 changes: 8 additions & 5 deletions sklearn/preprocessing/data.py
Expand Up @@ -945,9 +945,9 @@ class RobustScaler(BaseEstimator, TransformerMixin):
and the 3rd quartile (75th quantile).
Centering and scaling happen independently on each feature (or each
sample, depending on the `axis` argument) by computing the relevant
sample, depending on the ``axis`` argument) by computing the relevant
statistics on the samples in the training set. Median and interquartile
range are then stored to be used on later data using the `transform`
range are then stored to be used on later data using the ``transform``
method.
Standardization of a dataset is a common requirement for many
Expand All @@ -964,7 +964,7 @@ class RobustScaler(BaseEstimator, TransformerMixin):
----------
with_centering : boolean, True by default
If True, center the data before scaling.
This does not work (and will raise an exception) when attempted on
This will cause ``transform`` to raise an exception when attempted on
sparse matrices, because centering them entails building a dense
matrix which in common use cases is likely to be too large to fit in
memory.
Expand Down Expand Up @@ -1059,11 +1059,14 @@ def fit(self, X, y=None):
return self

def transform(self, X):
"""Center and scale the data
"""Center and scale the data.
Can be called on sparse input, provided that ``RobustScaler`` has been
fitted to dense input and ``with_centering=False``.
Parameters
----------
X : array-like
X : {array-like, sparse matrix}
The data used to scale along the specified axis.
"""
if self.with_centering:
Expand Down

0 comments on commit f6c7080

Please sign in to comment.