New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] ENH: (minor) add quantile_low/high parameters to robust scaler #5929

Merged
merged 5 commits into from Aug 13, 2016

Conversation

Projects
None yet
6 participants
@podshumok
Contributor

podshumok commented Nov 27, 2015

No description provided.

@@ -911,6 +912,12 @@ class RobustScaler(BaseEstimator, TransformerMixin):
with_scaling : boolean, True by default
If True, scale the data to interquartile range.
quantile_low : float

This comment has been minimized.

@giorgiop

giorgiop Nov 27, 2015

Contributor

Could you please add .. versionadded:: 0.18 where needed in the docstrings?

@@ -1065,6 +1077,12 @@ def robust_scale(X, axis=0, with_centering=True, with_scaling=True, copy=True):
If True, scale the data to unit variance (or equivalently,
unit standard deviation).
quantile_low : float

This comment has been minimized.

@giorgiop

giorgiop Nov 27, 2015

Contributor

Maybe, also say what are the constraints for the floats such that the param makes sense.

iqr = q[1] - q[0]
assert_array_almost_equal(iqr, 1)

This comment has been minimized.

@giorgiop

giorgiop Nov 27, 2015

Contributor

Could you tests for extreme values of quantile_low/high? For example when quantile_low >= quantile_high and when quantile_low=0 etc. Also, does RobustScaler ever reduces to StandardScaler for any value of those parameters?

This comment has been minimized.

@podshumok

podshumok Nov 28, 2015

Contributor

Is it expected to raise at __init__ time, at set_params time or at fit time?

@giorgiop

This comment has been minimized.

Contributor

giorgiop commented Nov 27, 2015

This looks like an useful addition to me. @GaelVaroquaux any opinion?
Also, whats_new might need to be updated.

@podshumok

This comment has been minimized.

Contributor

podshumok commented Nov 30, 2015

Could you tests for extreme values of quantile_low/high? For example when quantile_low >= quantile_high and when quantile_low=0 etc. Also, does RobustScaler ever reduces to StandardScaler for any value of those parameters?

@giorgiop: Is it expected to raise at __init__ time, at set_params time or at fit time?

@giorgiop

This comment has been minimized.

Contributor

giorgiop commented Nov 30, 2015

I think the best is at fit time. You could use MinMaxScaler as inspiration.

@podshumok podshumok changed the title from add quantile_low/high parameters to robust scaler to [NEEDS REVIEW] ENH: (minor) add quantile_low/high parameters to robust scaler Nov 30, 2015

@podshumok

This comment has been minimized.

Contributor

podshumok commented Dec 1, 2015

Also, does RobustScaler ever reduces to StandardScaler for any value of those parameters?

@giorgiop
In general it is not the case. But if you know original distribution of X and sample is large enough then you can calculate quantiles for mean +- std.

@@ -911,6 +912,14 @@ class RobustScaler(BaseEstimator, TransformerMixin):
with_scaling : boolean, True by default
If True, scale the data to interquartile range.

This comment has been minimized.

@giorgiop

giorgiop Dec 2, 2015

Contributor

cosmetics: only one blank line in the docstrings, same below.

self.quantile_range[1] <= 0,
self.quantile_range[0] >= 100,
self.quantile_range[1] >= 100,
)):

This comment has been minimized.

@giorgiop

giorgiop Dec 2, 2015

Contributor

Can you do if not 0 < self.quantile_range[0] < self.quantile_range[1] < 100: instead?

@giorgiop

This comment has been minimized.

Contributor

giorgiop commented Dec 2, 2015

For the rest, LGMT.

@podshumok podshumok changed the title from [NEEDS REVIEW] ENH: (minor) add quantile_low/high parameters to robust scaler to [MRG] ENH: (minor) add quantile_low/high parameters to robust scaler Dec 2, 2015

@podshumok

This comment has been minimized.

Contributor

podshumok commented Dec 5, 2015

git blame shows some docstrings was added by @amueller, may be he will be interested in reviewing this.
hello

@@ -1,3 +1,4 @@

This comment has been minimized.

@glouppe

glouppe Feb 11, 2016

Member

Please remove this blank line

@@ -911,6 +915,12 @@ class RobustScaler(BaseEstimator, TransformerMixin):
with_scaling : boolean, True by default
If True, scale the data to interquartile range.
quantile_range : tuple (q_min, q_max), 0.0 < q_min < q_max < 100.0
Default: (25.0, 75.0) = (1st quntile, 3rd quntile) = IQR

This comment has been minimized.

@glouppe

glouppe Feb 11, 2016

Member

quntile -> quartile?

@@ -1065,6 +1083,12 @@ def robust_scale(X, axis=0, with_centering=True, with_scaling=True, copy=True):
If True, scale the data to unit variance (or equivalently,
unit standard deviation).
quantile_range : tuple (q_min, q_max), 0.0 < q_min < q_max < 100.0
Default: (25.0, 75.0) = (1st quntile, 3rd quntile) = IQR

This comment has been minimized.

@glouppe

glouppe Feb 11, 2016

Member

same here

@glouppe

This comment has been minimized.

Member

glouppe commented Feb 11, 2016

Seems like a fine addition to me. +1 for merge provided my comments are fixed.

(Sorry for the slow reply here...)

@glouppe glouppe changed the title from [MRG] ENH: (minor) add quantile_low/high parameters to robust scaler to [MRG+1] ENH: (minor) add quantile_low/high parameters to robust scaler Feb 11, 2016

@podshumok

This comment has been minimized.

Contributor

podshumok commented Jul 29, 2016

bump?

@amueller

This comment has been minimized.

Member

amueller commented Jul 29, 2016

Sorry for the slow turnaround. LGTM.

@amueller

This comment has been minimized.

Member

amueller commented Jul 29, 2016

Do you want to add an entry to whatsnew?

@amueller amueller changed the title from [MRG+1] ENH: (minor) add quantile_low/high parameters to robust scaler to [MRG+2] ENH: (minor) add quantile_low/high parameters to robust scaler Jul 29, 2016

@podshumok

This comment has been minimized.

Contributor

podshumok commented Aug 10, 2016

Sorry, didn't realize that was necessarily.
Do I have to rebase now or something?

@@ -59,6 +59,10 @@ Enhancements
generated during build. Distribution packages will still contain generated
C/C++ files. By `Arthur Mensch`_
- :class: `RobustScaler` now accepts ``quantile_range`` parameter.

This comment has been minimized.

@nelson-liu

nelson-liu Aug 10, 2016

Contributor

remove the space between class: and RobustScaler to have it link properly

@@ -59,6 +59,10 @@ Enhancements
generated during build. Distribution packages will still contain generated
C/C++ files. By `Arthur Mensch`_
- :class: `RobustScaler` now accepts ``quantile_range`` parameter.
(`#5929 <https://github.com/scikit-learn/scikit-learn/pull/5929>`_)
By `Konstantin Podshumok`_.

This comment has been minimized.

@nelson-liu

nelson-liu Aug 10, 2016

Contributor

you're creating a link here, so you should add a definition at the bottom of this file

@podshumok

This comment has been minimized.

Contributor

podshumok commented Aug 11, 2016

yep
and now we have conflicts

@nelson-liu

This comment has been minimized.

Contributor

nelson-liu commented Aug 11, 2016

Can you rebase on master?

@podshumok

This comment has been minimized.

Contributor

podshumok commented Aug 11, 2016

I'm trying but strangely after rebase I find myself in detached HEAD

@podshumok

This comment has been minimized.

Contributor

podshumok commented Aug 11, 2016

Looks like it is ok now

@agramfort agramfort merged commit f893565 into scikit-learn:master Aug 13, 2016

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@agramfort

This comment has been minimized.

Member

agramfort commented Aug 13, 2016

thanks @podshumok

@podshumok podshumok deleted the podshumok:minor_enh_add_quantile_params_to_robust_scaler branch Aug 13, 2016

TomDLT added a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016

[MRG+2] ENH: (minor) add quantile_low/high parameters to robust scaler (
scikit-learn#5929)

* add quantile_low/high parameters to robust scaler

add versionadded tags

* robust scaler: add parameter for quantile_range, check its validity

* robust scaler: simplify params validations

fix typo in docstrings of RobustScaler

* preprocessing.data fix some lines > 80 columns

* add whatsnew for robust scaler quantile_range param

add missing url link in whatsnew
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment