[WIP] Quantile dummy #3426

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
6 participants
@RolT

RolT commented Jul 18, 2014

Added quantile strategy in dummy classifier.
Added quantile strategy tests in test_dummy.

@arjoly

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

Thanks for the pr :-)

Member

arjoly commented Jul 18, 2014

Thanks for the pr :-)

sklearn/dummy.py
@@ -317,9 +320,10 @@ class DummyRegressor(BaseEstimator, RegressorMixin):
True if the output at fit is 2d, else false.
"""
- def __init__(self, strategy="mean", constant=None):
+ def __init__(self, strategy="mean", constant=None, alpha=None):

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

Could you document the new parameter and say whenever it's used?

@arjoly

arjoly Jul 18, 2014

Member

Could you document the new parameter and say whenever it's used?

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

I would also highlight the similarity with the median strategy.

@arjoly

arjoly Jul 18, 2014

Member

I would also highlight the similarity with the median strategy.

This comment has been minimized.

Show comment Hide comment
@RolT

RolT Jul 18, 2014

ok I'll do that

@RolT

RolT Jul 18, 2014

ok I'll do that

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

What should be the default? alpha=0.5?

@arjoly

arjoly Jul 18, 2014

Member

What should be the default? alpha=0.5?

@arjoly

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

for reference #3421

Member

arjoly commented Jul 18, 2014

for reference #3421

sklearn/dummy.py
+ elif self.strategy == "quantile":
+ if not self.alpha:
+ raise TypeError("`alpha` value has to be specified "
+ "when the quantile strategy is used.")

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

It would be nice to have default that doesn't raise error by default. Why do you think of 0.5 or None would be equivalent to 0.5.

Maybe the path for median quantile could be merged. What do you think?

@arjoly

arjoly Jul 18, 2014

Member

It would be nice to have default that doesn't raise error by default. Why do you think of 0.5 or None would be equivalent to 0.5.

Maybe the path for median quantile could be merged. What do you think?

This comment has been minimized.

Show comment Hide comment
@RolT

RolT Jul 18, 2014

One reason not to do it is that median uses numpy while percentile uses scipy. The numpy median is probably faster.
The other reason would be that it's easier to do the weighted median in numpy than to implement a weighted percentile.

@RolT

RolT Jul 18, 2014

One reason not to do it is that median uses numpy while percentile uses scipy. The numpy median is probably faster.
The other reason would be that it's easier to do the weighted median in numpy than to implement a weighted percentile.

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

For the weighted percentile, that's not that hard. See #3420.

@arjoly

arjoly Jul 18, 2014

Member

For the weighted percentile, that's not that hard. See #3420.

@arjoly

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

Tests looks good :-)

Member

arjoly commented Jul 18, 2014

Tests looks good :-)

sklearn/dummy.py
@@ -283,6 +285,7 @@ def predict_log_proba(self, X):
class DummyRegressor(BaseEstimator, RegressorMixin):
+

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

Can you remove this lines?

@arjoly

arjoly Jul 18, 2014

Member

Can you remove this lines?

sklearn/dummy.py
class DummyClassifier(BaseEstimator, ClassifierMixin):
+

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

Can you remove this lines?

@arjoly

arjoly Jul 18, 2014

Member

Can you remove this lines?

@arjoly

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 18, 2014

Member

Given Travis, the axis keyword is not present in scipy==0.9. Could you backport it? There are examples of backports in sklearn.utils.fixes?

Member

arjoly commented Jul 18, 2014

Given Travis, the axis keyword is not present in scipy==0.9. Could you backport it? There are examples of backports in sklearn.utils.fixes?

@GaelVaroquaux GaelVaroquaux changed the title from [MRG] Quantile dummy to [WIP] Quantile dummy Jul 19, 2014

@GaelVaroquaux

This comment has been minimized.

Show comment Hide comment
@GaelVaroquaux

GaelVaroquaux Jul 19, 2014

Member

I switched the title back to WIP until the travis failure is not fixed.

Member

GaelVaroquaux commented Jul 19, 2014

I switched the title back to WIP until the travis failure is not fixed.

@coveralls

This comment has been minimized.

Show comment Hide comment
@coveralls

coveralls Jul 22, 2014

Coverage Status

Changes Unknown when pulling 8aff3fa on RolT:quantile_dummy into * on scikit-learn:master*.

Coverage Status

Changes Unknown when pulling 8aff3fa on RolT:quantile_dummy into * on scikit-learn:master*.

@coveralls

This comment has been minimized.

Show comment Hide comment
@coveralls

coveralls Jul 22, 2014

Coverage Status

Changes Unknown when pulling acf4495 on RolT:quantile_dummy into * on scikit-learn:master*.

Coverage Status

Changes Unknown when pulling acf4495 on RolT:quantile_dummy into * on scikit-learn:master*.

@coveralls

This comment has been minimized.

Show comment Hide comment
@coveralls

coveralls Jul 22, 2014

Coverage Status

Changes Unknown when pulling bd92588 on RolT:quantile_dummy into * on scikit-learn:master*.

Coverage Status

Changes Unknown when pulling bd92588 on RolT:quantile_dummy into * on scikit-learn:master*.

sklearn/utils/fixes.py
+ return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
+
+ def scoreatpercentile_axis(a, per, axis):
+ return _scoreatpercentile(a, per, axis=axis)

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 22, 2014

Member

Would it possible to have the same name as the scipy function?
This would help whenever we support the axis argument.

The whole backport could be made outside the try ... except ...

@arjoly

arjoly Jul 22, 2014

Member

Would it possible to have the same name as the scipy function?
This would help whenever we support the axis argument.

The whole backport could be made outside the try ... except ...

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 22, 2014

Member

This would help whenever we support the axis argument.

More precisely, this would help whenever we will want to remove the backport.

@arjoly

arjoly Jul 22, 2014

Member

This would help whenever we support the axis argument.

More precisely, this would help whenever we will want to remove the backport.

- print(80 * '_')
- return
+ raise ValueError('Example directory %s does not have a README.txt' %
+ src_dir)

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 22, 2014

Member

Hm strange. I wouldn't expect this was modified.

@arjoly

arjoly Jul 22, 2014

Member

Hm strange. I wouldn't expect this was modified.

+Neural Networks
+-----------------------
+
+Examples concerning the :mod:`sklearn.neural_network` module.

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 22, 2014

Member

Hm strange. I wouldn't expect this was modified.

@arjoly

arjoly Jul 22, 2014

Member

Hm strange. I wouldn't expect this was modified.

This comment has been minimized.

Show comment Hide comment
@amueller

amueller Jul 22, 2014

Member

Rebase error with #3460. Can you try to rebase again @RolT ?

@amueller

amueller Jul 22, 2014

Member

Rebase error with #3460. Can you try to rebase again @RolT ?

sklearn/dummy.py
constant : int or float or array of shape = [n_outputs]
The explicit constant as predicted by the "constant" strategy. This
parameter is useful only for the "constant" strategy.
+ alpha : float, optional.

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Jul 22, 2014

Member

What do you think of naming this parameter quantile?

@arjoly

arjoly Jul 22, 2014

Member

What do you think of naming this parameter quantile?

@arjoly

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Aug 11, 2014

Member

Looks good with an update in doc / modules / model_evaluation.rst about DummyRegressor.

Member

arjoly commented Aug 11, 2014

Looks good with an update in doc / modules / model_evaluation.rst about DummyRegressor.

@arjoly

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Aug 13, 2014

Member

@RolT Do you plan to finish this pr? I am checking that this pr hasn't stalled.

Member

arjoly commented Aug 13, 2014

@RolT Do you plan to finish this pr? I am checking that this pr hasn't stalled.

@RolT

This comment has been minimized.

Show comment Hide comment
@RolT

RolT Aug 14, 2014

I completely forgot about that PR, I'm a bit busy right now but I'll look into that soon. Sorry about the delay

RolT commented Aug 14, 2014

I completely forgot about that PR, I'm a bit busy right now but I'll look into that soon. Sorry about the delay

split test_common.py into checks and test file.
move dataset generation into estimator_checks

Closes #2360. Fix tiebreaking.

some cleanups in common_test, speedup.

Make fit_transform and fit().transform() equivalent in nmf

slight speedup
some cleanup

Make everything accept lists as input.

Remove 'DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future' warnings

Remove a couple more 'using a non-integer number instead of an integer DeprecationWarning'

Added a parameter check on n_estimators in BaseEnsemble to raise error if it is not strictly positive.

ENH Make all decision functions have the same shape. Fixes SVC and GradientBoosting. Closes #1490.

ENH better default for test for SelectKBest and random projection

MAINT tree compute feature_importance by default

pep8

ENH add label ranking average precision

DOC write narrative doc for label ranking average precision

DOC FIX error + wording

TST invariance testing + handle degenerate case

flake8

FIX use np.bincount

DOC friendlier narrative documentation

ENH simplify label ranking average precision (thanks @jnothman)

ENH be backward compatible for old version of scipy

Typo

pep8

DOC remove confusing mention to mean average precision

DOC improve documentatino thanks to @vene and remove mention of relevant labels

DOC more intuition about corner case

DOC add documentation to backported function

ENH less nested code

FIX encoding issue

DOC update what's new

MAINT deprecate fit_ovr, fit_ovo, fit_ecoc, predict_ovr, predict_ovo, predict_ecoc and predict_proba_ovr

ENH + DOC set a default scorer in the multiclass module

DOC typo + not forgetting single output case

scorer: add sample_weight support

COSMIT Use explicit if/else in scorer

TST default scorers with sample_weight

DOC Update What's New

MAINT split sklearn/metrics/metrics.py

DOC improve documentation and distinguish each module

ENH add a friendly warnings before deleting the file

TST: fix tests on numpy 1.9.b2

Refactor input validation.

remove check_arrays stuff and old input validation

ENH add allowed_sparse named argument for @ogrisel

FIX classes name in OvR

DOC mention doc-building dependency on Pillow

DOC link example gallery scripts rather than inline

COSMIT

DOC show referring examples on API reference pages

DOC ensure longer underline

DOC Fix example path

DOC fix 'Return' -> 'Returns'

FIX Py3k support for out-of-core example

DOC make neural networks example appear

Also make the absence of a README fail doc compilation

DOC add links to github sourcecode in API reference

DOC fix opaque background glitch when hovering example icons

FIX better RandomizedPCA sparse deprecation

added quantile strategy to dummy classifier

replaced quantile by percentile, added docstrings

switched back to quantile

final commit

forgot to save the file

removed blank lines, added default alpha param

added backport for axis keyword in scipy.scoreatpercentile

added quantile strategy to dummy classifier

replaced quantile by percentile, added docstrings

switched back to quantile

final commit

forgot to save the file

added backport for axis keyword in scipy.scoreatpercentile

removed scoreatpercentile axis dependency in the dummy tests

typo

fixed import

fixed axis keyword mistake

changed fix function name, fixed rebase bug

removed blank lines, added default alpha param

removed blank lines, added default alpha param

removed unnecessary import

docstring merge typo

updated doc (quantile dummy regressor)
@RolT

This comment has been minimized.

Show comment Hide comment
@RolT

RolT Sep 8, 2014

This PR is a mess, I'll do a new clean one in 1 step when I get some time... I cant rebase anymore

RolT commented Sep 8, 2014

This PR is a mess, I'll do a new clean one in 1 step when I get some time... I cant rebase anymore

@RolT RolT closed this Sep 8, 2014

@RolT RolT deleted the RolT:quantile_dummy branch Sep 8, 2014

@staple

This comment has been minimized.

Show comment Hide comment
@staple

staple Oct 2, 2014

Contributor

@RoIT Just wanted to check in to see if you are still planning to work on this. In case not, I'd like to give it a try. But if you're still planning to work on it, then no worries.

Contributor

staple commented Oct 2, 2014

@RoIT Just wanted to check in to see if you are still planning to work on this. In case not, I'd like to give it a try. But if you're still planning to work on it, then no worries.

@arjoly

This comment has been minimized.

Show comment Hide comment
@arjoly

arjoly Oct 6, 2014

Member

@staple I think that you can give it a try.

Member

arjoly commented Oct 6, 2014

@staple I think that you can give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment