[MRG] FIX/TST pass argument to ratio as callable #307

glemaitre · 2017-07-31T17:26:52Z

Reference Issue

closes #305

What does this implement/fix? Explain your changes.

Allow to pass argument to ratio when this is a function.
It could be useful when the heuristic is outside of the function

Any other comments?

glemaitre · 2017-07-31T17:28:12Z

@chkoar Here it comes. The multiplier could be decided by the user and passed to the function.

codecov · 2017-07-31T17:31:07Z

Codecov Report

Merging #307 into master will increase coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #307      +/-   ##
==========================================
+ Coverage   98.32%   98.34%   +0.01%     
==========================================
  Files          68       68              
  Lines        3890     3918      +28     
==========================================
+ Hits         3825     3853      +28     
  Misses         65       65

Impacted Files	Coverage Δ
imblearn/utils/validation.py	`100% <100%> (ø)`	⬆️
imblearn/utils/tests/test_validation.py	`100% <100%> (ø)`	⬆️
imblearn/ensemble/balance_cascade.py	`100% <0%> (ø)`	⬆️
...prototype_selection/neighbourhood_cleaning_rule.py	`100% <0%> (ø)`	⬆️
...ampling/prototype_selection/one_sided_selection.py	`100% <0%> (ø)`	⬆️
.../under_sampling/prototype_selection/tomek_links.py	`100% <0%> (ø)`	⬆️
..._sampling/prototype_selection/tests/test_allknn.py	`100% <0%> (ø)`	⬆️
imblearn/combine/smote_tomek.py	`100% <0%> (ø)`	⬆️
imblearn/datasets/imbalance.py	`100% <0%> (ø)`	⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33660d4...d251db9. Read the comment docs.

massich · 2017-08-01T10:00:25Z

LGTM

chkoar · 2017-08-01T13:45:23Z

I am -1 on this. That's the point of a callable as ratio.

Design a function that accepts a y as input.
Do whatever you want.

import numpy as np

from collections import Counter
from imblearn.utils import check_ratio
from sklearn.utils.testing import assert_equal

y = np.array([1] * 50 + [2] * 100 + [3] * 25)

def ratio_func(y):
    """samples such that each class will be affected by the multiplier."""
    multiplier = {1: 1.5, 2: 1, 3: 3}
    target_stats = Counter(y)
    return {key: int(values * multiplier[key])
            for key, values in target_stats.items()}

ratio_ = check_ratio(ratio_func, y, 'over-sampling')
assert_equal(ratio_, {1: 25, 2: 0, 3: 50})

massich · 2017-08-01T16:08:24Z

I agree, that you could do what ever you want inside ratio_func and therefore you might not need it. But at the same time, if you can do whatever you want, why wouldn't be possible to get any parameter passed to such function.

What I mean, is:
A) we thought about the API as @chkoar call
B) however, users would like to call it passing parameters.

In the case of having a ratio function that takes some parameter, (A) could wrapper the call, (B) could do it directly. I don't see much problem in allowing the behavior since (A) remains perfectly valid. I might still wrap the function 'cos its more clear. But maybe some user can explode the flexibility of (B) in some pipelining or whatever.

glemaitre · 2017-08-01T16:25:30Z

We probably might want to use **kwargs in this case. The only thing is that we need to make sure it is working inside a grid-search.

glemaitre · 2017-08-02T17:02:00Z

@chkoar Does the change with kwargs make sense.

chkoar · 2017-08-03T08:05:27Z

Still a user has to provide a function like this one, right?

glemaitre · 2017-08-03T08:20:14Z

yep this is one of the possibility. Like this you allow both behaviour.

glemaitre · 2017-08-03T08:20:24Z

In addition of the previous behaviour

glemaitre · 2017-08-04T14:18:06Z

@chkoar Working on the documentation, I went through an example which is really useful.

We are sharing the same ratio behaviour between making a dataset imbalanced than using algorithms to make them balanced.

Therefore, it could be nice to have this feature, if you want to try different level of imbalancing in a dataset. Before, we could provide something like ratio=[0.1, 0.3, 0.5]. If we still want to have a float multiplier we need to define a function but more precisely one function by multiplier.

If we use **kwargs we can efficiently avoid those several functions in a more practical manner.

So IMHO, it makes sense in this configuration, much more than in the opposite one (grid-search the ratio for the algorithm, to get the best performance)

glemaitre · 2017-08-09T22:26:27Z

@chkoar are you still -1?

chkoar · 2017-08-10T11:21:17Z

I am neutral

FIX/TST pass argument to ratio as callable

7d1d083

DOC Whats new entry

e1fb989

glemaitre changed the title ~~[MRG] FIX/TST pass argument to ratio as callable~~ [MRG+1] FIX/TST pass argument to ratio as callable Aug 1, 2017

glemaitre changed the title ~~[MRG+1] FIX/TST pass argument to ratio as callable~~ [RFC] FIX/TST pass argument to ratio as callable Aug 1, 2017

Use **kwargs

d251db9

glemaitre changed the title ~~[RFC] FIX/TST pass argument to ratio as callable~~ [MRG] FIX/TST pass argument to ratio as callable Aug 4, 2017

glemaitre force-pushed the master branch from 9395cbe to 333d81b Compare August 9, 2017 11:53

glemaitre force-pushed the master branch 2 times, most recently from 1b22868 to 33660d4 Compare August 11, 2017 14:43

glemaitre merged commit d4e52e5 into scikit-learn-contrib:master Aug 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] FIX/TST pass argument to ratio as callable #307

[MRG] FIX/TST pass argument to ratio as callable #307

glemaitre commented Jul 31, 2017

glemaitre commented Jul 31, 2017

codecov bot commented Jul 31, 2017 •

edited

Loading

massich commented Aug 1, 2017

chkoar commented Aug 1, 2017 •

edited

Loading

massich commented Aug 1, 2017

glemaitre commented Aug 1, 2017

glemaitre commented Aug 2, 2017

chkoar commented Aug 3, 2017

glemaitre commented Aug 3, 2017

glemaitre commented Aug 3, 2017

glemaitre commented Aug 4, 2017

glemaitre commented Aug 9, 2017

chkoar commented Aug 10, 2017

[MRG] FIX/TST pass argument to ratio as callable #307

[MRG] FIX/TST pass argument to ratio as callable #307

Conversation

glemaitre commented Jul 31, 2017

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

glemaitre commented Jul 31, 2017

codecov bot commented Jul 31, 2017 • edited Loading

Codecov Report

massich commented Aug 1, 2017

chkoar commented Aug 1, 2017 • edited Loading

massich commented Aug 1, 2017

glemaitre commented Aug 1, 2017

glemaitre commented Aug 2, 2017

chkoar commented Aug 3, 2017

glemaitre commented Aug 3, 2017

glemaitre commented Aug 3, 2017

glemaitre commented Aug 4, 2017

glemaitre commented Aug 9, 2017

chkoar commented Aug 10, 2017

codecov bot commented Jul 31, 2017 •

edited

Loading

chkoar commented Aug 1, 2017 •

edited

Loading