Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random state parameter doomed to fail in IterativeStratification #251

Closed
kamilc-bst opened this issue Nov 8, 2022 · 3 comments
Closed

Comments

@kamilc-bst
Copy link

kamilc-bst commented Nov 8, 2022

Hi there,

First of all great job on stratification - it is already very useful in our recent project :)

I encountered small issue tho
I'm trying to use IterativeStratification and one it's parameters, random_state seems to be a trap

This code works fine:

from skmultilearn.model_selection import IterativeStratification
test_size = 0.2

stratifier = IterativeStratification(
    n_splits=2, order=2,
    sample_distribution_per_fold=[
        test_size, 1.0 - test_size],
)

train_indices, test_indices = next(
    stratifier.split(
        X=np.random.random((100,4)), 
        y=(np.random.random((100,4)) > 0.5).astype(int)    
    )
)

while this

from skmultilearn.model_selection import IterativeStratification
test_size = 0.2

stratifier = IterativeStratification(
    n_splits=2, order=2,
    sample_distribution_per_fold=[
        test_size, 1.0 - test_size],
    random_state = 42
)

train_indices, test_indices = next(
    stratifier.split(
        X=np.random.random((100,4)), 
        y=(np.random.random((100,4)) > 0.5).astype(int)    
    )
)

produces

ValueError: Setting a random_state has no effect since shuffle is False. You should leave random_state to its default (None), or set shuffle=True.

but shuffle is hardcoded as False in IterativeStratification super class call

@kamilc-bst
Copy link
Author

that seems to be related to #234

@suchith-sixsense
Copy link

#248 gives the fix for IterativeStratification.

@erikhuck
Copy link

I was able to do a workaround using numpy.random.seed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants