New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] split data using _safe_split in _permutaion_test_scorer to fix error… #5697

Merged
merged 1 commit into from Dec 29, 2016

Conversation

Projects
None yet
6 participants
@equialgo
Contributor

equialgo commented Nov 2, 2015

… when using Pandas DataFrame/Series

Related to issue #5696

@raghavrv

This comment has been minimized.

Show comment
Hide comment
@raghavrv

raghavrv Nov 30, 2015

Member

thanks for fixing this!

Could you also add this fix to the model_selection module and also add a NRT to model_selection/tests/test_validation.py? :)

Member

raghavrv commented Nov 30, 2015

thanks for fixing this!

Could you also add this fix to the model_selection module and also add a NRT to model_selection/tests/test_validation.py? :)

@raghavrv

This comment has been minimized.

Show comment
Hide comment
@raghavrv

raghavrv Nov 30, 2015

Member

Also could you edit the

PR title to -
[MRG] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

And the PR description to "Fixes #5696"

Member

raghavrv commented Nov 30, 2015

Also could you edit the

PR title to -
[MRG] split data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

And the PR description to "Fixes #5696"

@amueller amueller changed the title from split data using _safe_split in _permutaion_test_scorer to fix error… to [MRG] split data using _safe_split in _permutaion_test_scorer to fix error… Dec 10, 2015

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Dec 10, 2015

Member

yeah a test would be great, using the mock data frame or an acutal dataframe if pandas is installed.

Member

amueller commented Dec 10, 2015

yeah a test would be great, using the mock data frame or an acutal dataframe if pandas is installed.

@equialgo

This comment has been minimized.

Show comment
Hide comment
@equialgo

equialgo Jan 3, 2016

Contributor

First of all, sorry for the late reply.

I added the fix also to model_selection/_validation.py.

Moreover, I added a test_permutation_test_score_pandas() test case to model_selection/tests/test_validation.py (based on the test_cross_val_score_pandas() test case)

Contributor

equialgo commented Jan 3, 2016

First of all, sorry for the late reply.

I added the fix also to model_selection/_validation.py.

Moreover, I added a test_permutation_test_score_pandas() test case to model_selection/tests/test_validation.py (based on the test_cross_val_score_pandas() test case)

@dankessler

This comment has been minimized.

Show comment
Hide comment
@dankessler

dankessler Dec 14, 2016

Is there additional work that needs to be done before this can be merged? I'm still encountering the issue that this addresses (#5696), which I'm working around by calling .values on my series. Is it just the merge conflict that's hairy? If so, I'm happy to take a crack.

dankessler commented Dec 14, 2016

Is there additional work that needs to be done before this can be merged? I'm still encountering the issue that this addresses (#5696), which I'm working around by calling .values on my series. Is it just the merge conflict that's hairy? If so, I'm happy to take a crack.

@amueller amueller added the Bug label Dec 14, 2016

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Dec 14, 2016

Member

Wow sorry this one has been lying around for a bit. It needs a rebase and an entry to whatsnew.rst - and reviews, though that should be reasonably quick.

Member

amueller commented Dec 14, 2016

Wow sorry this one has been lying around for a bit. It needs a rebase and an entry to whatsnew.rst - and reviews, though that should be reasonably quick.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Dec 14, 2016

Member

This actually fixes another issue, namely using precomputed kernels / distance matrices with permutation_score. It would be great if there was also a regression test for that.
Otherwise looks good to me. Sorry for the slow turn-around.

Member

amueller commented Dec 14, 2016

This actually fixes another issue, namely using precomputed kernels / distance matrices with permutation_score. It would be great if there was also a regression test for that.
Otherwise looks good to me. Sorry for the slow turn-around.

check_df = lambda x: isinstance(x, InputFeatureType)
check_series = lambda x: isinstance(x, TargetType)
clf = CheckingClassifier(check_X=check_df, check_y=check_series)
permutation_test_score(clf, X_df, y_ser)

This comment has been minimized.

@jnothman

jnothman Dec 14, 2016

Member

Ideally we should check that, for fixed random_state, results are identical regardless of input type.

@jnothman

jnothman Dec 14, 2016

Member

Ideally we should check that, for fixed random_state, results are identical regardless of input type.

This comment has been minimized.

@equialgo

equialgo Dec 24, 2016

Contributor

At the moment, such input tests are also not present for the other validation methods, there is only: test_cross_val_score_pandas(), test_cross_val_predict_pandas().

I wonder is that really a necessity?

@equialgo

equialgo Dec 24, 2016

Contributor

At the moment, such input tests are also not present for the other validation methods, there is only: test_cross_val_score_pandas(), test_cross_val_predict_pandas().

I wonder is that really a necessity?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Dec 14, 2016

Member

@equialgo Sorry for the extreme delay. Let us know if you have time soon to complete the changes.

Member

jnothman commented Dec 14, 2016

@equialgo Sorry for the extreme delay. Let us know if you have time soon to complete the changes.

@equialgo

This comment has been minimized.

Show comment
Hide comment
@equialgo

equialgo Dec 24, 2016

Contributor

Hi guys, I rebased the fix it should be relatively easy to merge now (if it passes the checks).

@amueller I do not really get why this also fixes the issue using precomputed kernels / distance matrices with permutation_score. If it is trivial please elaborate, otherwise it might be better to add a new ticket for that one.

Contributor

equialgo commented Dec 24, 2016

Hi guys, I rebased the fix it should be relatively easy to merge now (if it passes the checks).

@amueller I do not really get why this also fixes the issue using precomputed kernels / distance matrices with permutation_score. If it is trivial please elaborate, otherwise it might be better to add a new ticket for that one.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Dec 25, 2016

Member

You've done something strange with your commit history. Perhaps a rebase is in order.

Member

jnothman commented Dec 25, 2016

You've done something strange with your commit history. Perhaps a rebase is in order.

@equialgo

This comment has been minimized.

Show comment
Hide comment
@equialgo

equialgo Dec 28, 2016

Contributor

@jnothman oops, kinda messed-up there! Re-did the rebase; commit history should be okay now.

Contributor

equialgo commented Dec 28, 2016

@jnothman oops, kinda messed-up there! Re-did the rebase; commit history should be okay now.

fixing E221 multiple spaces before operator flake8 warnings (+6 squas…
…hed commits)

Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

@GaelVaroquaux GaelVaroquaux changed the title from [MRG] split data using _safe_split in _permutaion_test_scorer to fix error… to [MRG+1] split data using _safe_split in _permutaion_test_scorer to fix error… Dec 28, 2016

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Dec 28, 2016

Member

+1 to merge this guy (there is a flake8 failure, but it's because of lambda expressions, and here they seem legit).

Let's try to merge this fast, it's been lying around for too long.

Member

GaelVaroquaux commented Dec 28, 2016

+1 to merge this guy (there is a flake8 failure, but it's because of lambda expressions, and here they seem legit).

Let's try to merge this fast, it's been lying around for too long.

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Dec 28, 2016

Member

@raghavrv : are you +1 for merge? If so, let's merge

Member

GaelVaroquaux commented Dec 28, 2016

@raghavrv : are you +1 for merge? If so, let's merge

@raghavrv

This comment has been minimized.

Show comment
Hide comment
@raghavrv

raghavrv Dec 28, 2016

Member

I think this needs a whatsnew entry? I can send a quick PR after this gets merged if you want. Otherwise LGTM... That pep8 error needs to be added to ignore list... (#8131)

Member

raghavrv commented Dec 28, 2016

I think this needs a whatsnew entry? I can send a quick PR after this gets merged if you want. Otherwise LGTM... That pep8 error needs to be added to ignore list... (#8131)

@raghavrv

Pending whatsnew... (Feel free to merge and later add a whatsnew...)

@raghavrv raghavrv removed the Needs Review label Dec 28, 2016

@raghavrv raghavrv merged commit 986a49b into scikit-learn:master Dec 29, 2016

2 of 3 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@raghavrv

This comment has been minimized.

Show comment
Hide comment
@raghavrv

raghavrv Dec 29, 2016

Member

Okay... I'm merging this... Il make a whatsnew entry for this and another PR tomorrow...

Member

raghavrv commented Dec 29, 2016

Okay... I'm merging this... Il make a whatsnew entry for this and another PR tomorrow...

@raghavrv

This comment has been minimized.

Show comment
Hide comment
@raghavrv
Member

raghavrv commented Dec 29, 2016

Thanks @equialgo!

@equialgo

This comment has been minimized.

Show comment
Hide comment
@equialgo

equialgo Dec 29, 2016

Contributor

You guys thanks for merging!

Contributor

equialgo commented Dec 29, 2016

You guys thanks for merging!

raghavrv added a commit to raghavrv/scikit-learn that referenced this pull request Jan 5, 2017

FIX Split data using _safe_split in _permutaion_test_score (#5697)
Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

sergeyf added a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017

FIX Split data using _safe_split in _permutaion_test_score (#5697)
Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

@Przemo10 Przemo10 referenced this pull request Mar 17, 2017

Closed

update fork (#1) #8606

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

FIX Split data using _safe_split in _permutaion_test_score (#5697)
Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

FIX Split data using _safe_split in _permutaion_test_score (#5697)
Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

FIX Split data using _safe_split in _permutaion_test_score (#5697)
Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

FIX Split data using _safe_split in _permutaion_test_score (#5697)
Squashed commits:
[94fd9f4] split data using _safe_split in _permutaion_test_scorer
[522053b] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[21b23ce] running test_permutation_test_score_pandas on iris data to prevent warnings.
[15a48bf] adding safe_indexing to _shuffle function
[9ea5c9e] adding test case test_permutation_test_score_pandas() to check if permutation_test_score plays nice with pandas dataframe/series
[3cf5e8f] split  data using _safe_split in _permutaion_test_scorer to fix error when using Pandas DataFrame/Series
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment