Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Rename CV params n_{folds,iter} to n_splits #7187

Merged
merged 17 commits into from
Aug 16, 2016

Conversation

yenchenlin
Copy link
Contributor

@yenchenlin yenchenlin commented Aug 13, 2016

Since now all CV splitters have get_n_splits(), we can rename n_{folds,iter} to n_splits in order to avoid confusion.

More detailed discussion can be found in #7169 .

  • Rename n_iter to n_splits
  • Rename n_ folds to n_splits

@@ -134,7 +134,7 @@ def test_cross_validator_with_default_params():
n_unique_labels = 4
n_folds = 2
p = 2
n_iter = 10 # (the default value)
n_splits= 10 # (the default value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space

@jnothman
Copy link
Member

One question is whether we should be doing any deprecation... I think not, in this case.

@jnothman
Copy link
Member

Test failures.

@yenchenlin
Copy link
Contributor Author

yenchenlin commented Aug 14, 2016

Oh yeah, I submit this PR to ask question:

Should I modify old splitter in sklearn/cross_validation.py?
This line seems to use old splitter, and therefore results in error.

Update: Never mind, I was trapped 👋 .

@jnothman
Copy link
Member

More test failures, though!

@jnothman
Copy link
Member

Remaining failures are uses of n_folds in doc/tutorial/statistical_inference/model_selection.rst

@yenchenlin yenchenlin changed the title [WIP] Rename CV params n_{folds,iter} to n_splits [MRG] Rename CV params n_{folds,iter} to n_splits Aug 14, 2016
@yenchenlin
Copy link
Contributor Author

@jnothman @MechCoder CI is smiling 😄 , I think it's ready to be reviewed!

@@ -128,7 +128,7 @@
# To get nice curve, we need a large number of iterations to
# reduce the variance
grid = GridSearchCV(clf, refit=False, param_grid=param_grid,
cv=ShuffleSplit(train_size=train_size, n_iter=250,
cv=ShuffleSplit(train_size=train_size, n_splits=250,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8

@jnothman
Copy link
Member

jnothman commented Aug 14, 2016

Multiple PEP8 line length violations...

"train_size=None)")
ps_repr = "PredefinedSplit(test_fold=array([1, 1, 2, 2]))"

n_splits = [n_samples, comb(n_samples, p), n_folds, n_folds,
n_unique_labels, comb(n_unique_labels, p), n_iter, 2]
splits_cnts = [n_samples, comb(n_samples, p), n_splits, n_splits,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected_n_splits or n_splits_expected might be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that cnt, apart from being an abbreviation we should try to avoid for clarity, means the same as n. So splits_cnt means the same as n_splits and so the naming tells us little of what it contains.

Another option is renaming the new n_splits to n_splits_param

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the elaboration, it's useful for me in the future 😄

@jnothman
Copy link
Member

Otherwise LGTM

@jnothman
Copy link
Member

Please draft a change to what's new, both in the Model Selection Enhancements section and in, perhaps, API Changes.

@raghavrv
Copy link
Member

Yes this should go into model selection enhancements section. Otherwise with a whatsnew and reviews addressed this LGTM too.... Let's merge this! Thanks for the PR @yenchenlin

@@ -62,6 +62,16 @@ Model Selection Enhancements and API Changes
the corresponding parameter is not applicable. Additionally a list of all
the parameter dicts are stored at ``results_['params']``.

- **Renaming of `n_folds` and `n_iter` to `n_splits`**

The number of folds (i.e. the number of partitions of a dataset) is not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't need so much motivation here, you just need to say something like "In the new model_selection module, some parameter names have changed: the n_folds parameter in x, y, z is now n_splits; the n_iter parameter in a, b, c is now n_splits". Please use :class: as appropriate.

@yenchenlin
Copy link
Contributor Author

Comments addressed, please have a look!

@jnothman
Copy link
Member

LGTM

@jnothman jnothman changed the title [MRG] Rename CV params n_{folds,iter} to n_splits [MRG+1] Rename CV params n_{folds,iter} to n_splits Aug 16, 2016
@@ -62,6 +62,16 @@ Model Selection Enhancements and API Changes
the corresponding parameter is not applicable. Additionally a list of all
the parameter dicts are stored at ``results_['params']``.

- **Renaming of ``n_folds`` and ``n_iter`` to ``n_splits``**

Some parameter names have changed:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could remove this line I think. You already have a heading...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is good.

Copy link
Member

@raghavrv raghavrv Aug 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought after the below suggested change, this is not needed...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps...? I'm unconvinced that this series of comments markedly contributes to the quality of the text in terms of convention, clarity or information structure.

@raghavrv
Copy link
Member

Feel free to add a +1 once the comments are addressed. Ping @jnothman for merge once done...

@raghavrv
Copy link
Member

raghavrv commented Aug 16, 2016

@jnothman Would you like to merge it as such? LGTM...

@yenchenlin
Copy link
Contributor Author

@raghavrv great thanks a lot for your review!
@jnothman comments addressed!

@agramfort
Copy link
Member

Lgtm

@jnothman merge if you are happy

@MechCoder MechCoder merged commit 42120e5 into scikit-learn:master Aug 16, 2016
@MechCoder
Copy link
Member

(Y)

@amueller
Copy link
Member

aaaand my book broke ^^ glad it's not in print yet lol. I should do that release some time soom.

@jnothman
Copy link
Member

hahahahaha. Glad you wrote it for the version where everything changes.

@amueller
Copy link
Member

Well that's why I knew I couldn't do it for 0.17. But It would be great to release 0.18 within the next month (which I think is doable).

@amueller
Copy link
Member

Shouldn't the default implementation of get_n_splits be return n_splits?

@jnothman
Copy link
Member

I think it is better explicitly implemented...?

TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016
* Rename n_iter to n_splits

* Fix bug

* Fix examples

* Add spaces

* Rename n_folds to n_splits

* Fix error

* Fix doc

* Fix doc

* Fix example

* Rename variables name

* PEP8

* Fix error message

* Add whats_new

* Fix test

* Fix doc

* Fix doc

* Make test clear
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants