Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up xvalidation API #4049

Open
karlnapf opened this issue Dec 25, 2017 · 5 comments
Open

Clean up xvalidation API #4049

karlnapf opened this issue Dec 25, 2017 · 5 comments

Comments

@karlnapf
Copy link
Member

karlnapf commented Dec 25, 2017

  • Let's make the splitting strategy an iterator
  • let's rename generate_subset_indices and generate_subset_inverse to something like get_train and get_validation or even better train and validation
  • Let's get rid of the build_subsets method (and do it in the constructor or lazily when the first index set is requested)
  • make the methods that return indices const and make then return a const vector (which then can be a read-only view on some pre-allocated array)

Probably more to come. Have a look at the sklearn splittings at e.g. here
See also #3965

@karlnapf
Copy link
Member Author

Good example for time-series splitting based on simple iterator approach
https://github.com/scikit-learn/scikit-learn/blob/ea6ea815e1071bfb9de18953fea98b136a7fa8ed/sklearn/model_selection/_split.py#L770

@braceletboy
Copy link

braceletboy commented Jan 10, 2018

@karlnapf I wanna take this up.

@karlnapf
Copy link
Member Author

Go for it

@luisffranca
Copy link
Contributor

Hi @karlnapf !

By "make the splitting strategy an iterator" you mean creating a method (like split on sklearn's example) that would return iterators for the training (and test) set indices for that subset index? Such a method would internally call generate_subset_indices and generate_subset_inverse (with their new names as you suggested). Is my understanding correct?

@karlnapf
Copy link
Member Author

An iterator would be a next step, much cleaner, yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants