Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a LearningCurveSplitter to test sample efficiency #23817

Open
CompRhys opened this issue Jul 1, 2022 · 0 comments
Open

Add a LearningCurveSplitter to test sample efficiency #23817

CompRhys opened this issue Jul 1, 2022 · 0 comments
Labels
Needs Decision - Include Feature Requires decision regarding including feature New Feature

Comments

@CompRhys
Copy link

CompRhys commented Jul 1, 2022

Describe the workflow you want to enable

In applications of machine learning the error of the model typically follows a power law with the number of training examples. In the ML for material science and molecules community there is lots of interest in using such plots to asses model performance however as of yet there is no standard splitting tool to make such splits. A standard splitter would improve the reproducibility and uptake of such studies.

Describe your proposed solution

A learning-curve splitter that returns a series of training sets of various sizes alongside a fixed test set. Kwargs would control whether the splitting size was logarithmic, how many splits, whether to take multiple splits for each training set size, etc

Describe alternatives you've considered, if relevant

No response

Additional context

image
This is an example of the type of plot used to compare models. https://www.nature.com/articles/s41467-020-18556-9

@CompRhys CompRhys added Needs Triage Issue requires triage New Feature labels Jul 1, 2022
@Micky774 Micky774 added Needs Decision - Include Feature Requires decision regarding including feature and removed Needs Triage Issue requires triage labels Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Decision - Include Feature Requires decision regarding including feature New Feature
Projects
None yet
Development

No branches or pull requests

2 participants