Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add split value argument to ColSplitter #3737

Merged
merged 2 commits into from
Jul 17, 2022

Conversation

DanteOz
Copy link
Contributor

@DanteOz DanteOz commented Jul 15, 2022

Adds an optional parameter to ColSplitter to specify which value to split on. Done with intention of simplifying the use of a k-fold validiation index column with the DataBlocks API. Example usage would be:

df = pd.DataFrame({
    'a': [0,1,2,3,4,5], 
    'fold': [1,2,3,1,2,3]
})
splits = ColSplitter('fold', 3)(df)

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@DanteOz DanteOz marked this pull request as ready for review July 17, 2022 01:12
@DanteOz DanteOz requested a review from jph00 as a code owner July 17, 2022 01:12
@DanteOz
Copy link
Contributor Author

DanteOz commented Jul 17, 2022

@jph00 Ready for review. I added support for splitting on a list of values, given the discord discussion with Benjamin.

@jph00 jph00 merged commit caf805e into fastai:master Jul 17, 2022
@jph00
Copy link
Member

jph00 commented Jul 17, 2022

Good stuff!

@DanteOz DanteOz deleted the colsplitter-on-value branch July 19, 2022 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants