[MRG] final edits to the class #3
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi @gverbock
I made tiny edits to the docstrings and the class and added a few additional asserts and tests to the test file.
I have one final concern. I am not sure split_distrinct is working as I expect. Maybe my expectation is wrong.
For example, if
df["A"] = [ A, A, A, A, A, A, B, C, D, E]split_distinct=Falseandsplit_frac=0.5, I expect basis to be [ A, A, A, A, A] and test [ A, B, C, D, E]split_distinct=Trueandsplit_frac=0.5, I expect basis to be [ A, A, A, A, A, A] and test [ B, C, D, E]Also when split_distinct=True and the split_col is numerical, are we obtaining what we expect?
For example, if
df["A"] = [ 0, 0, 0, 0, 0, 0, 1, 2, 3, 4]split_distinct=Falseandsplit_frac=0.5, I expect basis to be [ 0, 0, 0, 0, 0] and test [0, 1, 2, 3, 4]split_distinct=Trueandsplit_frac=0.5, I expect basis to be [ 0, 0, 0, 0, 0, 0] and test [1, 2, 3, 4]Are we obtaining that? can we test? or could you tell me which one is the test for this? or maybe I am interpreting the functionality wrongly?
Sorry to ask, but I can't see it directly from the logic and I've been thinking about this and going back and forth for a while now, so my brain is burnt.
I also added comments in the test file in the original PR to elaborate more on my doubts. Could you have a look at those?
Thank you!