New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
groups in kenstone not working because of wrong argument check #51
Comments
Hi, the function is correct, the problem was in the documentation (which I just clarified for the upcoming version). The point is that when you use the argument In your example your max. |
Hi,
Well I understand your reasoning but then the code is wrong. So in the case of groups the number of groups is compared with the number of selected observations in the while loop (while is should be compared with the number of selected groups). That’s why, when I use goups and k = 30 I get 30 observations and not 30 groups.
see comparisson statement in while loop line 199 and update in line 220
Please check again
Thanks
Michi
|
Here a reproducible example:
|
Hi Michi, |
your fist comment was accurate... there is no reason why The I added the following explanations in the documentation of the function: In the argument description: In Details: Thank you very much for noticing this issue! |
Hi, Ok yes. Although I think your explanation is not sound yet. If I'm not mistaken, in your example in details it will actually return 5 samples and not 15 (see your while loop). "For example, if k = 2 and if the first sample identified belongs to with group of 5 samples and the second one belongs to a group with 10 samples, then, the total amount of samples retrieved by the function will be 15." |
I do not think so... I did my homework
|
ok, you are right, your example is correct -- but that a bit special because the first two samples are selected together. But for higher k the behaviour is similar to what I suspected. So I'm not sure how illustrative the example is (at least for me it's confusing).
|
Not sure what your suspicion was about... |
Hmm. Maybe I don't understand. My understanding is that the first two groups are selected before the while loop. So they both get selected even if one group alone (!) already has more than k observations (as in your example). This is then different for the third group which would only be added if the first two groups together have less than k observations. |
Hi Leonardo
Unfortunately kenstone with groups not working correctly.
Eg. for a dataset with 100 groups * 10 samples per group = 1000 observations, it will only allow to select 100 observations at max (k = 100).
Because when k > 100 in the above example it will fail here in line 157
if (k > nlevels(group)) { stop("k is larger the the number of groups/levels in 'group'") }
best regards & thanks
The text was updated successfully, but these errors were encountered: