New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FPGrowth and Apriori disagreement with minsup=0 #574
Comments
Good catch! I agree with your assessment regarding practically useful & theoretically correct. So, setting min_support to 0 will result in all possible itemsets, which is probably never useful in practice. However, it would be technically correct ... The two options I have in mind are a) Disallow if min_support <= 0.:
raise ValueError('`min_support must be a positive number within the interval (0, 1]`) b) Allow 0 values in fpmax and fpgrowth with the behavior of returning all subsets -- who knows, maybe it is useful for people who just want to quickly get the number of total itemsets (possible combinations) although it may not be an efficient way for doing that ... I would tend to a). Would be curious to hear what you think. |
I guess using But, in the rare case someone wants that, you could also just do a small value; i.e., 1/nrows. The only thing 0 gives you over 1/nrows is all possible subsets, which doesn't seem useful to have. Yeah, I am good with option (a). I can't see any reason they would need to do 0 over just doing 1/nrows. |
Sounds good, thanks for the feedback! I just added it to the existing PR at #573 |
Should be fine now after #573 |
Noticed this when fixing other bug. There is a disagreement between FPGrowth and Apriori when minsupport=0. I am not sure what the answer should be.
Support your input itemsets are
[[a], [b]]
.This produces:
We should make it consistent, but which one? The second is correct in a theoretical sense (every possibly subset of all items appears at least 0 times), but the first is probably more useful in the practical sense.
The text was updated successfully, but these errors were encountered: