-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems when only one class manifest in the training target #123
Comments
MLJLM uses base data types (vector of reals), so it doesn't get the levels information and will just see a vector with a single unique element. It may be made to take an optional MLJLinearModels.jl/src/fit/default.jl Line 40 in 55f22b0
gets a that would mean calling |
This here MLJLinearModels.jl/src/mlj/interface.jl Lines 57 to 81 in 55f22b0
should get the right number of classes though via the interface. Would need to figure out why the there's definitely a bug here somewhere |
It has something to do with the encoding, which is weird. In the encoding, the classes are number starting with -1, not 0 or 1. |
@tlienart The problem is that, for some reason I don't understand at all, the encoding of MLJLinearModels.jl/src/mlj/interface.jl Line 66 in 55f22b0
If we subsample and only see one of the two classes, then the encoded
is incorrect. The problem for me is that I really can't figure out what this For what it's worth, a better and safer design would probably be to remove all this binary special-casing altogether, if that makes sense here. But maybe you have your reasons... |
the distinction binary/multiclass is in the use of the internal representation of the vector. For binary it's more convenient to have Anyway here the problem is that I was trying to do a bit too much for the user:
anyway, I've removed this by only recoding to -1/1 if the user explicitly specifies Binary; otherwise a Multinomial case with redundant computations is used. A test case with your example is added. maybe not the way you'd have preferred but I can't do much more at the moment |
In the training target below we have
length(levels(y)) == 2
buty
itself only exhibits one class. This is crashingfit
. Occasionally, especially in smaller data sets, a large class may be "hidden" when we restrict to a particular fold, so this is an issue.The text was updated successfully, but these errors were encountered: