Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set_threshold sometimes returns NA response for thresholds equal to 0. #452

Closed
pfistfl opened this issue Mar 6, 2020 · 5 comments
Closed

Comments

@pfistfl
Copy link
Member

pfistfl commented Mar 6, 2020

learner = lrn("classif.rpart", predict_type = "prob")
t = tsk("iris")
prd = learner$train(t)$predict(t)
prd$set_threshold(c(setosa = 1,versicolor = 0, virginica = 0))
prd

This happens in case predicted probabilities of 0 occur together with thresholds 0.
(1 / 0) * 0 -> NaN -> max.col returns NA.

This leads to problems when we e.g. tune over thresholds, as this then breaks the prediction's $score() method.

@mllg
Copy link
Member

mllg commented Mar 9, 2020

This is somewhat undefined behavior here. The learner says probability 0, then this gets re-scaled by multiplication with 1/threshold which is Inf for threshold == 0. Is this now a very good label to predict or a very bad label to predict?

@pfistfl
Copy link
Member Author

pfistfl commented Mar 9, 2020

I understand.
I intuitively would have assumed that I could enforce prediction of any level by setting its threshold to 0. This would mean that a prob could never be really 0 I guess.

I guess we need to decide and document what happens here, but returning NA might just make
things harder downstream.

@berndbischl
Copy link
Member

we had exactly the same issue(s) in mlr2. these edgecases really need to be solved for robust applicability and ROC.

it is also not that undefined? it is simply a tie? and we need/have tiehandling anyway?

@berndbischl
Copy link
Member

it is simply a tie? and we need/have tiehandling anyway?

no, sorry, i didnt read correctly.
i think this is how it is best:

case 1:
p = c(0.8, 0.1, 0.1)
t = c(1, 0, 0)
--> preference weighs (0.8, Inf, Inf)
This is a tie now between classes 2 and 3

case 2:
p = c(1, 0, 0)
t = c(1, 0, 0)

i think the most reasonable think is to define this now as
pref = c(1, 0, 0).
so in cases where P=0, the weight is always 0.
(then ties coud still occur, in the example here the do not)

agreed?
this would NEED to be properly documentend and tested

@mllg mllg closed this as completed in 2670210 Mar 9, 2020
@mllg
Copy link
Member

mllg commented Mar 9, 2020

Agreed. Documented and tested now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants