Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add plot for calibration #402

Closed
giuseppec opened this issue Jul 17, 2015 · 16 comments
Closed

add plot for calibration #402

giuseppec opened this issue Jul 17, 2015 · 16 comments

Comments

@giuseppec
Copy link
Contributor

We should think of adding 'calibration plots' to mlr. Two packages, where this is implemented are:
http://www.genabel.org/PredictABEL/plotCalibration.html
http://www.inside-r.org/packages/cran/caret/docs/xyplot.calibration

@berndbischl
Copy link
Member

@zmjones

Would you please look at this when you are healthy again? Maybe we can simply recode the ABEL plot in gglot? But give us some feedback what is in the 2 packages above and how you like it.

@zmjones
Copy link
Contributor

zmjones commented Jul 23, 2015

Ok looked at both of them. I see no reason to limit this to binary classification except for the Hosmer-Lemeshow test. Is there a multiclass analogue? Or do we care about this at all?

@schiffner
Copy link
Contributor

Thanks and welcome back!!!

Generally, we would be happy to have plots/tests for the multiclass case, too.
From the top of my head I don't know about a multiclass version of the test. Have to look into it myself.

@zmjones
Copy link
Contributor

zmjones commented Jul 23, 2015

Here is a preliminary version. Let me know what you all think when you get a chance to play with it.

@zmjones
Copy link
Contributor

zmjones commented Jul 23, 2015

Also if you all know of a way to get tapply to not skip unused factor levels that would be helpful.

@schiffner
Copy link
Contributor

Thank you very much, Zach.

About tapply:
My general impression is that there is potential to simplify your code.
For example in the binary case (starting with l. 65 in plotCalibration.R) you could directly
use table(y, p_bin) and then normalize it by the colSums (!= 0) without any tapply.

I played around a bit and this is my first try to simplify l. 43-73 with something that
works for binary and multiclass. Maybe you want to use/improve this:

task = iris.task
# task = sonar.task

lrn = makeLearner("classif.lda", predict.type = "prob")
m = train(lrn, task)
pred = predict(m, task)
df = cbind(truth = getPredictionTruth(pred), getPredictionProbabilities(pred, cl = getTaskClassLevels(task)))
## maybe use as.data.frame(pred) instead

df = reshape2::melt(df, id.vars = "truth")
break.points = hist(df$value, breaks = "Sturges", plot = FALSE)$breaks
df$bin = cut(df$value, break.points, include.lowest = TRUE, ordered_results = TRUE)
out = plyr::ddply(df, "bin", function(x) {
    tab = table(x$variable, x$truth)
    s = rowSums(tab)
    freq = ifelse(s == 0, 0, diag(tab)/s)
    freq
    }
)
out

I guess to be consistent we could again separate things into data generating and plotting functions.

I have no experience with calibration curves and not yet a strong opinion about this.
Just tossing it out for discussion if we also want to have this:
There is some literature and also some R packages that use loess or other techniques to generate
smooth calibration curves, for example packages gbm and rms:
http://www.inside-r.org/packages/cran/gbm/docs/calibrate.plot
http://www.rdocumentation.org/packages/rms/functions/calibrate
The reasoning is that binning is too arbitrary and curves resulting from different break points
can look very different.
Does anyone have experience with this?

@berndbischl
Copy link
Member

@giuseppec
You should check this please

@berndbischl
Copy link
Member

He does have experience

@zmjones
Copy link
Contributor

zmjones commented Jul 24, 2015

Thanks @schiffner for the help. I thought about a smoother for the same reason. I will definitely separate into plot/generate functions and also appreciate the refactoring help.

@zmjones
Copy link
Contributor

zmjones commented Jul 24, 2015

The gist is updated now. There is an optional smoother, separate plot/generate functions, and simpler code thanks to @schiffner.

If the basic design is fine then I'll make this work with survival tasks and add a ggvis version of the plot function as well. If anyone has any nominations for other features from any of the above packages I'm all ears.

@giuseppec
Copy link
Contributor Author

Nice work. I have two "requests":

  1. Instead of just grouping the predictions into quantiles (with the cut function), we could also add an option to use qually sized groups because they can be quantiles without observations. For example:
Z <- rnorm(10000)
table(Hmisc::cut2(Z, g = 10))
  1. We should have the possibility to add the "optimal" calibration curve, which is a line from (0,0) to (1,1), see also the red line in the example from http://www.inside-r.org/packages/cran/gbm/docs/calibrate.plot.

Regarding the smooth calibration curve, I have to read a little bit more about it. But I think it makes also sense to include them.

@zmjones
Copy link
Contributor

zmjones commented Jul 25, 2015

Ok I found cut_number which does the same thing in as cut2 but is in ggplot2. Would you prefer that be the default rather than using hist? I'll still have it so that the user can pass break points if they like.

Optimal line will be optional and true by default.

Ok sounds good on the smoother.

@giuseppec
Copy link
Contributor Author

I think can use hist as default. Another idea, which we could also implement:
In Figure 8 from http://mrvar.fdv.uni-lj.si/pub/mz/mz3.1/vuk.pdf they show "positive examples above
the graph area (on the x-axis) and negative example below the graph area, in what is called
"a rag"". This should be possible with geom_rug.

@zmjones
Copy link
Contributor

zmjones commented Jul 26, 2015

I think I got the "rag" and the binning done in an appropriate way. Lars already merged it but if you anyone has any other suggestions I am happy to work on it some more. The ggvis version is going to take a bit more work.

@zmjones
Copy link
Contributor

zmjones commented Jul 27, 2015

This has now been added to the tutorial (under advanced) as well. I think the issue can be closed unless there are objections.

@larskotthoff
Copy link
Member

Thanks, closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants