add plot for calibration #402

giuseppec · 2015-07-17T20:00:41Z

We should think of adding 'calibration plots' to mlr. Two packages, where this is implemented are:
http://www.genabel.org/PredictABEL/plotCalibration.html
http://www.inside-r.org/packages/cran/caret/docs/xyplot.calibration

berndbischl · 2015-07-18T09:59:35Z

@zmjones

Would you please look at this when you are healthy again? Maybe we can simply recode the ABEL plot in gglot? But give us some feedback what is in the 2 packages above and how you like it.

zmjones · 2015-07-23T15:47:11Z

Ok looked at both of them. I see no reason to limit this to binary classification except for the Hosmer-Lemeshow test. Is there a multiclass analogue? Or do we care about this at all?

schiffner · 2015-07-23T16:44:44Z

Thanks and welcome back!!!

Generally, we would be happy to have plots/tests for the multiclass case, too.
From the top of my head I don't know about a multiclass version of the test. Have to look into it myself.

zmjones · 2015-07-23T20:55:12Z

Here is a preliminary version. Let me know what you all think when you get a chance to play with it.

zmjones · 2015-07-23T20:56:28Z

Also if you all know of a way to get tapply to not skip unused factor levels that would be helpful.

schiffner · 2015-07-24T10:34:17Z

Thank you very much, Zach.

About tapply:
My general impression is that there is potential to simplify your code.
For example in the binary case (starting with l. 65 in plotCalibration.R) you could directly
use table(y, p_bin) and then normalize it by the colSums (!= 0) without any tapply.

I played around a bit and this is my first try to simplify l. 43-73 with something that
works for binary and multiclass. Maybe you want to use/improve this:

task = iris.task
# task = sonar.task

lrn = makeLearner("classif.lda", predict.type = "prob")
m = train(lrn, task)
pred = predict(m, task)
df = cbind(truth = getPredictionTruth(pred), getPredictionProbabilities(pred, cl = getTaskClassLevels(task)))
## maybe use as.data.frame(pred) instead

df = reshape2::melt(df, id.vars = "truth")
break.points = hist(df$value, breaks = "Sturges", plot = FALSE)$breaks
df$bin = cut(df$value, break.points, include.lowest = TRUE, ordered_results = TRUE)
out = plyr::ddply(df, "bin", function(x) {
    tab = table(x$variable, x$truth)
    s = rowSums(tab)
    freq = ifelse(s == 0, 0, diag(tab)/s)
    freq
    }
)
out

I guess to be consistent we could again separate things into data generating and plotting functions.

I have no experience with calibration curves and not yet a strong opinion about this.
Just tossing it out for discussion if we also want to have this:
There is some literature and also some R packages that use loess or other techniques to generate
smooth calibration curves, for example packages gbm and rms:
http://www.inside-r.org/packages/cran/gbm/docs/calibrate.plot
http://www.rdocumentation.org/packages/rms/functions/calibrate
The reasoning is that binning is too arbitrary and curves resulting from different break points
can look very different.
Does anyone have experience with this?

berndbischl · 2015-07-24T11:02:49Z

@giuseppec
You should check this please

berndbischl · 2015-07-24T11:03:01Z

He does have experience

zmjones · 2015-07-24T14:07:52Z

Thanks @schiffner for the help. I thought about a smoother for the same reason. I will definitely separate into plot/generate functions and also appreciate the refactoring help.

zmjones · 2015-07-24T18:02:00Z

The gist is updated now. There is an optional smoother, separate plot/generate functions, and simpler code thanks to @schiffner.

If the basic design is fine then I'll make this work with survival tasks and add a ggvis version of the plot function as well. If anyone has any nominations for other features from any of the above packages I'm all ears.

giuseppec · 2015-07-24T19:16:38Z

Nice work. I have two "requests":

Instead of just grouping the predictions into quantiles (with the cut function), we could also add an option to use qually sized groups because they can be quantiles without observations. For example:

Z <- rnorm(10000)
table(Hmisc::cut2(Z, g = 10))

We should have the possibility to add the "optimal" calibration curve, which is a line from (0,0) to (1,1), see also the red line in the example from http://www.inside-r.org/packages/cran/gbm/docs/calibrate.plot.

Regarding the smooth calibration curve, I have to read a little bit more about it. But I think it makes also sense to include them.

zmjones · 2015-07-25T01:18:52Z

Ok I found cut_number which does the same thing in as cut2 but is in ggplot2. Would you prefer that be the default rather than using hist? I'll still have it so that the user can pass break points if they like.

Optimal line will be optional and true by default.

Ok sounds good on the smoother.

giuseppec · 2015-07-25T11:12:37Z

I think can use hist as default. Another idea, which we could also implement:
In Figure 8 from http://mrvar.fdv.uni-lj.si/pub/mz/mz3.1/vuk.pdf they show "positive examples above
the graph area (on the x-axis) and negative example below the graph area, in what is called
"a rag"". This should be possible with geom_rug.

zmjones · 2015-07-26T01:43:48Z

I think I got the "rag" and the binning done in an appropriate way. Lars already merged it but if you anyone has any other suggestions I am happy to work on it some more. The ggvis version is going to take a bit more work.

zmjones · 2015-07-27T18:53:42Z

This has now been added to the tutorial (under advanced) as well. I think the issue can be closed unless there are objections.

larskotthoff · 2015-07-30T22:59:25Z

Thanks, closing this.

zmjones mentioned this issue Jul 25, 2015

adds calibration generation/plot #414

Merged

larskotthoff closed this as completed Jul 30, 2015

giuseppec mentioned this issue Apr 22, 2016

Calibration-in-the-large and calibration slope #842

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add plot for calibration #402

add plot for calibration #402

giuseppec commented Jul 17, 2015

berndbischl commented Jul 18, 2015

zmjones commented Jul 23, 2015

schiffner commented Jul 23, 2015

zmjones commented Jul 23, 2015

zmjones commented Jul 23, 2015

schiffner commented Jul 24, 2015

berndbischl commented Jul 24, 2015

berndbischl commented Jul 24, 2015

zmjones commented Jul 24, 2015

zmjones commented Jul 24, 2015

giuseppec commented Jul 24, 2015

zmjones commented Jul 25, 2015

giuseppec commented Jul 25, 2015

zmjones commented Jul 26, 2015

zmjones commented Jul 27, 2015

larskotthoff commented Jul 30, 2015

add plot for calibration #402

add plot for calibration #402

Comments

giuseppec commented Jul 17, 2015

berndbischl commented Jul 18, 2015

zmjones commented Jul 23, 2015

schiffner commented Jul 23, 2015

zmjones commented Jul 23, 2015

zmjones commented Jul 23, 2015

schiffner commented Jul 24, 2015

berndbischl commented Jul 24, 2015

berndbischl commented Jul 24, 2015

zmjones commented Jul 24, 2015

zmjones commented Jul 24, 2015

giuseppec commented Jul 24, 2015

zmjones commented Jul 25, 2015

giuseppec commented Jul 25, 2015

zmjones commented Jul 26, 2015

zmjones commented Jul 27, 2015

larskotthoff commented Jul 30, 2015