Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggplot.resamples and plot.resamples when results constant #1007

Closed
c1au6i0 opened this issue Feb 14, 2019 · 5 comments
Closed

ggplot.resamples and plot.resamples when results constant #1007

c1au6i0 opened this issue Feb 14, 2019 · 5 comments

Comments

@c1au6i0
Copy link

@c1au6i0 c1au6i0 commented Feb 14, 2019

Using the function plot or ggplot for an object of class resamples will not visualize data points for a model in which data$model$values are constant or near constant.
The problem seems to be in these lines of code of the ggplot.resamples function (line 982):

results <- lapply(
      plotData,
      function(x, cl) {
        ttest <- try(t.test(x$value, conf.level = cl),
                     silent = TRUE)
        if (class(ttest)[1] == "htest") {
          out <- c(ttest$conf.int, ttest$estimate)
          names(out) <-
            c("LowerLimit", "UpperLimit", "Estimate")
        } else
          out <- rep(NA, 3)
        out
      }

In particular the t.test function will fail to report confidence intervals or means. An example:

t.test(rep(1,10), conf.level = 0.95)

Error in t.test.default(rep(1, 10), conf.level = 0.095) : data are essentially constant 

and thus also the mean alone will not be plotted and the out will be vector of NA.

I suppose that a model with results nearly constant is a rare occurrence, but I realized this because it happened to me. :)

Claudio

@c1au6i0 c1au6i0 changed the title ggplot.resamples and plot.resamples when results constants ggplot.resamples and plot.resamples when results constant Feb 14, 2019
@topepo
Copy link
Owner

@topepo topepo commented Feb 19, 2019

So maybe return

out <- c(LowerLimit = NA_real_, UpperLimit = NA_real_, Estimate = mean(x$value, na.rm = TRUE))

instead?

@c1au6i0
Copy link
Author

@c1au6i0 c1au6i0 commented Feb 19, 2019

Tested,
that works.

Thanks :)

@topepo
Copy link
Owner

@topepo topepo commented Mar 25, 2019

Do you have a test case that I can use for a unit test?

@c1au6i0
Copy link
Author

@c1au6i0 c1au6i0 commented Mar 25, 2019

uhm,
would something like this work?

library(mlbench)
data(Sonar)
set.seed(998)

# one of the variables is the same as the classifier
Sonar[,"Vx"] <- Sonar$Class

fitControl <- trainControl(## 10-fold CV
                           method = "repeatedcv",
                           number = 5,
                           repeats = 5)

gbmFit1 <- train(Class ~ ., data = Sonar, 
                 method = "gbm", 
                 trControl = fitControl,
                 verbose = FALSE)


resamps <- resamples(list(gbmFit1, gbmFit1))

ggplot(resamps)
@topepo topepo closed this in b91539f Mar 25, 2019
@topepo
Copy link
Owner

@topepo topepo commented Mar 25, 2019

Give those changes a try. It works for the example that I cooked up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.