Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replaced pROC metrics and optimized sections of code #482

Merged
merged 31 commits into from
Sep 8, 2016

Conversation

JackStat
Copy link
Contributor

@JackStat JackStat commented Sep 3, 2016

Sorry for all of the commits but I kept finding stuff that I wanted to polish...

x = rbind(iris[,1:4], iris[,1:4], iris[,1:4], iris[,1:4], iris[,1:4], iris[,1:4])
y = rep(iris$Species, 6)

### Revamped Function
filterVarImp(x,y)
#              setosa versicolor virginica
# Sepal.Length 0.9846     0.9326    0.9846
# Sepal.Width  0.1656     0.6636    0.6636
# Petal.Length 1.0000     1.0000    1.0000
# Petal.Width  1.0000     1.0000    1.0000

### Original Function
filterVarImp1(x,y)
#              setosa versicolor virginica
# Sepal.Length 0.9846     0.9326    0.9846
# Sepal.Width  0.1656     0.6636    0.6636
# Petal.Length 1.0000     1.0000    1.0000
# Petal.Width  1.0000     1.0000    1.0000

library(microbenchmark)
microbenchmark(
   filterVarImp(x,y)
   ,filterVarImp1(x,y)
 )

# Unit: milliseconds
#                 expr      min       lq      mean    median       uq       max neval
#   filterVarImp(x, y) 2.760626 2.848932  3.361481  2.985821  3.32962  7.639012   100
#  filterVarImp1(x, y) 9.136088 9.682239 10.970366 10.411569 12.21326 15.967631   100

@topepo
Copy link
Owner

topepo commented Sep 3, 2016

Wow, this looks great. If I had know of your package I would not have written precision, recall, F1 and those

For the expected results, can you go back to hand-coding the expected results instead of comparing to the other function output?

Also, how do your functions expect the "event" to be listed for the two-class functions? caret assumes that the first factor level is the event but other packages (such as pROC) assume the opposite. We just went through a few issues and PR's to make sure that we get the directionality correct.

Finally, I didn't see a function for the area under the recision-recall curve. Is that something that you will be adding?

@JackStat
Copy link
Contributor Author

JackStat commented Sep 3, 2016

I recently began getting frustrated by the other metrics type packages because they were slow or had issues calculating on bigger data.

What do you want me to do here?
"For the expected results, can you go back to hand-coding the expected results instead of comparing to the other function output?"

For the directionality, my package will convert a factor to a numeric and subtract 1. I am happy to add some tests/code for this so it behaves as expected.

I opened an issue in ModelMetrics to add precision-recall curve and I might rework aaa.R so it is supported by ModelMetrics as well.

I realize now that my commits doubled up for some reason... Sorry about that

@JackStat
Copy link
Contributor Author

JackStat commented Sep 3, 2016

I think I see what you are talking about with pROC. When the AUC is < .50 they automatically flip it for some reason. My package doesn't do that

@topepo
Copy link
Owner

topepo commented Sep 3, 2016

What do you want me to do here?

"For the expected results, can you go back to hand-coding the expected results instead of comparing to the other function output?"

I had expected results for log-loss where I calculated it manually:

expected1 <- log(1-eps) + log(.8) + log(.51) + log(.8) + log(.6) + log(.4)

I'd rather have expected results that are explicit rather than comparing to some other function call.

For the levels

For the directionality, my package will convert a factor to a numeric and subtract 1. I am happy to add some tests/code for this so it behaves as expected.

Here is what the package currently does (using the data from ?caret:::sensitivity)

lvs <- c("normal", "abnormal")
truth <- factor(rep(lvs, times = c(86, 258)),
                levels = rev(lvs))
pred <- factor(
               c(
                 rep(lvs, times = c(54, 32)),
                 rep(lvs, times = c(27, 231))),               
               levels = rev(lvs))

xtab <- table(pred, truth)
> str(truth)
 Factor w/ 2 levels "abnormal","normal": 2 2 2 2 2 2 2 2 2 2 ...

The it computes the sensitivity, it uses "abnormal" to define the event (most of my functions have arguments to change this). It looks like yours will do the opposite:

> table(truth, as.numeric(truth)-1)

truth        0   1
  abnormal 258   0
  normal     0  86

That's easy to handle since you could just relevel those factor. I just wanted to make sure that we define things the same way

@topepo
Copy link
Owner

topepo commented Sep 3, 2016

I think I see what you are talking about with pROC. When the AUC is < .50 they automatically flip it for some reason. My package doesn't do that

He added a direction argument that helps but yeah that was the issue.

@topepo
Copy link
Owner

topepo commented Sep 3, 2016

It looks like you left twoClassSummary as-is? Did you want to replace the calls to pROC there too? Otherwise, we would need both packages.

@JackStat
Copy link
Contributor Author

JackStat commented Sep 3, 2016

I moved your hand calc test into my package JackStat/ModelMetrics@2341348 (probably the best place to put it). If I need to make adjustments to give you credit in a different way let me know

Now I will work on the handling of the factors so they line up

@JackStat
Copy link
Contributor Author

JackStat commented Sep 4, 2016

Any idea on how to fix the travis builds?

@coveralls
Copy link

coveralls commented Sep 5, 2016

Coverage Status

Coverage increased (+0.2%) to 14.761% when pulling 3466da2 on JackStat:master into fe2ed91 on topepo:master.

@JackStat
Copy link
Contributor Author

JackStat commented Sep 6, 2016

Random question.... Are you guys against using roxygen or it would just be too difficult to add it in now?

@zachmayer
Copy link
Collaborator

I'm personally pro-roxygen, but it'd be a lot of documentation pages to migrate over

@JackStat
Copy link
Contributor Author

JackStat commented Sep 6, 2016

Ok that is why I asked. It does look like it would take a lot of time

@topepo
Copy link
Owner

topepo commented Sep 6, 2016

Just an FYI... I'm generating the regression test results across the models so that I can test things with the new code in the PR. Otherwise, it looks good. I'll run the regression tests with the new code to check for issues.

Thanks

@JackStat
Copy link
Contributor Author

JackStat commented Sep 6, 2016

@topepo Thanks for double checking my work. If anything pops up I am happy to zap it

@JackStat
Copy link
Contributor Author

JackStat commented Sep 7, 2016

the mcc function in confusionMatrix.R has a numeric overflow problem

set.seed(1234)
y = runif(1000)
x = runif(1000)

mcc(table(as.numeric(y < .5), as.numeric(x < .5)))
# [1] NA
# Warning message:
# In d1 * d2 * d3 * d4 : NAs produced by integer overflow

I am going to move that to rely on ModelMetrics as well

@JackStat
Copy link
Contributor Author

JackStat commented Sep 7, 2016

Well mcc isn't used anywhere in the codebase. Maybe it is best to remove it for now?

@coveralls
Copy link

coveralls commented Sep 7, 2016

Coverage Status

Coverage increased (+0.3%) to 14.776% when pulling 707f251 on JackStat:master into fe2ed91 on topepo:master.

@topepo
Copy link
Owner

topepo commented Sep 7, 2016

It was on my list to add but had not gotten to it (in forever). You can add it back if you want.

@JackStat
Copy link
Contributor Author

JackStat commented Sep 7, 2016

Were you wanting to add it to the twoClassSummary?

@topepo
Copy link
Owner

topepo commented Sep 7, 2016

No, I'd probably keep it in confusionMatrix unless someone explicitly
asks.

On Wed, Sep 7, 2016 at 2:33 PM, Tyler Hunt notifications@github.com wrote:

Were you wanting to add it to the twoClassSummary?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#482 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFdy43_39dSQMD0WPKafEMgsOPWhENlmks5qnwOVgaJpZM4J0V7o
.

@JackStat
Copy link
Contributor Author

JackStat commented Sep 7, 2016

It doesn't seem like the function was exported so I assumed that it needed to be incorporated in another function?

I don't think you need an mcc function separately from ModelMetrics. Here is a toy example

y = as.numeric(runif(100) > .5)
x = as.numeric(runif(100) > .5)

ModelMetrics::mcc(y, x, .5)
# [1] -0.0251101

Do you want me to put the old function back?

@topepo
Copy link
Owner

topepo commented Sep 7, 2016

No, I'd rather use yours. I wrote it on a plane a while back and never got to test or incorporate it into confusionMatrix.

@JackStat
Copy link
Contributor Author

JackStat commented Sep 7, 2016

Looks like you are conducting all of your calculations on a confusionMatrix which is a good idea. Right now my mcc function takes the raw inputs. I will have to add some code to handle tables

@topepo
Copy link
Owner

topepo commented Sep 7, 2016

the regression tests are giving me results that are different form the current code. I think that this issue is with this line of twoClassSummary:

rocAUC <- auc(data$y, data$pred)

pred is a factor variable. It should be one of the class probability columns such as data[, lev[1]].

Here is a test case:

library(caret)

set.seed(1)
tr_dat <- twoClassSim(500)
te_dat <- twoClassSim(500)

old_summary <- function (data, lev = NULL, model = NULL) {
  if (length(levels(data$obs)) > 2) 
    stop(paste("Your outcome has", length(levels(data$obs)), 
               "levels. The twoClassSummary() function isn't appropriate."))
  caret:::requireNamespaceQuietStop("pROC")
  if (!all(levels(data[, "pred"]) == levels(data[, "obs"]))) 
    stop("levels of observed and predicted data do not match")
  rocObject <- try(pROC::roc(data$obs, data[, lev[1]], direction = ">"), 
                   silent = TRUE)
  rocAUC <- if (class(rocObject)[1] == "try-error") 
    NA
  else rocObject$auc
  out <- c(rocAUC, sensitivity(data[, "pred"], data[, "obs"], 
                               lev[1]), specificity(data[, "pred"], data[, "obs"], lev[2]))
  names(out) <- c("ROC", "Sens", "Spec")
  out
}


set.seed(35)
old_res <- train(Class ~ ., data = tr_dat, 
                 method = "fda",
                 tuneLength = 10,
                 metric = "ROC",
                 trControl = trainControl(classProbs = TRUE,
                                          method = "LOOCV",
                                          summaryFunction = old_summary))

set.seed(35)
new_res <- train(Class ~ ., data = tr_dat, 
                 method = "fda",
                 tuneLength = 10,
                 metric = "ROC",
                 trControl = trainControl(classProbs = TRUE,
                                          method = "LOOCV",
                                          seeds = old_res$control$seeds,
                                          summaryFunction = twoClassSummary))

I get different results for all of the parameters but here is an example:

> subset(old_res$results, nprune == 5)
  degree nprune       ROC      Sens      Spec
3      1      5 0.8890885 0.8358779 0.7394958
> subset(new_res$results, nprune == 5)
  degree nprune       ROC      Sens      Spec
3      1      5 0.7876868 0.8358779 0.7394958

This code can get the same results:

> library(pROC)
> library(ModelMetrics)
> 
> five_terms_old <- subset(old_res$pred, nprune == 5)
> 
> pROC:::roc(five_terms_old$obs, five_terms_old$Class1, direction = ">")

Call:
roc.default(response = five_terms_old$obs, predictor = five_terms_old$Class1,     direction = ">")

Data: five_terms_old$Class1 in 262 controls (five_terms_old$obs Class1) > 238 cases (five_terms_old$obs Class2).
Area under the curve: 0.8891
> ModelMetrics:::auc(ifelse(five_terms_old$obs == levels(five_terms_old$obs)[2], 0, 1),
+                    five_terms_old$Class1)
[1] 0.8890885

This might do it:

> newer_summary <- function (data, lev = NULL, model = NULL){
+   lvls <- levels(data$obs)
+   if(length(lvls) > 2)
+     stop(paste("Your outcome has", length(lvls),
+                "levels. The twoClassSummary() function isn't appropriate."))
+   caret:::requireNamespaceQuietStop('ModelMetrics')
+   if (!all(levels(data[, "pred"]) == lvls))
+     stop("levels of observed and predicted data do not match")
+   rocAUC <- ModelMetrics:::auc(ifelse(data$obs == lev[2], 0, 1), data[, lvls[1]])
+   out <- c(rocAUC,
+            caret:::sensitivity(data[, "pred"], data[, "obs"], lev[1]),
+            caret:::specificity(data[, "pred"], data[, "obs"], lev[2]))
+   names(out) <- c("ROC", "Sens", "Spec")
+   out
+ }
> newer_summary(five_terms_old, lev = levels(five_terms_old$obs))
      ROC      Sens      Spec 
0.8890885 0.8358779 0.7394958

@JackStat
Copy link
Contributor Author

JackStat commented Sep 7, 2016

Nice find sir. I extended the test that I was running to find this bug and then used your method for a fix.

@coveralls
Copy link

coveralls commented Sep 7, 2016

Coverage Status

Coverage increased (+0.3%) to 14.776% when pulling 33d8b0d on JackStat:master into fe2ed91 on topepo:master.

topepo pushed a commit that referenced this pull request Sep 8, 2016
@topepo
Copy link
Owner

topepo commented Sep 8, 2016

All the regression tests pass! I'll merge now

@topepo topepo merged commit 51b6b47 into topepo:master Sep 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants