Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Venkatraman test improperly documented as comparing AUC #92

Closed
MetaEntropy opened this issue Feb 13, 2021 · 4 comments
Closed

Venkatraman test improperly documented as comparing AUC #92

MetaEntropy opened this issue Feb 13, 2021 · 4 comments
Labels
bug No, it's not a feature! doc

Comments

@MetaEntropy
Copy link

Describe the bug
The documentation of roc.test and procedure output states that the test compares Area Under Cuves (AUC) of Receiver Operating Characteristic (ROC) curves, while it actually does not.
The Venkatraman procedure tests whether two ROC curves are perfectly superimposed. If ROC curves cross, it will tend to reject the null hypothesis even if they have exactly the same AUC.

In my opinion, this test is very dangerous and should rarely be used. A brief review of literature citing Venkatraman's article on Google Scholar (https://scholar.google.com/scholar?cites=15643302621044267150) shows that it is almost always interpreted incorrectly. Software documentation may be partly responsible of these misinterpretations.

The only protection to the misinterpretation is the omission of AUC in the output of the roc.test() procedure.

To Reproduce
Steps to reproduce the behavior:
load pROC and display the help of roc.test
library(pROC)
help("roc.test")

The documentation states that:
"This function compares the AUC or partial AUC of two correlated (or paired) or uncorrelated (unpaired) ROC curves."
[...]
With method="venkatraman", the processing is done as described in Venkatraman and Begg (1996) (for paired ROC curves) and Venkatraman (2000) (for unpaired ROC curves) with boot.n permutation of sample ranks (with ties breaking). For consistency reasons, the same argument boot.n as in bootstrap defines the number of permutations to execute, even though no bootstrap is performed.

Then, execute:

expo1=rnorm(100);expo2=rnorm(100);outcome=rbinom(100,1,0.5);
roc.test(roc(outcome,expo1),roc(outcome,expo2),method="venkatraman")
Venkatraman's test for two paired ROC curves

data: roc(outcome, expo1) and roc(outcome, expo2)
[...]
alternative hypothesis: true difference in AUC is not equal to 0

Expected behavior
The roc.test documentation should clearly state that Venkatraman's test does not compare AUC but tests for perfect superimposition of ROC curves.
"This function compares the AUC or partial AUC of two correlated (or paired) or uncorrelated (unpaired) ROC curves except when Venkatraman's method is used; the latter tests the hypothesis of non-superimposed ROC curves."
[...]
_With method="venkatraman", the processing is done as described in Venkatraman and Begg (1996) (for paired ROC curves) and Venkatraman (2000) (for unpaired ROC curves) with boot.n permutation of sample ranks (with ties breaking). For consistency reasons, the same argument boot.n as in bootstrap defines the number of permutations to execute, even though no bootstrap is performed. **This method does not compare AUC of ROC curves but tests whether ROC curves are non-superimposed (alternative hypothesis). If curves cross, AUC may be equal but ROC curves not be superimposed. When ROC curves cross, even if AUC actually do differ, the test cannot be used to compare AUC, because it may a lead to a type III error rate close to 50%, concluding that the ROC curve with the best AUC actually has the worse AUC ** _

Execution of these commands:

expo1=rnorm(100);expo2=rnorm(100);outcome=rbinom(100,1,0.5);
roc.test(roc(outcome,expo1),roc(outcome,expo2),method="venkatraman")

Should show, in the output:
alternative hypothesis: ROC curves are not perfectly superimposed (and may or may not have the same AUC)

It may even display a warning such as:
Warning: Venkatraman's test cannot be used to compare AUC (see documentation)

@xrobin
Copy link
Owner

xrobin commented Feb 13, 2021

Hi, thanks for your report!
You are right this is not good. I think the problem is that roc.test should not be described as 'Compare the AUC of two ROC curves' but instead simply 'Compare two ROC curves'. And have more details further down, clearly separated by method.
I'll see what I can do.

@xrobin
Copy link
Owner

xrobin commented May 16, 2021

Interestingly the alternative hypothesis for venkatraman shouldn't have mentioned the AUC all along, but due to a tiny bug the message wasn't applied. This is fixed now, and the message is:

alternative hypothesis: true difference in ROC operating points is not equal to 0

Due to the way the hypothesis tests are printed in R (with a built-in method), I am limited in how I can phrase the message, which must include is not equal to 0. But I think this one is technically correct.

I also realized that the "sensitivity" and "specificity" tests were wrong and fixed them too.

Finally I updated the documentation of the roc.test, which now states that it "compares two ROC curves" and shortly describes the different methods.

@MetaEntropy Can you please confirm that everything is properly fixed? You can install the updated version with:

devtools::install_github("xrobin/pROC@release-1.17")

Thanks!

xrobin added a commit that referenced this issue May 16, 2021
xrobin added a commit that referenced this issue May 16, 2021
@xrobin xrobin added the bug No, it's not a feature! label May 16, 2021
@MetaEntropy
Copy link
Author

MetaEntropy commented May 18, 2021 via email

xrobin added a commit that referenced this issue Aug 2, 2021
xrobin added a commit that referenced this issue Aug 2, 2021
@xrobin
Copy link
Owner

xrobin commented Sep 2, 2021

Version 1.18.0 of pROC was submitted to CRAN earlier and solves this issue.

@xrobin xrobin closed this as completed Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug No, it's not a feature! doc
Projects
None yet
Development

No branches or pull requests

2 participants