Calculating AUC/AUPRC confidence intervals #13

micdonato · 2019-11-19T18:07:00Z

Hello.

I love precrec, but every time I use it I have to go crazy with integrating it with pROC to include confidence intervals of the AUCs (I still wasn't able to do so for AUPRCs).

Since precrec computes the cb bounds for the curves, is it possible to have the confidence intervals coming out of the auc function?

The text was updated successfully, but these errors were encountered:

takayasaito · 2020-01-10T10:13:09Z

I checked the source code of pROC for its CI calculation and found that it uses a bootstrapping approach. pROC generates 2000 bootstrap samples (resampling with replacement) by default so that 2000 AUCs should be calculated. Then, it simply selects the 0.25 and the 0.975 quantiles from the list of the calculated AUCs when the significant level (alpha) is 0.05.

Since precrec doesn't provide bootstrapping, we can't apply the same method to calculate CIs. Alternatively, you can still use precrec to calculate a CI when you are dealing with cross-validation data. I added a simple help function called auc_ci that performs CI calculation on precrec objects.

library(precrec)

# Create sample datasets with 100 positives and 100 negatives
samps <- create_sim_samples(4, 100, 100, "all")
mdat <- mmdata(samps[["scores"]], samps[["labels"]],
               modnames = samps[["modnames"]],
               dsids = samps[["dsids"]])

# Generate an mscurve object that contains ROC and Precision-Recall curves
mmcurves <- evalmod(mdat)

# Calculate CI of AUCs
auc_ci(mmcurves)

# Calculate CI with alpha = 0.01
auc_ci(mmcurves, alpha = 0.01)

# Calculate CI with t-distribution
auc_ci(mmcurves, dtype = "t")

I have submitted precrec v0.11 to CRAN, and it has been already available for several platforms. You can check the availability status here.

JanaFe · 2020-11-26T21:23:39Z

Hi, I am also trying to calculate confidence intervals for the area under the precision recall curve with R version 4.0.3.

I have a vector of scores (value range 0-100), and a vector of labels (0 or 1).
Running this code:

mdat <- mmdata(scores, labels)  
mmcurves <- evalmod(mdat)  
mm_auc_ci <- auc_ci(mmcurves, alpha=0.05, dtype='t')

Gives an error:
Error: 'curves' must contain multiple datasets.

What am I doing wrong?

takayasaito · 2020-11-27T15:43:04Z

precrec doesn't calculate confidence band/confidence interval for a single testset but for cross-validation results with multiple testsets. Your example seems like a case of a single test set to me. It is of course possible to use a bootstrapping approach to simulate the result of your model with a single test set, but I don't know whether or not it's a good idea.

Your example

library(precrec)

# Create scores and labels
n <- 100
scores <- runif(n)*100
labels <- sample(c(0, 1), n, replace=TRUE)

# Calculate curves (single model with single dataset)
mdat <- mmdata(scores, labels)
sscurves <- evalmod(mdat)
plot(sscurves)

Resample scores r1 times

# Create bootstrapped scores
r1 <- 10
resampled_scores <- replicate(r1, sample(scores, replace=TRUE))

# Calculate curves (single model with multiple datasets)
smdat1 <- mmdata(resampled_scores, labels, modnames=rep("m1", r1), dsids=1:r1)
smcurves1 <- evalmod(smdat1)
plot(smcurves1)
auc_ci(smcurves1)

Resample labels r2 times

# Create bootstrapped labels
r2 <- 10
resampled_labels <- replicate(r2, sample(labels, replace=TRUE))

# Calculate curves (single model with multiple datasets)
smdat2 <- mmdata(replicate(r2, scores), resampled_labels, modnames=rep("m1", r2), dsids=1:r2)
smcurves2 <- evalmod(smdat2)
plot(smcurves2)
auc_ci(smcurves2)

To access the performance of your model accurately, it would be much better to perform cross-validation than bootstrapping the result of your model on a test dataset (resampling scores and labels like the examples above). I would avoid using any bootstrapping approaches if it's possible.

JanaFe · 2020-11-29T15:56:19Z

That helps, thanks a lot!

bblodfon · 2024-04-26T09:29:25Z

Hi @takayasaito! Happy to have found your package! I am trying to do something similar to the above (ie we have predictions from a single model and we do stratified bootstrap both on labels and scores to see the variability of the PR) and would like a bit your help since you know the internal functions better than me :)

So, how can I get the Precision-Recall data in a data.frame from an smcurves object (before plotting)? eg

library(precrec)

samps = create_sim_samples(4, 100, 100, "good_er")
mdat  = mmdata(samps[["scores"]], samps[["labels"]],
  modnames = samps[["modnames"]],
  dsids = samps[["dsids"]]
)
smcurves = evalmod(mdat, type = "rocpr")

# how can I get a `data.frame` with colnames `c(recall, precision, threshold)` for each dataset ID?
# ie a list of `data.frame`s with that info? My problem especially using `PRROC` doing the same 
# thing is that the multiplicity and number of thresholds is different so merging them is really a 
# pain :) - which I think you have solved since we can call `plot(smcurves)`!
smcurves
#> 
#>     === AUCs ===
#> 
#>      Model name Dataset ID Curve type       AUC
#>    1    good_er          1        ROC 0.8364000
#>    2    good_er          1        PRC 0.8593735
#>    3    good_er          2        ROC 0.7677000
#>    4    good_er          2        PRC 0.8169513
#>    5    good_er          3        ROC 0.8218000
#>    6    good_er          3        PRC 0.8520650
#>    7    good_er          4        ROC 0.8139000
#>    8    good_er          4        PRC 0.8528955
#> 
#> 
#>     === Input data ===
#> 
#>      Model name Dataset ID # of negatives # of positives
#>    1    good_er          1            100            100
#>    2    good_er          2            100            100
#>    3    good_er          3            100            100
#>    4    good_er          4            100            100

^{Created on 2024-04-26 with reprex v2.0.2}

bblodfon · 2024-04-26T09:51:30Z

Ah, ok you have it in res = precrec::evalmod(data, raw_curves = TRUE), can extract it, nice

bblodfon · 2024-04-26T13:20:39Z

So, the thresholds might not be equal as far as I can see (I thought x_bins controls for that), may it's a bug? I have another example where there are way more unique values. Maybe filling them up with the last precision value in each respective vector makes sense? (without breaking the 1-1 correspondence between the thresholds I guess, if that makes sense...)

library(precrec)

samps = create_sim_samples(100, 20, 20, "good_er")
mdat  = mmdata(samps[["scores"]], samps[["labels"]],
  modnames = samps[["modnames"]],
  dsids = samps[["dsids"]]
)

# Generate an smcurve object that contains ROC and Precision-Recall curves
smcurves = evalmod(mdat, type = "rocpr", raw_curves = TRUE)
# extract precision vectors per dataset
precision = lapply(smcurves$prcs, function(obj) obj$y)
unique(unlist(lapply(precision, length)))
#> [1] 1024 1023

^{Created on 2024-04-26 with reprex v2.0.2}

takayasaito · 2024-04-29T10:11:51Z

For the first example, you can simply call data.frame as data.frame(smcurves).

data.frame(smcurves) |> head()
#      x      y      ymin      ymax modname type
#1 0.000 0.0000 0.0000000 0.0000000 good_er  ROC
#2 0.000 0.2975 0.1912348 0.4037652 good_er  ROC
#3 0.001 0.2975 0.1912348 0.4037652 good_er  ROC
#4 0.002 0.2975 0.1912348 0.4037652 good_er  ROC
#5 0.003 0.2975 0.1912348 0.4037652 good_er  ROC
#6 0.004 0.2975 0.1912348 0.4037652 good_er  ROC

Similarly, you can use data.frame to convert an AUC object to a data.frame.

auc(smcurves) |> data.frame() |> head()
#  modnames dsids curvetypes      aucs
#1  good_er     1        ROC 0.7683000
#2  good_er     1        PRC 0.8108477
#3  good_er     2        ROC 0.8287000
#4  good_er     2        PRC 0.8626605
#5  good_er     3        ROC 0.7498000
#6  good_er     3        PRC 0.7995740

takayasaito · 2024-04-29T10:29:21Z

For the second question, you can convert the object to a data frame in order to check the actual values.

library(dplyr)

precision <- data.frame(smcurves) |> 
  dplyr::filter(type == "ROC" & modname == "good_er" & dsid == 1) |>
  dplyr::select(x)

length(precision) == length(unique(precision))
# [1] TRUE

takayasaito self-assigned this Nov 20, 2019

takayasaito mentioned this issue May 27, 2024

how to use precrec_auc_ci to calculate 95%CI of PR curve during prediction model #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculating AUC/AUPRC confidence intervals #13

Calculating AUC/AUPRC confidence intervals #13

micdonato commented Nov 19, 2019

takayasaito commented Jan 10, 2020

JanaFe commented Nov 26, 2020 •

edited

Loading

takayasaito commented Nov 27, 2020

JanaFe commented Nov 29, 2020

bblodfon commented Apr 26, 2024

bblodfon commented Apr 26, 2024 •

edited

Loading

bblodfon commented Apr 26, 2024 •

edited

Loading

takayasaito commented Apr 29, 2024

takayasaito commented Apr 29, 2024

Calculating AUC/AUPRC confidence intervals #13

Calculating AUC/AUPRC confidence intervals #13

Comments

micdonato commented Nov 19, 2019

takayasaito commented Jan 10, 2020

JanaFe commented Nov 26, 2020 • edited Loading

takayasaito commented Nov 27, 2020

JanaFe commented Nov 29, 2020

bblodfon commented Apr 26, 2024

bblodfon commented Apr 26, 2024 • edited Loading

bblodfon commented Apr 26, 2024 • edited Loading

takayasaito commented Apr 29, 2024

takayasaito commented Apr 29, 2024

JanaFe commented Nov 26, 2020 •

edited

Loading

bblodfon commented Apr 26, 2024 •

edited

Loading

bblodfon commented Apr 26, 2024 •

edited

Loading