Skip to content

MRMC analysis of binary data

Brandon Gallas edited this page May 1, 2023 · 2 revisions

Can iMRMC analyze binary data?

Yes! The iMRMC java GUI can analyze binary data and estimate reader-averaged percent correct. The key is how to format the data as the input file. Note that using the iMRMC java GUI to analyze binary data is kind of trick-out of the program to analyze ROC data. As such, results are still presented in terms of AUC even though you have tricked out the data so that you are actually performing an MRMC analysis of percent correct. See below.

Instead of tricking out the iMRMC java GUI, we have an R package ("iMRMC") that includes a function to specifically analyze binary data: uStat11.conditionalD. If you are comfortable with R, you should check out the iMRMC package that is downloadable from CRAN or downloadable directly from the release page of this repository.

Data

Your binary data is a set of "success" observations. One observation corresponds to one reader evaluating one case in one modality. The reader either gets the case correct (1 = one) or not (0 = zero). We need to map this data into an input file of ROC data that iMRMC expects. Please read the documentation here

The first section of an iMRMC input file contains the study description. After the study description, we specify the truth status of each case and then the data. This section begins with "BEGIN DATA:". The subsequent rows have four fields separated by commas. The fields are readerID, caseID, modalityID, and score.

1. Create the rows specifying the truth state of each case

*Let the cases corresponding to your actual binary data be the ROC "disease cases". If you have N1 cases, there will be N1 rows specifying your cases as disease cases. Each row starts with "truth" as the readerID, then you need to give a unique caseID to each case, then specify "truth" as the modalityID, and then the score is 1 for "disease". *Next we create 5 fake ROC "non-disease" cases. Each row starts with "truth" as the readerID, then create a fake caseID for each case ("fake1", "fake2", ... "fake5"), then specify "truth" as the modalityID, and the score is 0 for "non-disease".

2. Create the rows specifying the "success" observations

*Map each of your binary "success" observations to a row of the input file. Each row corresponds to a readerID, caseID, modalityID, and the observation success result (Correct=1, Incorrect=0). *Create fake observations of the ROC "non-disease" cases. For each reader, fake case, and modality where there are actual cases, create a fake observation with the score = 0.5. If a reader does not read in a modality, don't bother creating fake data for that modality.

Case Category Reader decision Binary decision
Actual Case Correct 1
Actual Case Incorrect 0
Fake Case NA 0.5

3. Example

Reading result:

Case ID Reader1 decision Reader2 decision
Actual1 Correct Incorrect
Actual2 Incorrect Correct

Input file: (Add two Fake cases, assume only one modality: "modalityA")

readerID caseID modalityID score
truth Actual1 truth 1
truth Actual2 truth 1
truth Fake1 truth 0
truth Fake2 truth 0
Reader1 Actual1 modalityA 1
Reader1 Actual2 modalityA 0
Reader1 Fake1 modalityA 0.5
Reader1 Fake2 modalityA 0.5
Reader2 Actual1 modalityA 0
Reader2 Actual2 modalityA 1
Reader2 Fake1 modalityA 0.5
Reader2 Fake2 modalityA 0.5