Analyzing and sizing binary data #166

anatsa1 · 2021-05-10T15:06:06Z

Hi,

I'm familiarizing myself with the R and JAVA GUI of iMRMC, and I have a few questions.

The data is binary with only 1 modality,

I generated data with a different number of readers, different number of cases and increasing true probability to get a feel for the behavior (which is as expected). I generated data under assumption of independence for simplicity, but I then analyze the data as if it came from a MRMC set-up with its dependencies. I have noticed that in some cases there are warning messages (e.g. that MLE should be used or messages about the degrees of freedom). However, in some situations, neither the GUI nor R provide any error/warning and it seems like it is stuck in a loop. In R, I had to kill the R session. I attach an example of such a case. I can’t understand what in this situation, creates the problem. Any idea?
Moreover, in R I did a simulation when data was generated under same conditions/assumptions and repeating 1000 times in a loop. At some point, it seems that the same problem occurs and R gets stuck in a loop. Since I don’t understand when or why it happens I don’t know how to check the data before I call the function.
With regard to the trick of adding fake data in order to use the tool for binary data: For one scenario I added 3 fake subjects and then 5 fake subjects. The results are identical. I see why the point estimate is not impacted by adding fake subjects. However, I don’t have an intuition why it does not impact the SE and hence the CI or p-value. Can you provide an intuition or refer me to a paper/presentation that discuss this?
In the GUI tool, the bottom part refers to the sizing of the study:
a. Is paired referred to the case of 2 modalities? If not, then what it is? If yes, what should I use if having only 1 modality?
b. How is the effect size defined for binary data? The default in the GUI is HO of AUC=0.5 or equivalent for P=0.5. I was not sure if the effect size is defined as the difference between P under the null and P under HO or differently. If this is how it is defined, then why it does not needed to give also the null? I see why this is not needed for AUC, but for binary, the sizing is not the same if the difference is 10% and P0 is 1% or 50%.

CO.Ind.P0.9.2R.40C.xlsx

Thanks,
Anat

brandon-gallas · 2021-05-18T03:42:49Z

First of all, thanks for the feedback and sorry for the delay. I wish I could tweak and tune the software accordingly right now.

Q1 response: I don’t have an answer for you now. The data file you shared indicates you are pushing the limits in terms of the number of readers. I do not recommend doing an MRMC analysis with only 2 readers. It’s like trying to estimate a variance with only two numbers, just not a good idea. We are rewriting the software to be all native R code. Right now it calls a java app, making it hard to debug. I will leave this issue to check once that is complete.

Q2 response: The fake signal-absent data only serves as a threshold for the binary signal-present success data. If you look at the components of variance for the signal-absent data, they are all zero. There is no variability from the signal-absent data. This is not a surprise since all the fake data is set to 0.5.

Q3 response:
a. You are correct. The sizing section operates on the variance components from the analysis section. If you only have 1 modality in the analysis section, the sizing section will be for 1 modality. What does this mean for “Paired Readers” etc.? They should be disabled. Sorry that is not the case. As such, they should all be “Yes” for pairing, or they could make problems for your sizing results.
b. I think you are referring to the effect size in the sizing analysis. The bottom line is that the sizing analysis was not created for the binary analysis. That said, if you are a statistician, you have all the ingredients: the percent correct, the components of variance, and the standard error.

I want to congratulate you on asking excellent questions. They will motivate improvements on the R package (not doing any more development of the java app). You are quite familiar with the software! I hope you will share your investigations with me. I want to know.

All the best,
Brandon

anatsa1 · 2021-05-26T07:32:15Z

Hi Brandon, Thank you very much for the detailed response. With regard to the 1st question, the file I sent you included 2 readers and I see your point, yet I had this problem also when running simulations with 4 readers. What I have noticed, is that as P goes up and is close to 1, it is more likely that there will be a problem and the program will get into some loop and I will need to kill R. We are planning a prospective pilot study with 4 readers and ~60 subjects: each subject will be evaluated by only one reader and each reader will evaluate ~15 subjects. The device and design are similar to a device that recently approved by FDA (I attach the paper and supplemental. They used iMRMC to power the pivotal study based on the pilot. The probability of “correctness” is very high). In my simulations, I tried to increase P and for higher P there is a problem. I suspect that when the sample size per reader is not large enough and P is high there may not be enough variability, but not sure. My main concern is in case we run the pilot study, and we run into this problem when we need to power the pivotal based on the pilot. I still did not try to run on larger sample sizes to see if the problem occurs also there. Best regards, Anat ANAT SAKOV, PhD, Managing Partner M: +972-54-888-6619 E: <mailto:Anat@DataSights.co.il> Anat@DataSights.co.il <http://www.DataSights.co.il> www.DataSights.co.il From: Brandon Gallas ***@***.***> Sent: 18 May, 2021 06:43 To: DIDSR/iMRMC ***@***.***> Cc: anatsa1 <Anat@Datasights.co.il>; Author ***@***.***> Subject: Re: [DIDSR/iMRMC] Analyzing and sizing binary data (#166) First of all, thanks for the feedback and sorry for the delay. I wish I could tweak and tune the software accordingly right now. Q1 response: I don’t have an answer for you now. The data file you shared indicates you are pushing the limits in terms of the number of readers. I do not recommend doing an MRMC analysis with only 2 readers. It’s like trying to estimate a variance with only two numbers, just not a good idea. We are rewriting the software to be all native R code. Right now it calls a java app, making it hard to debug. I will leave this issue to check once that is complete. Q2 response: The fake signal-absent data only serves as a threshold for the binary signal-present success data. If you look at the components of variance for the signal-absent data, they are all zero. There is no variability from the signal-absent data. This is not a surprise since all the fake data is set to 0.5. Q3 response: a. You are correct. The sizing section operates on the variance components from the analysis section. If you only have 1 modality in the analysis section, the sizing section will be for 1 modality. What does this mean for “Paired Readers” etc.? They should be disabled. Sorry that is not the case. As such, they should all be “Yes” for pairing, or they could make problems for your sizing results. b. I think you are referring to the effect size in the sizing analysis. The bottom line is that the sizing analysis was not created for the binary analysis. That said, if you are a statistician, you have all the ingredients: the percent correct, the components of variance, and the standard error. I want to congratulate you on asking excellent questions. They will motivate improvements on the R package (not doing any more development of the java app). You are quite familiar with the software! I hope you will share your investigations with me. I want to know. All the best, Brandon — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#166 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AUAT47MT2NMDR5QGVT7SMZDTOHO4RANCNFSM44RNYV6Q> . <https://github.com/notifications/beacon/AUAT47PHOTHVXMYYTONZQTTTOHO4RA5CNFSM44RNYV62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOGI6CF4I.gif>

…

-- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

brandon-gallas · 2021-05-27T11:59:35Z

Sorry you are having this problem. I will leave this issue open and debug the problem after we complete the transition to replace the java app with native R code.

I think your simulations would be useful to justify your pivotal study size. Of course, if you want an official agency opinion, you should ask through a pre-submission: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/requests-feedback-and-meetings-medical-device-submissions-q-submission-program

Brandon

anatsa1 · 2021-05-29T19:09:27Z

Thanks. Best regards, Anat ANAT SAKOV, PhD, Managing Partner M: +972-54-888-6619 E: <mailto:Anat@DataSights.co.il> Anat@DataSights.co.il <http://www.DataSights.co.il> www.DataSights.co.il From: Brandon Gallas ***@***.***> Sent: 27 May, 2021 15:00 To: DIDSR/iMRMC ***@***.***> Cc: anatsa1 <Anat@Datasights.co.il>; Author ***@***.***> Subject: Re: [DIDSR/iMRMC] Analyzing and sizing binary data (#166) Sorry you are having this problem. I will leave this issue open and debug the problem after we complete the transition to replace the java app with native R code. I think your simulations would be useful to justify your pivotal study size. Of course, if you want an official agency opinion, you should ask through a pre-submission: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/requests-feedback-and-meetings-medical-device-submissions-q-submission-program Brandon — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#166 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AUAT47M6JQ45CWVCDIZB3YLTPYX3NANCNFSM44RNYV6Q> . <https://github.com/notifications/beacon/AUAT47KCE5RPFNUI4T2BYHDTPYX3NA5CNFSM44RNYV62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOGKRXGKI.gif>

…

-- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

jsprtwlt mentioned this issue Feb 10, 2022

Sizing reader study (radiologists versus standalone CAD) #168

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analyzing and sizing binary data #166

Analyzing and sizing binary data #166

anatsa1 commented May 10, 2021

brandon-gallas commented May 18, 2021

anatsa1 commented May 26, 2021 via email

brandon-gallas commented May 27, 2021

anatsa1 commented May 29, 2021 via email

Analyzing and sizing binary data #166

Analyzing and sizing binary data #166

Comments

anatsa1 commented May 10, 2021

brandon-gallas commented May 18, 2021

anatsa1 commented May 26, 2021 via email

brandon-gallas commented May 27, 2021

anatsa1 commented May 29, 2021 via email