Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size of Prediction Sets using APS Different Than Reported in RAPS Paper #8

Closed
kevinkasa opened this issue Mar 10, 2023 · 5 comments
Closed
Assignees
Labels
question Further information is requested

Comments

@kevinkasa
Copy link

Hello,

Thank you so much for providing the conformal prediction tutorial & corresponding notebooks, they are super helpful!

I had a question regarding the size of the prediction sets returned using the APS methods. In the implementation provided in the notebooks, the prediction sets are far larger than reported on your paper than introduced RAPS. The notebook implementation returns sets that are on average >200 labels, whereas the paper reports an average set size of 10.4, on ResNet152.

I have not done extensive evaluation on RAPS, but it seems the notebook implementation also returns slightly larger sets (set size of ~3).

I was wondering if you have any ideas as to what might be causing this discrepancy, and what the best way to replicate the results in the paper might be.

Also, I wasn't sure which repo this issue should be opened in, so apologies if it doesn't fit here. Thanks in advance!

@aangelopoulos
Copy link
Owner

It's probably due to the lack of randomization!
None of the methods herein are the randomized versions of their respective algorithms... and APS is extremely bad without randomization.
If you randomize, you should recover roughly the results in the paper. Of course, that paper also has its own repo, but it's less friendly than this one.

@aangelopoulos aangelopoulos self-assigned this Mar 11, 2023
@aangelopoulos aangelopoulos added the question Further information is requested label Mar 11, 2023
@aangelopoulos
Copy link
Owner

Hey @kevinkasa, have you had a chance to follow up here? Just wondering if this answers your question.

@kevinkasa
Copy link
Author

Hey @aangelopoulos thanks for the quick response! I was just slightly confused since both your paper and the APS paper seemed to suggest that randomization should affect the sets by at most one element, so it was surprising that APS lead to considerably larger sets without it. I suppose that algorithm is just super sensitive without it then?

Was planning on trying to add randomization to the notebook implementations but haven't had a chance yet. I am trying out the other RAPS repository in the meantime as well. Thanks!

@aangelopoulos
Copy link
Owner

Good question.

Randomization at test time only changes the set by one element.
Randomization during calibration has a much larger effect.

@kevinkasa
Copy link
Author

I see, thank you for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants