Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example notebook doesn't work and has poor results when fixed #109

Closed
Htermotto opened this issue Oct 21, 2022 · 5 comments
Closed

Example notebook doesn't work and has poor results when fixed #109

Htermotto opened this issue Oct 21, 2022 · 5 comments

Comments

@Htermotto
Copy link

Htermotto commented Oct 21, 2022

Hi,

In using Torch XRV we started with the example notebook. However, it has multiple issues such that it does not run to begin with. We were able to fix it but the results of the pretrained weights seem poor on the NIH dataset. Additionally we think there is a bug in the code that will erroneously apply sigmoid twice if apply_sigmoid=True link.

I've opened a PR with our changes that fix the notebook here but we would appreciate input on the model performance.

EDIT: I cleared the notebook in the PR so there wouldn't be a lot of output, but essentially what we are seeing is very low precision on NIH res224. Here are the metrics we get generated from running the above notebook:

image

Thanks!

@ieee8023
Copy link
Member

Hey, sorry for the delay, I will have a chance to look into this soon!

@Htermotto
Copy link
Author

Thanks! Interested to know what you find. Curious if one of the issues is a calibration one.

@ieee8023
Copy link
Member

ieee8023 commented Nov 5, 2022

Hey I had a chance to look at this. I updated the notebook so it runs and computes more metrics. One issue there was that it was applying the sigmoid after the model was already doing it as you said. I'm not sure why that was there but it wouldn't have impacted the AUC anyway and the AUC values were similar. The samples were randomly selected before so it is not possible to process the same samples again. I set the random seed this time.

The accuracy and F1 scores seem bad. Setting a threshold at 0.5 appears not optimal for this data. The "all" model was calibrated using the test splits over all the datasets so it is likely not perfect for these samples.

Screen Shot 2022-11-05 at 11 58 54 AM

@Htermotto
Copy link
Author

Hi thanks for looking into this and fixing the double sigmoid bug!

I agree that the threshold of 0.5 is not optimal. In your code there are some custom thresholds set, and we tried using those, but it didn't seem to make a huge difference. However it could have been from the double sigmoid pushing the numbers up.

@ieee8023
Copy link
Member

Let me know if this issue is not resolved. I'll close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants