-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precision-recall curves #18
Comments
I believe that for the I assume that you're more interested in the AUC. At the moment the code is very ROC-centric, and requires the curve to increase monotonically. That's not the case for PR, so a new method would have to be written. Once we have that it should be possible to bootstrap and calculate variance, p values etc. I have been working on cleaning up the current mess in my bootstrapping code, but it's still very convoluted (as I need to keep the parameters used to build the ROC curve such as partial AUC, smoothing, direction etc, and handle stratified and non stratified sampling). Smoothing is out of the picture and I'd have no idea how to integrate that. I guess that makes it a pretty big rewrite. Except for §1 it's unlikely I'll have time to do it in the foreseeable future unless someone else steps in. |
Everything is simple for the person not doing it =]
Thanks for the assessment. |
Hi, |
Precision and recall was implemented in version 1.10.0. I don't know why it wasn't mentioned here.
Now regarding the |
Thanks for the input :) I would've thought that the approach would be the same that in ROC curve analysis : drawing a line from the "ideal" corner and gradually decreasing till finding a tangent point in the PR curve (EER if I'm not mistaken). This link offers a few options and they seem legit from what I understand : https://stats.stackexchange.com/questions/7718/how-to-choose-a-good-operation-point-from-precision-recall-curves |
There are a lot of legitimate things that can be done. The real question is which one(s) are worth being implemented. Equal Error Rate or EER is not something that pROC can do with ROC curves at the moment. It's tricky to calculate, as one needs to interpolate both sensitivity and specificity together. It may likely not correspond to a threshold. I am not aware that it's ever used in practice. However if interest is there it can be done. For PR curves I don't know if the equation has a single solution, but again this can be worked out. You can already specify the prevalence and relative cost for mis-classifications, with a formula given by Perkins and Schisterman. Please feel free to open a new feature request with the specific feature you'd like to see implemented. It would also help if you can provide some evidence that it's been used in published research, and an algorithm to calculate it. |
Any chance that you might add these?
The text was updated successfully, but these errors were encountered: