You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the PRC curve returns essentially a DataFrame with three columns: x, y, and a boolean column orig_points.
Is it possible to somehow map the non-interpolated points (orig_points = 1) to the actual score thresholds for the resulting precision/recall measurements? Somewhat among the lines of how sklearn handles it. This is sometimes needed to ask questions, like 'what is the minimum threshold at which precision is >= 75%?' or similar.
I assume a sorted increasing list of unique scores should map 1:1 to the orig_points, but this seems a bit hacky. Maybe there is a way to get it out of precrec directly?
The text was updated successfully, but these errors were encountered:
It is not easy to use orig_points to retrieve the corresponding scores, but you can use mode = "basic" for that purpose. For instance, the following snippet shows how to get the original scores when precision is greater than or equal to 0.75.
library("precrec")
# Dataset with 10 positives and 10 negatives
data(P10N10)
# Calculate basic evaluation measuressspoints<- evalmod(mode="basic", scores=P10N10$scores, labels=P10N10$labels)
# Convert sspoints to data.framedf<-data.frame(sspoints)
# Get normalized threshold values for precision >= 0.75xs<-df[df$type=="precision"&df$y>=0.75, "x"]
# Show scores and precision values corresponding to xsdf[df$x%in%xs&df$type%in% c("score", "precision"), ]
In the data frame of the example above, the x column contains the normalized threshold values with range [0, 1], and the y column contains the values specified in the type column.
Unlike ROC, precision-recall curves are not monotonically increasing so that you may need to add one more condition, such as 'recall is greater than 0.5', for some cases.
Currently the PRC curve returns essentially a DataFrame with three columns:
x
,y,
and a boolean columnorig_points
.Is it possible to somehow map the non-interpolated points (
orig_points
= 1) to the actual score thresholds for the resulting precision/recall measurements? Somewhat among the lines of howsklearn
handles it. This is sometimes needed to ask questions, like 'what is the minimum threshold at which precision is >= 75%?' or similar.I assume a sorted increasing list of unique scores should map 1:1 to the
orig_points
, but this seems a bit hacky. Maybe there is a way to get it out ofprecrec
directly?The text was updated successfully, but these errors were encountered: