Computing values on ground truth does not give perfect scores #26

jhlegarreta · 2024-03-04T01:07:09Z

Following one of the examples in the README file, a quick test using a ground truth image as the predicted image,

labels = [0, 1, 2]
gdth_img = np.array([[0,0,1], [0,1,2]])
metrics = sg.write_metrics(labels=labels[1:], gdth_img=gdth_img, pred_img=gdth_img)

does not give perfect scores: i.e. the dice score, Jaccard index, precision and recall are not 1.0. Although they are close (0.999) they should be a perfect 1.0.

The text was updated successfully, but these errors were encountered:

agarcia-ruiz · 2024-04-09T16:33:31Z

It seems to do with the smooth parameter, changing it to smooth=0.0 gives a dice of 1.0. Some discussion about that here.

segmentation_metrics/seg_metrics/seg_metrics.py

Lines 81 to 89 in df9a231

    
           smooth = 0.001 
        
           precision = tp / (pred_sum + smooth) 
        
           recall = tp / (gdth_sum + smooth) 
        
           fpr = fp / (fp + tn + smooth) 
        
           fnr = fn / (fn + tp + smooth) 
        
           jaccard = intersection_sum / (union_sum + smooth) 
        
           dice = 2 * intersection_sum / (gdth_sum + pred_sum + smooth)

jhlegarreta · 2024-04-09T16:50:59Z

@agarcia-ruiz thanks for the answer. I understand that it may be useful to have it to prevent a division by zero, but I am not convinced that it should be there outside the scope of training a DL model (have gone through the discussion you pointed, and even there it seem like for some cases the training does not work unless setting it to 0).

In this context, the case should probably be dealt with in another way when pred_sum is 0. At the very least, documenting the behavior may be worthwhile, or adding it as a parameter to the method.

Jingnan-Jia · 2024-04-17T12:21:59Z

@jhlegarreta Thank you very much proposing this question. Actually, we have had some discussion on this issue and we decided to use 0.001 as the default smooth. But indeed we found that a lot of users may feel confused about it especially when running our examples. So we will have another deeper discussion on it and see how to solve it. A possible solution, as you mentioned, is to make it as a parameter. Another solution is make it as 0 and raise Exception when division zero error occured.

Anyway, we will solve it soon. And if you have better solution, please let us know.

Best,
Jingnan

Jingnan-Jia · 2024-04-17T13:05:49Z

@jhlegarreta Another solution could be that we have the default smooth value of 0 followed by a try excep function for the metrics calculation. If we met the division zero exception in the try function, we reset the smooth value to 0.001 and recalculate the metrics, (and may show a warning to the users about this case).

What do you think of this solution?

jhlegarreta · 2024-04-17T21:02:09Z

@Jingnan-Jia it looks like one possible workaround; I am not sure what the best approach is. Maybe worthwhile looking at what other scientific tools that compute at least the DSC do. In my mind, when giving the ground truth it should clearly provide a perfect score. Providing tests on well-known cases would probably be enlightening.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computing values on ground truth does not give perfect scores #26

Computing values on ground truth does not give perfect scores #26

jhlegarreta commented Mar 4, 2024

agarcia-ruiz commented Apr 9, 2024

jhlegarreta commented Apr 9, 2024

Jingnan-Jia commented Apr 17, 2024 •

edited

Jingnan-Jia commented Apr 17, 2024

jhlegarreta commented Apr 17, 2024

Computing values on ground truth does not give perfect scores #26

Computing values on ground truth does not give perfect scores #26

Comments

jhlegarreta commented Mar 4, 2024

agarcia-ruiz commented Apr 9, 2024

jhlegarreta commented Apr 9, 2024

Jingnan-Jia commented Apr 17, 2024 • edited

Jingnan-Jia commented Apr 17, 2024

jhlegarreta commented Apr 17, 2024

Jingnan-Jia commented Apr 17, 2024 •

edited