Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing values on ground truth does not give perfect scores #26

Open
jhlegarreta opened this issue Mar 4, 2024 · 5 comments
Open

Comments

@jhlegarreta
Copy link

Following one of the examples in the README file, a quick test using a ground truth image as the predicted image,

labels = [0, 1, 2]
gdth_img = np.array([[0,0,1], [0,1,2]])
metrics = sg.write_metrics(labels=labels[1:], gdth_img=gdth_img, pred_img=gdth_img)

does not give perfect scores: i.e. the dice score, Jaccard index, precision and recall are not 1.0. Although they are close (0.999) they should be a perfect 1.0.

@agarcia-ruiz
Copy link

It seems to do with the smooth parameter, changing it to smooth=0.0 gives a dice of 1.0. Some discussion about that here.

smooth = 0.001
precision = tp / (pred_sum + smooth)
recall = tp / (gdth_sum + smooth)
fpr = fp / (fp + tn + smooth)
fnr = fn / (fn + tp + smooth)
jaccard = intersection_sum / (union_sum + smooth)
dice = 2 * intersection_sum / (gdth_sum + pred_sum + smooth)

@jhlegarreta
Copy link
Author

@agarcia-ruiz thanks for the answer. I understand that it may be useful to have it to prevent a division by zero, but I am not convinced that it should be there outside the scope of training a DL model (have gone through the discussion you pointed, and even there it seem like for some cases the training does not work unless setting it to 0).

In this context, the case should probably be dealt with in another way when pred_sum is 0. At the very least, documenting the behavior may be worthwhile, or adding it as a parameter to the method.

@Jingnan-Jia
Copy link
Owner

Jingnan-Jia commented Apr 17, 2024

@jhlegarreta Thank you very much proposing this question. Actually, we have had some discussion on this issue and we decided to use 0.001 as the default smooth. But indeed we found that a lot of users may feel confused about it especially when running our examples. So we will have another deeper discussion on it and see how to solve it. A possible solution, as you mentioned, is to make it as a parameter. Another solution is make it as 0 and raise Exception when division zero error occured.

Anyway, we will solve it soon. And if you have better solution, please let us know.

Best,
Jingnan

@Jingnan-Jia
Copy link
Owner

@jhlegarreta Another solution could be that we have the default smooth value of 0 followed by a try excep function for the metrics calculation. If we met the division zero exception in the try function, we reset the smooth value to 0.001 and recalculate the metrics, (and may show a warning to the users about this case).

What do you think of this solution?

@jhlegarreta
Copy link
Author

@Jingnan-Jia it looks like one possible workaround; I am not sure what the best approach is. Maybe worthwhile looking at what other scientific tools that compute at least the DSC do. In my mind, when giving the ground truth it should clearly provide a perfect score. Providing tests on well-known cases would probably be enlightening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants