You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, the general purpose evaluate_model_on_tests available in flamby.utils uses y_pred. This mismatch causes different metric values forFed_KiTS19 evaluation depending on the function used.
Seems like evaluate_dice_on_tests is the correct version. Can you please confirm ?
Thanks !
The text was updated successfully, but these errors were encountered:
Hello @akash-07 !
For models working on data modalities that are too big to fit in RAM we have functions that batch the inference such as evaluate_dice_on_tests to measure prediction/ground truth match at the sample level, this is also the case for Fed-LIDC. They are the ones that are being used in the benchmark script.
I agree that it's not really clear. The metric functions also "work" but they are patch-wise.
Maybe @ErumMushtaq can provide more info ?
I think most users of the repo would attempt using evaluate_model_on_tests. Adding a note or some documentation regarding which functions to use per dataset would be helpful.
As another option, fixing evaluate_model_on_tests also seems easier.
You are completely right about the lack of documentation on loss funtions I will open an issue about it.
However the goal of FLamby is not to impose metrics or anything upon the user it is to be a playground for FL research.
Dear authors,
The
evaluate_dice_on_tests
function callsand uses
preds
in themetric
function.However, the general purpose
evaluate_model_on_tests
available inflamby.utils
usesy_pred
. This mismatch causes different metric values forFed_KiTS19
evaluation depending on the function used.Seems like
evaluate_dice_on_tests
is the correct version. Can you please confirm ?Thanks !
The text was updated successfully, but these errors were encountered: