-
-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opinion about dataset split #14
Comments
Hi, @djskwh. First, actually I think you are right. In my opinion, the global server should not hold a global testset for evaluating. In my imagine, in industry, the side responsible for FL training is unable to obtain the testset that has the same data distribution as the trainset. But in academy, the global testset setting is proposed for evaluation convenience and it seems permissible. Let us review the code in FL-bench for testing FL methods (no matter traditional or personalized). Lines 227 to 252 in a3ace46
Suppose I have a global testset, its size is According to your colleague's opinion, the final accuracy of traditional FL methods should be calculated by Suppose I have two FL clients at all, What my code calculated is based on So in my opinion, the result my code calculated should be the same as the results calculated with a global testset on traditional FL methods. Of course, personalized FL methods are N/A to this discussion and they are incompatible to the global testset setting. |
thanks for the detailed review of your code and explanation. |
Hi, KarhouTam.
I recently talked about dataset split for Traditional FL(fedavg, fedprox, feddyn, etc. ...) with my colleague.
my colleague insisted that in order to evaluate the FL algorithm, i should evaluate the model on isolated dataset
(i.e. when MNIST, first split 6000~ imgs to test dataset for global model evaluation and then, assign the rest of imgs to each client for test and eval).
as far as i know, global server can't see the entire dataset for privacy issue. right?
i think thats why your dataset creation setting also don't assign a test dataset for global model evaluation.
what is your opinion about this?
The text was updated successfully, but these errors were encountered: