Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of FedNova #1

Open
omarfoq opened this issue Jun 24, 2021 · 1 comment
Open

Implementation of FedNova #1

omarfoq opened this issue Jun 24, 2021 · 1 comment

Comments

@omarfoq
Copy link

omarfoq commented Jun 24, 2021

Hello,

I am not sure about this, but it seems there is a slight mismatch between the implementation of FedNova, and it's description from the paper. It seems that in the current implementation, you compute the cumulative sum of gradient, without multiplying by weights a_{i}, even when etamu is not zero (see below)

# update accumalated local updates
if 'cum_grad' not in param_state:
      param_state['cum_grad'] = torch.clone(d_p).detach()
      param_state['cum_grad'].mul_(local_lr)
else:
      param_state['cum_grad'].add_(local_lr, d_p)

p.data.add_(-local_lr, d_p)

Probably it is needed to add a line in order to multiply d_p by a_i (this variable should be tracked as well), before updating param_state['cum_grad'].

Is what I said correct, or is there anything I am missing?

Thanks in advance for your help

@nthhiep
Copy link

nthhiep commented Dec 23, 2021

Hi.
d_p is initially gradient but at the end of the step, it is a complex update of many factors including gradient, momentum, variance (correction), proximal, ....
So param_state['cum_grad'] is the accumulation of step updates d_p, not accumulation of gradients.
By some equivalent mathematical transformation, you will see this accumulated update param_state['cum_grad'] is equivalent to the sum of ai[j]*grad_stepj. Let's see page 6 of the paper for better understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants