Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you implement the equation(15) in your paper? #5

Closed
Dorispaopao opened this issue Aug 23, 2019 · 2 comments
Closed

How do you implement the equation(15) in your paper? #5

Dorispaopao opened this issue Aug 23, 2019 · 2 comments

Comments

@Dorispaopao
Copy link

And how to consider the gradient backpropgation in your implement?

@XiaLiPKU
Copy link
Owner

For the first question:
I implemented it in the 'train.py'

EMANet/train.py

Line 134 in 9a492d8

self.net.module.ema.mu *= momentum
. Implement it in the EMAU module may be more good-looking. But as the \mu has to be averaged on the whole batch, implementing it in the module needs the 'reduce' operation as in SyncBN. So I just write the line in the 'train.py', where the \mu from all GPUs are already together here.

@XiaLiPKU
Copy link
Owner

And how to consider the gradient backpropgation in your implement?

For the second question:

I simple cut off the gradients for the A_E and A_M iterations as

with torch.no_grad():
.
To be honest, there lacks deep exploration of what happens inside the EMA. So EMANet is just a naive exploration on the EM + Attention mechanism. So, I just look forward for more deep analysis by dear followers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants