Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question on the computation of Hessian-vector product #4

Closed
YiifeiWang opened this issue Feb 15, 2020 · 2 comments
Closed

A question on the computation of Hessian-vector product #4

YiifeiWang opened this issue Feb 15, 2020 · 2 comments

Comments

@YiifeiWang
Copy link

In the function dataloader_hv_product() under the class hessian(), in line 86-87, it follows
'''
THv = [torch.randn(p.size()).to(device) for p in self.params
] # accumulate result
'''
I am wondering why it uses random initialization instead of zero initialization. (Although in actual computation, with large data number, this initialization is approximate to zero.)

@htwang14
Copy link

Thanks a lot to the authors for releasing these codes of their excellent work. But I do agree with @YiifeiWang that it is more appropriate to use zero initialization here. In my own experiments, when using random initialization, the power iteration converges poorly and the returned top eigenvalues vary a lot among different runs. Please point out if my understanding is wrong.

@yaozhewei
Copy link
Collaborator

Thanks for pointing this out. This is a mistake when we clean up the code. It is fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants