Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not use autograd directly? #44

Closed
sdogsq opened this issue Mar 17, 2022 · 2 comments
Closed

Why not use autograd directly? #44

sdogsq opened this issue Mar 17, 2022 · 2 comments

Comments

@sdogsq
Copy link

sdogsq commented Mar 17, 2022

Hi, Patrick,

Thanks a lot for your nice work and detailed code comments! I have a very naive question: why do not use pytorch autograd directly for backward process? Since I see the tensor operations are all like addcmul in forward process. I have some simple reasons but I am not sure if they are correct.

  1. custom backward process might have speed up comparing to autograd
  2. it can avoid intermediate variables and thus save memory

I am a newbie in custom pytorch functions. I would appreciate it if you could kindly share some opinions. Thank you again!

@patrick-kidger
Copy link
Owner

The main reason is that it is possible to efficiently reconstruct the forward pass during the backward pass. Doing so means we don't need to hold intermediate values in memory.

This is actually a very special case of the "continuous adjoint method" sometimes used in differential equations (e.g. as popularised for neural ODEs; also see Chapter 5 of https://arxiv.org/abs/2202.02435). Although in our case, because of the piecewise linear structure, we can recompute things without suffering any numerical truncation error. (Only floating point error, which usually isn't that bad.)

@sdogsq
Copy link
Author

sdogsq commented Mar 18, 2022

Really insightful views! I'll read this paper carefully.

Cheers!

@sdogsq sdogsq closed this as completed Mar 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants