Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doubt] Would iteratively setting a non-local variable as the answer result in correct gradients #67

Closed
pranv opened this issue Nov 7, 2015 · 2 comments
Labels

Comments

@pranv
Copy link

pranv commented Nov 7, 2015

Suppose I had a class method that like:

def call(self, X):
    for t in range(X.shape[0]):
        self.ans = some_function(X[t], self.ans)       # some_function is a parameterized operation
        # some more computation with self.ans
        # final step has a scalar loss function

would I get correct gradients of this whole process? Meaning - will the computation graph that autograd constructs have the value of self.ans stored per iteration and use it?

@mattjj
Copy link
Contributor

mattjj commented Nov 7, 2015

Yup!

Here's a test just to make sure I understand what you mean:

import autograd.numpy as np
from autograd import grad
from autograd.util import quick_grad_check


def some_function(x, y):
    return x + y + x*y

class A(object):
    def call(self, X):
        self.ans = np.zeros(X.shape)
        for t in range(X.shape[0]):
            self.ans = some_function(X[t], self.ans)

        return np.sum(self.ans**2)


a = A()
print grad(a.call)(np.random.randn(5))
quick_grad_check(a.call, np.random.randn(5))

One thing to keep in mind is that ans will come out boxed at the end:

In [1]: run issue67
[ 323.38327233  185.12997922  283.66369298  168.29411013  807.31095824]
Checking gradient of <bound method A.call of <__main__.A object at 0x10d2d9610>> at [-0.10688095  0.66919977 -0.45675244 -1.08241973 -0.91352716]
Gradient projection OK (numeric grad: -0.105982859395, analytic grad: -0.105982859699)

In [2]: print a.ans
Autograd ArrayNode with value [-1.005772 -1.005772 -1.005772 -1.005772 -1.005772] and 1 tape(s)

but its computation tape is completed and so it will act just like a regular array.

The code also works if the updated value of self.ans gets reused in future calls to call instead of getting reset to zeros like in the example I wrote. That just means the function changes every time you call it, which autograd can handle but quick_grad_check can't (because it invokes the function multiple times to check its numerical gradient):

class A(object):
    def __init__(self, ans):
        self.ans = ans

    def call(self, X):
        for t in range(X.shape[0]):
            self.ans = some_function(X[t], self.ans)

        return np.sum(self.ans**2)


a = A(5.)
print grad(a.call)(np.random.randn(3))
print grad(a.call)(np.random.randn(3))
In [1]: run issue67
[  8.91513093  60.30447614 -32.37465155]
[  2.43210897  17.48004748   5.28757384]

@mattjj mattjj closed this as completed Nov 7, 2015
@mattjj mattjj added the question label Nov 7, 2015
@pranv
Copy link
Author

pranv commented Nov 8, 2015

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants