BatchAutoDiffCost: make shape consistent #7

AdamGleave · 2019-01-28T06:04:07Z

First of all, thanks for the great repo -- cleanest implementation of iLQR I've found and easily extensible for my needs.

The current implementation of BatchAutoDiffCost passes in the inputs without a batch in the first dimension for l, l_x and l_u (i.e. x.shape == (state_size,)), and inputs with a batch for l_xx, l_ux and l_uu (i.e. x.shape == (batch_size, state_size)).

Although the cost function l can be written to handle both inputs, I find it quite fiddly to get this right, and it means doing things like summing over axis=-1 since axis=1 doesn't exist in the non-batch case. In this PR I change the input to always have a batch in the first dimension (sometimes of length 1). I also redefine i to be a scalar, which I think it always will be.

anassinator

Thanks for the kind words and the contributions!

Could you give a small example of what this solves exactly? I always just used the [..., i] syntax to support both as in the examples in the README. Is that insufficient? Also, would BatchAutoDiffDynamics need to change?

anassinator · 2019-02-03T02:35:52Z

ilqr/cost.py

@@ -353,7 +353,7 @@ def __init__(self, f, state_size, action_size, **kwargs):
        # Prepare inputs.
        self._x = x = T.dvector("x")
        self._u = u = T.dvector("u")
-        self._i = i = T.dvector("i")
+        self._i = i = T.dscalar("i")


I think we should keep T.dvector here to be consistent with the rest of the Cost and Dynamics implementations. The idea was to support computing the cost of a batch of (x, u, i) pairs in one shot. We can revisit this though, but probably better in a separate PR.

Agree it seems conceptually cleaner to support evaluating at different i values. I'm not sure I can think of a concrete use case, and it does complicate implementing Cost a little, but I don't have strong feelings here. I originally made this change since l(x,u,i) failed and required a call l(x,u,[i]), but I can fix that directly.

Should terminal also be a vector if i is? I'm confused how these interact.

AdamGleave · 2019-02-03T18:25:00Z

It's possible to write cost functions in such a way that they support both batched and non-batched inputs, but it's error prone. In particular, it's easy to write a function which works for l, l_x and l_u but fails on l_xx. I tend to assume that if the function evaluates correctly on a few test inputs, then since the derivatives are computed automatically they should be correct as well. So I found this behaviour quite counterintuitive, and it effectively requires testing twice as many cases. Here's a MWE of the problem:

import numpy as np
from theano import tensor as T
from theano.printing import Print

from ilqr.cost import BatchAutoDiffCost


class DemoCost(BatchAutoDiffCost):
    def __init__(self, axis):
        def f(x, u, i, terminal):
            if terminal:
                control_cost = T.zeros_like(x[..., 0])
            else:
                control_cost = T.square(u).sum(axis=axis)

            foo = x[..., 0:2]
            state_cost = T.sqrt(T.sum(x, axis=axis))
            cost = state_cost + control_cost
            cost = control_cost
            return cost

        super().__init__(f, state_size=2, action_size=2)

def similar(x, y):
    x = abs(x - y) < 1e-8
    if np.isscalar(x):
        return x
    else:
        return x.all()


def test():
    correct_cost = DemoCost(axis=-1)  # this always works
    cost0 = None
    try:
        # This one is wrong, but can be created in both master and PR.
        cost0 = DemoCost(axis=0)
    except Exception as e:
        print('DemoCost(axis=0) failed with', e)
    cost1 = None
    try:
        # This one is correct. It cannot be created in master. It can in PR.
        cost1 = DemoCost(axis=1)
    except Exception as e:
        print('DemoCost(axis=1) failed with', e)

    x = np.array([1, 2])
    u = np.array([1, 2])

    for i, cost in enumerate([cost0, cost1]):
        print('run ', i)
        if cost is None:
            continue
        try:  # on PR branch
            print('l', similar(correct_cost.l(x, u, 0), cost.l(x, u, 0)))
        except Exception as e:  # on master branch
            print('Exception: ', e)
            print('l', similar(correct_cost.l(x, u, [0]), cost.l(x, u, [0])))
        print('l_x', similar(correct_cost.l_x(x, u, 0), cost.l_x(x, u, 0)))
        print('l_u', similar(correct_cost.l_u(x, u, 0), cost.l_u(x, u, 0)))
        print('l_xx', similar(correct_cost.l_xx(x, u, 0), cost.l_xx(x, u, 0)))
        print('l_uu', similar(correct_cost.l_uu(x, u, 0), cost.l_uu(x, u, 0)))


if __name__ == '__main__':
    test()

On my machine, on master this produces:

DemoCost(axis=1) failed with Not enough dimensions on Elemwise{sqr,no_inplace}.0 to reduce on axis 1
l True  # it appears to work...
l_x True                                                                                                                                                                                              
l_u True                                                                                                                                                                                              
l_xx False  # but is actually broken :(
l_uu True

By contrast, in the PR branch:

run  0  # this is wrong, but it's immediately obviously wrong -- it disagrees on l
l False                                                                                                                                                                                               
l_x True
l_u False                                                                                                                                                                                             
l_xx True                                                                                                                                                                                             
l_uu True                                                                                                                                                                                             
run  1  # this is correct
l True                                                                                                                                                                                                
l_x True                                                                                                                                                                                              
l_u True                                                                                                                                                                                              
l_xx True                                                                                                                                                                                             
l_uu True

anassinator · 2019-04-07T07:47:59Z

Sorry, I forgot to address this!

I understand what you mean now. For completeness' sake, I was only able to reproduce your example by removing the cost = control_cost line.

And I didn't notice there was a discrepancy with the i vector between l() and l_*(). So you can fix that after all. Would you also mind making the same change to BatchAutoDiffDynamics then so they're both consistent?

AdamGleave · 2019-04-07T23:37:53Z

You're right the cost = control_cost line needs to be removed for my example to reproduce; apologies.

I've changed self._i to be dscalar in both BatchAutoDiffCost and BatchAutoDiffDynamics. I also noticed that BatchAutoDiffDynamics had a similar issue of inconsistent # of dimensions, which I've resolved analogously to in BatchAutoDiffCost.

I believe this addresses all outstanding issues in this PR; let me know if there are any other changes you'd like me to make.

BatchAutoDiffCost: make shape consistently (batch_size, ...)

5686db3

anassinator requested changes Feb 3, 2019

View reviewed changes

AdamGleave added 2 commits February 3, 2019 10:47

Bugfix: reshape terminal cost too

fe4132a

Make i a vector again

efc8125

AdamGleave added 4 commits April 7, 2019 15:50

Merge remote-tracking branch 'anassinator/master' into batchcostshapes

a437734

Make self._i a scalar

5e0ff13

Ignore PyCharm project settings

61c6045

Make BatchAutoDiffDynamics also have consistent # of dimensions

47ec39a

anassinator approved these changes May 5, 2019

View reviewed changes

anassinator merged commit d4a0ac0 into anassinator:master May 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BatchAutoDiffCost: make shape consistent #7

BatchAutoDiffCost: make shape consistent #7

AdamGleave commented Jan 28, 2019

anassinator left a comment

anassinator Feb 3, 2019

AdamGleave Feb 3, 2019

AdamGleave Feb 3, 2019

AdamGleave commented Feb 3, 2019 •

edited

Loading

anassinator commented Apr 7, 2019

AdamGleave commented Apr 7, 2019

BatchAutoDiffCost: make shape consistent #7

BatchAutoDiffCost: make shape consistent #7

Conversation

AdamGleave commented Jan 28, 2019

anassinator left a comment

Choose a reason for hiding this comment

anassinator Feb 3, 2019

Choose a reason for hiding this comment

AdamGleave Feb 3, 2019

Choose a reason for hiding this comment

AdamGleave Feb 3, 2019

Choose a reason for hiding this comment

AdamGleave commented Feb 3, 2019 • edited Loading

anassinator commented Apr 7, 2019

AdamGleave commented Apr 7, 2019

AdamGleave commented Feb 3, 2019 •

edited

Loading