controllers: support affine terms in finite-horizon LQR #13272

RussTedrake · 2020-05-11T02:52:11Z

The corresponding derivation is in:
http://underactuated.mit.edu/lqr.html#finite_horizon_derivation

This is the last major piece of functionality that I hoped to support for continuous-time systems. (Richer cost specification would be the one easy thing to add if/when it's needed). But I still hope to add the discrete-time formulations before I'm completely satisfied.

This change is

RussTedrake

+@hongkai-dai for feature review, please? (This one is a bit a of a doozy... let me know if you're not up for it)

Reviewable status: LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)

hongkai-dai

Checkpoint.

Reviewed 2 of 7 files at r1.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)

systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):

A structure that contains the basic FiniteHorizonLinearQuadraticRegulator
results. The finite-horizon cost-to-go is given by (x-x0(t))'*S(t)*(x-x0(t)) +
2*(x-x0(t))'sx(t) + s0(t) and the optimal controller is given by u-u0(t) =

sx(t) could be interpreted as a scalar s multiplies with x(t). Can we use another symbol for this vector?

systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

         + \int_{t_0}^{t_f} 2(x(t)-x_d(t))'N(u(t)-u_d(t)) dt \\
  \text{s.t. } \dot{x} - \dot{x}_0(t) = A(t)(x(t) - x_0(t)) + B(t)(u(t) -
u_0(t)) + c(t)

I am not sure if we should have \dot{x}_0(t) here. The nonlinear dynamics is

xdot = f(x, u)

We do a taylor expansion of the dynamics around (x0, u0), and get

xdot ≈ f(x0, u0) + df/dx * (x - x0) + df/du * (u - u0)

with A = df/dx, B = df/du, c = f(x0, u0), we should get

xdot ≈ A * (x - x0) + B * (u - u0) + c

Did I miss anything here?

systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 109 at r1 (raw file):

    const Eigen::Ref<const Eigen::MatrixXd>& A = AB.leftCols(num_states_);
    const Eigen::Ref<const Eigen::MatrixXd>& B = AB.rightCols(num_inputs_);
    const Eigen::VectorXd c = math::autoDiffToValueMatrix(autodiff_xdot0) -

I think in most cases, x0_.EvalDerivative() does not equal to xdot0 computed from f(t, x0, u0). Most of the time we get x0 from interpolating the trajectory optimization result, with no guarantee that the time derivative of the interpolated trajectory matches with xdot computed from the dynamics. With this non-zero c, we will always introduce non-zero affine and constant terms. This is probably different from the LQR controller implemented in many other places, which don't have the non-zero affine term.

RussTedrake

Reviewable status: 3 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)

systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

sx(t) could be interpreted as a scalar s multiplies with x(t). Can we use another symbol for this vector?

Ok, I will update the doxygen to use sₓ, s₀, etc. (but will push this after your review; just to not spend a CI cycle on it).

systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

I am not sure if we should have \dot{x}_0(t) here. The nonlinear dynamics is
xdot = f(x, u)
We do a taylor expansion of the dynamics around (x0, u0), and get
xdot ≈ f(x0, u0) + df/dx * (x - x0) + df/du * (u - u0)
with A = df/dx, B = df/du, c = f(x0, u0), we should get
xdot ≈ A * (x - x0) + B * (u - u0) + c
Did I miss anything here?

That is not how I'm defining c. I'm defining xbar = x-x0, and xbardot = Axbar + Bxbar + c.

systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 109 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

I think in most cases, x0_.EvalDerivative() does not equal to xdot0 computed from f(t, x0, u0). Most of the time we get x0 from interpolating the trajectory optimization result, with no guarantee that the time derivative of the interpolated trajectory matches with xdot computed from the dynamics. With this non-zero c, we will always introduce non-zero affine and constant terms. This is probably different from the LQR controller implemented in many other places, which don't have the non-zero affine term.

I agree. And this version is better!

hongkai-dai

First pass done. BTW, the notes in underactuated doesn't have the affine term N, will the notes be updated?

Reviewed 4 of 7 files at r1.
Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)

systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

Previously, RussTedrake (Russ Tedrake) wrote…

That is not how I'm defining c. I'm defining xbar = x-x0, and xbardot = Axbar + Bxbar + c.

I think c(t) is defined as c(t) = f(t, x0(t), u0(t)) in the documentation below. So we need to update the definition of c(t) in the next paragraph.

systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 159 at r1 (raw file):

    // the optimal sampling times for the fixed-degree polynomial over the
    // finite time interval. Find it and use it here.
    const std::vector<double> scaled_sample_times = {0.0, 1. / 3., 2. / 3., 1.};

By using Lagrangian interpolation with 4 points, we create a 3rd order polynomial that matches with K at the sampled time. Could you add more explanation on why choosing a 3rd order polynomial?

systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):

  // Confirm that it converges to the solution.
  EXPECT_TRUE(CompareMatrices(result.S->value(t0), lqr_result.S, 1e-5));
  EXPECT_TRUE(result.sx->value(t0).isZero(1e-12));

Could you provide more explanation why sx and s0 would converge to 0 at t0 with non-zero N? In my understanding, this is because for this linear system, x0 = xd and u0 = ud.

systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 211 at r1 (raw file):

Here we will leave the coordinate system alone

Do you mean that the coordinate system x?

systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 252 at r1 (raw file):

      FiniteHorizonLinearQuadraticRegulator(sys, *context, t0, tf, Q, R,
                                            options);
  // The LQR x'Sx is the correct solution, but in the Finite Horizon version,

Could you provide more explanation on why x'Sx is the correct solution? My understanding is that because xd = x0 = 0 in options.

RussTedrake

The notes do have the N term (pushed before I opened this PR). Are you looking at the right section? http://underactuated.csail.mit.edu/lqr.html#finite_horizon_derivation .

Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)

systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):

Previously, RussTedrake (Russ Tedrake) wrote…

Ok, I will update the doxygen to use sₓ, s₀, etc. (but will push this after your review; just to not spend a CI cycle on it).

Done.

systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

I think c(t) is defined as c(t) = f(t, x0(t), u0(t)) in the documentation below. So we need to update the definition of c(t) in the next paragraph.

Done. Good catch.

systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

Could you provide more explanation why sx and s0 would converge to 0 at t0 with non-zero N? In my understanding, this is because for this linear system, x0 = xd and u0 = ud.

N is included in the infinite-horizon solution from LQR (which is in the form x'Sx). And we expect the finite-horizon solution to converge to the infinite-horizon solution as t->\inf.

systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 211 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

Here we will leave the coordinate system alone

Do you mean that the coordinate system x?

Done. (yes)

systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 252 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

Could you provide more explanation on why x'Sx is the correct solution? My understanding is that because xd = x0 = 0 in options.

Done.

RussTedrake

Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)

systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 159 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

By using Lagrangian interpolation with 4 points, we create a 3rd order polynomial that matches with K at the sampled time. Could you add more explanation on why choosing a 3rd order polynomial?

Done.

hongkai-dai

+@sammy-tri for platform review please thanks!

Sorry I must have misread the underactuated notes. It does have N term.

Reviewed 1 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: LGTM missing from assignee sammy-tri(platform) (waiting on @sammy-tri)

systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):

Previously, RussTedrake (Russ Tedrake) wrote…

N is included in the infinite-horizon solution from LQR (which is in the form x'Sx). And we expect the finite-horizon solution to converge to the infinite-horizon solution as t->\inf.

I see, thanks!

sammy-tri

Reviewed 4 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @RussTedrake)

systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 296 at r2 (raw file):

    Eigen::Map<Eigen::VectorXd> S_xx(S_vectorized.data(),
                                     num_states * num_states);
    S_xx = *options.Qf;

This line appears to be triggering an assertion in debug builds:

finite_horizon_linear_quadratic_regulator_test: external/eigen/include/_usr_include_eigen3/Eigen/src/Core/DenseBase.h:257: void Eigen::DenseBase::resize(Eigen::Index, Eigen::Index) [with Derived = Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >; Eigen::Index = long int]: Assertion `rows == this->rows() && cols == this->cols() && "DenseBase::resize() does not actually allow one to resize."' failed.

sammy-tri

Full disclosure: I did not attempt to properly understand this.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @RussTedrake)

The corresponding derivation is in: http://underactuated.mit.edu/lqr.html#finite_horizon_derivation This is the last major piece of functionality that I hoped to support for continuous-time systems. (Richer cost specification would be the one easy thing to add if/when it's needed). But I still hope to add the discrete-time formulations before I'm completely satisfied.

RussTedrake

Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @hongkai-dai and @sammy-tri)

systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 296 at r2 (raw file):

Previously, sammy-tri (Sam Creasey) wrote…

This line appears to be triggering an assertion in debug builds:

finite_horizon_linear_quadratic_regulator_test: external/eigen/include/_usr_include_eigen3/Eigen/src/Core/DenseBase.h:257: void Eigen::DenseBase::resize(Eigen::Index, Eigen::Index) [with Derived = Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >; Eigen::Index = long int]: Assertion `rows == this->rows() && cols == this->cols() && "DenseBase::resize() does not actually allow one to resize."' failed.

Done. That one was real. Thanks CI!

sammy-tri

Reviewed 1 of 1 files at r3.
Reviewable status: complete! all discussions resolved, LGTM from assignees sammy-tri(platform),hongkai-dai

RussTedrake assigned hongkai-dai May 11, 2020

RussTedrake commented May 11, 2020

View reviewed changes

hongkai-dai reviewed May 11, 2020

View reviewed changes

RussTedrake commented May 11, 2020

View reviewed changes

hongkai-dai reviewed May 11, 2020

View reviewed changes

RussTedrake force-pushed the affine_tvlqr branch from 22bf0ce to e4a4577 Compare May 12, 2020 17:00

RussTedrake commented May 12, 2020

View reviewed changes

RussTedrake force-pushed the affine_tvlqr branch from e4a4577 to dd3b923 Compare May 12, 2020 17:05

RussTedrake commented May 12, 2020

View reviewed changes

hongkai-dai assigned sammy-tri May 12, 2020

hongkai-dai reviewed May 12, 2020

View reviewed changes

sammy-tri reviewed May 12, 2020

View reviewed changes

RussTedrake force-pushed the affine_tvlqr branch from dd3b923 to 60c82f1 Compare May 13, 2020 01:27

RussTedrake commented May 13, 2020

View reviewed changes

sammy-tri reviewed May 13, 2020

View reviewed changes

sammy-tri merged commit eee86f0 into RobotLocomotion:master May 13, 2020

RussTedrake deleted the affine_tvlqr branch August 4, 2020 11:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

controllers: support affine terms in finite-horizon LQR #13272

controllers: support affine terms in finite-horizon LQR #13272

RussTedrake commented May 11, 2020 •

edited by sherm1

RussTedrake left a comment

hongkai-dai left a comment

RussTedrake left a comment

hongkai-dai left a comment

RussTedrake left a comment

RussTedrake left a comment

hongkai-dai left a comment

sammy-tri left a comment

sammy-tri left a comment

RussTedrake left a comment

sammy-tri left a comment

controllers: support affine terms in finite-horizon LQR #13272

controllers: support affine terms in finite-horizon LQR #13272

Conversation

RussTedrake commented May 11, 2020 • edited by sherm1

RussTedrake left a comment

Choose a reason for hiding this comment

hongkai-dai left a comment

Choose a reason for hiding this comment

RussTedrake left a comment

Choose a reason for hiding this comment

hongkai-dai left a comment

Choose a reason for hiding this comment

RussTedrake left a comment

Choose a reason for hiding this comment

RussTedrake left a comment

Choose a reason for hiding this comment

hongkai-dai left a comment

Choose a reason for hiding this comment

sammy-tri left a comment

Choose a reason for hiding this comment

sammy-tri left a comment

Choose a reason for hiding this comment

RussTedrake left a comment

Choose a reason for hiding this comment

sammy-tri left a comment

Choose a reason for hiding this comment

RussTedrake commented May 11, 2020 •

edited by sherm1