Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controllers: support affine terms in finite-horizon LQR #13272

Merged
merged 1 commit into from May 13, 2020

Conversation

RussTedrake
Copy link
Contributor

@RussTedrake RussTedrake commented May 11, 2020

The corresponding derivation is in:
http://underactuated.mit.edu/lqr.html#finite_horizon_derivation

This is the last major piece of functionality that I hoped to support for continuous-time systems. (Richer cost specification would be the one easy thing to add if/when it's needed). But I still hope to add the discrete-time formulations before I'm completely satisfied.


This change is Reviewable

Copy link
Contributor Author

@RussTedrake RussTedrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@hongkai-dai for feature review, please? (This one is a bit a of a doozy... let me know if you're not up for it)

Reviewable status: LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)

Copy link
Contributor

@hongkai-dai hongkai-dai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpoint.

Reviewed 2 of 7 files at r1.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)


systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):

A structure that contains the basic FiniteHorizonLinearQuadraticRegulator
results. The finite-horizon cost-to-go is given by (x-x0(t))'*S(t)*(x-x0(t)) +
2*(x-x0(t))'sx(t) + s0(t) and the optimal controller is given by u-u0(t) =

sx(t) could be interpreted as a scalar s multiplies with x(t). Can we use another symbol for this vector?


systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

         + \int_{t_0}^{t_f} 2(x(t)-x_d(t))'N(u(t)-u_d(t)) dt \\
  \text{s.t. } \dot{x} - \dot{x}_0(t) = A(t)(x(t) - x_0(t)) + B(t)(u(t) -
u_0(t)) + c(t)

I am not sure if we should have \dot{x}_0(t) here. The nonlinear dynamics is

xdot = f(x, u)

We do a taylor expansion of the dynamics around (x0, u0), and get

xdot ≈ f(x0, u0) + df/dx * (x - x0) + df/du * (u - u0)

with A = df/dx, B = df/du, c = f(x0, u0), we should get

xdot ≈ A * (x - x0) + B * (u - u0) + c

Did I miss anything here?


systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 109 at r1 (raw file):

    const Eigen::Ref<const Eigen::MatrixXd>& A = AB.leftCols(num_states_);
    const Eigen::Ref<const Eigen::MatrixXd>& B = AB.rightCols(num_inputs_);
    const Eigen::VectorXd c = math::autoDiffToValueMatrix(autodiff_xdot0) -

I think in most cases, x0_.EvalDerivative() does not equal to xdot0 computed from f(t, x0, u0). Most of the time we get x0 from interpolating the trajectory optimization result, with no guarantee that the time derivative of the interpolated trajectory matches with xdot computed from the dynamics. With this non-zero c, we will always introduce non-zero affine and constant terms. This is probably different from the LQR controller implemented in many other places, which don't have the non-zero affine term.

Copy link
Contributor Author

@RussTedrake RussTedrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)


systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

sx(t) could be interpreted as a scalar s multiplies with x(t). Can we use another symbol for this vector?

Ok, I will update the doxygen to use sₓ, s₀, etc. (but will push this after your review; just to not spend a CI cycle on it).


systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

I am not sure if we should have \dot{x}_0(t) here. The nonlinear dynamics is

xdot = f(x, u)

We do a taylor expansion of the dynamics around (x0, u0), and get

xdot ≈ f(x0, u0) + df/dx * (x - x0) + df/du * (u - u0)

with A = df/dx, B = df/du, c = f(x0, u0), we should get

xdot ≈ A * (x - x0) + B * (u - u0) + c

Did I miss anything here?

That is not how I'm defining c. I'm defining xbar = x-x0, and xbardot = Axbar + Bxbar + c.


systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 109 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

I think in most cases, x0_.EvalDerivative() does not equal to xdot0 computed from f(t, x0, u0). Most of the time we get x0 from interpolating the trajectory optimization result, with no guarantee that the time derivative of the interpolated trajectory matches with xdot computed from the dynamics. With this non-zero c, we will always introduce non-zero affine and constant terms. This is probably different from the LQR controller implemented in many other places, which don't have the non-zero affine term.

I agree. And this version is better!

Copy link
Contributor

@hongkai-dai hongkai-dai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass done. BTW, the notes in underactuated doesn't have the affine term N, will the notes be updated?

Reviewed 4 of 7 files at r1.
Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)


systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

Previously, RussTedrake (Russ Tedrake) wrote…

That is not how I'm defining c. I'm defining xbar = x-x0, and xbardot = Axbar + Bxbar + c.

I think c(t) is defined as c(t) = f(t, x0(t), u0(t)) in the documentation below. So we need to update the definition of c(t) in the next paragraph.


systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 159 at r1 (raw file):

    // the optimal sampling times for the fixed-degree polynomial over the
    // finite time interval. Find it and use it here.
    const std::vector<double> scaled_sample_times = {0.0, 1. / 3., 2. / 3., 1.};

By using Lagrangian interpolation with 4 points, we create a 3rd order polynomial that matches with K at the sampled time. Could you add more explanation on why choosing a 3rd order polynomial?


systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):

  // Confirm that it converges to the solution.
  EXPECT_TRUE(CompareMatrices(result.S->value(t0), lqr_result.S, 1e-5));
  EXPECT_TRUE(result.sx->value(t0).isZero(1e-12));

Could you provide more explanation why sx and s0 would converge to 0 at t0 with non-zero N? In my understanding, this is because for this linear system, x0 = xd and u0 = ud.


systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 211 at r1 (raw file):

Here we will leave the coordinate system alone

Do you mean that the coordinate system x?


systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 252 at r1 (raw file):

      FiniteHorizonLinearQuadraticRegulator(sys, *context, t0, tf, Q, R,
                                            options);
  // The LQR x'Sx is the correct solution, but in the Finite Horizon version,

Could you provide more explanation on why x'Sx is the correct solution? My understanding is that because xd = x0 = 0 in options.

Copy link
Contributor Author

@RussTedrake RussTedrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notes do have the N term (pushed before I opened this PR). Are you looking at the right section? http://underactuated.csail.mit.edu/lqr.html#finite_horizon_derivation .

Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)


systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):

Previously, RussTedrake (Russ Tedrake) wrote…

Ok, I will update the doxygen to use sₓ, s₀, etc. (but will push this after your review; just to not spend a CI cycle on it).

Done.


systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

I think c(t) is defined as c(t) = f(t, x0(t), u0(t)) in the documentation below. So we need to update the definition of c(t) in the next paragraph.

Done. Good catch.


systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

Could you provide more explanation why sx and s0 would converge to 0 at t0 with non-zero N? In my understanding, this is because for this linear system, x0 = xd and u0 = ud.

N is included in the infinite-horizon solution from LQR (which is in the form x'Sx). And we expect the finite-horizon solution to converge to the infinite-horizon solution as t->\inf.


systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 211 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

Here we will leave the coordinate system alone

Do you mean that the coordinate system x?

Done. (yes)


systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 252 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

Could you provide more explanation on why x'Sx is the correct solution? My understanding is that because xd = x0 = 0 in options.

Done.

Copy link
Contributor Author

@RussTedrake RussTedrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)


systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 159 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…

By using Lagrangian interpolation with 4 points, we create a 3rd order polynomial that matches with K at the sampled time. Could you add more explanation on why choosing a 3rd order polynomial?

Done.

Copy link
Contributor

@hongkai-dai hongkai-dai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: +@sammy-tri for platform review please thanks!

Sorry I must have misread the underactuated notes. It does have N term.

Reviewed 1 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: LGTM missing from assignee sammy-tri(platform) (waiting on @sammy-tri)


systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):

Previously, RussTedrake (Russ Tedrake) wrote…

N is included in the infinite-horizon solution from LQR (which is in the form x'Sx). And we expect the finite-horizon solution to converge to the infinite-horizon solution as t->\inf.

I see, thanks!

Copy link
Contributor

@sammy-tri sammy-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @RussTedrake)


systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 296 at r2 (raw file):

    Eigen::Map<Eigen::VectorXd> S_xx(S_vectorized.data(),
                                     num_states * num_states);
    S_xx = *options.Qf;

This line appears to be triggering an assertion in debug builds:

finite_horizon_linear_quadratic_regulator_test: external/eigen/include/_usr_include_eigen3/Eigen/src/Core/DenseBase.h:257: void Eigen::DenseBase::resize(Eigen::Index, Eigen::Index) [with Derived = Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >; Eigen::Index = long int]: Assertion `rows == this->rows() && cols == this->cols() && "DenseBase::resize() does not actually allow one to resize."' failed.

Copy link
Contributor

@sammy-tri sammy-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full disclosure: I did not attempt to properly understand this.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @RussTedrake)

The corresponding derivation is in:
http://underactuated.mit.edu/lqr.html#finite_horizon_derivation

This is the last major piece of functionality that I hoped to support for continuous-time systems.  (Richer cost specification would be the one easy thing to add if/when it's needed).  But I still hope to add the discrete-time formulations before I'm completely satisfied.
Copy link
Contributor Author

@RussTedrake RussTedrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @hongkai-dai and @sammy-tri)


systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 296 at r2 (raw file):

Previously, sammy-tri (Sam Creasey) wrote…

This line appears to be triggering an assertion in debug builds:

finite_horizon_linear_quadratic_regulator_test: external/eigen/include/_usr_include_eigen3/Eigen/src/Core/DenseBase.h:257: void Eigen::DenseBase::resize(Eigen::Index, Eigen::Index) [with Derived = Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >; Eigen::Index = long int]: Assertion `rows == this->rows() && cols == this->cols() && "DenseBase::resize() does not actually allow one to resize."' failed.

Done. That one was real. Thanks CI!

Copy link
Contributor

@sammy-tri sammy-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r3.
Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees sammy-tri(platform),hongkai-dai

@sammy-tri sammy-tri merged commit eee86f0 into RobotLocomotion:master May 13, 2020
@RussTedrake RussTedrake deleted the affine_tvlqr branch August 4, 2020 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants