New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controllers: support affine terms in finite-horizon LQR #13272
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+@hongkai-dai for feature review, please? (This one is a bit a of a doozy... let me know if you're not up for it)
Reviewable status: LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkpoint.
Reviewed 2 of 7 files at r1.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)
systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):
A structure that contains the basic FiniteHorizonLinearQuadraticRegulator results. The finite-horizon cost-to-go is given by (x-x0(t))'*S(t)*(x-x0(t)) + 2*(x-x0(t))'sx(t) + s0(t) and the optimal controller is given by u-u0(t) =
sx(t)
could be interpreted as a scalar s
multiplies with x(t)
. Can we use another symbol for this vector?
systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):
+ \int_{t_0}^{t_f} 2(x(t)-x_d(t))'N(u(t)-u_d(t)) dt \\ \text{s.t. } \dot{x} - \dot{x}_0(t) = A(t)(x(t) - x_0(t)) + B(t)(u(t) - u_0(t)) + c(t)
I am not sure if we should have \dot{x}_0(t)
here. The nonlinear dynamics is
xdot = f(x, u)
We do a taylor expansion of the dynamics around (x0, u0), and get
xdot ≈ f(x0, u0) + df/dx * (x - x0) + df/du * (u - u0)
with A = df/dx, B = df/du, c = f(x0, u0)
, we should get
xdot ≈ A * (x - x0) + B * (u - u0) + c
Did I miss anything here?
systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 109 at r1 (raw file):
const Eigen::Ref<const Eigen::MatrixXd>& A = AB.leftCols(num_states_); const Eigen::Ref<const Eigen::MatrixXd>& B = AB.rightCols(num_inputs_); const Eigen::VectorXd c = math::autoDiffToValueMatrix(autodiff_xdot0) -
I think in most cases, x0_.EvalDerivative()
does not equal to xdot0 computed from f(t, x0, u0). Most of the time we get x0
from interpolating the trajectory optimization result, with no guarantee that the time derivative of the interpolated trajectory matches with xdot
computed from the dynamics. With this non-zero c
, we will always introduce non-zero affine and constant terms. This is probably different from the LQR controller implemented in many other places, which don't have the non-zero affine term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)
systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
sx(t)
could be interpreted as a scalars
multiplies withx(t)
. Can we use another symbol for this vector?
Ok, I will update the doxygen to use sₓ, s₀, etc. (but will push this after your review; just to not spend a CI cycle on it).
systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
I am not sure if we should have
\dot{x}_0(t)
here. The nonlinear dynamics isxdot = f(x, u)
We do a taylor expansion of the dynamics around (x0, u0), and get
xdot ≈ f(x0, u0) + df/dx * (x - x0) + df/du * (u - u0)
with
A = df/dx, B = df/du, c = f(x0, u0)
, we should getxdot ≈ A * (x - x0) + B * (u - u0) + c
Did I miss anything here?
That is not how I'm defining c. I'm defining xbar = x-x0, and xbardot = Axbar + Bxbar + c.
systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 109 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
I think in most cases,
x0_.EvalDerivative()
does not equal to xdot0 computed from f(t, x0, u0). Most of the time we getx0
from interpolating the trajectory optimization result, with no guarantee that the time derivative of the interpolated trajectory matches withxdot
computed from the dynamics. With this non-zeroc
, we will always introduce non-zero affine and constant terms. This is probably different from the LQR controller implemented in many other places, which don't have the non-zero affine term.
I agree. And this version is better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass done. BTW, the notes in underactuated doesn't have the affine term N
, will the notes be updated?
Reviewed 4 of 7 files at r1.
Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)
systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):
Previously, RussTedrake (Russ Tedrake) wrote…
That is not how I'm defining c. I'm defining xbar = x-x0, and xbardot = Axbar + Bxbar + c.
I think c(t)
is defined as c(t) = f(t, x0(t), u0(t))
in the documentation below. So we need to update the definition of c(t) in the next paragraph.
systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 159 at r1 (raw file):
// the optimal sampling times for the fixed-degree polynomial over the // finite time interval. Find it and use it here. const std::vector<double> scaled_sample_times = {0.0, 1. / 3., 2. / 3., 1.};
By using Lagrangian interpolation with 4 points, we create a 3rd order polynomial that matches with K at the sampled time. Could you add more explanation on why choosing a 3rd order polynomial?
systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):
// Confirm that it converges to the solution. EXPECT_TRUE(CompareMatrices(result.S->value(t0), lqr_result.S, 1e-5)); EXPECT_TRUE(result.sx->value(t0).isZero(1e-12));
Could you provide more explanation why sx
and s0
would converge to 0 at t0
with non-zero N
? In my understanding, this is because for this linear system, x0
= xd
and u0 = ud
.
systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 211 at r1 (raw file):
Here we will leave the coordinate system alone
Do you mean that the coordinate system x
?
systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 252 at r1 (raw file):
FiniteHorizonLinearQuadraticRegulator(sys, *context, t0, tf, Q, R, options); // The LQR x'Sx is the correct solution, but in the Finite Horizon version,
Could you provide more explanation on why x'Sx
is the correct solution? My understanding is that because xd = x0 = 0
in options
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notes do have the N term (pushed before I opened this PR). Are you looking at the right section? http://underactuated.csail.mit.edu/lqr.html#finite_horizon_derivation .
Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai and @RussTedrake)
systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 75 at r1 (raw file):
Previously, RussTedrake (Russ Tedrake) wrote…
Ok, I will update the doxygen to use sₓ, s₀, etc. (but will push this after your review; just to not spend a CI cycle on it).
Done.
systems/controllers/finite_horizon_linear_quadratic_regulator.h, line 111 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
I think
c(t)
is defined asc(t) = f(t, x0(t), u0(t))
in the documentation below. So we need to update the definition of c(t) in the next paragraph.
Done. Good catch.
systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
Could you provide more explanation why
sx
ands0
would converge to 0 att0
with non-zeroN
? In my understanding, this is because for this linear system,x0
=xd
andu0 = ud
.
N is included in the infinite-horizon solution from LQR (which is in the form x'Sx). And we expect the finite-horizon solution to converge to the infinite-horizon solution as t->\inf.
systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 211 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
Here we will leave the coordinate system alone
Do you mean that the coordinate system
x
?
Done. (yes)
systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 252 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
Could you provide more explanation on why
x'Sx
is the correct solution? My understanding is that becausexd = x0 = 0
inoptions
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 unresolved discussions, LGTM missing from assignee hongkai-dai, needs platform reviewer assigned, needs at least two assigned reviewers (waiting on @hongkai-dai)
systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 159 at r1 (raw file):
Previously, hongkai-dai (Hongkai Dai) wrote…
By using Lagrangian interpolation with 4 points, we create a 3rd order polynomial that matches with K at the sampled time. Could you add more explanation on why choosing a 3rd order polynomial?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+@sammy-tri for platform review please thanks!
Sorry I must have misread the underactuated notes. It does have N
term.
Reviewed 1 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: LGTM missing from assignee sammy-tri(platform) (waiting on @sammy-tri)
systems/controllers/test/finite_horizon_linear_quadratic_regulator_test.cc, line 57 at r1 (raw file):
Previously, RussTedrake (Russ Tedrake) wrote…
N is included in the infinite-horizon solution from LQR (which is in the form x'Sx). And we expect the finite-horizon solution to converge to the infinite-horizon solution as t->\inf.
I see, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 7 files at r1, 3 of 3 files at r2.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @RussTedrake)
systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 296 at r2 (raw file):
Eigen::Map<Eigen::VectorXd> S_xx(S_vectorized.data(), num_states * num_states); S_xx = *options.Qf;
This line appears to be triggering an assertion in debug builds:
finite_horizon_linear_quadratic_regulator_test: external/eigen/include/_usr_include_eigen3/Eigen/src/Core/DenseBase.h:257: void Eigen::DenseBase::resize(Eigen::Index, Eigen::Index) [with Derived = Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >; Eigen::Index = long int]: Assertion `rows == this->rows() && cols == this->cols() && "DenseBase::resize() does not actually allow one to resize."' failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full disclosure: I did not attempt to properly understand this.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @RussTedrake)
The corresponding derivation is in: http://underactuated.mit.edu/lqr.html#finite_horizon_derivation This is the last major piece of functionality that I hoped to support for continuous-time systems. (Richer cost specification would be the one easy thing to add if/when it's needed). But I still hope to add the discrete-time formulations before I'm completely satisfied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee sammy-tri(platform) (waiting on @hongkai-dai and @sammy-tri)
systems/controllers/finite_horizon_linear_quadratic_regulator.cc, line 296 at r2 (raw file):
Previously, sammy-tri (Sam Creasey) wrote…
This line appears to be triggering an assertion in debug builds:
finite_horizon_linear_quadratic_regulator_test: external/eigen/include/_usr_include_eigen3/Eigen/src/Core/DenseBase.h:257: void Eigen::DenseBase::resize(Eigen::Index, Eigen::Index) [with Derived = Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >; Eigen::Index = long int]: Assertion `rows == this->rows() && cols == this->cols() && "DenseBase::resize() does not actually allow one to resize."' failed.
Done. That one was real. Thanks CI!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3.
Reviewable status: complete! all discussions resolved, LGTM from assignees sammy-tri(platform),hongkai-dai
The corresponding derivation is in:
http://underactuated.mit.edu/lqr.html#finite_horizon_derivation
This is the last major piece of functionality that I hoped to support for continuous-time systems. (Richer cost specification would be the one easy thing to add if/when it's needed). But I still hope to add the discrete-time formulations before I'm completely satisfied.
This change is