-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-linear dynamics -- Trajectory Following #1
Comments
Hello! I just added support for time-varying dynamics models and cost functions in f97ffb3. I haven't tested it as thoroughly yet, so there might be issues. I think you could set up what you've outlined pretty easily as follows: import theano.tensor as T
from ilqr.dynamics import AutoDiffDynamics
dt = 0.01
x = T.dscalar("x")
u = T.dscalar("u")
t = T.dscalar("t")
x_dot = (dt * t - u) * x**2
f = T.stack([x + x_dot * dt])
dynamics = AutoDiffDynamics(f, [x], [u], t)
>>> t = 2
>>> x = [2.0]
>>> u = [1.0]
>>> dynamics.f(x, u, t)
... array([ 1.9604])
>>> dynamics.f_x(x, u, t)
... array([[ 0.9604]]) As for your cost function, I believe all you would need is to copy the Essentially, the The only issue I see from all of this is that you might need a way to shift your cost function if you are planning to use this as an MPC with the Let me know if this works for you and hope this helps! |
Hello again Thanks for your quick and helpful reply! Hopefully, I'll try out your advice tonight and let you know on my progress. I don't need MPC at the moment only planning with respect to some non-linear dynamics. If you haven't implemented it yet, it would be great to add state and control constraints as in But still great package and great work @anassinator ! |
Thanks! Yeah, I would like to add that at some point. For now, this already supports control constraints through the squashing function outlined in that paper and is used in both the pendulum and cartpole examples with the |
Hello again! I followed your advice and I got it running (not quite sure if it is correct/accurate since I was testing a simple linear model for the trajectory tracking). So I changed the following: class QRCost(Cost):
Basically Removing the assertion in the init **adding [t] to the other functions e.g. : Adding [t] to the init method raised errors which I could not handle! I called it like this: dist = np.linspace(0,T+1,T+1) """Creating a sin trajectory for the state""" cost = QRCost(Q, R, Q_terminal=Q_terminal, x_goal=x_goal, u_goal = u_goal) Now x_goal and u_goal are trajectories. If you have a better solution or if you think I made a mistake somewhere in your code, please let me know! Thanks in advance! Looking forward to your reply |
Yeah that's what I meant. This is how you can do it without commenting out the assertions BTW and to be more explicit about the sizes required. class PathQRCost(Cost):
"""Quadratic Regulator Instantaneous Cost for trajectory following."""
def __init__(self, Q, R, x_path, u_path=None, Q_terminal=None):
"""Constructs a QRCost.
Args:
Q: Quadratic state cost matrix [state_size, state_size].
R: Quadratic control cost matrix [action_size, action_size].
x_path: Goal state path [T+1, state_size].
u_path: Goal control path [T, action_size].
Q_terminal: Terminal quadratic state cost matrix
[state_size, state_size].
"""
self.Q = np.array(Q)
self.R = np.array(R)
self.x_path = np.array(x_path)
state_size = self.Q.shape[0]
action_size = self.R.shape[0]
path_length = self.x_path.shape[0]
if Q_terminal is None:
self.Q_terminal = self.Q
else:
self.Q_terminal = np.array(Q_terminal)
if u_path is None:
self.u_path = np.zeros(path_length - 1, action_size)
else:
self.u_path = np.array(u_path)
assert self.Q.shape == self.Q_terminal.shape, "Q & Q_terminal mismatch"
assert self.Q.shape[0] == self.Q.shape[1], "Q must be square"
assert self.R.shape[0] == self.R.shape[1], "R must be square"
assert state_size == self.x_path.shape[1], "Q & x_path mismatch"
assert action_size == self.u_path.shape[1], "R & u_path mismatch"
assert path_length == self.u_path.shape[0] + 1, "x_path must be 1 longer than u_path"
# Precompute some common constants.
self._Q_plus_Q_T = self.Q + self.Q.T
self._R_plus_R_T = self.R + self.R.T
self._Q_plus_Q_T_terminal = self.Q_terminal + self.Q_terminal.T
super(PathQRCost, self).__init__()
def l(self, x, u, t, terminal=False):
"""Instantaneous cost function.
Args:
x: Current state [state_size].
u: Current control [action_size]. None if terminal.
t: Current time step.
terminal: Compute terminal cost. Default: False.
Returns:
Instantaneous cost (scalar).
"""
Q = self.Q_terminal if terminal else self.Q
R = self.R
x_diff = x - self.x_path[t]
squared_x_cost = x_diff.T.dot(Q).dot(x_diff)
if terminal:
return squared_x_cost
u_diff = u - self.u_path[t]
return squared_x_cost + u_diff.T.dot(R).dot(u_diff)
def l_x(self, x, u, t, terminal=False):
"""Partial derivative of cost function with respect to x.
Args:
x: Current state [state_size].
u: Current control [action_size]. None if terminal.
t: Current time step.
terminal: Compute terminal cost. Default: False.
Returns:
dl/dx [state_size].
"""
Q_plus_Q_T = self._Q_plus_Q_T_terminal if terminal else self._Q_plus_Q_T
x_diff = x - self.x_path[t]
return x_diff.T.dot(Q_plus_Q_T)
def l_u(self, x, u, t, terminal=False):
"""Partial derivative of cost function with respect to u.
Args:
x: Current state [state_size].
u: Current control [action_size]. None if terminal.
t: Current time step.
terminal: Compute terminal cost. Default: False.
Returns:
dl/du [action_size].
"""
if terminal:
return np.zeros_like(self.u_path)
u_diff = u - self.u_path[t]
return u_diff.T.dot(self._R_plus_R_T)
def l_xx(self, x, u, t, terminal=False):
"""Second partial derivative of cost function with respect to x.
Args:
x: Current state [state_size].
u: Current control [action_size]. None if terminal.
t: Current time step.
terminal: Compute terminal cost. Default: False.
Returns:
d^2l/dx^2 [state_size, state_size].
"""
return self._Q_plus_Q_T_terminal if terminal else self._Q_plus_Q_T
def l_ux(self, x, u, t, terminal=False):
"""Second partial derivative of cost function with respect to u and x.
Args:
x: Current state [state_size].
u: Current control [action_size]. None if terminal.
t: Current time step.
terminal: Compute terminal cost. Default: False.
Returns:
d^2l/dudx [action_size, state_size].
"""
return np.zeros((self.R.shape[0], self.Q.shape[0]))
def l_uu(self, x, u, t, terminal=False):
"""Second partial derivative of cost function with respect to u.
Args:
x: Current state [state_size].
u: Current control [action_size]. None if terminal.
t: Current time step.
terminal: Compute terminal cost. Default: False.
Returns:
d^2l/du^2 [action_size, action_size].
"""
if terminal:
return np.zeros_like(self.R)
return self._R_plus_R_T
T = 100
state_size = 1
action_size = 1
Q = np.eye(state_size)
R = np.eye(action_size)
dist = np.linspace(0,T + 1,T + 1).reshape(-1, 1)
x_path = np.sin(dist)
u_path = np.zeros((T, action_size))
cost = PathQRCost(Q, R, x_path, u_path=u_path)
>>> cost.l([0], [2], 2)
... 4.8114283738935208 |
Hello I think this is it, I tested it for a class of systems linear and non-linear and it seems to work! Let me ask something else: If you constrain the input as you did with saturation can you make your constraints depend on the Theano variables e.g. x,t,u ? thanks again |
Cool! I just added it. Yeah you can saturate whatever expression you want, the iLQR solver doesn't make any assumptions about that. You just need to make sure you saturate the output fitted path accordingly as well. from ilqr.dynamics import tensor_constrain, constrain
# When building dynamics model...
u = tensor_constrain(u + x * t, min_bound, max_bound)
# When fitting...
xs, us = ilqr.fit(x0, us_init)
us = constrain(us + xs * t, min_bound, max_bound) |
Closing this for now. You can re-open if you have any more questions. |
Hello and thanks for the interesting package you've created!
I wanted to ask how can I use it for a dynamic model that depends explicitly on time
i.e x_dot = (t-u)*x^2
where u is the input, t is time and x is the state.
Moreover, can your package be used for Trajectory following?
I.e mininimize -> J = sum_i=0^N ( x(i) - x_ref(i) )^T Q ( x(i) - x_ref(i) ) + ( u(i) - u_ref(i) )^T Q ( u(i) - u_ref(i) )
Thanks in advance!
The text was updated successfully, but these errors were encountered: