# Goal of the project

The goal of this project is to control a 2D quadrotor to perform acrobatic moves. There are 4 parts of the project, where you will build controllers of increasing complexity. The last part will lead to the implementation of the iterative LQR (iLQR) algorithm.

## Instructions
Answer all the questions in the 4 parts below. You will need to submit:
1. A report (pdf format only - every other format will be rejected) answering all the questions that do not request code. DO NOT include code in the report.
2. One (or several) Jupyter notebook(s) containing all the code used to answer the questions. The notebook(s) should be runnable as is.

## 2D quadrotor

The quadrotor is depicted in the following figure <img src='quadrotor.png' width="300">
The quadrotor model is written as
$$\begin{align} 
\dot{x} &= v_x\\
m \dot{v}_x &= - (u_1 + u_2) \sin \theta \\ 
\dot{y} &= v_y\\
m \dot{v}_y &= (u_1 + u_2) \cos \theta  - m g\\
\dot{\theta} &= \omega\\
I \dot{\omega} &= r (u_1 - u_2) \end{align}$$
where $x$ is the horizontal and $y$ the vertical positions of the quadrotor and $\theta$ is its orientation with respect to the horizontal plane. $v_x$ and $v_y$ are the linear velocities and $\omega$ is the angular velocity of the robot. $u_1$ and $u_2$ are the forces produced by the rotors (our control inputs). $m$ is the quadrotor mass, $I$ its moment of inertia (a scalar), $r$ is the distance from the center of the robot frame to the propellers and $g$ is the gravity constant. To denote the entire state, we will write $z = [x, v_x, y, v_y, \theta, \omega]^T$ - we will also write $u = [u_1, u_2]^T$.

The module ```quadrotor.py``` defines useful constants (mass, length, gravity, etc) and functions to simulate and animate the quadrotor as shown below.

## Part 1 - Setting up
1. Discretize the system dynamics using the method seen in class - write the time discretization step as $\Delta t$ (use symbols not numbers for the mass, etc)
2. Assume that the robot starts at an arbitrary position $x(0) = x_0$, $y(0) = y_0$ and $\theta(0) = 0$ with 0 velocities. Compute $u_1^*$ and $u_2^*$ such that the robot stays at this position forever after (you may test your answer using the simulation below).
3. Analyzing the system dynamics, is it possible to move in the x direction while keeping $\theta = 0$? Explain why.
4. Analyzing the system dynamics, is it possible to have the system at rest with $\theta = \frac{\pi}{2}$ (i.e. have the quadrotor in a vertical position)? Explain why.

In [1]:
%matplotlib notebook

import numpy as np
import matplotlib.pyplot as plt

import quadrotor

In [2]:
# we can get its mass, half length (r), gravity constant
print(f'm is {quadrotor.MASS}')
print(f'r is {quadrotor.LENGTH}')
print(f'I is {quadrotor.INERTIA}')
print(f'g is {quadrotor.GRAVITY}')

# we can also get the integration step used in the simulation
print(f'dt is {quadrotor.DELTA_T}')

# we can get the size of its state and control vector
print(f'number of states {quadrotor.NUMBER_STATES} and number of controls {quadrotor.NUMBER_CONTROLS}')
print('the states are indexed as follows: x, vx, y, vy, theta, omega')

m is 0.6
r is 0.2
I is 0.15
g is 9.81
dt is 0.01
number of states 6 and number of controls 2
the states are indexed as follows: x, vx, y, vy, theta, omega


In [3]:
# we can simulate the robot but we need to provide a controller of the following form
def dummy_controller(state, i):
    """
        the prototype of a controller is as follows
        state is a column vector containing the state of the robot
        i is the index corresponding to the time step in the horizon (useful to index gains K for e.g.)
        
        this controller needs to return an array of size (2,)
    """
    # here we do nothing and just return some non-zero control
    #return 1. * np.ones([2,])
    
    u_star = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2]).T
    
    return u_star



# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.zeros([quadrotor.NUMBER_STATES,])
t, state, u = quadrotor.simulate(z0, dummy_controller, horizon_length, disturbance = False)

In [4]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('1.1.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('1.2.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [5]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

In [6]:
# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.zeros([quadrotor.NUMBER_STATES,])
t, state, u = quadrotor.simulate(z0, dummy_controller, horizon_length, disturbance = True)

In [7]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('1.3.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('1.4.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [8]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

## Part 2 - LQR to stay in place
Now that we have $u^*$ capable of keeping the robot at rest, we can design a simple controller that ensures that the robot stays in place even when pushed around by random disturbances (e.g. due to the wind). Our task here will be to design a LQR controller that keeps the robot at a predefined position. Since the dynamics is not linear, we need to compute a linear approximation of it.
1. Linearize the dynamics at an arbitrary operating point $z^*$, $u^*$ and write the linearized system dynamics using the variables $\bar{z}_n = z_n - z^*$ and $\bar{u}_n = u_n - u^*$.
2. Write a function ```get_linearization(z, u)``` that returns the matrices A and B given a state $z$ and a control $u$ (use the constants defined in the ``quadrotor.py`` module).
3. Using the linearized dynamics, we can design an infinite horizon LQR controller of the form, $\hat{u} = K \bar{z}$ to stabilize the resting point. Write the equations of the controller in the original coordinates $u$ as a function of $z$.
4. Design an infinite-horizon LQR controller that stabilizes the origin $z=0$ and test it using the simulator below.
5. Explain your intended design in the report, including the cost function and found control law. In particular, verify that it can handle perturbations by calling the ```simulate``` function with ```disturbance = True``` (when setting disturbance to ``True``, the simulator will generate a random perturbation every 1 second). Simulate your controller for 10 seconds, plot the state evolution and show the animation (include the plots in your report).

In [9]:
def get_linearization(z, u):
    
    A = np.array([[1,quadrotor.DELTA_T,0,0,0,0],
                  [0,1,0,0,-(1/quadrotor.MASS)*(u[0]+u[1])*np.cos(z[4])*quadrotor.DELTA_T,0],
                  [0,0,1,quadrotor.DELTA_T,0,0],
                  [0,0,0,1,-(1/quadrotor.MASS)*(u[0]+u[1])*np.sin(z[4])*quadrotor.DELTA_T,0],
                  [0,0,0,0,1,quadrotor.DELTA_T],
                  [0,0,0,0,0,1]])
    
    B = np.array([[0,0],
                  [-np.sin(z[4])*quadrotor.DELTA_T/quadrotor.MASS,-np.sin(z[4])*quadrotor.DELTA_T/quadrotor.MASS],
                  [0,0],
                  [np.cos(z[4])*quadrotor.DELTA_T/quadrotor.MASS,np.cos(z[4])*quadrotor.DELTA_T/quadrotor.MASS],
                  [0,0],
                  [quadrotor.LENGTH*quadrotor.DELTA_T/quadrotor.INERTIA,-quadrotor.LENGTH*quadrotor.DELTA_T/quadrotor.INERTIA]])
    
    return A,B
    

In [10]:
def solve_LQR(A, B, Q, R, QN, N):
    
    list_of_P = []
    list_of_K = []

    list_of_P.append(QN)

    for i in range(N):
        list_of_K.append(-np.matmul(np.linalg.inv(np.matmul(B.T, np.matmul(list_of_P[i],B))+R),np.matmul(B.T, np.matmul(list_of_P[i],A))))
        list_of_P.append(Q + np.matmul(A.T,np.matmul(list_of_P[i],A))+np.matmul(A.T,np.matmul(list_of_P[i],np.matmul(B,list_of_K[i]))))
        
 
    list_of_P.reverse()
    list_of_K.reverse()
    
    return list_of_P, list_of_K

In [11]:
def origin_controller(state,i):
    
    z_star = np.array([0,0,0,0,0,0]).T
    u_star = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2]).T
    
    A, B = get_linearization(z_star, u_star)
    
    Q = np.array([[100.0, 0., 0., 0., 0., 0.], 
                  [0., 10.0, 0., 0., 0., 0.], 
                  [0., 0., 100.0, 0., 0., 0.], 
                  [0., 0., 0., 10.0, 0., 0.], 
                  [0., 0., 0., 0., 100.0, 0.], 
                  [0., 0., 0., 0., 0., 10.0]])
    
    R = np.array([[0.01, 0.],[0., 0.01]])
    
    QN = Q
    
    _, K = solve_LQR(A, B, Q, R, QN, 1000)
    
    
    u = np.matmul(K[i],(state - z_star)) + u_star
    
    return u

    

In [12]:
z_star = np.array([0,0,0,0,0,0]).T
u_star = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2]).T
A, B = get_linearization(z_star, u_star)

Q = np.array([[100.0, 0., 0., 0., 0., 0.], 
              [0., 100.0, 0., 0., 0., 0.], 
              [0., 0., 100.0, 0., 0., 0.], 
              [0., 0., 0., 100.0, 0., 0.], 
              [0., 0., 0., 0., 100.0, 0.], 
              [0., 0., 0., 0., 0., 100.0]])
    
R = np.array([[0.01, 0.],[0., 0.01]])

QN = Q
    
_, K = solve_LQR(A, B, Q, R, QN, 1)

print(np.shape(K))
print(A.shape)
print(B.shape)
print(u_star.shape)
print(z_star.shape)


(1, 2, 6)
(6, 6)
(6, 2)
(2,)
(6,)


In [13]:
# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.zeros([quadrotor.NUMBER_STATES,])
t, state, u = quadrotor.simulate(z0, origin_controller, horizon_length, disturbance = False)

In [14]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('2.1.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('2.2.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [15]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

In [16]:
# we can also simulate with perturbations
t, state, u = quadrotor.simulate(z0, origin_controller, horizon_length, disturbance = True)

# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('2.3.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('2.4.png')

quadrotor.animate_robot(state,u)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Part 3 - following a trajectory using linearized dynamics
Now we want to follow a given trajectory leveraging a linearized version of the dynamics to design LQ controllers.
1. Assume that we want to follow a circle of radius 1 centered at (0,0) while keeping an orientation $\theta=0$, how does the linearization of the dynamics change along the desired trajectory? Why?
2. Design a tracking controller (using an LQ design with linear approximations) to follow this desired trajectory. Explain your design in the report. 
3. Test the tracking controller with the simulation (with and without the perturbations) and verify that you can indeed track the (x,y) trajectory very well. Are you able to also track $\theta$? (Explain) 
4. Analyze your results (including plots of the states, controls, etc). What benefits and issues do you see with this approach?
5. Is it possible to do the same thing while keeping a desired orientation of $\theta = \frac{\pi}{4}$? What might influence the results in this case?

In [17]:
def solve_LQR_trajectory(A, B, Q, R, x_bar, N):
    '''
    A, B, Q and R are the matrices defining the OC problem
    x_bar is the trajectory of desired states of size dim(x) x (N+1)
    N is the horizon length
    
    The function returns 1) a list of gains of length N and 2) a list of feedforward controls of length N
    '''
    list_of_P = []
    list_of_K = []
    list_of_p = []
    list_of_k = []
    list_of_q = []
    
    list_of_q = -np.matmul(Q, x_bar[:,-1])

    list_of_P.append(Q)
    list_of_p.append(list_of_q)

    for i in range(N):
        list_of_K.append(-np.matmul(np.linalg.inv(np.matmul(B.T, np.matmul(list_of_P[i],B))+R),np.matmul(B.T, np.matmul(list_of_P[i],A))))
        list_of_P.append(Q + np.matmul(A.T,np.matmul(list_of_P[i],A))+np.matmul(A.T,np.matmul(list_of_P[i],np.matmul(B,list_of_K[i]))))
        list_of_k.append(-np.matmul(np.linalg.inv(np.matmul(B.T, np.matmul(list_of_P[i],B))+R),np.matmul(B.T, list_of_p[i])))
        list_of_q = -np.matmul(Q, x_bar[:,N-i-1])
        list_of_p.append(list_of_q + np.matmul(A.T,list_of_p[i])+np.matmul(A.T,np.matmul(list_of_P[i],np.matmul(B,list_of_k[i]))))
        
 
    
    list_of_K.reverse()
    list_of_k.reverse()


    K_gains = list_of_K
    k_feedforward = list_of_k
    
    return K_gains, k_feedforward

In [18]:
def LQ_controller(state,i):
    
    z_star = np.array([0,0,0,0,0,0]).T
    u_star = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2]).T
    
    A, B = get_linearization(z_star, u_star)
    
    Q = np.array([[10e+5, 0., 0., 0., 0., 0.], 
                  [0., 50.0, 0., 0., 0., 0.], 
                  [0., 0., 10e+5, 0., 0., 0.], 
                  [0., 0., 0., 50.0, 0., 0.], 
                  [0., 0., 0., 0., 10e+5, 0.], 
                  [0., 0., 0., 0., 0., 50.0]])
    
    R = np.array([[0.001, 0.],[0., 0.001]])
    
    N = 1000  # horizon_length
    
    x_bar = np.zeros((A.shape[0],N+1))
    
    #t = np.arange(0., 10.1 , quadrotor.DELTA_T)
    
    theta = 2*np.pi

    for j in range(N+1):
        
        x_bar[:,j] = np.array([np.cos(theta*j/N), 0, np.sin(theta*j/N),0, 0, 0]).T
        # print(x_bar)
    
    K, k = solve_LQR_trajectory(A, B, Q, R, x_bar, N)
    
    
    u = np.matmul(K[i],state) + k[i] + u_star
    
    return u

In [19]:
# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.array([1,0,0,0,0,0])
t, state, u = quadrotor.simulate(z0, LQ_controller, horizon_length, disturbance = False)

In [20]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('3.1.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('3.2.png')



<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [21]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

In [22]:
# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.array([1,0,0,0,0,0])
t, state, u = quadrotor.simulate(z0, LQ_controller, horizon_length, disturbance = True)

In [23]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('3.3.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('3.4.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [24]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

In [25]:
def LQpiby4_controller(state,i):
    
    z_star = np.array([0,0,0,0,0,0]).T
    u_star = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2]).T
    
    A, B = get_linearization(z_star, u_star)
    
    Q = np.array([[10e+5, 0., 0., 0., 0., 0.], 
                  [0., 50.0, 0., 0., 0., 0.], 
                  [0., 0., 10e+5, 0., 0., 0.], 
                  [0., 0., 0., 50.0, 0., 0.], 
                  [0., 0., 0., 0., 10e+5, 0.], 
                  [0., 0., 0., 0., 0., 50.0]])
    
    R = np.array([[0.001, 0.],[0., 0.001]])
    
    N = 1000  # horizon_length
    
    z_bar = np.zeros((A.shape[0],N+1))
    
    theta = 2*np.pi

    for j in range(N+1):
        
        z_bar[:,j] = np.array([np.cos(theta*j/N), 0, np.sin(theta*j/N),0, np.pi/4, 0]).T
        # print(z_bar)
    
    K, k = solve_LQR_trajectory(A, B, Q, R, z_bar, N)
    
    
    u = np.matmul(K[i],state) + k[i] + u_star
    
    return u

In [26]:
# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.array([1,0,0,0,0,0])
t, state, u = quadrotor.simulate(z0, LQpiby4_controller, horizon_length, disturbance = False)

In [27]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('3.5.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('3.6.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [28]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

In [29]:
# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.array([1,0,0,0,0,0])
t, state, u = quadrotor.simulate(z0, LQpiby4_controller, horizon_length, disturbance = True)

In [30]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('3.7.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('3.8.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [31]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

## Part 4 - iterative LQR
Now we would like to do more complicated motions with the robot, like a flip. In this case, we do not have a prescribed trajectory but we would like to compute a locally optimal trajectory while we optimize the controller. We will use the *iterative LQR* algorithm to solve this problem.
### Task 1 - reaching a vertical orientation
In the first task, we want the robot to reach a vertical orientation $\theta = \frac{\pi}{2}$ at the location $x=3$ and $y=3$ at time $t=5$ starting from $z_0=0$. During the rest of the motion, the robot should try and stay close to the origin. It should also try to keep its control $u$ close to the control needed to keep the robot at rest. We want to make sure the robot reaches the origin $z=0$ at the end of the movement. 
1. Find a time-varying cost function that promotes such a behavior (use only quadratic/linear terms for simplicity). Assume $T=10$ seconds.
2. Write a function ```compute_cost(z,u, horizon_length)``` that returns the cost of a trajectory z with control trajectory u (using the cost function you wrote in question 1).
3. Compute the quadratic approximation of your cost function along an arbitrary trajectory of states $z_n$ with control trajectory $u_n$ (this is not just your cost function!)
4. Write a function ```get_quadratic_approximation_cost(z, u, horizon_length)``` that returns the quadratic approximation (Hessian matrices and Jacobians) of the cost function when approximated along the trajectory z with control trajectory u.
5. Write the iLQR algorithm that solves the problem using the functions written above. DO NOT FORGET the line search step at each iteration. For the line search, start with $\alpha = 1.$ and decrease it by half when the cost does not improve (you can stop when $\alpha < 0.01$).
6. Test the algorithm using as initial guess $u$ such that the robot is at rest (using the results of Part 1.2). Analyze your results (probably you will need to "tune" your cost function), plot the initial and final state and control trajectories, show the animation. Use the simulation without perturbations for simplicity. 
7. What benefits and issues do you see with this approach?


In [32]:

def compute_cost(z, u, horizon_length):
    
    z_trajectory1 = np.array([0,0,0,0,0,0]).T    
    z_trajectory2 = np.array([3,0,3,0,np.pi/2,0]).T    
    z_trajectory3 = np.array([0,0,0,0,0,0]).T
    
    u_desired = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2])
    
    cost = 0

    for i in range(horizon_length):
        
        R = np.array([[1., 0.],[0., 1.]])

        if i<350:

              
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 2., 0.], 
                          [0., 0., 0., 0., 0., 2.]]) 
            
            z_desired = z_trajectory1
            
        elif i>350 and i<500:
            

            Q = np.array([[100., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 150., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 200., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
            
            z_desired = z_trajectory2
            
        elif i>500 and i<650:

            
            Q = np.array([[200., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 200., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 100., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
            
            z_desired = z_trajectory3
            
        else:

            
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 3., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
            
            z_desired = z_trajectory3


        cost_val = np.matmul(np.transpose(z[:,i] - z_desired), np.matmul(Q, (z[:,i] - z_desired))) + np.matmul(np.transpose(u[:,i] - u_desired), np.matmul(R, (u[:,i] - u_desired)))
    
        cost += cost_val
        
   
    terminal_cost = np.matmul(np.transpose(z[:,-1] - z_trajectory3), np.matmul(Q, (z[:,-1] - z_trajectory3)))
    
    cost += terminal_cost

    return cost

In [33]:
def get_quadratic_approximation_cost(z, u, horizon_length=1000):   
    
    z_trajectory1 = np.array([0,0,0,0,0,0]).T
    z_trajectory2 = np.array([3,0,3,0,np.pi/2,0]).T
    z_trajectory3 = np.array([0,0,0,0,0,0]).T
    
    list_of_A = []
    list_of_B = []
    list_of_Q = []
    list_of_R = []
    list_of_q = []
    list_of_r = []
#     
    u_star = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2]).T

    for i in range(1000):
#         print(i)
        
        A,B = get_linearization(z[:,i],u[:,i])
        
        if i < 350:
            
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 2., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory1))
            
        elif i >350 and i < 500: 
            
            Q = np.array([[100., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 150., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 200., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory2))
            
        elif i > 500 and i < 650:
            
            Q = np.array([[200., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 200., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 100., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory3))
            
        else:  
            
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 3., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory3))
 
    
        list_of_A.append(A)
        list_of_B.append(B)
        list_of_Q.append(Q)
        list_of_R.append(R)
        list_of_q.append(q)
        list_of_r.append(r)

    q = np.matmul(Q, (z[:,-1]-z_trajectory3))
    
    Q = np.array([[5., 0., 0., 0., 0., 0.], 
                  [0., 2., 0., 0., 0., 0.], 
                  [0., 0., 5., 0., 0., 0.], 
                  [0., 0., 0., 2., 0., 0.], 
                  [0., 0., 0., 0., 3., 0.], 
                  [0., 0., 0., 0., 0., 2.]])
    
    list_of_Q.append(Q)
    list_of_q.append(q)
    
        
    return list_of_A, list_of_B, list_of_Q, list_of_R, list_of_q, list_of_r


In [34]:
def solve_iLQR(A, B, Q, R, q, r, N):
    
    list_of_K = []
    list_of_k = []
    list_of_P = []
    list_of_p = []

    QN = Q[-1]
    qn = q[-1]

    list_of_P.append(QN)
    list_of_p.append(qn)
    
    A.reverse()
    B.reverse()

    for i in range(N):

        K = np.matmul(- np.linalg.inv(np.matmul(np.transpose(B[i]), np.matmul(list_of_P[i], B[i])) + R[N-i-1]), np.matmul(np.transpose(B[i]), np.matmul(list_of_P[i], A[i])))
        
        P = Q[N-i-2] + np.matmul(np.transpose(A[i]), np.matmul(list_of_P[i], A[i])) + np.matmul(np.transpose(A[i]), np.matmul(list_of_P[i], np.matmul(B[i], K)))

        list_of_K.append(K)
        list_of_P.append(P)
                
        k = np.matmul(- np.linalg.inv(np.matmul(np.transpose(B[i]), np.matmul(list_of_P[i], B[i])) + R[N-i-1]), np.matmul(np.transpose(B[i]), list_of_p[i]) + r[N-i-1])
        
        p = q[N-i-2] + np.matmul(np.transpose(A[i]), list_of_p[i]) + np.matmul(np.transpose(A[i]), np.matmul(list_of_P[i], np.matmul(B[i], k)))

        list_of_k.append(k)
        list_of_p.append(p)
        
    list_of_K.reverse()
    list_of_k.reverse()
    
    return list_of_K, list_of_k

In [35]:
def get_new_state_control(z, u, K, k, alpha, horizon_length):
    
    u_n = np.zeros_like(u)
    z_n = np.zeros_like(z)
    
    z_n[:,0] = np.array([0, 0, 0, 0, 0, 0])

    for i in range(horizon_length):
        
        u_n[:,i] = K[i]@(z_n[:,i]-z[:,i] ) + alpha*k[i] + u[:,i]

        z_n[:,i+1] = quadrotor.get_next_state(z_n[:, i], u_n[:,i])

    z_new = z_n
    u_new = u_n
    
    return z_new, u_new

In [36]:
N = 1000
z0 = np.array([0, 0, 0, 0, 0, 0])
u = np.ones([quadrotor.NUMBER_CONTROLS, N])
z = np.zeros([quadrotor.NUMBER_STATES, N+1])

alpha = 1.0

while alpha > 0.01:
        
    z[:,0] = z0
    for i in range(N):
        z[:, i+1] = quadrotor.get_next_state(z[:, i], u[:,i])

    old_cost = compute_cost(z, u, N)

    A, B, Q, R, q, r = get_quadratic_approximation_cost(z, u, N)

    K, k = solve_iLQR(A, B, Q, R, q, r, N)      

    z_new, u_new = get_new_state_control(z, u, K, k, alpha, N)
    
    new_cost = compute_cost(z_new, u_new, N)
    
    if new_cost < old_cost:
        
        old_cost = new_cost
    else:
        
        alpha = alpha/2
        
    u = u_new

quadrotor.animate_robot(z, u)


In [37]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, z[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, z[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, z[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, z[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, z[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, z[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('4.1.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('4.2.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### Task 2 - doing a full flip
In the second task, we want the robot to do a full flip, trying to reach the upside-down state $x=1.5$, $y=3$ and $\theta = \pi$ at $t=5$ and upright state $x=3$, $y=0$ and $\theta = 2\pi$ at $T=10$.
8. Use iLQR (and a new cost function) to get the quadrotor to perform the task. Analyze your results. 
9. What benefits and issues do you see with this approach? Could you run the resulting controller on a real robot?

In [38]:
def compute_flip_cost(z, u, horizon_length):
    # iterating cost
    
    z_trajectory1 = np.array([0,0,0,0,0,0]).T    
    z_trajectory2 = np.array([1.5,0,3,0,np.pi,0]).T    
    z_trajectory3 = np.array([3,0,0,0,np.pi*2,0]).T
    
    u_desired = np.array([(quadrotor.MASS*9.81)/2, (quadrotor.MASS*9.81)/2])
    
    cost = 0

    for i in range(horizon_length):

        
        R = np.array([[1., 0.],[0., 1.]])

        if i<350:

              
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 2., 0.], 
                          [0., 0., 0., 0., 0., 2.]]) 
            
            z_desired = z_trajectory1
            
        elif i>350 and i<500:
            
            Q = np.array([[10., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 100., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 200., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
            
            z_desired = z_trajectory2
            
        elif i>500 and i<650:
            
            Q = np.array([[20., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 20., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 500., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
            
            z_desired = z_trajectory3
            
        else:
            
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 3., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
            
            z_desired = z_trajectory3


        cost_val = np.matmul(np.transpose(z[:,i] - z_desired), np.matmul(Q, (z[:,i] - z_desired))) + np.matmul(np.transpose(u[:,i] - u_desired), np.matmul(R, (u[:,i] - u_desired)))
    
        cost += cost_val
        
   
    terminal_cost = np.matmul(np.transpose(z[:,-1] - z_trajectory3), np.matmul(Q, (z[:,-1] - z_trajectory3)))
    
    cost += terminal_cost

    return cost
    


In [39]:
def get_quadratic_approximation_flip_cost(z, u, horizon_length=1000):   
    
    z_trajectory1 = np.array([0,0,0,0,0,0]).T
    
    z_trajectory2 = np.array([1.5,0,3,0,np.pi,0]).T
    
    z_trajectory3 = np.array([3,0,0,0,np.pi*2,0]).T
    
    list_of_A = []
    list_of_B = []
    list_of_Q = []
    list_of_R = []
    list_of_q = []
    list_of_r = []
#     
    u_star = np.array([(quadrotor.MASS*9.81/2), (quadrotor.MASS*9.81/2)]).T

    for i in range(1000):
#         print(i)
        
        A,B = get_linearization(z[:,i],u[:,i])
        
        if i < 400:
            
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 2., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory1))
            
        elif i >400 and i < 500: 
            
            Q = np.array([[10., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 100., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 200., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory2))
            
        elif i > 500 and i < 650:
            
            Q = np.array([[20., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 20., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 500., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory3))
            
        else:  
            
            Q = np.array([[5., 0., 0., 0., 0., 0.], 
                          [0., 2., 0., 0., 0., 0.], 
                          [0., 0., 5., 0., 0., 0.], 
                          [0., 0., 0., 2., 0., 0.], 
                          [0., 0., 0., 0., 3., 0.], 
                          [0., 0., 0., 0., 0., 2.]])
    
            R = np.array([[1., 0.],[0., 1.]])
            
            r = np.matmul(R,(u[:,i]-u_star))
            q = np.matmul(Q, (z[:,i]-z_trajectory3))
 
    
        list_of_A.append(A)
        list_of_B.append(B)
        list_of_Q.append(Q)
        list_of_R.append(R)
        list_of_q.append(q)
        list_of_r.append(r)

    q = np.matmul(Q, (z[:,-1]-z_trajectory3))
    
    Q = np.array([[5., 0., 0., 0., 0., 0.], 
                  [0., 2., 0., 0., 0., 0.], 
                  [0., 0., 5., 0., 0., 0.], 
                  [0., 0., 0., 2., 0., 0.], 
                  [0., 0., 0., 0., 3., 0.], 
                  [0., 0., 0., 0., 0., 2.]])
    
    list_of_Q.append(Q)
    list_of_q.append(q)
    
        
    return list_of_A, list_of_B, list_of_Q, list_of_R, list_of_q, list_of_r


In [40]:
N = 1000
z0 = np.array([0, 0, 0, 0, 0, 0])
u = np.ones([quadrotor.NUMBER_CONTROLS, N])
z = np.zeros([quadrotor.NUMBER_STATES, N+1])

alpha = 1.0

while alpha > 0.01:
        
    z[:,0] = z0
    for i in range(N):
        z[:, i+1] = quadrotor.get_next_state(z[:, i], u[:,i])

    old_cost = compute_flip_cost(z, u, N)

    A, B, Q, R, q, r = get_quadratic_approximation_flip_cost(z, u, N)

    K, k = solve_iLQR(A, B, Q, R, q, r, N)      

    z_new, u_new = get_new_state_control(z, u, K, k, alpha, N)
    
    new_cost = compute_flip_cost(z_new, u_new, N)
    
    if new_cost < old_cost:
        
        old_cost = new_cost
    else:
        
        alpha = alpha/2
        
    u = u_new

quadrotor.animate_robot(z, u)


In [41]:
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, z[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, z[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, z[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, z[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, z[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, z[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')
plt.savefig('4.3.png')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')
plt.savefig('4.4.png')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>