# Goal of the project

The goal of this project is to control a 2D quadrotor to perform acrobatic moves. There are 4 parts of the project, where you will build controllers of increasing complexity. The last part will lead to the implementation of the iterative LQR (iLQR) algorithm.

## Instructions
Answer all the questions in the 4 parts below. You will need to submit:
1. A report (pdf format only - every other format will be rejected) answering all the questions that do not request code. DO NOT include code in the report.
2. One (or several) Jupyter notebook(s) containing all the code used to answer the questions. The notebook(s) should be runnable as is.

## 2D quadrotor

The quadrotor is depicted in the following figure <img src='quadrotor.png' width="300">
The quadrotor model is written as
$$\begin{align} 
\dot{x} &= v_x\\
m \dot{v}_x &= - (u_1 + u_2) \sin \theta \\ 
\dot{y} &= v_y\\
m \dot{v}_y &= (u_1 + u_2) \cos \theta  - m g\\
\dot{\theta} &= \omega\\
I \dot{\omega} &= r (u_1 - u_2) \end{align}$$
where $x$ is the horizontal and $y$ the vertical positions of the quadrotor and $\theta$ is its orientation with respect to the horizontal plane. $v_x$ and $v_y$ are the linear velocities and $\omega$ is the angular velocity of the robot. $u_1$ and $u_2$ are the forces produced by the rotors (our control inputs). $m$ is the quadrotor mass, $I$ its moment of inertia (a scalar), $r$ is the distance from the center of the robot frame to the propellers and $g$ is the gravity constant. To denote the entire state, we will write $z = [x, v_x, y, v_y, \theta, \omega]^T$ - we will also write $u = [u_1, u_2]^T$.

The module ```quadrotor.py``` defines useful constants (mass, length, gravity, etc) and functions to simulate and animate the quadrotor as shown below.

## Part 1 - Setting up
1. Discretize the system dynamics using the method seen in class - write the time discretization step as $\Delta t$ (use symbols not numbers for the mass, etc)
2. Assume that the robot starts at an arbitrary position $x(0) = x_0$, $y(0) = y_0$ and $\theta(0) = 0$ with 0 velocities. Compute $u_1^*$ and $u_2^*$ such that the robot stays at this position forever after (you may test your answer using the simulation below).
3. Analyzing the system dynamics, is it possible to move in the x direction while keeping $\theta = 0$? Explain why.
4. Analyzing the system dynamics, is it possible to have the system at rest with $\theta = \frac{\pi}{2}$ (i.e. have the quadrotor in a vertical position)? Explain why.

In [22]:
%matplotlib notebook
#Setting up & import packages

import numpy as np
import matplotlib.pyplot as plt

import quadrotor

In [23]:
# we can get its mass, half length (r), gravity constant
print(f'm is {quadrotor.MASS}')
print(f'r is {quadrotor.LENGTH}')
print(f'I is {quadrotor.INERTIA}')
print(f'g is {quadrotor.GRAVITY}')

# we can also get the integration step used in the simulation
print(f'dt is {quadrotor.DELTA_T}')

# we can get the size of its state and control vector
print(f'number of states {quadrotor.NUMBER_STATES} and number of controls {quadrotor.NUMBER_CONTROLS}')
print('the states are indexed as follows: x, vx, y, vy, theta, omega')

m is 0.6
r is 0.2
I is 0.15
g is 9.81
dt is 0.01
number of states 6 and number of controls 2
the states are indexed as follows: x, vx, y, vy, theta, omega


In [24]:
# we can simulate the robot but we need to provide a controller of the following form

# Verify Part1, question 2
def part1_q2_controller(state, i):
    """
        the prototype of a controller is as follows
        state is a column vector containing the state of the robot
        i is the index corresponding to the time step in the horizon (useful to index gains K for e.g.)
        
        this controller needs to return an array of size (2,)
    """
    # The controller to stay in the initial position
    balanced_force = 0.5 * quadrotor.MASS * quadrotor.GRAVITY
    controller = np.array([balanced_force, balanced_force])
    return controller


# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.zeros([quadrotor.NUMBER_STATES,])
t, state, u = quadrotor.simulate(z0, part1_q2_controller, horizon_length, disturbance = False)

In [25]:
# we can plot the results
def plot_results(t, state, u):
    plt.figure(figsize=[9,6])

    plt.subplot(2,3,1)
    plt.plot(t, state[0,:])
    plt.legend(['X'])

    plt.subplot(2,3,2)
    plt.plot(t, state[2,:])
    plt.legend(['Y'])

    plt.subplot(2,3,3)
    plt.plot(t, state[4,:])
    plt.legend(["theta"])

    plt.subplot(2,3,4)
    plt.plot(t, state[1,:])
    plt.legend(['Vx'])
    plt.xlabel('Time [s]')

    plt.subplot(2,3,5)
    plt.plot(t, state[3,:])
    plt.legend(['Vy'])
    plt.xlabel('Time [s]')

    plt.subplot(2,3,6)
    plt.plot(t, state[5,:])
    plt.legend(['omega'])
    plt.xlabel('Time [s]')

    # we can also plot the control
    plt.figure()
    plt.plot(t[:-1], u.T)
    plt.legend(['u1', 'u2'])
    plt.xlabel('Time [s]')
    
plot_results(t, state, u)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [26]:
# now we can also create an animation
quadrotor.animate_robot(state, u)

In [27]:
# Verify Part1 question 3, we can use the same controller 
# If we give the dron a non-zero velocity in the x-direction at the begining, it will move in the direction and keep theta = 0.
z0[1] = 0.25
t, state, u = quadrotor.simulate(z0, part1_q2_controller, horizon_length, disturbance = False)
plot_results(t, state, u)

quadrotor.animate_robot(state, u)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Part 2 - LQR to stay in place
Now that we have $u^*$ capable of keeping the robot at rest, we can design a simple controller that ensures that the robot stays in place even when pushed around by random disturbances (e.g. due to the wind). Our task here will be to design a LQR controller that keeps the robot at a predefined position. Since the dynamics is not linear, we need to compute a linear approximation of it.
1. Linearize the dynamics at an arbitrary operating point $z^*$, $u^*$ and write the linearized system dynamics using the variables $\bar{z}_n = z_n - z^*$ and $\bar{u}_n = u_n - u^*$.
2. Write a function ```get_linearization(z, u)``` that returns the matrices A and B given a state $z$ and a control $u$ (use the constants defined in the ``quadrotor.py`` module).
3. Using the linearized dynamics, we can design an infinite horizon LQR controller of the form, $\hat{u} = K \bar{z}$ to stabilize the resting point. Write the equations of the controller in the original coordinates $u$ as a function of $z$.
4. Design an infinite-horizon LQR controller that stabilizes the origin $z=0$ and test it using the simulator below.
5. Explain your intended design in the report, including the cost function and found control law. In particular, verify that it can handle perturbations by calling the ```simulate``` function with ```disturbance = True``` (when setting disturbance to ``True``, the simulator will generate a random perturbation every 1 second). Simulate your controller for 10 seconds, plot the state evolution and show the animation (include the plots in your report).

In [37]:
def get_linearization(z, u):
    theta = z[4]
    dt = quadrotor.DELTA_T
    drone_mass = quadrotor.MASS
    drone_length = quadrotor.LENGTH
    drone_inertia = quadrotor.INERTIA
    
    A_mat = np.array([[1,dt,0,0,0,0],
                      [0,1,0,0,dt * -((u[0] + u[1]) * np.cos(theta) / drone_mass),0],
                      [0,0,1,dt,0,0],
                      [0,0,0,1,dt * -((u[0] + u[1]) * np.sin(theta) / drone_mass),0],
                      [0,0,0,0,1,dt],
                      [0,0,0,0,0,1]])
    
    B_mat = np.array([[0,0],
                      [dt * (-np.sin(theta) / drone_mass), dt * (-np.sin(theta) / drone_mass)],
                      [0,0],
                      [dt * (np.cos(theta) / drone_mass), dt * (np.cos(theta) / drone_mass)],
                      [0,0],
                      [dt * drone_length / drone_inertia, -dt * drone_length / drone_inertia]])
    return A_mat, B_mat

#Solve the LQR Problem
def solve_LQR(A, B, Q, R, QN, N):
    list_of_P = []
    list_of_K = []

    list_of_P.append(QN)

    for i in range(N):
        K_i = -1 * np.linalg.inv(B.transpose().dot(list_of_P[i]).dot(B) + R).dot(B.transpose()).dot(list_of_P[i]).dot(A)
        P_i = Q + A.transpose().dot(list_of_P[i]).dot(A) + A.transpose().dot(list_of_P[i]).dot(B).dot(K_i)
        list_of_K.append(K_i)
        list_of_P.append(P_i)

    return list_of_P[::-1], list_of_K[::-1]
    

#Sovle infinite LQR for question 3 & 4
def get_part2_controller(z_star, u_star):
    # First we need to design the quadratic cost matrix Q and R
    Q_mat = 1000. * np.eye(6)
    R_mat = 10 * np.eye(2)
    
    # Then get A and B by linearization
    A, B = get_linearization(z_star, u_star)
    print ("A mat is: ", A)
    print ("B mat is: ", B)
    
    horizon_length = 1000
    P_mats, K_mats = solve_LQR(A, B, Q_mat, R_mat, Q_mat, 1000)
#     P = P_mats[0]
#     K_mat = K_mats[0]
    
    return K_mats
    
z_original = np.zeros([quadrotor.NUMBER_STATES,])
u_original = 0.5 * quadrotor.MASS * quadrotor.GRAVITY * np.ones([2,])
part2_K_mats = get_part2_controller(z_original, u_original)

# we can simulate the robot but we need to provide a controller of the following form
def part2_controller(state, i):
    """
        the prototype of a controller is as follows
        state is a column vector containing the state of the robot
        i is the index corresponding to the time step in the horizon (useful to index gains K for e.g.)
        
        this controller needs to return an array of size (2,)
    """
    K = part2_K_mats[i]
    control_unit = K.dot(state - z_original) + u_original
    return control_unit



# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.zeros([quadrotor.NUMBER_STATES,])
t, state, u = quadrotor.simulate(z0, part2_controller, horizon_length, disturbance = False)

A mat is:  [[ 1.      0.01    0.      0.      0.      0.    ]
 [ 0.      1.      0.      0.     -0.0981  0.    ]
 [ 0.      0.      1.      0.01    0.      0.    ]
 [ 0.      0.      0.      1.     -0.      0.    ]
 [ 0.      0.      0.      0.      1.      0.01  ]
 [ 0.      0.      0.      0.      0.      1.    ]]
B mat is:  [[ 0.          0.        ]
 [-0.         -0.        ]
 [ 0.          0.        ]
 [ 0.01666667  0.01666667]
 [ 0.          0.        ]
 [ 0.01333333 -0.01333333]]


In [38]:
plot_results(t, state, u)

quadrotor.animate_robot(state, u)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [39]:
t, state, u = quadrotor.simulate(z0, part2_controller, horizon_length, disturbance = True)

plot_results(t, state, u)
quadrotor.animate_robot(state, u)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Part 3 - following a trajectory using linearized dynamics
Now we want to follow a given trajectory leveraging a linearized version of the dynamics to design LQ controllers.
1. Assume that we want to follow a circle of radius 1 centered at (0,0) while keeping an orientation $\theta=0$, how does the linearization of the dynamics change along the desired trajectory? Why?
2. Design a tracking controller (using an LQ design with linear approximations) to follow this desired trajectory. Explain your design in the report. 
3. Test the tracking controller with the simulation (with and without the perturbations) and verify that you can indeed track the (x,y) trajectory very well. Are you able to also track $\theta$? (Explain) 
4. Analyze your results (including plots of the states, controls, etc). What benefits and issues do you see with this approach?
5. Is it possible to do the same thing while keeping a desired orientation of $\theta = \frac{\pi}{4}$? What might influence the results in this case?

In [31]:
time_length = 1001
deltaT = 0.01
def get_waypoints():
    waypoints_on_circle = []
    
    thetas = np.linspace(0, time_length * deltaT, 1001)
    omega = 2 * np.pi
    for i, t in enumerate(thetas):
        k = i / 1001
        z_i = np.array([np.cos(omega * k), -omega / 10 * np.sin(omega*k), np.sin(omega*k), omega / 10 * np.cos(omega*k), 0, 0])
        waypoints_on_circle.append(z_i)
        
    return waypoints_on_circle

# Get the waypoints on the circle
part3_trajectory_points = get_waypoints()

print("part3 points: ", part3_trajectory_points[0])

part3 points:  [ 1.         -0.          0.          0.62831853  0.          0.        ]


In [33]:
def solve_LQR_trajectory(A, B, Q, R, x_bar, N): 
    '''
    A, B, Q and R are the matrices defining the OC problem
    x_bar is the trajectory of desired states of size dim(x) x (N+1)
    N is the horizon length

    The function returns 1) a list of gains of length N and 2) a list of feedforward controls of length N
    '''
    K_gains = []  # K_i

    list_of_P = [Q]  # P_i and P_N

    k_feedforward = []  # k_i

    list_of_p = []  # p_i
    x_bar_N = x_bar[N]
    qN = - Q.dot(x_bar_N)
    list_of_p.append(qN)  # p_N

    for i in range(N):
        # A B order is N - 1, N - 2, ... , 0
        K_i = -1 * np.linalg.inv(R + B[i].transpose().dot(list_of_P[i]).dot(B[i])).dot(B[i].transpose()).dot(list_of_P[i]).dot(A[i])

        P_i = Q + A[i].transpose().dot(list_of_P[i]).dot(A[i]) + A[i].transpose().dot(list_of_P[i]).dot(B[i]).dot(K_i)

        k_i = -1 * np.linalg.inv(R + B[i].transpose().dot(list_of_P[i]).dot(B[i])).dot(B[i].transpose()).dot(list_of_p[i])

        x_bar_i = x_bar[N - i - 1]
        q_i = - Q.dot(x_bar_i)
        p_i = q_i + A[i].transpose().dot(list_of_p[i]) + A[i].transpose().dot(list_of_P[i]).dot(B[i]).dot(k_i)

        K_gains.append(K_i)
        list_of_P.append(P_i)
        k_feedforward.append(k_i)
        list_of_p.append(p_i)

    return K_gains[::-1], k_feedforward[::-1]

def get_part3_controller(z_star_waypoints, u_star):
    # First we need to design the quadratic cost matrix Q and R
    Q_mat = np.array([[1000,0,0,0,0,0],
                      [0,1000,0,0,0,0],
                      [0,0,1000,0,0,0],
                      [0,0,0,1000,0,0],
                      [0,0,0,0,1000,0],
                      [0,0,0,0,0,1000]])
    R_mat = np.array([[0.1, 0],
                      [0, 0.1]])
    
    # Then get A and B by linearization
    A_mats = []
    B_mats = []
    horizon_length = time_length - 1
    
    for i in range(horizon_length):
        A, B = get_linearization(z_star_waypoints[i], u_star)
        A_mats.append(A)
        B_mats.append(B)
    
    K_mats, k_feedforward_mats = solve_LQR_trajectory(A_mats[::-1], B_mats[::-1], Q_mat, R_mat, z_star_waypoints, horizon_length)
    
    return K_mats, k_feedforward_mats

z_original = np.zeros([quadrotor.NUMBER_STATES,])
balanced_force = 0.5 * quadrotor.MASS * quadrotor.GRAVITY
u_original = np.array([balanced_force, balanced_force])
part3_K_mats, part3_k_feedforward_mats = get_part3_controller(part3_trajectory_points, u_original)

def part3_controller(state, i):
    """
        the prototype of a controller is as follows
        state is a column vector containing the state of the robot
        i is the index corresponding to the time step in the horizon (useful to index gains K for e.g.)
        
        this controller needs to return an array of size (2,)
    """
    K = part3_K_mats[i]
    k = part3_k_feedforward_mats[i]
    control_unit = K @ (state) + u_original + k
    return control_unit


# we can now simulate for a given number of time steps - here we do 10 seconds
horizon_length = 1000
z0 = np.zeros([quadrotor.NUMBER_STATES,])
# z0[0] = 1
t, state, u = quadrotor.simulate(z0, part3_controller, horizon_length, disturbance = False)

In [34]:
plot_results(t, state, u)
quadrotor.animate_robot(state, u)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [35]:
t, state, u = quadrotor.simulate(z0, part3_controller, horizon_length, disturbance = True)

plot_results(t, state, u)
quadrotor.animate_robot(state, u)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Part 4 - iterative LQR
Now we would like to do more complicated motions with the robot, like a flip. In this case, we do not have a prescribed trajectory but we would like to compute a locally optimal trajectory while we optimize the controller. We will use the *iterative LQR* algorithm to solve this problem.
### Task 1 - reaching a vertical orientation
In the first task, we want the robot to reach a vertical orientation $\theta = \frac{\pi}{2}$ at the location $x=3$ and $y=3$ at time $t=5$ starting from $z_0=0$. During the rest of the motion, the robot should try and stay close to the origin. It should also try to keep its control $u$ close to the control needed to keep the robot at rest. We want to make sure the robot reaches the origin $z=0$ at the end of the movement. 
1. Find a time-varying cost function that promotes such a behavior (use only quadratic/linear terms for simplicity). Assume $T=10$ seconds.
2. Write a function ```compute_cost(z,u, horizon_length)``` that returns the cost of a trajectory z with control trajectory u (using the cost function you wrote in question 1).
3. Compute the quadratic approximation of your cost function along an arbitrary trajectory of states $z_n$ with control trajectory $u_n$ (this is not just your cost function!)
4. Write a function ```get_quadratic_approximation_cost(z, u, horizon_length)``` that returns the quadratic approximation (Hessian matrices and Jacobians) of the cost function when approximated along the trajectory z with control trajectory u.
5. Write the iLQR algorithm that solves the problem using the functions written above. DO NOT FORGET the line search step at each iteration. For the line search, start with $\alpha = 1.$ and decrease it by half when the cost does not improve (you can stop when $\alpha < 0.01$).
6. Test the algorithm using as initial guess $u$ such that the robot is at rest (using the results of Part 1.2). Analyze your results (probably you will need to "tune" your cost function), plot the initial and final state and control trajectories, show the animation. Use the simulation without perturbations for simplicity. 
7. What benefits and issues do you see with this approach?
### Task 2 - doing a full flip
In the second task, we want the robot to do a full flip, trying to reach the upside-down state $x=1.5$, $y=3$ and $\theta = \pi$ at $t=5$ and upright state $x=3$, $y=0$ and $\theta = 2\pi$ at $T=10$.
8. Use iLQR (and a new cost function) to get the quadrotor to perform the task. Analyze your results. 
9. What benefits and issues do you see with this approach? Could you run the resulting controller on a real robot?

In [44]:
print(horizon_length)

1000


In [45]:
#Task 1
deltaT = 0.01

# Get initial guess
def get_part4_trajectorypoints(horizon_length):
    # generate the inital points
    trajectory_points = []
    control_forces = []
    for i in range(horizon_length + 1):
        current_time = i * deltaT
        if (current_time < 5):
            z_i = [current_time / 5 * 3, 3/5, current_time / 5 * 3, 3/5, np.pi / 10 * current_time, np.pi / 10]
        else:
            z_i = [3 - (current_time - 5) / 5 * 3, -3/5, 3 - (current_time - 5) / 5 * 3, -3/5, np.pi / 2 - np.pi / 10 * (current_time - 5), -np.pi / 10]
        
        trajectory_points.append(z_i)
        
        if i != horizon_length:
            balanced_force = 0.5 * quadrotor.MASS * quadrotor.GRAVITY
            u_i = np.array([balanced_force, balanced_force])
            control_forces.append(u_i)
    
    return trajectory_points, control_forces

z_star_0, u_star_0 = get_part4_trajectorypoints(horizon_length)
    
# z_goal_1 = np.array([3, 0.6, 3, 0.6, np.pi / 2, np.pi / 10])
# z_goal_2 = np.zeros([quadrotor.NUMBER_STATES,])
# z0 = np.zeros([quadrotor.NUMBER_STATES,])

cost_Q_mat = np.array([[100,0,0,0,0,0],
                      [0,100,0,0,0,0],
                      [0,0,100,0,0,0],
                      [0,0,0,100,0,0],
                      [0,0,0,0,100,0],
                      [0,0,0,0,0,100]])
cost_R_mat = np.array([[0.1, 0],
                      [0, 0.1]])
cost_Q_N = 1000000 * np.eyes(6)

def compute_cost(z,u, horizon_length):
    total_cost = 0
    
    for i in range(horizon_length + 1):
        if i != horizon_length:
            total_cost = total_cost + 0.5 * z[i].transpose().dot(cost_Q_mat).dot(z[i]) + 0.5 * u[i].transpose().dot(cost_R_mat).dot(u[i])
        else:
            total_cost = total_cost + 0.5 * z[i].transpose().dot(cost_Q_N).dot(z[i])
        
    return total_cost


z_star_0[0] is:  [0.0, 0.6, 0.0, 0.6, 0.0, 0.3141592653589793]
z_star_0[500] is:  [3.0, -0.6, 3.0, -0.6, 1.5707963267948966, -0.3141592653589793]
z_star_0[1000] is:  [0.0, -0.6, 0.0, -0.6, 0.0, -0.3141592653589793]
u_star_0[0] is:  [2.943 2.943]
u_star_0[500] is:  [2.943 2.943]
u_star_0[999] is:  [2.943 2.943]


In [None]:
def get_quadratic_approximation_cost(z, u, horizon_length): # 0, 1000
    list_of_Q_mats = []
    list_of_R_mats = []
    list_of_q_mats = []
    list_of_r_mats = []
    
    for i in range(horizon_length):
        Q = cost_Q_mat
        R = cost_R_mat
        q = cost_Q_mat.dot(z[i])
        r = cost_R_mat.dot(u[i])
        
        list_of_Q_mats.append(Q)
        list_of_R_mats.append(R)
        list_of_q_mats.append(q)
        list_of_r_mats.append(r)
        
    QN = cost_Q_N
    qN = cost_Q_N.dot(z[-1])
    return list_of_Qmats, list_of_Rmats, list_of_q_mats, list_of_r_mats, QN, qN


def solve_iLQR(z_0, z_goal, horizon_length): # 0, 1000
    # record the optimal answer & initialization
    z_optimal = z_star_0
    u_optimal = u_star_0
    cost_optimal = compute_cost(z_optimal, u_optimal)
    
    # record the cost
    current_cost = 0
    prev_cost = 0
    
    eps = 1e-3
    
    while (np.abs(current_cost - prev_cost) / current_cost > eps):
        # get Q,R,q,r by quadratic
        Q_mats, R_mats, q_mats, r_mats, QN, qN = get_quadratic_approximation_cost(z_optimal, u_optimal, horizon_length)
        # reverse Q, R, q, r in N, N-1, ..., 1, 0 order
        Q_mats.reverse()
        R_mats.reverse()
        q_mats.reverse()
        r_mats.reverse()
        
        # get A, B by linearization
        A_mats = []
        B_mats = []
        time_step_length = horizon_length - 1 # 0,N-1
    
        for i in range(time_step_length):
            A, B = get_linearization(z_optimal[i], u_optimal[i])
            A_mats.append(A)
            B_mats.append(B)
        # reverse A, B in N, N-1, ..., 1, 0 order
        A_mats.reverse()
        B_mats.reverse()
        
        # Riccati recursion, using A, B, Q, R, q, r
        N = horizon_length
        K_gains = []  # K_i
        list_of_P = [QN]  # P_i and P_N

        k_feedforward = []  # k_i
        list_of_p = [qN]  # p_i
        
        for i in range(N):
            A = A_mats[i] # A_i
            B = B_mats[i] # B_i
            Q = Q_mats[i] # Q_i
            R = R_mats[i] # R_i
            q = q_mats[i] # q_i
            r = r_mats[i] # r_i
            
            P_prev = list_of_P[i] # P_{i+1}
            p_prev = list_of_p[i] # p_{i+1}
            
            
            K_i = -1 * np.linalg.inv(R + B.transpose().dot(P_prev).dot(B)).dot(B.transpose()).dot(P_prev).dot(A) 

            P_i = Q + A.transpose().dot(P_prev).dot(A) + A.transpose().dot(P_prev).dot(B).dot(K_i)

            k_i = -1 * np.linalg.inv(R + B.transpose().dot(P_prev).dot(B)).dot((B.transpose().dot(p_prev) + r))

            p_i = q + A.transpose().dot(p_prev) + A.transpose().dot(P_prev).dot(B).dot(k_i)

            K_gains.append(K_i)
            list_of_P.append(P_i)
            k_feedforward.append(k_i)
            list_of_p.append(p_i)
        
        # reorder K, k in 0, 1, ... , N - 1 order
        K_gains.reverse()
        k_feedforward.reverse()
    
        # Forward process
        alpha = 1.0
        current_u = []
        current_z = [np.zeros([quadrotor.NUMBER_STATES,])]
        for i in range(horizon_length):
            K = K_gains[i]
            k = k_feedforward[i]
            state = current_z[i]
            u_i = u_optimal[i] + K.dot(state - z_optimal[i]) + alpha * k
            next_state = quadrotor.get_next_state(state, u_i)
            current_z.append(next_state)
        
        total_cost = 
        #line search process
        while (alpha > 0.01 and ):
            alpha = alpha / 2
            
    
    return u_optimal


In [None]:
def part4_controller(state, i):
    return part4_q1_u_optimal[i]

t, state, u = quadrotor.simulate(z0, part4_controller, horizon_length, disturbance = True)

plot_results(t, state, u)
quadrotor.animate_robot(state, u)


In [None]:
#Task 2