# Goal of the project

The goal of this project is to control a 2D quadrotor to perform acrobatic moves. There are 4 parts of the project, where you will build controllers of increasing complexity. The last part will lead to the implementation of the iterative LQR (iLQR) algorithm.

## Instructions
Answer all the questions in the 4 parts below. You will need to submit:
1. A report (pdf format only - every other format will be rejected) answering all the questions that do not request code. DO NOT include code in the report.
2. One (or several) Jupyter notebook(s) containing all the code used to answer the questions. The notebook(s) should be runnable as is.

## 2D quadrotor

The quadrotor is depicted in the following figure <img src='quadrotor.png' width="300">
The quadrotor model is written as
$$\begin{align} 
\dot{x} &= v_x\\
m \dot{v}_x &= - (u_1 + u_2) \sin \theta \\ 
\dot{y} &= v_y\\
m \dot{v}_y &= (u_1 + u_2) \cos \theta  - m g\\
\dot{\theta} &= \omega\\
I \dot{\omega} &= r (u_1 - u_2) \end{align}$$
where $x$ is the horizontal and $y$ the vertical positions of the quadrotor and $\theta$ is its orientation with respect to the horizontal plane. $v_x$ and $v_y$ are the linear velocities and $\omega$ is the angular velocity of the robot. $u_1$ and $u_2$ are the forces produced by the rotors (our control inputs). $m$ is the quadrotor mass, $I$ its moment of inertia (a scalar), $r$ is the distance from the center of the robot frame to the propellers and $g$ is the gravity constant. To denote the entire state, we will write $z = [x, v_x, y, v_y, \theta, \omega]^T$ - we will also write $u = [u_1, u_2]^T$.

The module ```quadrotor.py``` defines useful constants (mass, length, gravity, etc) and functions to simulate and animate the quadrotor as shown below.

## Part 1 - Setting up
1. Discretize the system dynamics using the method seen in class - write the time discretization step as $\Delta t$ (use symbols not numbers for the mass, etc)
2. Assume that the robot starts at an arbitrary position $x(0) = x_0$, $y(0) = y_0$ and $\theta(0) = 0$ with 0 velocities. Compute $u_1^*$ and $u_2^*$ such that the robot stays at this position forever after (you may test your answer using the simulation below).
3. Analyzing the system dynamics, is it possible to move in the x direction while keeping $\theta = 0$? Explain why.
4. Analyzing the system dynamics, is it possible to have the system at rest with $\theta = \frac{\pi}{2}$ (i.e. have the quadrotor in a vertical position)? Explain why.

## Part 2 - LQR to stay in place
Now that we have $u^*$ capable of keeping the robot at rest, we can design a simple controller that ensures that the robot stays in place even when pushed around by random disturbances (e.g. due to the wind). Our task here will be to design a LQR controller that keeps the robot at a predefined position. Since the dynamics is not linear, we need to compute a linear approximation of it.
1. Linearize the dynamics at an arbitrary operating point $z^*$, $u^*$ and write the linearized system dynamics using the variables $\bar{z}_n = z_n - z^*$ and $\bar{u}_n = u_n - u^*$.
2. Write a function ```get_linearization(z, u)``` that returns the matrices A and B given a state $z$ and a control $u$ (use the constants defined in the ``quadrotor.py`` module).
3. Using the linearized dynamics, we can design an infinite horizon LQR controller of the form, $\hat{u} = K \bar{z}$ to stabilize the resting point. Write the equations of the controller in the original coordinates $u$ as a function of $z$.
4. Design an infinite-horizon LQR controller that stabilizes the origin $z=0$ and test it using the simulator below.
5. Explain your intended design in the report, including the cost function and found control law. In particular, verify that it can handle perturbations by calling the ```simulate``` function with ```disturbance = True``` (when setting disturbance to ``True``, the simulator will generate a random perturbation every 1 second). Simulate your controller for 10 seconds, plot the state evolution and show the animation (include the plots in your report).

## Part 3 - following a trajectory using linearized dynamics
Now we want to follow a given trajectory leveraging a linearized version of the dynamics to design LQ controllers.
1. Assume that we want to follow a circle of radius 1 centered at (0,0) while keeping an orientation $\theta=0$, how does the linearization of the dynamics change along the desired trajectory? Why?
2. Design a tracking controller (using an LQ design with linear approximations) to follow this desired trajectory. Explain your design in the report. 
3. Test the tracking controller with the simulation (with and without the perturbations) and verify that you can indeed track the (x,y) trajectory very well. Are you able to also track $\theta$? (Explain) 
4. Analyze your results (including plots of the states, controls, etc). What benefits and issues do you see with this approach?
5. Is it possible to do the same thing while keeping a desired orientation of $\theta = \frac{\pi}{4}$? What might influence the results in this case?

## Part 4 - iterative LQR
Now we would like to do more complicated motions with the robot, like a flip. In this case, we do not have a prescribed trajectory but we would like to compute a locally optimal trajectory while we optimize the controller. We will use the *iterative LQR* algorithm to solve this problem.
### Task 1 - reaching a vertical orientation
In the first task, we want the robot to reach a vertical orientation $\theta = \frac{\pi}{2}$ at the location $x=3$ and $y=3$ at time $t=5$ starting from $z_0=0$. During the rest of the motion, the robot should try and stay close to the origin. It should also try to keep its control $u$ close to the control needed to keep the robot at rest. We want to make sure the robot reaches the origin $z=0$ at the end of the movement. 
1. Find a time-varying cost function that promotes such a behavior (use only quadratic/linear terms for simplicity). Assume $T=10$ seconds.
2. Write a function ```compute_cost(z,u, horizon_length)``` that returns the cost of a trajectory z with control trajectory u (using the cost function you wrote in question 1).
3. Compute the quadratic approximation of your cost function along an arbitrary trajectory of states $z_n$ with control trajectory $u_n$ (this is not just your cost function!)
4. Write a function ```get_quadratic_approximation_cost(z, u, horizon_length)``` that returns the quadratic approximation (Hessian matrices and Jacobians) of the cost function when approximated along the trajectory z with control trajectory u.
5. Write the iLQR algorithm that solves the problem using the functions written above. DO NOT FORGET the line search step at each iteration. For the line search, start with $\alpha = 1.$ and decrease it by half when the cost does not improve (you can stop when $\alpha < 0.01$).
6. Test the algorithm using as initial guess $u$ such that the robot is at rest (using the results of Part 1.2). Analyze your results (probably you will need to "tune" your cost function), plot the initial and final state and control trajectories, show the animation. Use the simulation without perturbations for simplicity. 
7. What benefits and issues do you see with this approach?
### Task 2 - doing a full flip
In the second task, we want the robot to do a full flip, trying to reach the upside-down state $x=1.5$, $y=3$ and $\theta = \pi$ at $t=5$ and upright state $x=3$, $y=0$ and $\theta = 2\pi$ at $T=10$.
8. Use iLQR (and a new cost function) to get the quadrotor to perform the task. Analyze your results. 
9. What benefits and issues do you see with this approach? Could you run the resulting controller on a real robot?

In [1]:
%matplotlib inline 

import numpy as np
import matplotlib.pyplot as plt

import quadrotor

In [2]:
# we can get its mass, half length (r), gravity constant
print(f'm is {quadrotor.MASS}')
print(f'r is {quadrotor.LENGTH}')
print(f'I is {quadrotor.INERTIA}')
print(f'g is {quadrotor.GRAVITY}')

# we can also get the integration step used in the simulation
print(f'dt is {quadrotor.DELTA_T}')

# we can get the size of its state and control vector
print(f'number of states {quadrotor.NUMBER_STATES} and number of controls {quadrotor.NUMBER_CONTROLS}')
print('the states are indexed as follows: x, vx, y, vy, theta, omega')





m is 0.6
r is 0.2
I is 0.15
g is 9.81
dt is 0.01
number of states 6 and number of controls 2
the states are indexed as follows: x, vx, y, vy, theta, omega


In [3]:
a=1
Q2=np.array([[0.1,0,0,0,0,0],[0,0.1,0,0,0,0],[0,0,0.1,0,0,0],[0,0,0,0.1,0,0],[0,0,0,0,0.1,0],[0,0,0,0,0,0.1]])
Q1=np.array([[1000,0,0,0,0,0],[0,0.1,0,0,0,0],[0,0,1000,0,0,0],[0,0,0,0.1,0,0],[0,0,0,0,1000,0],[0,0,0,0,0,0.1]])
Q=[]
for i in range(1001):
    if (i==500):
        Q.append(Q1)
    else:
        Q.append(Q2)

R=0.01*np.eye(2)
R1=0.04*np.eye(2)

T = np.arange(0.,10.01, quadrotor.DELTA_T)

u_bar = (quadrotor.MASS * quadrotor.GRAVITY / 2) * np.ones([2, 1000])
u_bar[:,500]=[0,0] #column 500 set to 0
U = (quadrotor.MASS * quadrotor.GRAVITY / 2) * np.ones([2, 1000])
u=(quadrotor.MASS * quadrotor.GRAVITY / 2) * np.ones([2, 1000])
    


Z = np.zeros([6,1001])  #this is X
z_bar=np.zeros([6,1001])
z_bar[:,500]=[3,0,3,0,np.pi/2,0] #the position that want to go

In [4]:
def get_linearization(z, u):
        A = np.array([[1, quadrotor.DELTA_T, 0 ,0 ,0, 0],[0, 1, 0, 0, -((u[0]+u[1]) * np.cos(z[4])) * quadrotor.DELTA_T / quadrotor.MASS, 0], [0, 0, 1, quadrotor.DELTA_T, 0, 0], [0, 0, 0, 1,-((u[0]+u[1]) * np.sin(z[4])) * quadrotor.DELTA_T / quadrotor.MASS, 0], [0, 0, 0, 0, 1, quadrotor.DELTA_T], [0, 0, 0, 0, 0, 1]])
        B = np.array([[0,0],[-np.sin(z[4])*quadrotor.DELTA_T/quadrotor.MASS,-np.sin(z[4])*quadrotor.DELTA_T/quadrotor.MASS],[0,0],[np.cos(z[4])*quadrotor.DELTA_T/quadrotor.MASS, np.cos(z[4])*quadrotor.DELTA_T/quadrotor.MASS],[0,0],[quadrotor.LENGTH*quadrotor.DELTA_T/quadrotor.INERTIA, -quadrotor.LENGTH*quadrotor.DELTA_T/quadrotor.INERTIA]])
        return A, B

# def solve_ricatti_equations(A,B,Q,R,horizon_length):
    
#         P = [] #will contain the list of Ps from N to 0
#         K = [] #will contain the list of Ks from N-1 to 0
    
#         n=horizon_length
#         current_p=Q
#         P.append(current_p)
#         while n>0:
#             current_K = -1*np.linalg.inv(R+np.dot(B.T, current_p).dot(B)).dot(np.dot(B.T,current_p).dot(A))
#             K.append(current_K)
#             current_p = Q+np.dot(A.T,current_p).dot(A)+np.dot(np.dot(np.dot(A.T,current_p),B),current_K)
#             P.append(current_p)
#             n=n-1
    
#         return P[::-1],K[::-1]
def solve_ricatti_equations(Z,U,Q,R,N,q,r):
    
    
    K_gains = []
    k_feedforward = []
    P = Q[-1]
    p = q[-1].T
       
    for i in reversed(range(0,horizon_length)):
        A, B = get_linearization(Z[:,i], U[:,i])
        K = -np.linalg.inv(B.T @ P @ B + R) @ B.T @ P @ A
        K_gains.append(K)
        k = -np.linalg.inv(B.T @ P @ B + R) @ (B.T @ p + r[i].T)
        k_feedforward.append(k)
        p = q[i].T + A.T @ p + A.T @ P @ B @ k
        P = Q[i] + A.T @ P @ A + A.T @ P @ B @ K
       
    K_gains = K_gains[::-1]
    k_feedforward = k_feedforward[::-1]
    return K_gains, k_feedforward

In [5]:
# # we can simulate the robot but we need to provide a controller of the following form
# def dummy_controller(state, i):
#     """
#         the prototype of a controller is as follows
#         state is a column vector containing the state of the robot
#         i is the index corresponding to the time step in the horizon (useful to index gains K for e.g.)
        
#         this controller needs to return an array of size (2,)
#     """
#     # here we do nothing and just return some non-zero control
  
#     # A, B = get_linearization(Z, U)
#     A, B = get_linearization(Z[:,i], U)
#     A=np.array(A,dtype='float')
#     B=np.array(B,dtype='float')

#     P, K = solve_ricatti_equations(A,B,Q,R,1000)  #P list 1001, K list 1000
#     state = state.reshape(6,1)
#     # u = U + K[i] @ (state - Z)
#     u = U + K[i] @ (state - Z[:,i].reshape(6,1))#
#     u = u.reshape(2)
#     return u

# # we can now simulate for a given number of time steps - here we do 10 seconds
# horizon_length = 1000 #1000 steps, 0.01s for per step
# z0 = np.zeros([quadrotor.NUMBER_STATES,]) #array([0., 0., 0., 0., 0., 0.])
# t, state, u = quadrotor.simulate(z0, dummy_controller, horizon_length, disturbance = False)

def controller(Z,U,q,r,horizon_length,Q,R):
    """
        the prototype of a controller is as follows
        state is a column vector containing the state of the robot
        i is the index corresponding to the time step in the horizon (useful to index gains K for e.g.)
        
        this controller needs to return an array of size (2,)
    """
    # here we do nothing and just return some non-zero control
    state2=np.empty([6, horizon_length+1])
    z0 = np.zeros([quadrotor.NUMBER_STATES,])
    state2[:,0] = z0
    u2 = np.zeros([2, 1000])
    K,k = solve_ricatti_equations(Z,U,Q,R,horizon_length,q,r)
    for i in range(horizon_length):
        u2[:,i] = U[:,i] + K[i] @ (state2[:,i] - Z[:,i])+ a*k[i]
        state2[:,i+1] = quadrotor.get_next_state(state2[:,i], u2[:,i])
    
    state=state2
    u=u2
    return state,u

In [6]:
def compute_cost(Z,U,z_bar,u_bar, horizon_length):
    J=((Z[:,horizon_length]-z_bar[:,horizon_length]).T)@ Q[horizon_length] @(Z[:,horizon_length]-z_bar[:,horizon_length])
    for i in range(horizon_length):
        J=J+((Z[:,i]-z_bar[:,i]).T)@ Q[i] @(Z[:,i]-z_bar[:,i])+((U[:,i]-u_bar[:,i]).T @ R @ (U[:,i]-u_bar[:,i]))
    
    return J
    
def get_quadratic_approximation_cost(Z,U, horizon_length,z_bar,u_bar):
    q=[]
    r=[]
    J=compute_cost(Z,U,z_bar,u_bar, horizon_length)
    #J=J+0.5*((z[:,horizon_length]-Z[:,horizon_length]).T)@ Q[horizon_length] @(z[:,horizon_length]-Z[:,horizon_length])+2*(z[:,horizon_length].T-z_bar[:,horizon_length].T)@Q[horizon_length]@(z[:,horizon_length]-Z[:,horizon_length])
    for i in range(horizon_length):
        #J=J+0.5*((u[:,i]-U[:,i]).T)@ R @(u[:,i]-U[:,i])+0.5*((z[:,i]-Z[:,i]).T)@ Q[i] @(z[:,i]-Z[:,i])+2*((z[:,i]-z_bar[:,i]).T)@Q[i]@(z[:,i]-Z[:,i])+2*((u[:,i]-u_bar[:,i]).T)@R@(u[:,i]-U[:,i])    
        q1=2*(Z[:,i]-z_bar[:,i]).T@Q[i]
        q.append(q1)
        r1=2*(U[:,i]-u_bar[:,i]).T@R
        r.append(r1)
    
    return J,q,r

In [None]:
horizon_length = 1000
for j in range(0,2):
    J,q,r=get_quadratic_approximation_cost(Z,U, horizon_length,z_bar,u_bar)
    state,u = controller(Z,U,q,r,horizon_length,Q,R)

    print(J)
    J1=J
    U = u
    Z = np.zeros([quadrotor.NUMBER_STATES, horizon_length+1])
    for i in range(horizon_length):
        Z[:,i+1] = quadrotor.get_next_state(Z[:,i], U[:,i])
    j=j+1

while a>=0.01 :
    J,q,r=get_quadratic_approximation_cost(Z,U, horizon_length,z_bar,u_bar)
    state,u = controller(Z,U,q,r,horizon_length,Q,R)
    if (J>=J1):
        a=a/2
        
    
    U = u
    Z = np.zeros([quadrotor.NUMBER_STATES, horizon_length+1])
    for i in range(horizon_length):
        Z[:,i+1] = quadrotor.get_next_state(Z[:,i], U[:,i])
    J1=J
    print(J)
    print(a)

# run for 8 mins

20467.574325252343
18086465.016590465
823461508.1607354
0.5
3769722456.235524
0.25
173821691.76558027
0.25
269151876.2975881
0.125
196185945.35526678
0.125
116093283.4000904
0.125
70455848.80571565
0.125
42504480.919984825
0.125
27159924.736992482
0.125
17338459.10393974
0.125
10634259.449281784
0.125
6704017.589452875
0.125
4425642.130424856
0.125
2902581.2728577284
0.125
2011635.0362621038
0.125
1452191.0851320152
0.125
1109554.8795478495
0.125
892631.8204847166
0.125
753633.6796107248
0.125
655805.1778040319
0.125
742142.8761141081
0.0625
629523.7849523491
0.0625
592772.2179399546
0.0625
563043.6142023309
0.0625
538498.1114835424
0.0625
517579.5588361445
0.0625
498834.241366524
0.0625
481328.29642053676
0.0625
465170.33348976675
0.0625
450297.1385371335
0.0625
437493.56867828884
0.0625
426074.50199236855
0.0625
414056.98045652383
0.0625
401028.90294552315
0.0625
390696.8836541487
0.0625
378494.4275150632
0.0625
370107.1852287692
0.0625
360142.3334076242
0.0625
352124.92502110737
0.0

5011.488575911319
0.03125
4989.3340998695885
0.03125
4967.037750544026
0.03125
4944.592412865527
0.03125
4921.99231013745
0.03125
4899.2329523572325
0.03125
4876.311156444487
0.03125
4853.225120308759
0.03125
4829.974530353383
0.03125
4806.5606783434505
0.03125
4782.986559657008
0.03125
4759.256922155465
0.03125
4735.378234696675
0.03125
4711.358547842612
0.03125
4687.207227070174
0.03125
4662.934550262073
0.03125
4638.551174854505
0.03125
4614.067493397299
0.03125
4589.492906909087
0.03125
4564.83505127112
0.03125
4540.099012035348
0.03125
4515.286557698272
0.03125
4490.395412009302
0.03125
4465.418574091226
0.03125
4440.343682914067
0.03125
4415.152411449089
0.03125
4389.819866503947
0.03125
4364.313963093416
0.03125
4338.594737108781
0.03125
4312.613556724245
0.03125
4286.312191245705
0.03125
4259.62169621982
0.03125
4232.461076655776
0.03125
4204.735698670269
0.03125
4176.335438641112
0.03125
4147.132596976908
0.03125
4116.979676546585
0.03125
4085.7072603264796
0.03125
4053.122462

1719.5154236743676
0.03125
1718.0917696085883
0.03125
1716.6676651170196
0.03125
1715.2430964161217
0.03125
1713.8180496601863
0.03125
1712.3925109394022
0.03125
1710.9664662779444
0.03125
1709.5399016320514
0.03125
1708.1128028881037
0.03125
1706.685155860725
0.03125
1705.256946290861
0.03125
1703.8281598439023
0.03125
1702.3987821077353
0.03125
1700.9687985908906
0.03125
1699.5381947206029
0.03125
1698.1069558409342
0.03125
1696.6750672108521
0.03125
1695.2425140023417
0.03125
1693.809281298481
0.03125
1692.3753540915586
0.03125
1690.940717281123
0.03125
1689.5053556720977
0.03125
1688.0692539728448
0.03125
1686.6323967932237
0.03125
1685.194768642689
0.03125
1683.7563539283085
0.03125
1682.3171369528418
0.03125
1680.8771019127696
0.03125
1679.4362328963214
0.03125
1677.9945138815076
0.03125
1676.5519287341112
0.03125
1675.1084612057011
0.03125
1673.664094931609
0.03125
1672.218813428895
0.03125
1670.7726000943228
0.03125
1669.3254382022599
0.03125
1667.8773109026492
0.03125
1666.428

1008.1241271349319
0.03125
1007.5211465029722
0.03125
1006.9201749986302
0.03125
1006.3209146567225
0.03125
1005.72311510636
0.03125
1005.1265658552077
0.03125
1004.5310898361265
0.03125
1003.9365380082035
0.03125
1003.3427848384912
0.03125
1002.7497245196332
0.03125
1002.1572678024401
0.03125
1001.5653393425341
0.03125
1000.9738754767804
0.03125
1000.382822359248
0.03125
999.7921343978795
0.03125
999.2017729428909
0.03125
998.611705185878
0.03125
998.0219032353418
0.03125
997.4323433400995
0.03125
996.8430052365179
0.03125
996.2538715996377
0.03125
995.664927581389
0.03125
995.0761604218649
0.03125
994.4875591219599
0.03125
993.8991141674966
0.03125
993.3108172966262
0.03125
992.7226613035847
0.03125
992.134639873047
0.03125
991.5467474401745
0.03125
990.9589790723493
0.03125
990.3713303691108
0.03125
989.7837973774672
0.03125
989.196376520166
0.03125
988.6090645348866
0.03125
988.0218584226435
0.03125
987.4347554039902
0.03125
986.8477528818071
0.03125
986.260848409644
0.03125
985.67

In [None]:
t = np.linspace(0,10,1001)
# we can plot the results
plt.figure(figsize=[9,6])

plt.subplot(2,3,1)
plt.plot(t, state[0,:])
plt.legend(['X'])

plt.subplot(2,3,2)
plt.plot(t, state[2,:])
plt.legend(['Y'])

plt.subplot(2,3,3)
plt.plot(t, state[4,:])
plt.legend(["theta"])

plt.subplot(2,3,4)
plt.plot(t, state[1,:])
plt.legend(['Vx'])
plt.xlabel('Time [s]')

plt.subplot(2,3,5)
plt.plot(t, state[3,:])
plt.legend(['Vy'])
plt.xlabel('Time [s]')

plt.subplot(2,3,6)
plt.plot(t, state[5,:])
plt.legend(['omega'])
plt.xlabel('Time [s]')

# we can also plot the control
plt.figure()
plt.plot(t[:-1], u.T)
plt.legend(['u1', 'u2'])
plt.xlabel('Time [s]')

In [None]:
# now we can also create an animation
quadrotor.animate_robot(state, u)