# Soft Landing with an Aerial Manipulator

In this exercise your task is to make an Aerial Manipulator land as softly, and doing so as energy efficient as possible. The tricky part: you have no information about your height!

## OpenAIs Gymnasium (aka Gym)

The environment is adapted from the example ["Bipedal Walker"](https://www.gymlibrary.dev/environments/box2d/bipedal_walker/) from the standard examples of the OpenAI Gym environments. 
The environments work based on the standard systems approach: you input some action and the system outputs the resulting observations.

## Action Space

The Aerial Manipulator (AM) is a 2D Bi-Rotor with two articulated arms, each with two revolute joints. The action space consists of the input into each of these motors in the following order:

- Rotor 1
- Rotor 2
- Arm 1, Joint 1  
- Arm 1, Joint 2 
- Arm 2, Joint 1  
- Arm 2 Joint 2

## Observation Space

The observation space consist of measurements that are commonly available via proprioceptive sensors on an AM: angular positions and speeds, linear velocities, as well as contact information. The order is as following:
- Base Angular Position
- Base Linear Velocity (x)
- Base Linear Velocity (y)
- Arm 1, Joint 1 Position
- Arm 1, Joint 1 Speed
- Arm 1, Joint 2 Position
- Arm 1, Joint 2 Speed,
- Arm 1, EE in contact (Binary)
- Arm 2, Joint 1 Position
- Arm 2, Joint 1 Speed
- Arm 2, Joint 2 Position
- Arm 2, Joint 2 Speed,
- Arm 2, EE in contact (Binary)

In [19]:
# In this cell we import the needed libraries and setup the environment
# You don't need to touch this

import gymnasium as gym
import numpy as np
from dual_arm_am import DualArmAM

env = DualArmAM(render_mode='human')
env._max_episode_steps = 600
env.reset()
steps = 0
total_reward = 0

In [17]:
# This is where you do your work. By changing the action (a) you can command the AM

def control_law(s):
    ###################### Simple Heuristic: ###########################
    ####### The quadrotor slowly descends while constant torque   ######
    ####### on the joint motors                                   ######
    ####################################################################
    
    theta = s[0]
    thetadot = s[1]
    xdot = s[2]
    ydot = s[3]
    j0 = s[4]
    j0dot = s[5]
    j1 = s[6]
    j1dot = s[7]
    j2 = s[9]
    j2dot = s[10]
    j3 = s[11]
    j3dot = s[12]

    # First the descending control of the quadrotor:
    # Simple PD control on attitude while maintaining thrust
    # That is a bit less then the gravity
    k_att = 1
    d_att = 0.1
    a[0] = -k_att *theta + d_att * thetadot
    a[1] = (k_att *theta + d_att * thetadot)

    # Add thrust to have total thrust that (almost) matches gravity
    addon = np.cos(theta) * 0.50
    a[0] += addon
    a[1] += addon

    a[2] = 2.0
    a[3] = 0.005
    a[4] = -2.0
    a[5] = -0.005
    
    return a

In [22]:
# This runs one episode of the environment and outputs the cumulative reward

s =  np.zeros(env.observation_space.shape)
while True:
        s, r, terminated, truncated, info = env.step(control_law(s))
        total_reward += r
        if steps % 20 == 0 or terminated or truncated:
            print("\naction " + str([f"{x:+0.2f}" for x in a]))
            print(f"step {steps} total_reward {total_reward:+0.2f}")
            print("hull " + str([f"{x:+0.2f}" for x in s[0:4]]))
            print("leg0 " + str([f"{x:+0.2f}" for x in s[4:9]]))
            print("leg1 " + str([f"{x:+0.2f}" for x in s[9:14]]))
        steps += 1

        if terminated or truncated:
            break
            
env.close()

error: display Surface quit