# Tutorial
This is the tutorial for "Tadano Crane Slewing Operation Optimization Challenge".   
In this tutorial, we will see how the simulator works, how to design and implement the control algorithm, and the evaluation of the accuracy.

## Table of Contents
1. Pre-analysis
1. Algorithm for Controlling the Crane Simulator
1. Controlling the Crane Simulator
1. Making the File for Submission
1. For Further Analysis

## 1. Pre-analysis
We will see how the crane simulator works by inputing some sequence of lever values.

### Creating the Sequences
Notice that the lever value is "float" and ranges from 0.0 to 1.0.

In [None]:
sequence_1 = []
for i in range(1001):
    if i <= 100:
        sequence_1.append(0.01*i)
    elif i > 100 and i <= 200:
        sequence_1.append(2-0.01*i)
    else:
        sequence_1.append(0.0)

In [None]:
sequence_2 = []
for i in range(1001):
    if i <= 500:
        sequence_2.append(0.002*i)
    elif i > 500 and i <= 1000:
        sequence_2.append(2-0.002*i)

In [None]:
sequence_3 = []
for i in range(1001):
    if i <= 500:
        sequence_3.append(0.5)
    else:
        sequence_3.append(0.0)

In [None]:
sequence_4 = []
for i in range(1001):
    sequence_4.append(0.5)

### Visualizing the Sequences
We will visualize the sequences with time.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
t = np.arange(1001)*0.01

In [None]:
fig, axes = plt.subplots(2, 2,sharex=True, sharey=True, figsize=(15,10))
axes[0,0].plot(t, sequence_1)
axes[0,0].set_title('sequence_1', fontsize=20)
axes[0,0].set_xlabel('time[s]', fontsize=15)
axes[1,0].plot(t, sequence_2)
axes[1,0].set_title('sequence_2', fontsize=20)
axes[1,0].set_xlabel('time[s]', fontsize=15)
axes[0,1].plot(t, sequence_3)
axes[0,1].set_title('sequence_3', fontsize=20)
axes[0,1].set_xlabel('time[s]', fontsize=15)
axes[1,1].plot(t, sequence_4)
axes[1,1].set_title('sequence_4', fontsize=20)
axes[1,1].set_xlabel('time[s]', fontsize=15)
plt.show()

### Inputing the Sequences to the Crane Simulator
We will input the sequences to the crane simulator and see the outputs.

First, we will read the configurations for the simulator.

In [None]:
import os
import json

# read the parameters
config_dir = './configs'
with open(os.path.join(config_dir, 'env_params.json')) as f:
    env_params = json.load(f)
with open(os.path.join(config_dir, 'sim_params.json')) as f:
    sim_params = json.load(f)

In [None]:
for k, v in sim_params.items():
    print(k)
    print(v)

In [None]:
def play(sequence, crane_env):
    # control result
    result = {}
    result['senkai_radian'] = []
    result['circular_displacement'] = []
    result['radial_displacement'] = []

    # get the initial observation
    observation = crane_env.reset()

    # start the simulator
    crane_env.start()
    done = False
    t = 0
    while not done:
        # step foward
        observation, reward, done, info = crane_env.step(sequence[t])
        result['senkai_radian'].append(np.deg2rad(observation['turning_encoder_angle_deg']))
        result['circular_displacement'].append(observation['hook_diff_c_m'])
        result['radial_displacement'].append(observation['hook_diff_r_m'])
        
        t += 1

        if t == len(sequence):
            done = True

    # stop the simulator
    crane_env.stop()
    
    return result

We will instanciate the simulator at first.

In [None]:
import env_user

crane_env = env_user.CraneEnv(env_params, sim_params)

We will run the simulator.

In [None]:
result_1 = play(sequence_1, crane_env)
result_2 = play(sequence_2, crane_env)
result_3 = play(sequence_3, crane_env)
result_4 = play(sequence_4, crane_env)

### Visualizing the Outputs
We will visualize the outputs with time.  
Since the angular velocity[rad/s] is not observable, we will compute it from 'senkai_radian'('turning_encoder_angle_deg').

In [None]:
def visualize(result, sequence):
    # computing the angular velocity
    senkai_velocity = []
    for i in range(len(result['senkai_radian'])):
        if i == 0:
            senkai_velocity.append(0)
        else:
            senkai_velocity.append((result['senkai_radian'][i]-result['senkai_radian'][i-1])/0.01)
    result['senkai_velocity'] = senkai_velocity

    # Plotting the result
    plt.figure(figsize=(12,6))
    plt.plot(np.arange(len(result['circular_displacement']))*0.01, result['circular_displacement'], label='circular displacement [m]')
    plt.plot(np.arange(len(result['radial_displacement']))*0.01, result['radial_displacement'], label='radial displacement [m]')
    plt.plot(np.arange(len(result['senkai_radian']))*0.01, result['senkai_radian'], label="senkai angle [rad]")
    plt.plot(np.arange(len(result['senkai_velocity']))*0.01, result['senkai_velocity'], label="senkai velocity [rad/s]")
    plt.plot(np.arange(len(sequence))*0.01, sequence, label="lever sequence [-1 ~ 1]")

    plt.xlabel('Time [s]', fontsize=15)
    plt.legend()
    plt.show()
    plt.close()

In [None]:
visualize(result_1, sequence_1)

In [None]:
visualize(result_2, sequence_2)

In [None]:
visualize(result_3, sequence_3)

In [None]:
visualize(result_4, sequence_4)

We can see that the circular displacement(load swing of the circumference direction) and the radial displacement(load swing of the radial direction) are relatively sensitive to the lever value so we have to increase it gradually.

## 2. Algorithm for Controlling the Crane Simulator
From the pre-analysis above, we will design and implement simple rule-based control algorithm.

### Implementing the Agent
We will implement the agent to control the crane simulator.  
The agent must have at least the following methods:  
- `get_model`
- `set_params`
- `policy`

`get_model` instanciates the agent and loads some model if used. `set_params` sets parameters such as 'senkai_target_angle', 'weight_t', 'wire_ratio', etc. `policy` returns the next lever value after the observations are passed. The next lever value should range from 0 to 1.

The parameters for `set_params` are as follows.

|  name  |  type  |  unit  |  range  |  default  |  description  |
| ---- | :----: | :----: | :----: | :----: | ---- |
|  senkai_target_angle  |  float  |  deg  |  0 ~ 180  |  -  |  Target angle from the starting point  |
|  weight_t  |  float  |  t  |  0 ~ 4  |  1  |  Weight of the hanging load  |
|  wire_ratio  |  float  |  -  |  0.1 ~ 0.9  |  0.8  |  The length of the wire  |
|  init_kifuku_deg  |  float  |  deg  |  10 ~ 80  |  60  |  Initial value of the ups and downs angle  |
|  init_yure_senkai_m  |  float  |  m  |  -1 ~ 1  |  0  |  Load swing initial value of the radial direction  |
|  init_yure_kifuku_m  |  float  |  m  |  -1 ~ 1  |  0  |  Load swing initial value of the circumference direction  |
|  param_randomize_seed  |  uint  |  -  |  -  |  0  |  Randomize seed of the internal parameter(== 0 : seed is not fixed)  |

The observations passed to `policy` are as follows.

|  name  |  type  |  unit  |  description  |
| ---- | :----: | :----: | ---- |
|  step  |  uint  |  -  |  The practice step count of simulation  |
|  turning_lever_value  |  float  |  -  |  Turning lever operation quantity  |
|  turning_encoder_angle_deg  |  float  |  deg  |  Turning encoder angle  |
|  turning_encoder_acquisition_time  |  float  |  ms  |  The turning encoder acquisition time  |
|  working_radius_m  |  float  |  m  |  Work radius  |
|  actual_load_t  |  float  |  t  |  Weight of the actual load  |
|  main_wire_length_m  |  float  |  m  |  The length of the wire  |
|  boom_top_x_m  |  float  |  m  |  x coordinate of the boom top  |
|  boom_top_y_m  |  float  |  m  |  y coordinate of the boom top  |
|  boom_top_z_m  |  float  |  m  |  z coordinate of the boom top  |
|  boom_top_status  |  uint  |  -  |  When values are updated = 1, not updated = 0. The value is updated at 10Hz.  |
|  left_turning_pressure_mpa  |  float  |  MPa  |  Pressure of the left turning  |
|  right_turning_pressure_mpa  |  float  |  MPa  |  Pressure of the right turning  |
|  tawami_diff_deg  |  float  |  deg  |  Flection of the circumference direction  |
|  hook_x_m  |  float  |  m  |  x coordinate of the hook  |
|  hook_y_m  |  float  |  m  |  y coordinate of the hook  |
|  hook_z_m  |  float  |  m  |  z coordinate of the hook  |
|  hook_status  |  uint  |  -  |  When values are updated = 1, not updated = 0. The value is updated at 10Hz.  |
|  hook_diff_c_m  |  float  |  m  |  Load swing of the circumference direction  |
|  hook_diff_r_m  |  float  |  m  |  Load swing of the radial direction  |

When the simulation runs, the simulator is instanciated at first. Then `get_model` is called to instanciate the agent, and `set_params` is called to set the parameters from the passed parameters. In each step, observations are generated from the simulator and passed to `policy` and the next lever value is returned, then passed to the simulator, and so on, until `done` flag will be `True`(while `done` flag is `False`). `done` flag will be `True` if the lever value remains 0 for 1000 steps, or the total steps is greater than or equal to 100000.

```
# instanciate the simulator
crane_env = env_user.CraneEnv(env_params, sim_params)

# instanciate the agent
crane_agent = agent.Agent.get_model('./model')
crane_agent.set_params(external_params)

# get the initial observation
observation = env.reset()

# run the simulation
done = False
while not done:
    # next lever value
    action = crane_agent.policy(observation)

    # step foward
    observation, reward, done, info = crane_env.step(action)
```
To implement simple rule-based algorithm, we will define some model parameters in `get_model`.  
The model parameters are as follows:
- max_steps
- max_velocity
- del_lever
- max_angle

Check `policy` below to see how it works.    
Since the angular velocity[rad/s] is not observable, we will compute it from 'turning_encoder_angle_deg' which is available observation.

In [None]:
class Agent(object):
    def __init__(self, action_range):
        self.model = None
        self.action_range = action_range
        self.step = 0
        self.observations = self.init_log()

    @classmethod
    def get_model(cls, model_path):
        action_range = (0.0, 1.0)
        agent = cls(action_range)

        # load some model
        if os.path.exists(model_path):
            with open(model_path) as f:
                agent.model = json.load(f)
        else:
            agent.model = {'max_steps': 5000,
                           'max_velocity': 0.3,
                           'del_lever': 0.01,
                           'max_angle': 0.8}

        return agent

    def set_params(self, params):
        """
        args:
          params (data type: dict)
            - 'senkai_target_angle'
            - 'weight_t'
            - 'wire_ratio'
            - 'init_kifuku_deg'
            - 'init_yure_senkai_m'
            - 'init_yure_kifuku_m'
            - 'param_randomize_seed'
        """
        self.params = params

    def policy(self, observation):
        """
        args:
          observation (data type: dict):
            - 'step'
            - 'turning_lever_value'
            - 'turning_encoder_angle_deg'
            - 'turning_encoder_acquisition_time'
            - 'working_radius_m'
            - 'actual_load_t'
            - 'main_wire_length_m'
            - 'boom_top_x_m'
            - 'boom_top_y_m'
            - 'boom_top_z_m'
            - 'boom_top_status'
            - 'left_turning_pressure_mpa'
            - 'right_turning_pressure_mpa'
            - 'tawami_diff_deg'
            - 'hook_x_m'
            - 'hook_y_m'
            - 'hook_z_m'
            - 'hook_status'
            - 'hook_diff_c_m'
            - 'hook_diff_r_m'
        available observation

        returns:
          next_lever_value(data type: float, 0 <= and <= 1)
        """
        # save the observations
        if self.step > observation['step']:
            self.observations = self.init_log()
            self.step = 0
        self.observations['senkai_radian'].append(np.deg2rad(observation['turning_encoder_angle_deg']))

        # computing the angular velocity
        velocity = 0.0
        if len(self.observations['senkai_radian']) > 1:
            velocity = (self.observations['senkai_radian'][-1]-self.observations['senkai_radian'][-2])/0.01
        self.observations['senkai_velocity'].append(velocity)

        # computing the next lever value
        next_lever_value = 0.0
        if self.step <= self.model['max_steps']:
            if observation['turning_encoder_angle_deg'] < self.params['senkai_target_angle']*self.model['max_angle']:
                if self.observations['senkai_velocity'][-1] <= self.model['max_velocity']:
                    next_lever_value = observation['turning_lever_value'] + self.model['del_lever']
                else:
                    next_lever_value = observation['turning_lever_value'] - self.model['del_lever']

            else:
                if self.observations['senkai_velocity'][-1] <= 0.0:
                    next_lever_value = observation['turning_lever_value'] + self.model['del_lever']
                else:
                    next_lever_value = observation['turning_lever_value'] - self.model['del_lever']

        next_lever_value = min(max(next_lever_value, self.action_range[0]), self.action_range[1])

        self.step += 1

        return next_lever_value

    def init_log(self):
        observations = {}
        observations['senkai_radian'] = []
        observations['senkai_velocity'] = []

        return observations

### Optimizing the Model Parameters
We will optimize the model parameters using bayesian optimization(https://github.com/fmfn/BayesianOptimization).  
The parameters to be optimized are as follows:
- max_steps
- max_velocity
- del_lever
- max_angle

The target will be the score defined in the competition(https://signate.jp/competitions/428#evaluation). It can be computed using some functions defined in "evaluate.py".

In [None]:
import time
from bayes_opt import BayesianOptimization, UtilityFunction
from evaluate import evaluation_crane

class Optimizer(object):
    def __init__(self, env, agent, evaluation_params, external_params):
        self.env = env
        self.agent = agent
        self.evaluation_params = evaluation_params
        self.external_params = external_params

    def play(self):
        # get the initial observation
        observation = self.env.reset()

        # start the simulator
        self.env.start()
        done = False
        while not done:
            # time the control
            time_start = time.perf_counter()
            action = self.agent.policy(observation)
            runtime = time.perf_counter() - time_start
            self.env.update_runtime_results(runtime)

            # step foward
            observation, reward, done, info = self.env.step(action)

        # get the control results
        result = self.env.get_control_results()

        # stop the simulator
        self.env.stop()

        return result

    def compute_target(self, result):
        flag, score = evaluation_crane(result, self.evaluation_params, self.external_params)
        if flag:
            print('Score: {}\n'.format(score))
            return score
        else:
            print('Score: {}\n'.format(0.0))
            return 0.0

    def optimize(self, pbounds, n = 10):
        # set the optimizer
        opt = BayesianOptimization(f = None,
                                   pbounds = pbounds,
                                   verbose = 2,
                                   random_state = 1)
        opt.set_gp_params(normalize_y = False)
        utility = UtilityFunction(kind="ucb", kappa=2.5, xi=0.0)

        # run the optimization
        for i in range(n):
            print('Optimization {}:\n'.format(i+1))
            # suggest the next parameters
            next_point = opt.suggest(utility)

            # set the suggested parameters
            self.agent.model = next_point

            # get the result
            result = self.play()

            # compute the score
            score = self.compute_target(result)

            # register the result for the given parmaeters
            opt.register(params = next_point, target = score)

        # set the best model parameters
        self.agent.model = opt.max['params']

The configurations of the simulator(environment) and the external parameters passed to the agent such as target angle('senkai_target_angle') are as follows.

In [None]:
with open(os.path.join(config_dir, 'env_params.json')) as f:
    env_params = json.load(f)
with open(os.path.join(config_dir, 'sim_params.json')) as f:
    sim_params = json.load(f)
with open(os.path.join(config_dir, 'external_params.json')) as f:
    external_params = json.load(f)
external_params.update(sim_params['state_init'])

with open(os.path.join(config_dir, 'evaluation_params.json')) as f:
    evaluation_params = json.load(f)

print('configurations:')
for k, v in sim_params.items():
    print(' ', k)
    print(' ', v)
print('\nexternal parameters:')
for k, v in external_params.items():
    print(' ', k, v)

We will set the model parameter bounds in `pbounds` object and the number of iterations in `n` object as follows.  
Then we will run the optimization by `optimize` method.

In [None]:
# instanciate the agent
os.makedirs('model', exist_ok=True)
model_path = os.path.join('model', 'params.json')
action_range = (0.0, 1.0)
agent = Agent(action_range)
agent.set_params(external_params)

# instanciate the optimizer(pass "crane_env" instanciated above which were set "env_params" and "sim_params".)
optimizer = Optimizer(crane_env, agent, evaluation_params, external_params)
pbounds = {'max_steps': (9000, 12000),
           'max_velocity': (0.2, 0.4),
           'del_lever': (0.0005, 0.001),
           'max_angle': (0.4, 0.9)}

# run the optimizer
n = 100
optimizer.optimize(pbounds, n)

# save the best parameters
with open(model_path, 'w', encoding='utf-8') as f:
    json.dump(agent.model, f)

## 3. Controlling the Crane Simulator
We will control the crane simulator using the optimized model and evaluate the accuracy.

### Running the Simulation
We will run the simulation and save the output.

In [None]:
def run(crane_env, crane_agent, output_path):
    # get the initial observation
    observation = crane_env.reset()
    
    # start the simulator
    crane_env.start()
    done = False
    while not done:
        # time the control
        time_start = time.perf_counter()
        action = crane_agent.policy(observation)
        runtime = time.perf_counter() - time_start
        crane_env.update_runtime_results(runtime)
        
        # step foward
        observation, reward, done, info = crane_env.step(action)

    # save the control results
    crane_env.save_control_results(output_path)

    # stop the simulator
    crane_env.stop()

The output will be saved in `output_path` defined below.

In [None]:
crane_agent = Agent.get_model(model_path)
crane_agent.set_params(external_params)
output_path = os.path.join('output', 'control_results.json')
run(crane_env, crane_agent, output_path)

### Evaluating the Output
We will evaluate the output.

The configurations of the simulator(environment) and the external parameters passed to the agent such as target angle('senkai_target_angle') are as follows.

In [None]:
with open(os.path.join(config_dir, 'env_params.json')) as f:
    env_params = json.load(f)
with open(os.path.join(config_dir, 'sim_params.json')) as f:
    sim_params = json.load(f)
with open(os.path.join(config_dir, 'external_params.json')) as f:
    external_params = json.load(f)
external_params.update(sim_params['state_init'])
print('configurations:')
for k, v in sim_params.items():
    print(' ', k)
    print(' ', v)
print('\nexternal parameters:')
for k, v in external_params.items():
    print(' ', k, v)

We will check the score in the given condition.

In [None]:
# read parameters
with open(os.path.join(config_dir, 'evaluation_params.json')) as f:
    params = json.load(f)
with open(os.path.join(config_dir, 'external_params.json')) as f:
    external_params = json.load(f)
print('Parameters for evaluation:')
for k, v in params.items():
    print(' ', k, v)
for k, v in external_params.items():
    print(' ', k, v)

# read the control result
with open('./output/control_results.json') as f:
    result = json.load(f)
print('\nCategories for evaluation:')
for k in result.keys():
    print(' ', k)

# evaluate the result
print('\nEvaluation results:')
flag, score = evaluation_crane(result, params, external_params)

## 4. Making the File for Submission
We will make the file for submission of this algorithm.

### Edit "agent.py"

To make the file for submission, we have to create `Agent` class implemented above, in "agent.py" file at first(also refer to "README.md").

### Run the Simulator with the Agent

Run the following command and make sure the implemented agent runs as expected(also refer to "README.md").

In [None]:
!python run.py

### Evaluate the Algorithm

Run the following command and make sure the implemented agent performs as expected(also refer to "README.md").

In [None]:
!python evaluate.py

### Make the Zip File

Then, run the following command and you can make the file for submission which is a zip file(also refer to "README.md").

In [None]:
!bash submit.sh

## 5. For Further Analysis

In this tutorial, we implemented a simple rule-based algorithm and optimized its parameters.  
As we only considered a limited condition, we have to design the algorithm to make it robust(add some parameters for example).

Reinforcement learning algorithms may cope with this problem(or we can combine a rule-based algorithm and a reinforcement learning algorithm to make it better). To use the reinforcement learning algorithm, we have to define the reward which should be generated from the environment at each step. In this competition, the reward is not given explicitly. Designing the proper reward to achieve good accuracy score may be important when using reinforcement learning algorithms. We can implement it in `step` method in "env_user.py"(it always returns 0 by default).

We look forward to great ideas.  
Good luck!