# Continuous Control

---

Congratulations for completing the second project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program!  In this notebook, you will learn how to control an agent in a more challenging environment, where the goal is to train a creature with four arms to walk forward.  **Note that this exercise is optional!**

### 1. Start the Environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

In [1]:
from unityagents import UnityEnvironment
import numpy as np

env = UnityEnvironment(file_name='Crawler_Linux_NoVis/Crawler.x86_64')

# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# number of agents
num_agents = len(env_info.agents)
print('Number of agents:', num_agents)

# size of each action
action_size = brain.vector_action_space_size
print('Size of each action:', action_size)

# examine the state space 
states = env_info.vector_observations
state_size = states.shape[1]
print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))
print('The state for the first agent looks like:', states[0])

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: CrawlerBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 129
        Number of stacked Vector Observation: 1
        Vector Action space type: continuous
        Vector Action space size (per agent): 20
        Vector Action descriptions: , , , , , , , , , , , , , , , , , , , 


Number of agents: 12
Size of each action: 20
There are 12 agents. Each observes a state with length: 129
The state for the first agent looks like: [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  2.25000000e+00
  1.00000000e+00  0.00000000e+00  1.78813934e-07  0.00000000e+00
  1.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  6.06093168e-01 -1.42857209e-01 -6.06078804e-01  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  1.33339906e+00 -1.42857209e-01
 -1.33341408e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
 -6.0609

In [2]:
#######################################################################
# Copyright (C) 2017 Shangtong Zhang(zhangshangtong.cpp@gmail.com)    #
# Permission given to modify the code as long as you keep this        #
# declaration at the top                                              #
#######################################################################

from deep_rl import *

import torch
import numpy as np
from deep_rl.utils import *
import torch.multiprocessing as mp
from collections import deque
from skimage.io import imsave
from deep_rl.network import *
from deep_rl.component import *


class BaseAgent:
    def __init__(self, config):
        self.config = config
        self.logger = get_logger(tag=config.tag, log_level=config.log_level)
        self.task_ind = 0
        self.episode_rewards = []
        self.rewards = None
        self.episodic_return = None
    def close(self):
        close_obj(self.task)

    def save(self, filename):
        torch.save(self.network.state_dict(), '%s.model' % (filename))
        with open('%s.stats' % (filename), 'wb') as f:
            pickle.dump(self.config.state_normalizer.state_dict(), f)

    def load(self, filename):
        state_dict = torch.load('%s.model' % filename, map_location=lambda storage, loc: storage)
        self.network.load_state_dict(state_dict)
        with open('%s.stats' % (filename), 'rb') as f:
            self.config.state_normalizer.load_state_dict(pickle.load(f))

    def eval_step(self, state):
        raise NotImplementedError

    def eval_episode(self):
        env = self.config.eval_env
        state = env.reset()
        while True:
            action = self.eval_step(state)
            state, reward, done, info = env.step(action)
            ret = info[0]['episodic_return']
            if ret is not None:
                break
        return ret

    def eval_episodes(self):
        episodic_returns = []
        for ep in range(self.config.eval_episodes):
            total_rewards = self.eval_episode()
            episodic_returns.append(np.sum(total_rewards))
        self.episode_rewards = episodic_returns
        self.logger.info('steps %d, episodic_return_test %.2f(%.2f)' % (
            self.total_steps, np.mean(episodic_returns), np.std(episodic_returns) / np.sqrt(len(episodic_returns))
        ))
        self.logger.add_scalar('episodic_return_test', np.mean(episodic_returns), self.total_steps)
        return {
            'episodic_return_test': np.mean(episodic_returns),
        }

    def record_online_return(self, info, offset=0):
        if isinstance(info, dict):
            ret = info['episodic_return']
            self.rewards = info['all_rewards']
            if(self.rewards is not None):
                episode = len(self.rewards)
            if ret is not None:
                self.episodic_return = ret
#                 self.logger.add_scalar('episodic_return_train', ret, self.total_steps + offset)
#                 self.logger.info('Episode %d, steps %d, episodic_return_train %s' % (episode,self.total_steps + offset, ret))
        elif isinstance(info, tuple):
            for i, info_ in enumerate(info):
                self.record_online_return(info_, i)
        else:
            raise NotImplementedError

    def switch_task(self):
        config = self.config
        if not config.tasks:
            return
        segs = np.linspace(0, config.max_steps, len(config.tasks) + 1)
        if self.total_steps > segs[self.task_ind + 1]:
            self.task_ind += 1
            self.task = config.tasks[self.task_ind]
            self.states = self.task.reset()
            self.states = config.state_normalizer(self.states)

    def record_episode(self, dir, env):
        mkdir(dir)
        steps = 0
        state = env.reset()
        while True:
            self.record_obs(env, dir, steps)
            action = self.record_step(state)
            state, reward, done, info = env.step(action)
            ret = info[0]['episodic_return']
            steps += 1
            if ret is not None:
                break

    def record_step(self, state):
        raise NotImplementedError

    # For DMControl
    def record_obs(self, env, dir, steps):
        env = env.env.envs[0]
        obs = env.render(mode='rgb_array')
        imsave('%s/%04d.png' % (dir, steps), obs)

class PPOAgent(BaseAgent):
    def __init__(self, config):
        BaseAgent.__init__(self, config)
        self.config = config
        self.task = config.task_fn()
        self.network = config.network_fn()
        self.opt = config.optimizer_fn(self.network.parameters())
        self.total_steps = 0
        self.states = self.task.reset()
        self.states = config.state_normalizer(self.states)

    def step(self):
        config = self.config
        storage = Storage(config.rollout_length)
        states = self.states
        for _ in range(config.rollout_length):
            prediction = self.network(states)
            next_states, rewards, terminals, info = self.task.step(to_np(prediction['a']))
            self.record_online_return(info)
            rewards = config.reward_normalizer(rewards)
            next_states = config.state_normalizer(next_states)
            storage.add(prediction)
            storage.add({'r': tensor(rewards).unsqueeze(-1),
                         'm': tensor(1 - terminals).unsqueeze(-1),
                         's': tensor(states)})
            states = next_states
            self.total_steps += config.num_workers

        self.states = states
        prediction = self.network(states)
        storage.add(prediction)
        storage.placeholder()

        advantages = tensor(np.zeros((config.num_workers, 1)))
        returns = prediction['v'].detach()
        for i in reversed(range(config.rollout_length)):
            returns = storage.r[i] + config.discount * storage.m[i] * returns
            if not config.use_gae:
                advantages = returns - storage.v[i].detach()
            else:
                td_error = storage.r[i] + config.discount * storage.m[i] * storage.v[i + 1] - storage.v[i]
                advantages = advantages * config.gae_tau * config.discount * storage.m[i] + td_error
            storage.adv[i] = advantages.detach()
            storage.ret[i] = returns.detach()

        states, actions, log_probs_old, returns, advantages = storage.cat(['s', 'a', 'log_pi_a', 'ret', 'adv'])
        actions = actions.detach()
        log_probs_old = log_probs_old.detach()
        advantages = (advantages - advantages.mean()) / advantages.std()

        for _ in range(config.optimization_epochs):
            sampler = random_sample(np.arange(states.size(0)), config.mini_batch_size)
            for batch_indices in sampler:
                batch_indices = tensor(batch_indices).long()
                sampled_states = states[batch_indices]
                sampled_actions = actions[batch_indices]
                sampled_log_probs_old = log_probs_old[batch_indices]
                sampled_returns = returns[batch_indices]
                sampled_advantages = advantages[batch_indices]

                prediction = self.network(sampled_states, sampled_actions)
                ratio = (prediction['log_pi_a'] - sampled_log_probs_old).exp()
                obj = ratio * sampled_advantages
                obj_clipped = ratio.clamp(1.0 - self.config.ppo_ratio_clip,
                                          1.0 + self.config.ppo_ratio_clip) * sampled_advantages
                policy_loss = -torch.min(obj, obj_clipped).mean() - config.entropy_weight * prediction['ent'].mean()

                value_loss = 0.5 * (sampled_returns - prediction['v']).pow(2).mean()

                self.opt.zero_grad()
                (policy_loss + value_loss).backward()
                nn.utils.clip_grad_norm_(self.network.parameters(), config.gradient_clip)
                self.opt.step()

In [3]:
def run_steps_custom(agent):
    config = agent.config
    agent_name = agent.__class__.__name__
    t0 = time.time()
    rewards_deque = deque(maxlen=100)
    rewards_all = []
    while True:
        rewards = agent.episodic_return
        if rewards is not None:
            rewards_deque.append(np.mean(rewards))
            rewards_all.append(np.mean(rewards))
        if config.log_interval and not agent.total_steps % config.log_interval and (rewards is not None):
            agent.logger.info('Episode %d,last %d episodes, mean rewards  %.2f,  steps %d, %.2f steps/s' % (len(rewards_all),len(rewards_deque),np.mean(rewards_deque),agent.total_steps, config.log_interval / (time.time() - t0)))
            t0 = time.time()
#         if config.max_steps and agent.total_steps >= config.max_steps:
#             agent.close()
#             return True,rewards_deque,rewards_all
        if (rewards is not None) and np.mean(rewards_deque) > 2000:
            agent.save('./data/model-%s.bin' % (agent_name))
            agent.close()
            return True,rewards_deque,rewards_all
        if (len(rewards_all) % 200):
            agent.save('./data/model-%s.bin' % (agent_name))


        agent.step()
        agent.switch_task()

class CrawlerTask():
    def __init__(self):
#         BaseTask.__init__(self)
        self.name = 'Reacher'
        self.env = env
        self.action_dim = brain.vector_action_space_size
        self.state_dim = brain.vector_observation_space_size
        self.info = {"all_rewards":None}
        self.total_rewards = np.zeros(12)
        self.rewards = []
    def reset(self):
        env_info = self.env.reset(train_mode=True)[brain_name]
        return np.array(env_info.vector_observations)

    def step(self, action):
        action = np.clip(action, -1, 1)
        env_info = self.env.step(action)[brain_name]
        next_state = env_info.vector_observations   # next state
        reward = env_info.rewards                   # reward
        done = env_info.local_done

        self.total_rewards += reward

        if np.any(done):
            if any(np.isnan(self.total_rewards.reshape(-1))):
                self.total_rewards[np.isnan(self.total_rewards)] = -5            
            self.info['episodic_return'] = self.total_rewards
            self.rewards.append(self.total_rewards)
            self.info['all_rewards'] = self.rewards
            self.total_rewards = np.zeros(12)
            next_state = self.reset()
        else:
            self.info['episodic_return'] = None

        return np.array(next_state), np.array(reward), np.array(done), self.info

    def seed(self, random_seed):
        return 10

def ppo_continuous():
    config = Config()
    config.num_workers = 12
    task_fn = lambda : CrawlerTask()
    config.task_fn = task_fn
    config.eval_env = task_fn()

    config.network_fn = lambda: GaussianActorCriticNet(
        config.state_dim, config.action_dim, actor_body=FCBody(config.state_dim,hidden_units=(128, 128),gate=F.leaky_relu),
        critic_body=FCBody(config.state_dim,hidden_units=(128, 128),gate=F.leaky_relu))
    config.optimizer_fn = lambda params: torch.optim.Adam(params, 3e-4, eps=1e-5)
    config.discount = 0.99
    config.use_gae = True
    config.gae_tau = 0.99
    config.gradient_clip = 5
    config.rollout_length = 64
    config.optimization_epochs = 4
    config.mini_batch_size = 64
    config.ppo_ratio_clip = 0.2
    config.log_interval = 4096
    config.max_steps = 1e4
    config.state_normalizer = MeanStdNormalizer()
    agent = PPOAgent(config)
#     agent.load('data/model-PPOAgent.bin')
    return run_steps_custom(agent)

success, rewards_deque, rewards_all = ppo_continuous()

[-0.19641337  0.41737028  0.34938253  0.26959499 -0.63609633  0.2204645
  0.10837027  0.28709081  0.17114817 -0.30671829 -0.14098985  0.58249474]
[ 0.60755131  1.49471516  0.47632015  0.22274144  0.38541511  1.53155367
 -1.10944254  0.41752105  0.41745688  0.40480323  0.81558031  0.7472925 ]
[ 0.58201947 -0.52773523  0.57288626  0.0171763   0.59661319  0.41554551
  0.22343093  0.50277414  0.75755628  0.38586221  0.39582461  1.07341666]
[ 3.41018115  8.35183088  2.42862418  0.89619418 -2.22016979 -4.22177318
 25.25464392 -1.91332689 -6.3528728   6.2651826   7.54477025 -6.27524793]
[-6.260912   10.12902012  2.36891005  1.77814282  1.83298554  1.01409769
 -2.48596359  4.58804535  2.10502493 -6.80437178  1.84411505  2.41982502]
[ 0.60019951  0.55409392  0.90290319 -0.77207075 -0.04835965  0.12845957
  1.00560648  0.09697373  0.66443741  0.28524918  0.50797134  1.22876603]
[ 2.99639179  3.21350279  1.03110907  0.69655858  4.96937059  2.44093511
 -1.84587835  5.57597503 -2.61564241 -1.157521

[ 2.3190091  -0.06859651 -1.00304476 -0.08991033  6.18909459  1.66981848
  1.80791792  3.0679096  -7.23242802 -5.87870916  7.28136368  6.76818724]
[ 0.05311166  0.78189377  0.78012605  2.04038757 -1.16102651  0.10123191
  2.7394474   4.57530388 -0.2510179  -0.47243024  1.17771092 -0.14401486]
[ 9.57622319  1.25146434  7.99019143  4.25478672  3.61169948 -0.32405531
  1.91212639  8.03500577  1.9740963  -0.44267718  0.50365903  3.03157594]
[  4.74849857  11.41470777  15.19779353   4.49744739   5.43910293
  -4.85446887   8.70207772  11.09987338  27.87995841   4.39890187
   2.14261924 -10.30319074]
[-3.35449009 -2.99040537  3.66748747  3.14456372  2.43175521  5.63415649
  1.24727952  0.57271244  4.15404692  2.28785595  3.75881744  2.07542717]


INFO:root:Episode 16,last 16 episodes, mean rewards  1.69,  steps 12288, 779.86 steps/s


[-0.35704861  3.06064754  3.85572865  1.49106446 -0.20953835  2.72117607
  3.87984972  1.66892342  4.05854378  7.13703822 -2.10140909  3.28416826]
[ 3.94198901 18.59222946  8.14000197  5.24857307 10.27372722  1.97213477
  2.05421915  1.8020544   0.45155883 -3.51943207 -4.12232182 -6.8299483 ]
[ 2.94357077  7.14818891 14.58573619  5.60889615  0.17257225 10.79807982
  5.94263895  4.13312351  0.59146783  1.81820241  3.10231793 -2.41732401]
[ 0.80796602 -0.78736017  0.45289179  0.50505563  0.58212414  0.18197555
  0.37268888  0.94590468  0.53844271 -0.20171325  0.41474599  0.6447629 ]
[ 0.58612946 -0.69325532  0.76050567  0.77503896  0.7386516   0.42764268
  0.06206228  0.03507912  0.2639962   0.11178687  0.16152669  0.55571352]
[-2.17237342  0.71693169 -1.58311378 -7.67627061  1.10879465  6.54086136
  6.06767564  4.02489633  1.71340657 -6.48731127 -2.36367459  0.00853607]
[ 0.68441097 -0.87881049  0.13797816  0.19225797  0.27680864 -0.42854677
  0.87901484  0.24237924  0.51569608  0.59468

INFO:root:Episode 32,last 32 episodes, mean rewards  1.94,  steps 24576, 785.87 steps/s


[ 8.25066861 11.02634133  6.64609818  3.69094972 -4.06964082  6.70174445
 -1.80090307  7.40587712  2.91658515  8.90853455  5.90056181  3.98545513]
[ 3.6616846   3.65724902  0.38717878  4.55344123  5.52898587  1.99593858
 -5.44061928  0.8710315   2.11536275 -3.91406701  3.87552355 -3.4771184 ]
[ 4.14551996  2.70878315  8.33103861  9.7419994  12.81979671  1.56977803
  5.37607182  2.84614067  4.29070402 -1.04578298 11.0119324   6.27001604]
[11.93681468 -8.80420906  0.36154375  5.81122242  6.90697872  8.70143666
  9.66093106 12.01106732  7.6898278   3.41208695  9.08616518  8.30948034]
[17.25459455  3.41629342  7.32077592  5.30533004  4.92826989  9.25868308
  7.49909119  5.03608059  7.30802402  7.30201517  3.00165814  1.69303378]
[  2.35631962  -0.15095909 -17.56725027  12.72123141  11.31059894
  11.29742551  11.25354963  15.09458282   2.84417555  -0.18408726
   1.71465837  10.28624344]
[12.37713665 11.66092786 10.20117785 10.6632689   7.26723112  7.04330733
  8.49379408  7.63317303 12.4120

INFO:root:Episode 48,last 48 episodes, mean rewards  2.87,  steps 36864, 800.13 steps/s


[-0.31046497 13.1751601  -4.65777436  2.9543534   5.26187336  6.06864762
  5.98635693  0.92143909  3.14476053  3.17229585  3.51563028  6.29894715]
[ 8.10207584  8.15353748 10.82646032  6.80777863  5.45821773  4.37459239
  6.23276047  4.4192993   3.08707082  6.50763092  8.29187675  5.439627  ]
[ 7.8831565   3.89153937 -0.66063688  3.83935822  7.41677085  3.0760897
  4.34824842  6.20798585  6.9252803  11.77692672  1.75081908  4.93587956]
[ 8.20578688  4.09480825  6.4846738   8.59164804 12.90294988  5.37147337
  9.00910643  9.39142669  5.79952296  6.79025778 13.92544693  5.8345925 ]
[12.64748001  7.80600486  5.48140827 -2.79923983  5.32402087  8.10837673
 16.26194167 10.05092481  0.28905154 11.98152257  2.09595963  2.57703684]
[ 3.98827871  3.79532172 10.98017407  5.19027596 -0.74694424 -1.97326317
  1.88016572  5.06498989  2.45790135  1.50535581  5.20811276  5.56778176]
[7.48320784e+00 3.81029246e+00 3.62336938e+00 5.42791153e+00
 1.10200520e+01 5.37852221e+00 6.98068750e+00 2.72101172e+

INFO:root:Episode 64,last 64 episodes, mean rewards  3.58,  steps 49152, 780.29 steps/s


[ 3.9790596   3.44042086  3.39011125  7.4971165   2.21724536  1.28685368
 -0.48105103  5.0216758   1.95352312  8.63331687  3.51878899  3.09889733]
[ 8.45960581 10.5004954   2.83362017  6.50950311  7.52104075 -1.75395984
  8.00015907  3.50355844 11.27114138  8.7858549   6.41889848  5.85811146]
[ 8.09871757  5.10307739 10.22162294  6.81964121  7.31936972  3.3489967
  9.01412925  8.61557937 11.21400757  6.24936547  7.58624859  9.27569818]
[19.96116668  2.69748117  5.50638769  4.86478771  6.33999836 15.55750362
 12.8721824  20.2471922  15.21119348 15.05269226  0.77304979 12.81149935]
[ 7.54249995  4.03928558  8.06659906 11.06721719 12.75298885 10.24060847
  6.15089185  7.80894238  5.08492388  8.90708638 10.40033943  7.49350896]
[ 0.99485799  1.7872496  -0.01937213  0.28846363  1.25478263  1.10801664
  0.91978675  1.50228021  1.12726744  1.60769948  1.69944324  1.60096565]
[-0.2488903   9.21327017 11.31437889  4.16628008  8.79148703  0.85414797
  0.08307458  7.84292852  2.69595777  6.771144

INFO:root:Episode 80,last 80 episodes, mean rewards  4.39,  steps 61440, 784.64 steps/s


[12.05272499  9.83759826  8.64965163  9.16846442 14.19906243 11.06799354
  4.60515537 11.86522935  4.47172432 13.7740675   7.1805573   2.20424418]
[ 2.8761958  12.82011105 12.11051533  6.83974808  2.2164352   7.57362467
  7.39416046 17.19021157  9.52333434 12.3699051  13.63404927  3.60168175]
[ 4.36831846  9.13098969  4.96799052  2.44553509  5.12919161  4.90843448
  7.05270331  3.31752143  7.29176586  4.99132214 13.1935587   8.23442038]
[ 5.72118975 13.16188081 11.89572212  3.8209265  11.17004839 12.17027764
  2.98839257  6.06866071 16.31253038  5.76898162 15.7228295  10.9717086 ]
[ 8.33766496  7.36011175  9.40847307  0.33444056  9.15055117  9.5324727
  4.88140696  5.39191632  6.02505077  7.4677769   7.34684407 14.66293044]
[1.72775074 3.26670027 3.87173345 1.62214936 1.24862547 2.28547685
 1.23181002 3.71394746 0.80497899 5.03535612 3.06107268 4.45416299]
[ 2.41253508 13.63539952 10.7691726   3.36353791  6.347064   11.85940614
 15.94470108  8.15419238 13.12920931  5.87611774 10.370923

INFO:root:Episode 96,last 96 episodes, mean rewards  4.85,  steps 73728, 781.00 steps/s


[ 8.39378862  8.48748777 10.6642063  -0.38682878  6.81913029  7.87879584
  4.96213442 11.09202505  5.61672627  9.68056924  4.69357166  6.58856928]
[ 2.18849287  8.34195938 16.27301913 13.32295383 16.93508199 16.19896663
 11.77802995 15.35661227 18.13159295 12.68070008 14.86508761 12.11086621]
[11.03055464  9.41481986  3.20601134  8.56790372 12.12015959 14.14591244
  9.8541633   5.00483162 10.53795768  4.78132564  7.77965328  2.03577164]
[10.74792127  6.11844995  8.68916212 11.32235019  7.4238497   2.60190095
  7.70082694  9.33919559  7.31211922  1.26571712 11.73572681 12.130663  ]
[ 1.76484443  1.7191876   1.04343116  1.21046483  0.94107665  1.58192462
  1.41083179  0.95884908 -0.43681349  1.2424404   2.33348106  1.35116285]
[13.88999957 19.995948   12.99425833  9.65526467  5.88010129 11.93966194
 17.21346916  8.77781044 17.55278498 13.53316564  8.81418466 10.26254481]
[14.70641811 14.50619692  7.21795305 22.04581961 16.54303895 12.71852336
 15.21513558 15.73931529  7.48072103 18.85386

INFO:root:Episode 112,last 100 episodes, mean rewards  5.82,  steps 86016, 789.22 steps/s


[19.66980884  6.84222845  9.45559629 11.62055745  6.9919676  -3.55674985
 11.6750992  11.18471397  5.8077473  11.28753457 10.40681317 11.02240882]
[6.13271789 7.6105037  6.31685526 6.64342794 5.69042193 3.68196583
 4.76750548 5.81642879 8.94474222 3.24137836 6.5373693  3.89791396]
[29.50625847 15.24869636 18.40006409 24.41721277  6.92546276 20.08771932
 19.25863016 24.27330303 34.2820954  19.71339123 19.71674727 12.44578483]
[ 2.07012087  9.42593682 12.09758286 -8.79110524  2.25168324 11.47456662
  9.07528867  6.73670983 13.31827295 13.7403978  13.65788396 13.23224965]
[ 7.09469083  3.06615863  3.07383387  3.03006433 14.41991581 12.27934914
 10.09949125  8.22194032 10.51493298  6.65635756  4.82462097 10.85464487]
[2.39258007e+01 1.06248382e+01 1.54538993e+01 2.13382637e+01
 1.22082690e+01 2.59527482e+01 2.08741794e+01 1.43550263e+01
 1.99485558e+00 1.45244774e+01 8.83655576e-03 2.05756775e+01]
[ 0.7823868   1.02056444 -0.72368919  0.84006336  1.55149787  1.44204191
  0.72861601  1.2205

INFO:root:Episode 128,last 100 episodes, mean rewards  7.01,  steps 98304, 786.97 steps/s


[ 7.89743742  4.3768073   3.09323044  2.8874992   5.53137193  1.51542738
  1.6085449  10.86903539  8.13436143  0.57216489  2.70597873  4.89743706]
[16.28739993 15.51416544  4.67909666 12.32872797 17.57590596 13.29834451
  6.06662124  9.1730444   9.08724097 18.46533644  9.45252274  9.52020972]
[22.37977311 20.18848146 19.29595373 15.56211124 18.25313499 18.27909896
 26.58351039 11.56942504 10.76329548 11.1149423  21.75670985 18.0249454 ]
[ 8.25200685  4.54261914 13.28379584  9.52920101  8.37922948  7.81906716
  8.18290205 19.56823631 19.43411509  6.3665681  20.30093247 10.71126873]
[11.94327578  8.01068912 11.99302187 13.49680126 13.61896355  8.88889957
 13.38464217  4.92891272 10.49400403 12.55368928  9.24846041  8.11011753]
[ 2.30346092  9.78502781  9.63837111  8.73293552 16.84941745 13.947013
  9.82438051 11.92383887  0.40737989 11.17713423 15.35329204 22.06203035]
[ 1.07609402  1.74654682 -0.48200065  1.66351202  1.64516835  0.36997936
  0.79951752  1.56869371  1.53298982  1.7894756

INFO:root:Episode 144,last 100 episodes, mean rewards  7.86,  steps 110592, 785.56 steps/s


[ 5.93335741 19.85474648  7.69703692 16.90757999 14.262325    8.22315431
  7.99564702 11.03971577 23.55635239 14.38211813 12.85310199 16.10473676]
[ 1.11929085  1.54409579 -0.54466345  1.29190264  2.00947954  1.05531905
  1.86571804  1.48872498  2.13282461  0.34485175  1.00638884  1.84283951]
[ 8.28054561 16.12230295  9.28120418 10.33844034  3.78515226 12.2013136
 12.87634867 -1.40611057 10.66393738 10.92859527 11.74402112  7.3179984 ]
[ 8.19917284 10.22011151  5.74083481  8.81341813  6.83171582  2.92418694
  7.18809691  2.39604262  3.21468208  6.25227549  1.98248967  3.97384361]
[ 5.91745303 10.79093729  7.92888911 10.21050559  9.90267484  5.1322365
  9.44852588 10.48852211  6.89593017 10.95855007 10.45705664  4.49310508]
[ 8.56840576 12.9922026   9.73796945 10.69917462  9.86153825  9.27968556
  6.05124317 10.26438543  4.84337523 11.3005415   8.62894506  4.05503522]
[18.8062531  13.92069989 10.42406153 13.12423508 26.06992316  1.40002348
  5.21607885  4.00922162 15.80548735 10.4441413

INFO:root:Episode 160,last 100 episodes, mean rewards  8.39,  steps 122880, 791.61 steps/s


[ 1.87635988 18.39689326 16.55056179 13.05659884 20.61339954 12.36585753
 14.12779013 24.43472533 16.9595103  15.48099577 13.70652301 10.86821991]
[ 9.90207312  6.55180542  5.13551729  6.71592649  4.17673931  8.29022132
 10.76870624 11.21049981  6.6617085   2.81481137  6.47633018  8.58898449]
[12.54814732 13.68491759 -1.75132666 16.61260583  8.74620488  6.55051116
  8.69332519  7.60255497 11.16784937 15.05511996 11.85467404 16.40337444]
[10.797237    8.78158764  6.34755271  9.75278085  5.81603556  8.220372
  8.86518736 11.84771573 13.85104396 11.39463069 11.17602117  9.23652126]
[2.02055238 0.6423321  1.57669037 1.66431152 2.0166365  0.55714899
 2.30407749 0.45249    1.6352122  1.59202228 2.76777202 3.32124955]
[ 8.001728    7.24638038 10.13345857 10.6054359   9.31186115  2.40858299
  4.7423836  13.68911766  9.65917075  3.03132972 11.20725514  9.67876263]
[10.78254209  9.14120212  7.43963893  8.15981182  5.58517632  5.44568829
  7.43476581 11.83256085 -0.1238705   3.88697154  6.4616457

INFO:root:Episode 176,last 100 episodes, mean rewards  8.75,  steps 135168, 773.95 steps/s


[20.34866154 12.00355323 25.06674062 19.40825378 17.12298841 11.15957694
 12.42305856 21.24191001 19.26877705  9.28600773 17.03640891 16.16217125]
[ 0.72027221  1.09006119  0.84674238  0.96143512  1.09327216 -0.0644917
  0.8871681   0.75919448  0.72624237  1.40171281 -0.71675237  0.15129751]
[ 7.54676196  8.36134722  7.93714314  9.12148302  6.48522094  9.43231532
 10.47538444 12.74129663  5.71590654 10.06304982  8.69740952 11.58146495]
[21.56259702 18.78887549 17.46040886 14.93471002 23.28172074 22.43767459
 19.89171399 33.67689046 26.09856986 11.62689555 22.70870358 19.53336328]
[ 8.40085906 13.92180038  6.90717687 11.78018715 15.09386589 11.16366033
 13.98267648 12.28496097 10.15127348  8.17999545 11.824096    7.60682451]
[10.44582939  7.96832582  6.03944397  7.02371035  9.76520704  7.43405173
  7.8968349   3.85393469  7.85857875 13.80334751  7.22265046 11.13036935]
[ 9.77445832 11.80009888 12.23467274 13.33871423 13.91063202 16.5648839
 12.21682471  9.01110175 12.35662096 13.4251460

INFO:root:Episode 192,last 100 episodes, mean rewards  9.05,  steps 147456, 791.24 steps/s


[ 7.08448823  8.07267672  7.99347507  8.04452758  6.96941993 17.53033872
  9.14349777  8.20682116  8.64641305  7.05102118  9.6557228   8.93301125]
[18.86551394 18.04833509 21.51640861 13.5426208  13.34194079 11.76240586
 30.54809739 27.50795448 26.1483568  19.42526564 25.33091913 22.00221268]
[12.39977596 23.0177123  21.80342416 24.81626323 30.50274975 17.97327537
 12.79551591 26.9793967  22.72254792 15.01340045 19.26480946 15.38213717]
[ 6.38072683  8.89422482  7.1442641   3.82715327 11.76422633 12.1978352
  9.88210443 11.72599258 14.70331142  4.90294732  8.49106337 10.09787038]
[ 8.75859221 12.01878094  1.03129025  8.00106935 10.74244205  7.96997047
  8.83200073  5.74446852 10.75076454 10.40854203 10.47877272 14.31091227]
[ 9.20865344  7.96957301  4.77619408 11.11641464 19.32529735 10.72383102
 17.50243138  8.90192675 12.73345365  6.39524495 12.05341128 16.0863894 ]
[ 1.12704503  1.75286798 -1.12111899  0.69101018  1.10264838  0.60494481
  1.60328914  1.6278858   1.48555113  1.911449

INFO:root:Episode 208,last 100 episodes, mean rewards  9.87,  steps 159744, 788.48 steps/s


[ 2.43429538  6.61766859 -0.33365815 14.25054032 10.01864313  7.27984716
  4.18980223 14.56734991  8.57678726  5.07112712 11.08649988  8.32033988]
[ 8.88969901  3.37895251  1.29999056 11.97104105 14.43296193  7.41145
  9.26244552 10.60639837  7.15239386  6.17503032  6.50935959 13.75569314]
[20.40913175 10.47215285  6.86343177  8.27579202 13.51215566 23.67776701
  9.07614087 13.83634839 15.51967264 11.08007943  9.30372757 20.48263711]
[ 7.6675374   4.80824514  8.7351879  11.19631878 19.8728402   9.8908422
  3.26276323 19.46327108  5.12744195 12.27215963 10.27878197  9.11610813]
[11.47729716  5.36862939  9.06796337  3.7803233   9.34596485  2.53442916
  7.32423289  8.64679825  6.53103784  6.96162928 13.47548627  8.77819959]
[14.23631476 14.951979   17.1772061  14.31278013 24.05769356 21.85555303
 16.2769549  12.10185916 17.33987639 16.23762338 17.80546593 17.48033717]
[ 9.28944068  4.94326569  5.91629864 13.58435563 18.35805288 15.502312
 11.51108036  8.79171333  6.75758032  5.33787887 14

INFO:root:Episode 224,last 100 episodes, mean rewards  10.19,  steps 172032, 790.06 steps/s


[16.48912585 25.94998076 13.66025132 25.86554237 23.352078   27.16974663
 24.51772417 16.3340382  17.79146648 16.94411386 16.08179966 16.63991195]
[ 6.09004319  8.95958316  7.86160657  6.13581114  8.78105267  4.99213342
  5.25505437  2.61104208 14.54425289  8.51396791 10.00333879  6.2330008 ]
[13.48414941 19.42422718 13.32808082 12.82665493 13.95115491  9.15766349
 10.19642385 10.1166273   6.89755923 21.75561185 16.706771   11.52533312]
[13.704252    5.35463028  7.85434778  5.67607094 10.52337882 12.9807115
 14.78487515 11.29061794  5.68102877 12.10144338  8.30967611 10.09260239]
[10.14851531 12.60875418  8.29122059  8.93750946  6.61907922  8.45984805
  6.76272229  3.92398243  7.9578673   6.02232777  7.16462175  7.31182592]
[ 8.38176611 13.41973858  7.98335248 14.00564873 11.56277272 10.88700861
 12.01452063  6.89652091  7.77043668 15.8851803   8.96137388  8.21803816]
[15.72728156 10.70855849 16.93066845 14.84561908 21.11019211 20.16557089
 13.58831714 13.39980938 15.2011836  16.748472

INFO:root:Episode 240,last 100 episodes, mean rewards  10.25,  steps 184320, 780.03 steps/s


[16.19623775 10.56097736 12.98402648 12.04419863  8.40780351 20.99027921
  8.33740584 13.37785528 14.67651179 12.42739889 23.14844374 23.57453385]
[3.80995341 0.85892614 5.31014964 6.46258744 5.20315851 5.5057849
 5.44190161 6.04304351 5.98960932 4.06334844 5.51978204 2.47295542]
[12.39183818 26.73954609 20.88259822 23.18435091 16.84408907  9.39649656
  9.07499086 17.46946726 23.22579307 21.33373487  1.78845648 15.58447903]
[24.6723931  19.81962172 12.6097551  10.44962166 12.94279708  9.08146449
  8.10892814 18.31737406  6.91401788 18.27203657 13.42412701  5.18822127]
[5.88923023 6.60400859 9.73454162 4.23916928 6.07537348 8.89536838
 8.69047895 5.22612211 3.92833493 3.08939522 7.05960609 5.8293698 ]
[ 9.92955725  7.75769225 14.15059009  8.48358506  6.94974085 16.94969507
  9.00988719 12.12394634 10.66476787 12.85307766 11.39413166 10.14532777]
[ 9.05803954  7.57266967  8.47462248 13.49096664 10.75049111 10.67425916
 12.69667657 12.72593261  7.18015906  8.90763982 10.43081772 12.273205

INFO:root:Episode 256,last 100 episodes, mean rewards  10.88,  steps 196608, 778.45 steps/s


[14.17445113 13.85951136  8.192709   14.28950697  9.00959506 12.02172945
 13.56830537  8.47853527  7.3629617  10.42358615 11.38708488  9.48496106]
[10.11218506 17.03554425 18.50484489 22.43037315 16.18953466 20.67842118
  8.88085511 11.75198489 18.80369358 22.44475324 13.47608372 15.22519431]
[ 3.05499981  6.32396499 16.25951039 10.38926729  8.46160804  9.27051528
  4.16222588  9.30807183 11.23484455 13.01903629  5.69502372  8.61028158]
[10.18105054 12.34406966  7.6220568   9.62890303 17.04254274 11.94092685
  8.37118589  5.51155663  6.02282071 13.38064616  1.75269795 13.01363169]
[19.94576973 15.92122657 26.5755437  14.79211685 22.23602236 19.88874942
 21.75453495 19.03549525  7.61834836 17.59446612 19.84084872 16.51828399]
[ 9.56435125 10.83432246  5.67967777 10.5414932   5.09833277 11.8250167
 16.16694375 11.13534467  4.63439522 10.39468482 11.29680432  6.41135881]
[ 8.03082833  8.71319507  9.8540834   8.00348761  9.74322216 11.03809585
  3.2420668  11.33093137 13.4717313   5.436818

INFO:root:Episode 272,last 100 episodes, mean rewards  11.34,  steps 208896, 802.40 steps/s


[10.80922331 12.17687082  8.34063937  0.27955953  7.59387543  8.83538555
  9.15876043  5.61944417  9.01352554  9.46219243  9.62519301  4.56608194]
[11.53691091 11.84977012 10.40604927 13.63213667  8.61954142  9.61980772
  2.06549424  6.96839709 10.11219122 11.38213564  7.87778583 12.94591418]
[ 7.88858785 11.1906579  16.85318026 10.91352065 14.64067494 19.98659428
 21.7654138  27.1196796   8.60604707 22.81066067 16.03932764 16.62854251]
[ 8.70166834 15.00627977 25.56901932 16.10127284 19.60311357 19.54793117
 13.74225746 26.40635949 19.25311786 17.80338613 15.54515258 14.80374033]
[11.98514937  7.37085564 10.55925509 12.66822049  8.16936191 13.63404681
 10.62048781  9.89441286 12.01818231 14.75999199 12.37916146 14.35479602]
[13.9019291  12.17290458 12.94481794 14.0736819  13.55668503  9.55516954
 14.10818972 12.65555626 13.96532163 18.94642404 15.73620308  5.64107397]
[12.68430472 13.08326619  8.06031129 13.73217602  6.89438639  6.89661127
 11.65092854 13.24448035  3.96819854 16.21808

INFO:root:Episode 288,last 100 episodes, mean rewards  11.20,  steps 221184, 787.70 steps/s


[16.663722   11.41503581 14.96674059 10.51172162 11.17351367 15.63693089
 14.48576571 10.70495048 11.66451076  7.20408247 13.80840719  9.96879691]
[34.44295146 19.2863349  24.34789656 24.35311427 17.37708335 15.10882706
 23.8692692  22.79598677 24.06343151 18.78053269 31.54381128 20.39132009]
[ 1.24246845  1.24233474  1.37327315  0.90662789  1.17488026  1.11681019
  1.04533945  1.50666931  0.62019876  1.13042103  0.81486448 -0.52384099]
[15.22190289 11.10996146 11.79097989 16.15253607  7.81944029 15.10598787
 11.74955972 12.52367907  5.67284954 -1.19173153 11.46037563  6.06267691]
[11.07429306 23.34656819 11.19177992 17.57957941 11.26998164 14.4086787
 25.95450675  7.61605771 16.81015842 16.92250305 17.0340936  15.04676117]
[13.92288695 21.43724063 22.78066321  6.62433238 19.96359208 13.54128296
 28.29321269 27.44467217 33.25065051 16.25019796 23.10929694 16.11331962]
[ 7.07324401 13.84328755  8.9083467   3.81205485 10.22957321  8.15773655
  8.55875807  4.40783755 13.25081312 11.686158

INFO:root:Episode 304,last 100 episodes, mean rewards  11.25,  steps 233472, 785.54 steps/s


[17.01780271 16.43616698 16.07798994 17.57519067 20.32755936 23.50239097
 21.27068846 12.86560888 14.33263332 24.17260929 23.02245336 17.60366177]
[18.75821556 15.28374122  6.74489264 16.90239981 11.0311177  12.37641537
  8.03110024 18.1114015  17.646496   16.33046773 11.7766361  10.21167037]
[15.71592534 11.49806837 16.72082711 15.72060392 18.47444852  7.38471312
 14.17868647 19.2179547  22.01463004 19.98243333 17.99639504 16.35469204]
[12.26221622 11.64708669  4.65448899 14.57057568 14.02999108  1.0218849
  8.62784431 13.14686204  9.34524    16.37001545 16.60032746 13.27587765]
[ 5.16421884 13.78146337  9.56902334 11.77263863 11.97603179 10.08293433
 12.89809006  8.78566021  7.84719924 12.13308319  3.82573565 12.09196904]
[ 8.4784532  15.65192966 13.31408534 13.87367901  6.32521211 20.15345079
 15.57122339  6.06703422 11.09175857 10.73630683 11.62939655 14.88173496]
[ 1.49170989  8.35151382 10.25990117  7.60827338 11.33424075  9.06384367
  4.7775648   6.18310058  9.42978397  5.447827

INFO:root:Episode 320,last 100 episodes, mean rewards  11.54,  steps 245760, 786.86 steps/s


[ 9.63121759 12.26343992 12.76949649 12.39268135 10.90884875 12.9952489
 11.89338865  9.40611259  4.007118    8.74965595 12.91771217  6.73907558]
[4.68081411 2.65923987 2.49193348 2.29158056 1.67326198 4.64077196
 4.73809312 4.94035909 4.17408143 1.86164204 3.59865809 3.05555047]
[ 8.49832886 12.80216835 19.76868558 13.97336013  7.86145328  7.62919361
  4.65645846 18.46349314 15.49149537  5.89745846 18.54311544 14.33770886]
[10.75645894 12.8997306  12.43284803 12.3682909  19.15355809 17.87327752
 15.42524596 21.00480155  7.14647059 13.80158965 10.92573682 10.7745878 ]
[12.08846835  7.68712047 12.98371802  8.72124041 12.18973697 12.35108916
 15.15277393  9.3006967  13.4981539  11.73921002 10.60603806 12.18464707]
[12.90373458 11.29035483 11.6899382  13.84550386  7.11163593  5.01581245
  7.73724299  7.81337962  8.8623507  16.06216351  9.62323434  8.47143441]
[14.27400553 10.84739739 19.69546049 14.24392409 10.20404703 11.54658877
 12.13007775 18.10588442 15.55113763 16.65543622 17.170556

INFO:root:Episode 336,last 100 episodes, mean rewards  12.03,  steps 258048, 784.48 steps/s


[13.14209207 17.06581745 16.07973725 22.46009309 16.03838446 21.25754713
 21.63016107 18.69789298 16.92261261  4.18413894 23.22682912  6.85711265]
[18.34617236 12.38688161  7.90818655 13.06884139 13.21297106 14.37563664
 10.34816536 11.24296876  9.27240655 -0.87166391  8.18199416 22.03018123]
[10.94480969 13.21305668 -0.52829486  3.36869939  5.91551405 15.12213481
  6.75803486  8.99986013 13.29849915 12.01781399 11.33070407 11.41052486]
[ 6.08044752 10.42961095 14.82172033  8.30925617 10.44746111  7.21813162
  7.6386428   4.96176697  9.31302577 10.52269915 15.60619104  9.12644904]
[ 8.49310529 15.49738671 12.15312334 19.83090382 17.73974047 10.00687669
 15.52247955 11.80821258 18.10090736 12.54224093 13.48466179 15.81758966]
[33.79281899  4.46393411 31.92796525 23.82312588 12.92510575 20.96630767
 13.32440026 23.21221672 26.6885659  14.21421324 21.77159053 10.7908947 ]
[23.94992423 26.46146755 18.02052937 19.77411896 18.75385958 29.99488705
 19.6600329  17.07279119 25.54448608 24.98544

INFO:root:Episode 352,last 100 episodes, mean rewards  12.33,  steps 270336, 778.41 steps/s


[11.85184608 15.73618007 11.08783024 15.30683534 11.43441999 11.29415814
 15.47552373 12.82038365  9.54024869 11.72464747 17.87879571 15.03544193]
[17.96707232 15.2697023   9.87769312 18.06432783 16.05493096  7.53370458
 11.93939091  4.77066266 16.78297182 12.36343889 13.20250314 16.05803183]
[20.84541943 23.79248081 16.82294297 10.92062847  8.51640088 19.89541118
 12.39172556 21.15591284 22.66626095 23.65995143 16.99390848 18.4842654 ]
[15.20014543  9.17066593  8.80012556  6.56857205 10.05099597 14.33679564
  5.99995161  6.62935495 13.2708261   8.85878122 12.12306384 12.01338686]
[10.94483585 22.76230371 12.95934442  9.22640025 18.76452216 11.58296436
 12.5276174  11.20429388 10.53251052 18.67789523 19.00507252 13.42169166]
[ 9.95321677  6.99129179 10.71749778  9.68239329  8.74678266 12.09480058
  8.49720744  6.79868188  6.04614826 10.13552356  8.20447613 10.43653908]
[13.43703476 13.38789438 13.14377106  7.52473413  8.63086222  9.00339951
 14.57284889  9.77230753 14.37118324  7.48985

INFO:root:Episode 368,last 100 episodes, mean rewards  12.82,  steps 282624, 786.78 steps/s


[17.80070445 28.14634649 27.7547412  16.71970557 21.43119476 16.01893459
 34.05576285 21.81114865 35.02692435 17.36528265 18.00438753 30.93188675]
[16.12823978 16.88832301 17.18728597 19.07980843 16.69066135 13.15479175
 17.38649743 15.18223368 18.95087801 21.10710623 15.32433229 11.66059274]
[14.51965048 15.36150783 12.2860444   6.20691784  6.22876679 11.98348999
 14.28302828 14.09977516 12.02044212 18.18955441 17.31743379 15.3552684 ]
[20.01557534  8.86628622 19.93490957 14.19912676 11.86062283  6.96973794
 19.62262136 12.65524824 18.00027871 18.02231302 25.59161416 15.19531413]
[18.22294825 16.51184625 13.0006877   8.85083207 20.00386472  7.97133224
 15.95264661  8.8140206  10.66495148 14.79697281  5.70392589  4.9417918 ]
[10.87718474 13.56597757 13.05202067  6.56605343 15.70945827  8.44969878
 12.90823964  7.6002156  10.37028862 15.38409486 14.74234486 12.30450155]
[ 8.23224611 13.34093255 13.7691419  11.81332548 12.61750126 13.69507043
 15.3315014  15.99724842 10.16282546  9.27734

INFO:root:Episode 384,last 100 episodes, mean rewards  13.55,  steps 294912, 785.81 steps/s


[25.21980663 13.03113689 12.82628591 18.67275628 20.19717491 31.16335802
 24.1853568  19.26712024 25.06166953 21.92779672 10.36864167 27.02285673]
[16.56814581 21.92153265 25.87226399 29.92042183 22.96761583 19.89529332
 17.78666868 17.45831391 27.33397505 20.55641568 11.86424036 20.20297519]
[10.6548159  20.9644086  14.28105573 12.11723814  8.17074102 11.15743017
  6.86511284 16.00233152 10.85005131 13.95722853  9.0769066  16.0184559 ]
[ 7.4094932  16.78874671 13.26761195  9.27609396 11.55640872 20.20657601
 19.62070052 16.36589737 14.57359087 19.9587991  19.37887087  9.984623  ]
[28.90058316 17.11677122 24.7704033  19.31744017 13.66019008 22.4606342
 22.25745484 38.93649007 22.36021048 15.87351654 25.89215721 20.30374514]
[17.94752866  7.77958953 21.69136625  9.37925273 18.11474684 10.78128575
 22.31697562 16.55725481 16.53196057 15.43705499 14.72338872  9.75042265]
[ 9.32890904  7.97300496 17.14237519 11.61465128  6.8830197   9.07900393
 15.04757446 10.82619603  8.42853382  5.559304

INFO:root:Episode 400,last 100 episodes, mean rewards  13.80,  steps 307200, 781.34 steps/s


[17.41719273 19.85954597 12.91224309 14.96434036 19.07644852 14.99349446
 22.88466158 22.10366847 13.36241446  7.76706839 11.43011211 17.46098584]
[18.39884956 14.8311377  13.68593422 15.05418768 21.37454264 14.30097731
 14.99321121 12.83295066  6.91788076  7.96721021 12.70865082  7.15489169]
[13.03923575 13.24577541 18.50670595 10.06064743 11.63278889 17.90063743
 21.17222747 20.5957209  18.24197913  8.17352845 23.45646808 16.58254616]
[20.74145114 12.95987215 21.05375423 17.83926184 15.89379657 22.46451964
 17.88112208 23.20439957 15.6492243  17.50108372  8.59644404 18.98820188]
[10.27427059 10.98333689  8.82276353 14.36538421 10.36915449  6.84373509
 18.47016333  8.95135535 11.03450928  7.44676935 12.74758912  7.73122324]
[26.75385339 22.82948385 27.19832442 24.67129865 29.03480638 21.08718158
 28.15903648 32.86512004 30.36606167 33.13418256 25.15770976 14.98901568]
[ 9.60439831 15.66408785 17.16465907 13.79328981 14.13776659 10.2296557
 19.8252074  11.09590477 13.4969157  12.580442

KeyboardInterrupt: 