# Continuous Control

---

In this notebook, you will learn how to use the Unity ML-Agents environment for the second project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program.

### 1. Start the Environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

In [1]:
from unityagents import UnityEnvironment
import numpy as np

Next, we will start the environment!  **_Before running the code cell below_**, change the `file_name` parameter to match the location of the Unity environment that you downloaded.

- **Mac**: `"path/to/Reacher.app"`
- **Windows** (x86): `"path/to/Reacher_Windows_x86/Reacher.exe"`
- **Windows** (x86_64): `"path/to/Reacher_Windows_x86_64/Reacher.exe"`
- **Linux** (x86): `"path/to/Reacher_Linux/Reacher.x86"`
- **Linux** (x86_64): `"path/to/Reacher_Linux/Reacher.x86_64"`
- **Linux** (x86, headless): `"path/to/Reacher_Linux_NoVis/Reacher.x86"`
- **Linux** (x86_64, headless): `"path/to/Reacher_Linux_NoVis/Reacher.x86_64"`

For instance, if you are using a Mac, then you downloaded `Reacher.app`.  If this file is in the same folder as the notebook, then the line below should appear as follows:
```
env = UnityEnvironment(file_name="Reacher.app")
```

In [2]:
env = UnityEnvironment(file_name='Reacher_Windows_x86_64/Reacher.exe')

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		goal_speed -> 1.0
		goal_size -> 5.0
Unity brain name: ReacherBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 33
        Number of stacked Vector Observation: 1
        Vector Action space type: continuous
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 


Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

In [3]:
# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

### 2. Examine the State and Action Spaces

In this environment, a double-jointed arm can move to target locations. A reward of `+0.1` is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.

The observation space consists of `33` variables corresponding to position, rotation, velocity, and angular velocities of the arm.  Each action is a vector with four numbers, corresponding to torque applicable to two joints.  Every entry in the action vector must be a number between `-1` and `1`.

Run the code cell below to print some information about the environment.

In [4]:
# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# number of agents
num_agents = len(env_info.agents)
print('Number of agents:', num_agents)

# size of each action
action_size = brain.vector_action_space_size
print('Size of each action:', action_size)

# examine the state space 
states = env_info.vector_observations
state_size = states.shape[1]
print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))
print('The state for the first agent looks like:', states[0])

Number of agents: 1
Size of each action: 4
There are 1 agents. Each observes a state with length: 33
The state for the first agent looks like: [ 0.00000000e+00 -4.00000000e+00  0.00000000e+00  1.00000000e+00
 -0.00000000e+00 -0.00000000e+00 -4.37113883e-08  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00 -1.00000000e+01  0.00000000e+00
  1.00000000e+00 -0.00000000e+00 -0.00000000e+00 -4.37113883e-08
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  5.75471878e+00 -1.00000000e+00
  5.55726671e+00  0.00000000e+00  1.00000000e+00  0.00000000e+00
 -1.68164849e-01]


### 3. Take Random Actions in the Environment

In the next code cell, you will learn how to use the Python API to control the agent and receive feedback from the environment.

Once this cell is executed, you will watch the agent's performance, if it selects an action at random with each time step.  A window should pop up that allows you to observe the agent, as it moves through the environment.  

Of course, as part of the project, you'll have to change the code so that the agent is able to use its experience to gradually choose better actions when interacting with the environment!

In [5]:
random_actions = False

In [6]:
if random_actions:
    env_info = env.reset(train_mode=False)[brain_name]     # reset the environment    
    states = env_info.vector_observations                  # get the current state (for each agent)
    scores = np.zeros(num_agents)                          # initialize the score (for each agent)
    while True:
        actions = np.random.randn(num_agents, action_size) # select an action (for each agent)
        actions = np.clip(actions, -1, 1)                  # all actions between -1 and 1
        env_info = env.step(actions)[brain_name]           # send all actions to tne environment
        next_states = env_info.vector_observations         # get next state (for each agent)
        rewards = env_info.rewards                         # get reward (for each agent)
        dones = env_info.local_done                        # see if episode finished
        scores += env_info.rewards                         # update the score (for each agent)
        states = next_states                               # roll over states to next time step
        if np.any(dones):                                  # exit loop if episode finished
            break
    print('Total score (averaged over agents) this episode: {}'.format(np.mean(scores)))

When finished, you can close the environment.

In [7]:
if random_actions:
    env.close()

### 4. It's Your Turn!

Now it's your turn to train your own agent to solve the environment!  When training the environment, set `train_mode=True`, so that the line for resetting the environment looks like the following:
```python
env_info = env.reset(train_mode=True)[brain_name]
```

In [8]:
# reset the environment
env_info = env.reset(train_mode=True)[brain_name]
action_size = brain.vector_action_space_size
states = env_info.vector_observations
state_size = states.shape[1]

In [9]:
import random
import torch
import numpy as np
from collections import deque
import matplotlib.pyplot as plt
%matplotlib inline

In [10]:
from ddpg_agent import Agent
agent = Agent(state_size=state_size, action_size=action_size, random_seed=10)

In [11]:
scores_list = []
import pprint

In [12]:
agent_kwargs = {"state_size": state_size, "action_size": action_size, "random_seed": 33, }
agents = [Agent(**agent_kwargs) for _ in range(num_agents)]
assert len(agents) == num_agents

In [13]:
def ddpg(n_episodes=2000, max_t=700):
    scores_deque = deque(maxlen=100)
    max_score = -np.Inf
    for i_episode in range(1, n_episodes+1):
        # reset the environement for a new episode
        env_info = env.reset(train_mode=True)[brain_name]
#         state = env_info.vector_observations[0]
        states = env_info.vector_observations
        for agent in agents:
            agent.reset()

#         print(list(state))
        agent.reset()  # noise.reset
        scores = np.zeros(num_agents) # score = 0

        for t in range(max_t):
#             action = agent.act(state, add_noise=True)  # Size of each action: 4
            actions = [agent.act(states[idx]) for idx, agent in enumerate(agents)]

            # ToDo: monitor action
#             print(action)  # [-0.0788092  -0.88971204 -0.896988   -0.89706326]
#             actions = np.clip(actions, -1, 1)                  # all actions between -1 and 1

            #             next_state, reward, done, _ = env.step(action)
#             env_info = env.step(action)[brain_name]
#             next_state = env_info.vector_observations[0]   # get the next state
#             reward = env_info.rewards[0]                   # get the reward
#             print(reward)  # 0.0, 0.0, 0.02, 0.04, 0.03, 0.0
#             done = env_info.local_done[0]
    
            env_info = env.step(actions)[brain_name]
            next_states = env_info.vector_observations  # get next state (for each agent)
            rewards = env_info.rewards  # get reward (for each agent)
            dones = env_info.local_done  # see if episode finished
            step_tuple = zip(agents, states, actions, rewards, next_states, dones)

#             agent.step(state, action, reward, next_state, done)
            for agent, s, a, r, s_, d in step_tuple:
                agent.memory.add(s, a, r, s_, d)
                if (t % 10 == 0):
                    agent.step(s, a, r, s_, d)
#                     agent.step()
#             score += reward
            scores += rewards  # update the score (for each agent)
#             if done:
#                 break
            if np.any(dones):  # exit loop if episode finished
                break
#             state = next_state
            states = next_states
        
        score = np.mean(scores)
        print(score)
        scores_deque.append(score)
        scores_list.append(score)
        
        print('\rEpisode {}\tAverage Score: {:.2f}\tScore: {:.2f}'.format(
            i_episode, np.mean(scores_deque), score), end="")
        if i_episode % 10 == 0:  # save the two local (i.e. regular) nets
            torch.save(agent.actor_local.state_dict(), 'checkpoint_actor.pth')
            torch.save(agent.critic_local.state_dict(),
                       'checkpoint_critic.pth')
            print('\rEpisode {}\tAverage Score: {:.2f}'.format(
                i_episode, np.mean(scores_deque)))
#     return scores

In [None]:
ddpg()

0.0
Episode 1	Average Score: 0.00	Score: 0.00actions batch at 1000-th learning:
	 shape = (128, 4),
	 mean = [0.9368332  0.913111   0.7651488  0.72946566],
	  std = [0.16475068 0.20850752 0.39987826 0.4612174 ]
0.0
Episode 2	Average Score: 0.00	Score: 0.000.0
Episode 3	Average Score: 0.00	Score: 0.00actions batch at 2000-th learning:
	 shape = (128, 4),
	 mean = [0.8455676  0.9617074  0.89049196 0.6331586 ],
	  std = [0.38649136 0.17044906 0.31981197 0.54897213]
0.0
Episode 4	Average Score: 0.00	Score: 0.00actions batch at 3000-th learning:
	 shape = (128, 4),
	 mean = [0.8686924 0.8028698 0.9430816 0.6209773],
	  std = [0.36249006 0.4218102  0.23416072 0.5841726 ]
0.0
Episode 5	Average Score: 0.00	Score: 0.00actions batch at 4000-th learning:
	 shape = (128, 4),
	 mean = [0.9124473 0.805525  0.9456107 0.7491392],
	  std = [0.30162853 0.450631   0.2267814  0.4928763 ]
0.0
Episode 6	Average Score: 0.00	Score: 0.000.0
Episode 7	Average Score: 0.00	Score: 0.00actions batch at 5000-th lear

In [None]:
print(scores_list)

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot(np.arange(1, len(scores_list)+1), scores_list)
plt.ylabel('Score')
plt.xlabel('Episode #')
plt.show()

Testing

In [12]:
agent.actor_local.load_state_dict(torch.load('checkpoint_actor.pth'))
agent.critic_local.load_state_dict(torch.load('checkpoint_critic.pth'))

In [14]:
for i_episode in range(10):
    env_info = env.reset(train_mode=True)[brain_name]     # reset the environment    
    states = env_info.vector_observations                  # get the current state (for each agent)
    agent.reset()
    for i_step in range(1000):
        actions = agent.act(states)
        env_info = env.step(actions)[brain_name]           # send all actions to tne environment
        next_states = env_info.vector_observations
        states = next_states
        if np.any(env_info.local_done):                                  # exit loop if episode finished
                break

In [15]:
env.close()

In [None]:
0.0
Episode 1	Average Score: 0.00	Score: 0.00actions batch at 1000-th learning:
	 shape = (128, 4),
	 mean = [0.9368332  0.913111   0.7651488  0.72946566],
	  std = [0.16475068 0.20850752 0.39987826 0.4612174 ]
0.0
Episode 2	Average Score: 0.00	Score: 0.000.0
Episode 3	Average Score: 0.00	Score: 0.00actions batch at 2000-th learning:
	 shape = (128, 4),
	 mean = [0.8455676  0.9617074  0.89049196 0.6331586 ],
	  std = [0.38649136 0.17044906 0.31981197 0.54897213]
0.0
Episode 4	Average Score: 0.00	Score: 0.00actions batch at 3000-th learning:
	 shape = (128, 4),
	 mean = [0.8686924 0.8028698 0.9430816 0.6209773],
	  std = [0.36249006 0.4218102  0.23416072 0.5841726 ]
0.0
Episode 5	Average Score: 0.00	Score: 0.00actions batch at 4000-th learning:
	 shape = (128, 4),
	 mean = [0.9124473 0.805525  0.9456107 0.7491392],
	  std = [0.30162853 0.450631   0.2267814  0.4928763 ]
0.0
Episode 6	Average Score: 0.00	Score: 0.000.0
Episode 7	Average Score: 0.00	Score: 0.00actions batch at 5000-th learning:
	 shape = (128, 4),
	 mean = [0.95789474 0.91096276 0.9492406  0.7160696 ],
	  std = [0.20050935 0.31658065 0.21291286 0.5200734 ]
0.0
Episode 8	Average Score: 0.00	Score: 0.00actions batch at 6000-th learning:
	 shape = (128, 4),
	 mean = [0.93144965 0.9170963  0.9557695  0.80552197],
	  std = [0.26694497 0.289463   0.20738289 0.4439402 ]
0.0
Episode 9	Average Score: 0.00	Score: 0.000.0
Episode 10	Average Score: 0.00
actions batch at 7000-th learning:
	 shape = (128, 4),
	 mean = [0.9712112  0.9103896  0.93924403 0.8130261 ],
	  std = [0.14536116 0.30431736 0.24663185 0.42509466]
0.0
Episode 11	Average Score: 0.00	Score: 0.00actions batch at 8000-th learning:
	 shape = (128, 4),
	 mean = [0.8886721  0.8953277  0.9670292  0.75440615],
	  std = [0.36545694 0.34644124 0.18931207 0.4961262 ]
0.08999999798834324
Episode 12	Average Score: 0.01	Score: 0.090.13999999687075615
Episode 13	Average Score: 0.02	Score: 0.14actions batch at 9000-th learning:
	 shape = (128, 4),
	 mean = [0.83336747 0.78707075 0.8591153  0.83893573],
	  std = [0.42229873 0.46383512 0.38901532 0.39952454]
0.1599999964237213
Episode 14	Average Score: 0.03	Score: 0.16actions batch at 10000-th learning:
	 shape = (128, 4),
	 mean = [0.7732091  0.7065535  0.8239963  0.79108465],
	  std = [0.46272036 0.52519196 0.4181865  0.45045638]
0.3799999915063381
Episode 15	Average Score: 0.05	Score: 0.38actions batch at 11000-th learning:
	 shape = (128, 4),
	 mean = [0.71442044 0.7065757  0.7306834  0.7438079 ],
	  std = [0.49913904 0.50337416 0.5139211  0.4940978 ]
0.23999999463558197
Episode 16	Average Score: 0.06	Score: 0.240.0
Episode 17	Average Score: 0.06	Score: 0.00actions batch at 12000-th learning:
	 shape = (128, 4),
	 mean = [0.70844096 0.7734079  0.7022699  0.71278495],
	  std = [0.5108163  0.46192917 0.52968305 0.51004684]
0.8199999816715717
Episode 18	Average Score: 0.10	Score: 0.82actions batch at 13000-th learning:
	 shape = (128, 4),
	 mean = [0.57853895 0.6608934  0.5945393  0.5756223 ],
	  std = [0.5989095  0.54915553 0.59484994 0.5988235 ]
0.5199999883770943
Episode 19	Average Score: 0.12	Score: 0.520.5099999886006117
Episode 20	Average Score: 0.14
actions batch at 14000-th learning:
	 shape = (128, 4),
	 mean = [0.40545318 0.48896095 0.45047003 0.52270573],
	  std = [0.6381974  0.62225175 0.6326798  0.6053244 ]
0.8699999805539846
Episode 21	Average Score: 0.18	Score: 0.87actions batch at 15000-th learning:
	 shape = (128, 4),
	 mean = [0.41003934 0.3554643  0.54774475 0.434201  ],
	  std = [0.6365919 0.6625399 0.6010787 0.6526374]
0.8999999798834324
Episode 22	Average Score: 0.21	Score: 0.900.3999999910593033
Episode 23	Average Score: 0.22	Score: 0.40actions batch at 16000-th learning:
	 shape = (128, 4),
	 mean = [0.36016166 0.45648882 0.5633818  0.3970818 ],
	  std = [0.63898546 0.615824   0.57792854 0.63787687]
1.3099999707192183
Episode 24	Average Score: 0.26	Score: 1.31actions batch at 17000-th learning:
	 shape = (128, 4),
	 mean = [0.32128692 0.34464777 0.590331   0.2501421 ],
	  std = [0.6369484  0.63913    0.57475144 0.65760946]
0.3199999928474426
Episode 25	Average Score: 0.27	Score: 0.32actions batch at 18000-th learning:
	 shape = (128, 4),
	 mean = [0.34689972 0.3918433  0.5609776  0.3990742 ],
	  std = [0.667413   0.65251637 0.59834945 0.64529234]
0.2199999950826168
Episode 26	Average Score: 0.26	Score: 0.220.1599999964237213
Episode 27	Average Score: 0.26	Score: 0.16actions batch at 19000-th learning:
	 shape = (128, 4),
	 mean = [0.3262855  0.3289223  0.56854117 0.31748694],
	  std = [0.6430974  0.63497597 0.6042413  0.65574175]
0.08999999798834324
Episode 28	Average Score: 0.25	Score: 0.09actions batch at 20000-th learning:
	 shape = (128, 4),
	 mean = [0.30036214 0.2589074  0.63471615 0.22647826],
	  std = [0.6453207  0.6473552  0.5678511  0.65087193]
0.11999999731779099
Episode 29	Average Score: 0.25	Score: 0.120.0
Episode 30	Average Score: 0.24
actions batch at 21000-th learning:
	 shape = (128, 4),
	 mean = [0.3580498  0.32361144 0.544524   0.37792873],
	  std = [0.6284045  0.6538532  0.59553474 0.6442828 ]
0.24999999441206455
Episode 31	Average Score: 0.24	Score: 0.25actions batch at 22000-th learning:
	 shape = (128, 4),
	 mean = [0.19585313 0.16288635 0.50078875 0.16643383],
	  std = [0.6499789  0.6138219  0.6185572  0.63834804]
0.1099999975413084
Episode 32	Average Score: 0.24	Score: 0.110.8099999818950891
Episode 33	Average Score: 0.26	Score: 0.81actions batch at 23000-th learning:
	 shape = (128, 4),
	 mean = [0.40011907 0.3201105  0.6609279  0.2886256 ],
	  std = [0.6637805 0.6518069 0.5691627 0.6458452]
0.41999999061226845
Episode 34	Average Score: 0.26	Score: 0.42actions batch at 24000-th learning:
	 shape = (128, 4),
	 mean = [0.3888832  0.22682355 0.5930466  0.24551077],
	  std = [0.66120964 0.6547482  0.5845481  0.64848083]
0.08999999798834324
Episode 35	Average Score: 0.26	Score: 0.09actions batch at 25000-th learning:
	 shape = (128, 4),
	 mean = [0.27949    0.22746344 0.5409232  0.1718679 ],
	  std = [0.6493582  0.6406939  0.6271888  0.63979954]
0.6299999859184027
Episode 36	Average Score: 0.27	Score: 0.630.0
Episode 37	Average Score: 0.26	Score: 0.00actions batch at 26000-th learning:
	 shape = (128, 4),
	 mean = [0.3267429  0.3205608  0.53303653 0.28985992],
	  std = [0.6623145  0.6391742  0.6137441  0.67099977]
2.12999995239079
Episode 38	Average Score: 0.31	Score: 2.13actions batch at 27000-th learning:
	 shape = (128, 4),
	 mean = [0.45074606 0.40060338 0.60144573 0.40420246],
	  std = [0.6281709  0.64942986 0.5779819  0.64464563]
0.0
Episode 39	Average Score: 0.30	Score: 0.000.9699999783188105
Episode 40	Average Score: 0.32
actions batch at 28000-th learning:
	 shape = (128, 4),
	 mean = [0.2141951  0.26665655 0.5345362  0.1660307 ],
	  std = [0.6461623  0.6359192  0.61666155 0.61745566]
2.0199999548494816
Episode 41	Average Score: 0.36	Score: 2.02actions batch at 29000-th learning:
	 shape = (128, 4),
	 mean = [0.44543514 0.26099902 0.5913011  0.20713763],
	  std = [0.6438238 0.625896  0.5872445 0.6455962]
1.1099999751895666
Episode 42	Average Score: 0.38	Score: 1.110.0
Episode 43	Average Score: 0.37	Score: 0.00actions batch at 30000-th learning:
	 shape = (128, 4),
	 mean = [0.37584415 0.3174604  0.5286791  0.26573488],
	  std = [0.6499039 0.6528173 0.6298577 0.6684524]
0.0
Episode 44	Average Score: 0.36	Score: 0.00actions batch at 31000-th learning:
	 shape = (128, 4),
	 mean = [0.3271593  0.37439182 0.5815926  0.2483398 ],
	  std = [0.6591647  0.642501   0.580598   0.65745085]
0.6599999852478504
Episode 45	Average Score: 0.37	Score: 0.66actions batch at 32000-th learning:
	 shape = (128, 4),
	 mean = [0.39573    0.34457853 0.44652593 0.3153271 ],
	  std = [0.6435218  0.6561359  0.6278616  0.65303904]
0.7299999836832285
Episode 46	Average Score: 0.37	Score: 0.731.2299999725073576
Episode 47	Average Score: 0.39	Score: 1.23actions batch at 33000-th learning:
	 shape = (128, 4),
	 mean = [0.32668933 0.33032086 0.5087584  0.27983934],
	  std = [0.63730866 0.65227765 0.625835   0.6547617 ]
2.0999999530613422
Episode 48	Average Score: 0.43	Score: 2.10actions batch at 34000-th learning:
	 shape = (128, 4),
	 mean = [0.3514457  0.27974814 0.5114357  0.14669569],
	  std = [0.64526117 0.6667877  0.60573554 0.6333539 ]
0.7099999841302633
Episode 49	Average Score: 0.43	Score: 0.710.8899999801069498
Episode 50	Average Score: 0.44
actions batch at 35000-th learning:
	 shape = (128, 4),
	 mean = [0.40773633 0.38250494 0.5991274  0.29457507],
	  std = [0.6477609 0.6531643 0.5960358 0.6613005]
1.9199999570846558
Episode 51	Average Score: 0.47	Score: 1.92actions batch at 36000-th learning:
	 shape = (128, 4),
	 mean = [0.2960464  0.24981673 0.41959906 0.21998653],
	  std = [0.6402488  0.6435101  0.64271975 0.6591348 ]
1.7799999602138996
Episode 52	Average Score: 0.50	Score: 1.780.6999999843537807
Episode 53	Average Score: 0.50	Score: 0.70actions batch at 37000-th learning:
	 shape = (128, 4),
	 mean = [0.3032997  0.35178155 0.4724636  0.22084881],
	  std = [0.64056486 0.6387229  0.5884532  0.62289065]
1.2399999722838402
Episode 54	Average Score: 0.51	Score: 1.24actions batch at 38000-th learning:
	 shape = (128, 4),
	 mean = [0.31436372 0.34592578 0.49883503 0.31367895],
	  std = [0.6352533  0.64694756 0.61728734 0.65512097]
1.1299999747425318
Episode 55	Average Score: 0.53	Score: 1.13actions batch at 39000-th learning:
	 shape = (128, 4),
	 mean = [0.31990543 0.2579474  0.4726624  0.3280322 ],
	  std = [0.6533201  0.64481336 0.6141738  0.62988263]
1.0599999763071537
Episode 56	Average Score: 0.53	Score: 1.060.4399999901652336
Episode 57	Average Score: 0.53	Score: 0.44actions batch at 40000-th learning:
	 shape = (128, 4),
	 mean = [0.28619513 0.29745054 0.54061896 0.28867537],
	  std = [0.6587296  0.6519116  0.61958236 0.6451937 ]
0.3799999915063381
Episode 58	Average Score: 0.53	Score: 0.38actions batch at 41000-th learning:
	 shape = (128, 4),
	 mean = [0.28058076 0.23821351 0.48711628 0.254015  ],
	  std = [0.6695568  0.6523496  0.6278066  0.62979466]
1.0399999767541885
Episode 59	Average Score: 0.54	Score: 1.041.939999956637621
Episode 60	Average Score: 0.56
actions batch at 42000-th learning:
	 shape = (128, 4),
	 mean = [0.3156876  0.31369156 0.46162027 0.32394582],
	  std = [0.6553512 0.6417168 0.6113654 0.647255 ]
1.2499999720603228
Episode 61	Average Score: 0.57	Score: 1.25actions batch at 43000-th learning:
	 shape = (128, 4),
	 mean = [0.26398194 0.31954297 0.44185382 0.2388721 ],
	  std = [0.6292372  0.64580745 0.6303025  0.6304557 ]
1.1899999734014273
Episode 62	Average Score: 0.58	Score: 1.191.6599999628961086
Episode 63	Average Score: 0.60	Score: 1.66actions batch at 44000-th learning:
	 shape = (128, 4),
	 mean = [0.32405645 0.32669753 0.46806854 0.27614132],
	  std = [0.65338653 0.6270733  0.61938316 0.64884084]
0.1699999962002039
Episode 64	Average Score: 0.59	Score: 0.17actions batch at 45000-th learning:
	 shape = (128, 4),
	 mean = [0.18523872 0.24931498 0.33141384 0.22278446],
	  std = [0.6265344  0.6276133  0.617353   0.62497705]
1.1599999740719795
Episode 65	Average Score: 0.60	Score: 1.16actions batch at 46000-th learning:
	 shape = (128, 4),
	 mean = [0.25059748 0.34658512 0.35227925 0.2564374 ],
	  std = [0.6382145  0.66051376 0.6492352  0.6410386 ]
0.35999999195337296
Episode 66	Average Score: 0.60	Score: 0.360.9899999778717756
Episode 67	Average Score: 0.60	Score: 0.99actions batch at 47000-th learning:
	 shape = (128, 4),
	 mean = [0.3068998  0.39354718 0.4376912  0.27758166],
	  std = [0.6417057  0.62951905 0.62938464 0.64046586]
0.6599999852478504
Episode 68	Average Score: 0.61	Score: 0.66actions batch at 48000-th learning:
	 shape = (128, 4),
	 mean = [0.29390904 0.3549543  0.31321737 0.33977446],
	  std = [0.628543  0.626165  0.6390235 0.63027  ]
1.3999999687075615
Episode 69	Average Score: 0.62	Score: 1.400.3999999910593033
Episode 70	Average Score: 0.61
actions batch at 49000-th learning:
	 shape = (128, 4),
	 mean = [0.24720222 0.33309898 0.42166045 0.3333599 ],
	  std = [0.6373282  0.6414367  0.6382227  0.65830815]
0.5199999883770943
Episode 71	Average Score: 0.61	Score: 0.52actions batch at 50000-th learning:
	 shape = (128, 4),
	 mean = [0.22586668 0.37618202 0.4159114  0.32817337],
	  std = [0.63344634 0.63341516 0.6100769  0.6306782 ]
0.4499999899417162
Episode 72	Average Score: 0.61	Score: 0.450.42999999038875103
Episode 73	Average Score: 0.61	Score: 0.43actions batch at 51000-th learning:
	 shape = (128, 4),
	 mean = [0.27087116 0.39616984 0.36046535 0.25712138],
	  std = [0.64419013 0.6185777  0.63802934 0.62231386]
1.6599999628961086
Episode 74	Average Score: 0.62	Score: 1.66actions batch at 52000-th learning:
	 shape = (128, 4),
	 mean = [0.23184338 0.2843532  0.34665966 0.31900215],
	  std = [0.6164863  0.65429777 0.624139   0.6305414 ]
0.549999987706542
Episode 75	Average Score: 0.62	Score: 0.55actions batch at 53000-th learning:
	 shape = (128, 4),
	 mean = [0.32554615 0.33615687 0.4054471  0.31111687],
	  std = [0.626914   0.6341421  0.63215417 0.64664465]
0.0
Episode 76	Average Score: 0.61	Score: 0.000.549999987706542
Episode 77	Average Score: 0.61	Score: 0.55actions batch at 54000-th learning:
	 shape = (128, 4),
	 mean = [0.3439024  0.32374713 0.40927845 0.2739764 ],
	  std = [0.64343596 0.6336347  0.62056714 0.6273167 ]
0.3199999928474426
Episode 78	Average Score: 0.61	Score: 0.32actions batch at 55000-th learning:
	 shape = (128, 4),
	 mean = [0.15410276 0.20625116 0.3625148  0.20477235],
	  std = [0.57450515 0.61294997 0.64763904 0.6241376 ]
1.339999970048666
Episode 79	Average Score: 0.62	Score: 1.340.5399999879300594
Episode 80	Average Score: 0.62
actions batch at 56000-th learning:
	 shape = (128, 4),
	 mean = [0.3169377  0.377274   0.41856053 0.3777416 ],
	  std = [0.62097496 0.63784987 0.6363084  0.6388291 ]
0.13999999687075615
Episode 81	Average Score: 0.61	Score: 0.14actions batch at 57000-th learning:
	 shape = (128, 4),
	 mean = [0.15699337 0.39438456 0.26440656 0.221831  ],
	  std = [0.6004398  0.6352734  0.6368726  0.60976225]
1.1599999740719795
Episode 82	Average Score: 0.62	Score: 1.161.2499999720603228
Episode 83	Average Score: 0.63	Score: 1.25actions batch at 58000-th learning:
	 shape = (128, 4),
	 mean = [0.31013674 0.39928678 0.49929857 0.29757112],
	  std = [0.63507175 0.6406209  0.609958   0.626305  ]
0.6399999856948853
Episode 84	Average Score: 0.63	Score: 0.64actions batch at 59000-th learning:
	 shape = (128, 4),
	 mean = [0.32470682 0.2728946  0.3584144  0.2686361 ],
	  std = [0.6418662 0.6247969 0.6196615 0.6452995]
0.9699999783188105
Episode 85	Average Score: 0.63	Score: 0.97actions batch at 60000-th learning:
	 shape = (128, 4),
	 mean = [0.38428548 0.29084885 0.35812488 0.2106804 ],
	  std = [0.6325644  0.62802154 0.62731683 0.6271151 ]
1.2999999709427357
Episode 86	Average Score: 0.64	Score: 1.302.2599999494850636
Episode 87	Average Score: 0.66	Score: 2.26actions batch at 61000-th learning:
	 shape = (128, 4),
	 mean = [0.2177942  0.3184337  0.35938537 0.23661475],
	  std = [0.62610716 0.6343355  0.62382305 0.6402961 ]
0.6999999843537807
Episode 88	Average Score: 0.66	Score: 0.70actions batch at 62000-th learning:
	 shape = (128, 4),
	 mean = [0.2975654  0.42794082 0.4449501  0.28003472],
	  std = [0.6311578 0.6190677 0.5893335 0.6395589]
0.6799999848008156
Episode 89	Average Score: 0.66	Score: 0.680.14999999664723873
Episode 90	Average Score: 0.65
actions batch at 63000-th learning:
	 shape = (128, 4),
	 mean = [0.19271128 0.30708778 0.4503786  0.2078336 ],
	  std = [0.6357504  0.61721087 0.63147646 0.6191748 ]
0.8899999801069498
Episode 91	Average Score: 0.65	Score: 0.89actions batch at 64000-th learning:
	 shape = (128, 4),
	 mean = [0.24366833 0.3186137  0.3742056  0.21814914],
	  std = [0.6159226  0.6158625  0.61806124 0.6222638 ]
1.149999974295497
Episode 92	Average Score: 0.66	Score: 1.151.7499999608844519
Episode 93	Average Score: 0.67	Score: 1.75actions batch at 65000-th learning:
	 shape = (128, 4),
	 mean = [0.22584087 0.32033038 0.31148705 0.3135241 ],
	  std = [0.6451623 0.6109047 0.6254171 0.6379716]
0.2799999937415123
Episode 94	Average Score: 0.67	Score: 0.28actions batch at 66000-th learning:
	 shape = (128, 4),
	 mean = [0.16984813 0.3025721  0.36434013 0.2320921 ],
	  std = [0.61576885 0.6252344  0.6153418  0.62458396]
0.8099999818950891
Episode 95	Average Score: 0.67	Score: 0.81actions batch at 67000-th learning:
	 shape = (128, 4),
	 mean = [0.25487393 0.37888268 0.4203266  0.26461884],
	  std = [0.620932   0.6267157  0.61483747 0.6148122 ]
3.1699999291449785
Episode 96	Average Score: 0.69	Score: 3.171.1999999731779099
Episode 97	Average Score: 0.70	Score: 1.20actions batch at 68000-th learning:
	 shape = (128, 4),
	 mean = [0.26599765 0.3647638  0.2813592  0.2594468 ],
	  std = [0.60570174 0.62460095 0.6311952  0.6108531 ]
1.3299999702721834
Episode 98	Average Score: 0.71	Score: 1.33actions batch at 69000-th learning:
	 shape = (128, 4),
	 mean = [0.07577047 0.29213744 0.22007163 0.19142129],
	  std = [0.55967987 0.59856117 0.6094727  0.58914495]
0.6799999848008156
Episode 99	Average Score: 0.71	Score: 0.681.5699999649077654
Episode 100	Average Score: 0.71
actions batch at 70000-th learning:
	 shape = (128, 4),
	 mean = [0.23868515 0.3017905  0.3344451  0.28074837],
	  std = [0.6275234  0.6293779  0.6140343  0.63600534]
1.9699999559670687
Episode 101	Average Score: 0.73	Score: 1.97actions batch at 71000-th learning:
	 shape = (128, 4),
	 mean = [0.23604576 0.3090293  0.26743644 0.23019107],
	  std = [0.6260617  0.6152235  0.62590706 0.6401807 ]
0.9999999776482582
Episode 102	Average Score: 0.74	Score: 1.002.0199999548494816
Episode 103	Average Score: 0.76	Score: 2.02actions batch at 72000-th learning:
	 shape = (128, 4),
	 mean = [0.12179673 0.31397733 0.33659703 0.21807306],
	  std = [0.5998169  0.63570285 0.61716855 0.6202503 ]
1.3699999693781137
Episode 104	Average Score: 0.78	Score: 1.37actions batch at 73000-th learning:
	 shape = (128, 4),
	 mean = [0.25461963 0.27281457 0.36500657 0.41220143],
	  std = [0.601879  0.608701  0.6159682 0.6049978]
0.3499999921768904
Episode 105	Average Score: 0.78	Score: 0.35actions batch at 74000-th learning:
	 shape = (128, 4),
	 mean = [0.24194252 0.37055737 0.22818513 0.30198923],
	  std = [0.6251107  0.623375   0.6285125  0.62503165]
1.1599999740719795
Episode 106	Average Score: 0.79	Score: 1.161.4799999669194221
Episode 107	Average Score: 0.81	Score: 1.48actions batch at 75000-th learning:
	 shape = (128, 4),
	 mean = [0.22019587 0.2538052  0.28936785 0.22350763],
	  std = [0.6066451  0.62739277 0.6116809  0.6180243 ]
0.5099999886006117
Episode 108	Average Score: 0.81	Score: 0.51actions batch at 76000-th learning:
	 shape = (128, 4),
	 mean = [0.17239875 0.41067448 0.22138572 0.19655304],
	  std = [0.61371446 0.62614346 0.61796725 0.62645626]
0.7799999825656414
Episode 109	Average Score: 0.82	Score: 0.781.169999973848462
Episode 110	Average Score: 0.83
actions batch at 77000-th learning:
	 shape = (128, 4),
	 mean = [0.25800762 0.35359165 0.3473912  0.36042717],
	  std = [0.63636804 0.625619   0.6092698  0.62883085]
0.8099999818950891
Episode 111	Average Score: 0.84	Score: 0.81actions batch at 78000-th learning:
	 shape = (128, 4),
	 mean = [0.24102667 0.31007904 0.37693664 0.24461004],
	  std = [0.6178839  0.6136605  0.62903196 0.6343712 ]
0.3199999928474426
Episode 112	Average Score: 0.84	Score: 0.320.1599999964237213
Episode 113	Average Score: 0.84	Score: 0.16actions batch at 79000-th learning:
	 shape = (128, 4),
	 mean = [0.2705605  0.35799003 0.32694918 0.40600458],
	  std = [0.6077779  0.6264295  0.62299395 0.6299352 ]
0.0
Episode 114	Average Score: 0.84	Score: 0.00actions batch at 80000-th learning:
	 shape = (128, 4),
	 mean = [0.2839339  0.36669168 0.33305436 0.21078889],
	  std = [0.61649877 0.6031337  0.60929424 0.6061517 ]
1.6199999637901783
Episode 115	Average Score: 0.85	Score: 1.62actions batch at 81000-th learning:
	 shape = (128, 4),
	 mean = [0.23651129 0.3158104  0.28492996 0.2516399 ],
	  std = [0.6213135  0.6107184  0.61895734 0.6083591 ]
0.9199999794363976
Episode 116	Average Score: 0.86	Score: 0.920.35999999195337296
Episode 117	Average Score: 0.86	Score: 0.36actions batch at 82000-th learning:
	 shape = (128, 4),
	 mean = [0.1504214  0.3012068  0.38779828 0.18650734],
	  std = [0.5864619  0.6328788  0.6237812  0.62014097]
0.1599999964237213
Episode 118	Average Score: 0.86	Score: 0.16actions batch at 83000-th learning:
	 shape = (128, 4),
	 mean = [0.27062148 0.36874226 0.3647378  0.2665895 ],
	  std = [0.6046687  0.60513294 0.61960506 0.6266771 ]
1.7299999613314867
Episode 119	Average Score: 0.87	Score: 1.731.939999956637621
Episode 120	Average Score: 0.88
actions batch at 84000-th learning:
	 shape = (128, 4),
	 mean = [0.25273708 0.29015973 0.34882385 0.2525325 ],
	  std = [0.61487156 0.60736454 0.62411517 0.6327215 ]
0.2799999937415123
Episode 121	Average Score: 0.88	Score: 0.28actions batch at 85000-th learning:
	 shape = (128, 4),
	 mean = [0.2761733  0.313816   0.36991388 0.23578143],
	  std = [0.63054764 0.63421243 0.6358753  0.62560457]
0.7099999841302633
Episode 122	Average Score: 0.88	Score: 0.711.529999965801835
Episode 123	Average Score: 0.89	Score: 1.53actions batch at 86000-th learning:
	 shape = (128, 4),
	 mean = [0.28336418 0.32681176 0.26980773 0.2869182 ],
	  std = [0.6229108  0.6139461  0.61235535 0.6003115 ]
0.0
Episode 124	Average Score: 0.87	Score: 0.00actions batch at 87000-th learning:
	 shape = (128, 4),
	 mean = [0.16579556 0.281383   0.20777735 0.1900434 ],
	  std = [0.5815342  0.60993433 0.58998483 0.5879882 ]
4.009999910369515
Episode 125	Average Score: 0.91	Score: 4.01actions batch at 88000-th learning:
	 shape = (128, 4),
	 mean = [0.09505996 0.23299147 0.28371787 0.3416205 ],
	  std = [0.5656929 0.6190814 0.6052275 0.6125957]
1.5999999642372131
Episode 126	Average Score: 0.92	Score: 1.601.5399999655783176
Episode 127	Average Score: 0.94	Score: 1.54actions batch at 89000-th learning:
	 shape = (128, 4),
	 mean = [0.13621303 0.28953615 0.3520427  0.2594512 ],
	  std = [0.6041344 0.6224532 0.6151047 0.6339606]
2.8399999365210533
Episode 128	Average Score: 0.97	Score: 2.84actions batch at 90000-th learning:
	 shape = (128, 4),
	 mean = [0.13541462 0.2769566  0.3626123  0.2058708 ],
	  std = [0.5940897  0.61167413 0.61929137 0.61741483]
2.4799999445676804
Episode 129	Average Score: 0.99	Score: 2.482.5899999421089888
Episode 130	Average Score: 1.02
actions batch at 91000-th learning:
	 shape = (128, 4),
	 mean = [0.11230359 0.25768912 0.31967747 0.23128732],
	  std = [0.5986471  0.6093092  0.59121907 0.60971546]
1.459999967366457
Episode 131	Average Score: 1.03	Score: 1.46actions batch at 92000-th learning:
	 shape = (128, 4),
	 mean = [0.12457738 0.3499898  0.28530505 0.25449884],
	  std = [0.5790815  0.600274   0.60026324 0.61111253]
0.29999999329447746
Episode 132	Average Score: 1.03	Score: 0.302.179999951273203
Episode 133	Average Score: 1.04	Score: 2.18actions batch at 93000-th learning:
	 shape = (128, 4),
	 mean = [0.13915218 0.2510472  0.26990512 0.2044585 ],
	  std = [0.5893565  0.6079547  0.6394042  0.61736524]
2.5999999418854713
Episode 134	Average Score: 1.07	Score: 2.60actions batch at 94000-th learning:
	 shape = (128, 4),
	 mean = [0.17589602 0.29610956 0.3271139  0.16021514],
	  std = [0.5680093  0.5943745  0.58681905 0.59520215]
0.8899999801069498
Episode 135	Average Score: 1.07	Score: 0.89actions batch at 95000-th learning:
	 shape = (128, 4),
	 mean = [0.10731536 0.26581937 0.26158413 0.21140598],
	  std = [0.56599206 0.60864824 0.6220907  0.5951336 ]
2.4099999461323023
Episode 136	Average Score: 1.09	Score: 2.412.8899999354034662
Episode 137	Average Score: 1.12	Score: 2.89actions batch at 96000-th learning:
	 shape = (128, 4),
	 mean = [0.10648315 0.35031813 0.2624931  0.287181  ],
	  std = [0.5884634 0.6329903 0.6163188 0.629652 ]
2.5899999421089888
Episode 138	Average Score: 1.12	Score: 2.59actions batch at 97000-th learning:
	 shape = (128, 4),
	 mean = [0.19729339 0.2513015  0.2635001  0.2358474 ],
	  std = [0.5905768  0.6097635  0.63177633 0.60941696]
1.699999962002039
Episode 139	Average Score: 1.14	Score: 1.702.7699999380856752
Episode 140	Average Score: 1.16
actions batch at 98000-th learning:
	 shape = (128, 4),
	 mean = [0.21074016 0.40150124 0.29789037 0.13949224],
	  std = [0.6461006  0.6297786  0.642401   0.62248605]
0.0
Episode 141	Average Score: 1.14	Score: 0.00actions batch at 99000-th learning:
	 shape = (128, 4),
	 mean = [0.02187528 0.1884748  0.30114275 0.08386669],
	  std = [0.55368125 0.5977874  0.60539705 0.56250405]
1.699999962002039
Episode 142	Average Score: 1.15	Score: 1.701.4199999682605267
Episode 143	Average Score: 1.16	Score: 1.42actions batch at 100000-th learning:
	 shape = (128, 4),
	 mean = [0.03173192 0.28565452 0.2499042  0.17592303],
	  std = [0.563812   0.61312276 0.6032759  0.594532  ]
1.9699999559670687
Episode 144	Average Score: 1.18	Score: 1.97actions batch at 101000-th learning:
	 shape = (128, 4),
	 mean = [0.14593723 0.28071687 0.38551643 0.2657659 ],
	  std = [0.59595054 0.6152023  0.6215922  0.6178747 ]
0.7599999830126762
Episode 145	Average Score: 1.18	Score: 0.76actions batch at 102000-th learning:
	 shape = (128, 4),
	 mean = [0.11116302 0.3445767  0.26142684 0.12555613],
	  std = [0.5973634  0.6233698  0.60676414 0.5817949 ]
3.209999928250909
Episode 146	Average Score: 1.20	Score: 3.211.4199999682605267
Episode 147	Average Score: 1.21	Score: 1.42actions batch at 103000-th learning:
	 shape = (128, 4),
	 mean = [0.15065104 0.2594118  0.22887576 0.18653862],
	  std = [0.591604   0.61623454 0.6211856  0.6096918 ]
2.149999951943755
Episode 148	Average Score: 1.21	Score: 2.15actions batch at 104000-th learning:
	 shape = (128, 4),
	 mean = [0.08953416 0.26337695 0.28345528 0.25310528],
	  std = [0.57219887 0.60681766 0.6242086  0.59863776]
2.1899999510496855
Episode 149	Average Score: 1.22	Score: 2.193.2199999280273914
Episode 150	Average Score: 1.25
actions batch at 105000-th learning:
	 shape = (128, 4),
	 mean = [0.12615576 0.29396713 0.31973457 0.20229167],
	  std = [0.60977626 0.5840536  0.62671    0.59013134]
0.9999999776482582
Episode 151	Average Score: 1.24	Score: 1.00actions batch at 106000-th learning:
	 shape = (128, 4),
	 mean = [0.24909438 0.3557334  0.3797126  0.23395823],
	  std = [0.61800444 0.60893834 0.6235055  0.6142282 ]
2.2799999490380287
Episode 152	Average Score: 1.24	Score: 2.280.5399999879300594
Episode 153	Average Score: 1.24	Score: 0.54actions batch at 107000-th learning:
	 shape = (128, 4),
	 mean = [0.10923679 0.29631683 0.22810988 0.19846532],
	  std = [0.59031093 0.59987783 0.60285497 0.59719265]
2.0999999530613422
Episode 154	Average Score: 1.25	Score: 2.10actions batch at 108000-th learning:
	 shape = (128, 4),
	 mean = [0.13611357 0.29410428 0.3051329  0.2672056 ],
	  std = [0.5742897  0.61453927 0.6134371  0.6162865 ]
2.419999945908785
Episode 155	Average Score: 1.26	Score: 2.42actions batch at 109000-th learning:
	 shape = (128, 4),
	 mean = [0.13365543 0.3970812  0.34479946 0.25940284],
	  std = [0.6095884  0.6062539  0.610244   0.63757604]
1.7599999606609344
Episode 156	Average Score: 1.27	Score: 1.760.18999999575316906
Episode 157	Average Score: 1.27	Score: 0.19actions batch at 110000-th learning:
	 shape = (128, 4),
	 mean = [0.13267176 0.27543214 0.29390714 0.21510877],
	  std = [0.58727485 0.5950946  0.6065138  0.6230372 ]
1.81999995931983
Episode 158	Average Score: 1.28	Score: 1.82actions batch at 111000-th learning:
	 shape = (128, 4),
	 mean = [0.09824385 0.29184368 0.34138057 0.18446669],
	  std = [0.58264655 0.61357003 0.6104255  0.60294074]
2.849999936297536
Episode 159	Average Score: 1.30	Score: 2.851.2999999709427357
Episode 160	Average Score: 1.29
actions batch at 112000-th learning:
	 shape = (128, 4),
	 mean = [0.1404875  0.33853787 0.2395662  0.15861194],
	  std = [0.5795167 0.5905002 0.6040638 0.6137525]
2.8599999360740185
Episode 161	Average Score: 1.31	Score: 2.86actions batch at 113000-th learning:
	 shape = (128, 4),
	 mean = [0.20671009 0.38742825 0.23234573 0.23290601],
	  std = [0.61916953 0.6028736  0.6027442  0.6191131 ]
3.36999992467463
Episode 162	Average Score: 1.33	Score: 3.372.74999993853271
Episode 163	Average Score: 1.34	Score: 2.75actions batch at 114000-th learning:
	 shape = (128, 4),
	 mean = [0.1540212  0.36712745 0.32605308 0.28943497],
	  std = [0.61123216 0.61354387 0.6172813  0.61546457]
0.6799999848008156
Episode 164	Average Score: 1.35	Score: 0.68actions batch at 115000-th learning:
	 shape = (128, 4),
	 mean = [0.20270193 0.38519812 0.32148552 0.2974125 ],
	  std = [0.62806576 0.6266638  0.61411726 0.61182463]
1.8299999590963125
Episode 165	Average Score: 1.35	Score: 1.83actions batch at 116000-th learning:
	 shape = (128, 4),
	 mean = [0.08178514 0.3056395  0.30922738 0.21619003],
	  std = [0.548998  0.5972088 0.5967392 0.6221917]
3.0499999318271875
Episode 166	Average Score: 1.38	Score: 3.052.2399999499320984
Episode 167	Average Score: 1.39	Score: 2.24actions batch at 117000-th learning:
	 shape = (128, 4),
	 mean = [0.18288197 0.41515976 0.3479264  0.22305147],
	  std = [0.6027419  0.61161375 0.59720427 0.6055167 ]
3.3299999255687
Episode 168	Average Score: 1.42	Score: 3.33actions batch at 118000-th learning:
	 shape = (128, 4),
	 mean = [0.2540024  0.35380653 0.28182113 0.16740657],
	  std = [0.6120726 0.6293182 0.6186944 0.6201261]
3.609999919310212
Episode 169	Average Score: 1.44	Score: 3.612.869999935850501
Episode 170	Average Score: 1.47
actions batch at 119000-th learning:
	 shape = (128, 4),
	 mean = [0.08255266 0.22704807 0.27192762 0.15760519],
	  std = [0.58428454 0.6009379  0.61936015 0.6069366 ]
2.0699999537318945
Episode 171	Average Score: 1.48	Score: 2.07actions batch at 120000-th learning:
	 shape = (128, 4),
	 mean = [0.12071066 0.3134974  0.33927107 0.18613926],
	  std = [0.5790158  0.60175335 0.6019855  0.6167397 ]
2.7899999376386404
Episode 172	Average Score: 1.50	Score: 2.791.959999956190586
Episode 173	Average Score: 1.52	Score: 1.96actions batch at 121000-th learning:
	 shape = (128, 4),
	 mean = [0.16604517 0.3559894  0.26943374 0.24385646],
	  std = [0.5691454  0.60854214 0.6032122  0.610287  ]
3.9899999108165503
Episode 174	Average Score: 1.54	Score: 3.99actions batch at 122000-th learning:
	 shape = (128, 4),
	 mean = [0.25138164 0.30194107 0.35118774 0.33829838],
	  std = [0.62294316 0.6203356  0.5983846  0.60341835]
2.9099999349564314
Episode 175	Average Score: 1.57	Score: 2.91actions batch at 123000-th learning:
	 shape = (128, 4),
	 mean = [0.11693725 0.43061796 0.23515053 0.28635013],
	  std = [0.60694593 0.5928099  0.610879   0.6095932 ]
1.1299999747425318
Episode 176	Average Score: 1.58	Score: 1.135.1799998842179775
Episode 177	Average Score: 1.62	Score: 5.18actions batch at 124000-th learning:
	 shape = (128, 4),
	 mean = [0.18535061 0.3348822  0.29630598 0.21648508],
	  std = [0.6040074  0.59531057 0.60270256 0.6019591 ]
1.6699999626725912
Episode 178	Average Score: 1.64	Score: 1.67actions batch at 125000-th learning:
	 shape = (128, 4),
	 mean = [0.09702244 0.2908188  0.20490216 0.10899619],
	  std = [0.56026703 0.6157967  0.5970619  0.57393664]
2.5899999421089888
Episode 179	Average Score: 1.65	Score: 2.591.0599999763071537
Episode 180	Average Score: 1.66
actions batch at 126000-th learning:
	 shape = (128, 4),
	 mean = [0.20203838 0.42683995 0.3049933  0.30725318],
	  std = [0.6118312  0.59668607 0.6127289  0.5987963 ]
2.7599999383091927
Episode 181	Average Score: 1.68	Score: 2.76actions batch at 127000-th learning:
	 shape = (128, 4),
	 mean = [0.23314053 0.44939822 0.34553194 0.28615522],
	  std = [0.6045825 0.604005  0.614029  0.6007067]
3.469999922439456
Episode 182	Average Score: 1.70	Score: 3.474.639999896287918
Episode 183	Average Score: 1.74	Score: 4.64actions batch at 128000-th learning:
	 shape = (128, 4),
	 mean = [0.14952312 0.29622242 0.2397517  0.13217387],
	  std = [0.57783127 0.6004346  0.60505897 0.5552631 ]
5.699999872595072
Episode 184	Average Score: 1.79	Score: 5.70actions batch at 129000-th learning:
	 shape = (128, 4),
	 mean = [0.12685412 0.23231493 0.25312185 0.1662114 ],
	  std = [0.5842642  0.5800196  0.6060306  0.60288304]
3.6199999190866947
Episode 185	Average Score: 1.82	Score: 3.62actions batch at 130000-th learning:
	 shape = (128, 4),
	 mean = [0.12211426 0.27597618 0.25389883 0.19238122],
	  std = [0.5600649  0.58892846 0.6121865  0.5710825 ]
1.6099999640136957
Episode 186	Average Score: 1.82	Score: 1.613.9799999110400677
Episode 187	Average Score: 1.84	Score: 3.98actions batch at 131000-th learning:
	 shape = (128, 4),
	 mean = [0.23589931 0.42538413 0.27282685 0.20725192],
	  std = [0.59216756 0.6267369  0.60713565 0.5971785 ]
2.9499999340623617
Episode 188	Average Score: 1.86	Score: 2.95actions batch at 132000-th learning:
	 shape = (128, 4),
	 mean = [0.19551139 0.36214027 0.27772698 0.15303336],
	  std = [0.6119793  0.6242165  0.59335095 0.6034888 ]
3.8099999148398638
Episode 189	Average Score: 1.89	Score: 3.813.209999928250909
Episode 190	Average Score: 1.92
actions batch at 133000-th learning:
	 shape = (128, 4),
	 mean = [0.1772629  0.3492425  0.28143018 0.2155128 ],
	  std = [0.5935892 0.6239675 0.5807635 0.6012438]
3.159999929368496
Episode 191	Average Score: 1.94	Score: 3.16actions batch at 134000-th learning:
	 shape = (128, 4),
	 mean = [0.17666462 0.37183288 0.25454938 0.12100638],
	  std = [0.59600484 0.5843012  0.5971482  0.57896185]
1.6699999626725912
Episode 192	Average Score: 1.95	Score: 1.674.159999907016754
Episode 193	Average Score: 1.97	Score: 4.16actions batch at 135000-th learning:
	 shape = (128, 4),
	 mean = [0.15817612 0.37653512 0.34300616 0.2678368 ],
	  std = [0.6023421  0.61042523 0.62244314 0.5985452 ]
3.7199999168515205
Episode 194	Average Score: 2.01	Score: 3.72actions batch at 136000-th learning:
	 shape = (128, 4),
	 mean = [0.20015638 0.40420738 0.26067507 0.21785116],
	  std = [0.62026924 0.59653294 0.6205207  0.62573826]
1.6699999626725912
Episode 195	Average Score: 2.02	Score: 1.67actions batch at 137000-th learning:
	 shape = (128, 4),
	 mean = [0.13750339 0.3294235  0.20820293 0.19913846],
	  std = [0.5805379  0.60835075 0.5861199  0.58721495]
1.649999963119626
Episode 196	Average Score: 2.00	Score: 1.653.9799999110400677
Episode 197	Average Score: 2.03	Score: 3.98actions batch at 138000-th learning:
	 shape = (128, 4),
	 mean = [0.15054417 0.3288161  0.2287475  0.13731946],
	  std = [0.57935447 0.62842435 0.58160734 0.6046456 ]
2.989999933168292
Episode 198	Average Score: 2.04	Score: 2.99actions batch at 139000-th learning:
	 shape = (128, 4),
	 mean = [0.11547498 0.33391544 0.26551992 0.07638595],
	  std = [0.5828674  0.5949179  0.56765217 0.543902  ]
2.919999934732914
Episode 199	Average Score: 2.07	Score: 2.926.009999865666032
Episode 200	Average Score: 2.11
actions batch at 140000-th learning:
	 shape = (128, 4),
	 mean = [0.18256395 0.36805102 0.25073257 0.22529218],
	  std = [0.5844877  0.6024252  0.59365356 0.60904056]
6.179999861866236
Episode 201	Average Score: 2.15	Score: 6.18actions batch at 141000-th learning:
	 shape = (128, 4),
	 mean = [0.150882   0.27255857 0.14326443 0.16675532],
	  std = [0.57910126 0.58097744 0.57523584 0.5722141 ]
2.5499999430030584
Episode 202	Average Score: 2.17	Score: 2.551.6699999626725912
Episode 203	Average Score: 2.17	Score: 1.67actions batch at 142000-th learning:
	 shape = (128, 4),
	 mean = [0.13234636 0.2109032  0.3495944  0.09942139],
	  std = [0.59038186 0.58307815 0.59290123 0.59618294]
4.599999897181988
Episode 204	Average Score: 2.20	Score: 4.60actions batch at 143000-th learning:
	 shape = (128, 4),
	 mean = [0.10418957 0.32013327 0.31817988 0.21009703],
	  std = [0.6022608  0.59587395 0.59000653 0.59958184]
1.2999999709427357
Episode 205	Average Score: 2.21	Score: 1.30actions batch at 144000-th learning:
	 shape = (128, 4),
	 mean = [0.14903364 0.38517478 0.30890763 0.2686921 ],
	  std = [0.5853906  0.58852214 0.60164624 0.61031383]
1.9799999557435513
Episode 206	Average Score: 2.22	Score: 1.981.4299999680370092
Episode 207	Average Score: 2.21	Score: 1.43actions batch at 145000-th learning:
	 shape = (128, 4),
	 mean = [0.19583082 0.32206082 0.25730777 0.26047677],
	  std = [0.6150248 0.5818259 0.6182377 0.604028 ]
1.5999999642372131
Episode 208	Average Score: 2.23	Score: 1.60actions batch at 146000-th learning:
	 shape = (128, 4),
	 mean = [0.22379734 0.24581206 0.3193192  0.13192864],
	  std = [0.6085012  0.61443657 0.60132277 0.61032635]
2.4999999441206455
Episode 209	Average Score: 2.24	Score: 2.503.5599999204277992
Episode 210	Average Score: 2.27
actions batch at 147000-th learning:
	 shape = (128, 4),
	 mean = [0.09394779 0.31558794 0.3337825  0.25033128],
	  std = [0.5781252  0.59965587 0.57878953 0.59883875]
4.8699998911470175
Episode 211	Average Score: 2.31	Score: 4.87actions batch at 148000-th learning:
	 shape = (128, 4),
	 mean = [0.19535927 0.31195307 0.270373   0.16314602],
	  std = [0.5953567 0.596489  0.5903577 0.5768371]
4.379999902099371
Episode 212	Average Score: 2.35	Score: 4.386.579999852925539
Episode 213	Average Score: 2.41	Score: 6.58actions batch at 149000-th learning:
	 shape = (128, 4),
	 mean = [0.05909533 0.3935444  0.2547061  0.1374808 ],
	  std = [0.5518737  0.60989314 0.60271233 0.5590065 ]
3.0699999313801527
Episode 214	Average Score: 2.44	Score: 3.07actions batch at 150000-th learning:
	 shape = (128, 4),
	 mean = [0.08557839 0.372797   0.26591974 0.08301114],
	  std = [0.5617025 0.5888781 0.5858995 0.5501024]
3.0699999313801527
Episode 215	Average Score: 2.46	Score: 3.07actions batch at 151000-th learning:
	 shape = (128, 4),
	 mean = [0.1007649  0.3401784  0.2796254  0.18719043],
	  std = [0.54006773 0.6160914  0.586196   0.57575446]
3.8399999141693115
Episode 216	Average Score: 2.49	Score: 3.843.9099999126046896
Episode 217	Average Score: 2.52	Score: 3.91actions batch at 152000-th learning:
	 shape = (128, 4),
	 mean = [0.00643751 0.34134772 0.26964337 0.2351275 ],
	  std = [0.5329825  0.5920724  0.58601826 0.5869586 ]
2.8299999367445707
Episode 218	Average Score: 2.55	Score: 2.83actions batch at 153000-th learning:
	 shape = (128, 4),
	 mean = [0.16049674 0.3730469  0.28528106 0.22417398],
	  std = [0.5770853  0.59520525 0.6029712  0.5878228 ]
4.9399998895823956
Episode 219	Average Score: 2.58	Score: 4.942.539999943226576
Episode 220	Average Score: 2.59
actions batch at 154000-th learning:
	 shape = (128, 4),
	 mean = [0.12124395 0.3653108  0.250456   0.14682406],
	  std = [0.5614487 0.6014005 0.5798501 0.5760291]
3.129999930039048
Episode 221	Average Score: 2.62	Score: 3.13actions batch at 155000-th learning:
	 shape = (128, 4),
	 mean = [0.06982757 0.30814987 0.22523302 0.11626483],
	  std = [0.53396875 0.5931176  0.5662809  0.5403968 ]
5.279999881982803
Episode 222	Average Score: 2.66	Score: 5.286.379999857395887
Episode 223	Average Score: 2.71	Score: 6.38actions batch at 156000-th learning:
	 shape = (128, 4),
	 mean = [0.11541565 0.35283896 0.27080187 0.26411358],
	  std = [0.5701897  0.59192574 0.602389   0.60279304]
6.109999863430858
Episode 224	Average Score: 2.77	Score: 6.11actions batch at 157000-th learning:
	 shape = (128, 4),
	 mean = [0.09404009 0.2980686  0.15354913 0.15463237],
	  std = [0.5676958  0.60534316 0.54325706 0.55910194]
3.9599999114871025
Episode 225	Average Score: 2.77	Score: 3.96actions batch at 158000-th learning:
	 shape = (128, 4),
	 mean = [0.2751454  0.33247533 0.3044679  0.25572872],
	  std = [0.62080956 0.6040039  0.59432095 0.5896537 ]
4.809999892488122
Episode 226	Average Score: 2.80	Score: 4.815.069999886676669
Episode 227	Average Score: 2.84	Score: 5.07actions batch at 159000-th learning:
	 shape = (128, 4),
	 mean = [0.21160461 0.4140189  0.26231033 0.25649056],
	  std = [0.6311632 0.5830778 0.6060163 0.5949267]
6.749999849125743
Episode 228	Average Score: 2.88	Score: 6.75actions batch at 160000-th learning:
	 shape = (128, 4),
	 mean = [0.1520662  0.30612636 0.18795933 0.20193696],
	  std = [0.580226   0.61658597 0.56927544 0.5846612 ]
4.4399999007582664
Episode 229	Average Score: 2.90	Score: 4.445.569999875500798
Episode 230	Average Score: 2.93
actions batch at 161000-th learning:
	 shape = (128, 4),
	 mean = [0.1992189  0.33325875 0.28280398 0.21278034],
	  std = [0.59196556 0.58338314 0.5843906  0.61929286]
3.01999993249774
Episode 231	Average Score: 2.94	Score: 3.02actions batch at 162000-th learning:
	 shape = (128, 4),
	 mean = [0.17399675 0.39379925 0.33927083 0.24188714],
	  std = [0.59692216 0.5955398  0.5941624  0.60969305]
2.6199999414384365
Episode 232	Average Score: 2.96	Score: 2.625.879999868571758
Episode 233	Average Score: 3.00	Score: 5.88actions batch at 163000-th learning:
	 shape = (128, 4),
	 mean = [0.13918474 0.2693452  0.2021697  0.15177935],
	  std = [0.57108045 0.5616321  0.6006334  0.5794696 ]
4.539999898523092
Episode 234	Average Score: 3.02	Score: 4.54actions batch at 164000-th learning:
	 shape = (128, 4),
	 mean = [0.23474321 0.3409874  0.2604702  0.2786607 ],
	  std = [0.5879455  0.616702   0.61256886 0.626509  ]
3.0499999318271875
Episode 235	Average Score: 3.04	Score: 3.05actions batch at 165000-th learning:
	 shape = (128, 4),
	 mean = [0.25004584 0.36366358 0.31489837 0.17814182],
	  std = [0.6168087  0.604492   0.58581513 0.5603402 ]
5.709999872371554
Episode 236	Average Score: 3.08	Score: 5.714.679999895393848
Episode 237	Average Score: 3.09	Score: 4.68actions batch at 166000-th learning:
	 shape = (128, 4),
	 mean = [0.16744483 0.39732754 0.31832844 0.287526  ],
	  std = [0.57627153 0.5736454  0.6012933  0.6088565 ]
3.9799999110400677
Episode 238	Average Score: 3.11	Score: 3.98actions batch at 167000-th learning:
	 shape = (128, 4),
	 mean = [0.08651212 0.29432774 0.23066524 0.20397763],
	  std = [0.58238834 0.63097024 0.59019554 0.6042125 ]
3.4599999226629734
Episode 239	Average Score: 3.13	Score: 3.464.779999893158674
Episode 240	Average Score: 3.15
actions batch at 168000-th learning:
	 shape = (128, 4),
	 mean = [0.19272114 0.42773405 0.2597912  0.22048195],
	  std = [0.5786672  0.5815176  0.58504206 0.5958374 ]
7.04999984242022
Episode 241	Average Score: 3.22	Score: 7.05actions batch at 169000-th learning:
	 shape = (128, 4),
	 mean = [0.20896898 0.38849306 0.29440826 0.239963  ],
	  std = [0.6109855  0.59820694 0.6172414  0.57256126]
2.4099999461323023
Episode 242	Average Score: 3.22	Score: 2.415.719999872148037
Episode 243	Average Score: 3.27	Score: 5.72actions batch at 170000-th learning:
	 shape = (128, 4),
	 mean = [0.12348068 0.21803336 0.13934058 0.2019151 ],
	  std = [0.57781917 0.57807696 0.56152654 0.5838225 ]
4.3199999034404755
Episode 244	Average Score: 3.29	Score: 4.32actions batch at 171000-th learning:
	 shape = (128, 4),
	 mean = [0.27436358 0.38177422 0.25880268 0.25689635],
	  std = [0.60346043 0.6108779  0.6122403  0.5855764 ]
4.6299998965114355
Episode 245	Average Score: 3.33	Score: 4.63actions batch at 172000-th learning:
	 shape = (128, 4),
	 mean = [0.14132473 0.37546766 0.20664354 0.15593904],
	  std = [0.56612414 0.59776145 0.5874115  0.57215345]
4.309999903663993
Episode 246	Average Score: 3.34	Score: 4.316.319999858736992
Episode 247	Average Score: 3.39	Score: 6.32actions batch at 173000-th learning:
	 shape = (128, 4),
	 mean = [0.26997268 0.4023402  0.31939816 0.22882561],
	  std = [0.60197264 0.601723   0.5934471  0.596031  ]
4.6299998965114355
Episode 248	Average Score: 3.41	Score: 4.63actions batch at 174000-th learning:
	 shape = (128, 4),
	 mean = [0.1582704  0.37881988 0.20977813 0.11342543],
	  std = [0.5758477  0.56713784 0.60398364 0.5721824 ]
6.199999861419201
Episode 249	Average Score: 3.45	Score: 6.208.589999807998538
Episode 250	Average Score: 3.51
actions batch at 175000-th learning:
	 shape = (128, 4),
	 mean = [0.17013033 0.39894563 0.20204064 0.23302563],
	  std = [0.5938513 0.5697094 0.559678  0.5891342]
5.399999879300594
Episode 251	Average Score: 3.55	Score: 5.40actions batch at 176000-th learning:
	 shape = (128, 4),
	 mean = [0.15155955 0.29044855 0.19091311 0.17104328],
	  std = [0.5773781  0.6107936  0.615431   0.59825015]
4.789999892935157
Episode 252	Average Score: 3.58	Score: 4.795.45999987795949
Episode 253	Average Score: 3.63	Score: 5.46actions batch at 177000-th learning:
	 shape = (128, 4),
	 mean = [0.06512312 0.31542188 0.20244768 0.19008067],
	  std = [0.5307214  0.58445793 0.5813286  0.60858953]
4.899999890476465
Episode 254	Average Score: 3.65	Score: 4.90actions batch at 178000-th learning:
	 shape = (128, 4),
	 mean = [0.13718182 0.4361815  0.19170566 0.19526348],
	  std = [0.56119716 0.5965121  0.54505146 0.59133065]
5.979999866336584
Episode 255	Average Score: 3.69	Score: 5.98actions batch at 179000-th learning:
	 shape = (128, 4),
	 mean = [0.14180045 0.38880715 0.32241115 0.33120143],
	  std = [0.5860902  0.5712443  0.60155433 0.60160047]
7.709999827668071
Episode 256	Average Score: 3.75	Score: 7.718.619999807327986
Episode 257	Average Score: 3.83	Score: 8.62actions batch at 180000-th learning:
	 shape = (128, 4),
	 mean = [0.18276031 0.32027203 0.22235693 0.14409986],
	  std = [0.5849835  0.6101123  0.57331914 0.57325953]
6.199999861419201
Episode 258	Average Score: 3.88	Score: 6.20actions batch at 181000-th learning:
	 shape = (128, 4),
	 mean = [0.14979401 0.32870823 0.31362352 0.2259501 ],
	  std = [0.5932121  0.60998356 0.5801651  0.5865966 ]
7.179999839514494
Episode 259	Average Score: 3.92	Score: 7.186.549999853596091
Episode 260	Average Score: 3.97
actions batch at 182000-th learning:
	 shape = (128, 4),
	 mean = [0.20044264 0.47003537 0.26577485 0.2162478 ],
	  std = [0.5839547  0.56369907 0.58720785 0.5797636 ]
7.089999841526151
Episode 261	Average Score: 4.01	Score: 7.09actions batch at 183000-th learning:
	 shape = (128, 4),
	 mean = [0.248225   0.38672397 0.1814699  0.2761672 ],
	  std = [0.58796346 0.5785245  0.57552713 0.60469323]
5.209999883547425
Episode 262	Average Score: 4.03	Score: 5.218.579999808222055
Episode 263	Average Score: 4.09	Score: 8.58actions batch at 184000-th learning:
	 shape = (128, 4),
	 mean = [0.12855066 0.23873031 0.27199084 0.04101778],
	  std = [0.5681746  0.57639396 0.57914394 0.5562302 ]
5.469999877735972
Episode 264	Average Score: 4.14	Score: 5.47actions batch at 185000-th learning:
	 shape = (128, 4),
	 mean = [0.21484165 0.39127097 0.2648227  0.14016202],
	  std = [0.5844741  0.57499343 0.6056815  0.5664828 ]
6.6399998515844345
Episode 265	Average Score: 4.19	Score: 6.64actions batch at 186000-th learning:
	 shape = (128, 4),
	 mean = [0.20139039 0.2530306  0.23499304 0.11350746],
	  std = [0.57088304 0.5447998  0.58545494 0.57503384]
8.179999817162752
Episode 266	Average Score: 4.24	Score: 8.181.7499999608844519
Episode 267	Average Score: 4.23	Score: 1.75actions batch at 187000-th learning:
	 shape = (128, 4),
	 mean = [0.27457458 0.42130345 0.2606981  0.21385266],
	  std = [0.59974027 0.60057336 0.58445716 0.59622043]
8.47999981045723
Episode 268	Average Score: 4.28	Score: 8.48actions batch at 188000-th learning:
	 shape = (128, 4),
	 mean = [0.20184858 0.40512472 0.23305734 0.11568356],
	  std = [0.6025244  0.58155084 0.58251625 0.54657733]
6.319999858736992
Episode 269	Average Score: 4.31	Score: 6.325.449999878183007
Episode 270	Average Score: 4.34
actions batch at 189000-th learning:
	 shape = (128, 4),
	 mean = [0.08407597 0.2604824  0.16453658 0.16179028],
	  std = [0.5441361  0.5859915  0.58438534 0.5848726 ]
5.739999871701002
Episode 271	Average Score: 4.37	Score: 5.74actions batch at 190000-th learning:
	 shape = (128, 4),
	 mean = [0.13714357 0.26542705 0.2066402  0.26399112],
	  std = [0.57184637 0.59093434 0.57844234 0.5875032 ]
8.279999814927578
Episode 272	Average Score: 4.43	Score: 8.286.479999855160713
Episode 273	Average Score: 4.47	Score: 6.48actions batch at 191000-th learning:
	 shape = (128, 4),
	 mean = [0.20006612 0.27769566 0.32727072 0.20890021],
	  std = [0.5897883  0.60159886 0.56384474 0.6029755 ]
7.35999983549118
Episode 274	Average Score: 4.51	Score: 7.36actions batch at 192000-th learning:
	 shape = (128, 4),
	 mean = [0.16315374 0.2814292  0.20566799 0.20187052],
	  std = [0.5793189  0.574907   0.568923   0.58895016]
4.949999889358878
Episode 275	Average Score: 4.53	Score: 4.95actions batch at 193000-th learning:
	 shape = (128, 4),
	 mean = [0.17682824 0.3838923  0.23385178 0.20543995],
	  std = [0.5585566  0.57418627 0.5683816  0.56459785]
8.139999818056822
Episode 276	Average Score: 4.60	Score: 8.147.559999831020832
Episode 277	Average Score: 4.62	Score: 7.56actions batch at 194000-th learning:
	 shape = (128, 4),
	 mean = [0.19004957 0.20845555 0.15820993 0.12380341],
	  std = [0.59071946 0.5413218  0.5468282  0.5576955 ]
4.559999898076057
Episode 278	Average Score: 4.65	Score: 4.56actions batch at 195000-th learning:
	 shape = (128, 4),
	 mean = [0.20772173 0.33605534 0.2306197  0.22656246],
	  std = [0.605686  0.5878097 0.579757  0.592239 ]
5.829999869689345
Episode 279	Average Score: 4.68	Score: 5.836.129999862983823
Episode 280	Average Score: 4.73
actions batch at 196000-th learning:
	 shape = (128, 4),
	 mean = [0.07534403 0.3014454  0.19347627 0.1583343 ],
	  std = [0.56753457 0.58069104 0.59098077 0.5665931 ]
8.299999814480543
Episode 281	Average Score: 4.79	Score: 8.30actions batch at 197000-th learning:
	 shape = (128, 4),
	 mean = [0.16127326 0.3255825  0.24401724 0.16700716],
	  std = [0.5738791  0.59351903 0.58951175 0.58486193]
6.679999850690365
Episode 282	Average Score: 4.82	Score: 6.686.489999854937196
Episode 283	Average Score: 4.84	Score: 6.49actions batch at 198000-th learning:
	 shape = (128, 4),
	 mean = [0.15708642 0.34289593 0.25687554 0.21087605],
	  std = [0.5595862  0.58571804 0.57405066 0.56715095]
4.129999907687306
Episode 284	Average Score: 4.82	Score: 4.13actions batch at 199000-th learning:
	 shape = (128, 4),
	 mean = [0.09152199 0.31601208 0.26589096 0.08039416],
	  std = [0.5421899 0.5740245 0.57717   0.5190925]
4.329999903216958
Episode 285	Average Score: 4.83	Score: 4.33actions batch at 200000-th learning:
	 shape = (128, 4),
	 mean = [0.12180576 0.3238747  0.2239135  0.13423507],
	  std = [0.5747589  0.5913879  0.5736428  0.58687717]
8.109999818727374
Episode 286	Average Score: 4.90	Score: 8.116.839999847114086
Episode 287	Average Score: 4.93	Score: 6.84actions batch at 201000-th learning:
	 shape = (128, 4),
	 mean = [0.19733775 0.3156617  0.21001253 0.21026017],
	  std = [0.5825772  0.58614993 0.5861035  0.5912501 ]
8.899999801069498
Episode 288	Average Score: 4.99	Score: 8.90actions batch at 202000-th learning:
	 shape = (128, 4),
	 mean = [0.18340303 0.29468486 0.22529998 0.13775449],
	  std = [0.56967    0.5943023  0.55468994 0.55416346]
6.839999847114086
Episode 289	Average Score: 5.02	Score: 6.845.389999879524112
Episode 290	Average Score: 5.04
actions batch at 203000-th learning:
	 shape = (128, 4),
	 mean = [0.12172189 0.33821842 0.1695622  0.22688717],
	  std = [0.56852186 0.56444967 0.55116975 0.6074344 ]
5.339999880641699
Episode 291	Average Score: 5.06	Score: 5.34actions batch at 204000-th learning:
	 shape = (128, 4),
	 mean = [0.02673258 0.24996947 0.2530822  0.15186928],
	  std = [0.5102697  0.58111924 0.560437   0.55702424]
5.269999882206321
Episode 292	Average Score: 5.10	Score: 5.274.659999895840883
Episode 293	Average Score: 5.10	Score: 4.66actions batch at 205000-th learning:
	 shape = (128, 4),
	 mean = [0.14511964 0.34117484 0.23546556 0.13811211],
	  std = [0.5457669 0.5949936 0.5559336 0.5578695]
6.989999843761325
Episode 294	Average Score: 5.13	Score: 6.99actions batch at 206000-th learning:
	 shape = (128, 4),
	 mean = [0.13028358 0.34796652 0.19375898 0.23146927],
	  std = [0.55844164 0.55301744 0.5534499  0.5750024 ]
8.919999800622463
Episode 295	Average Score: 5.21	Score: 8.92actions batch at 207000-th learning:
	 shape = (128, 4),
	 mean = [0.10683605 0.27610743 0.20801912 0.18514371],
	  std = [0.5607543  0.5683532  0.57676214 0.5904533 ]
5.339999880641699
Episode 296	Average Score: 5.24	Score: 5.349.319999791681767
Episode 297	Average Score: 5.30	Score: 9.32actions batch at 208000-th learning:
	 shape = (128, 4),
	 mean = [0.1369111  0.27045935 0.19345526 0.24519373],
	  std = [0.5562485  0.59234846 0.57149744 0.57540846]
10.079999774694443
Episode 298	Average Score: 5.37	Score: 10.08actions batch at 209000-th learning:
	 shape = (128, 4),
	 mean = [0.2595153  0.23427905 0.19679324 0.18703358],
	  std = [0.57594687 0.5866386  0.57061523 0.5806136 ]
5.999999865889549
Episode 299	Average Score: 5.40	Score: 6.006.07999986410141
Episode 300	Average Score: 5.40
actions batch at 210000-th learning:
	 shape = (128, 4),
	 mean = [0.23223802 0.39575365 0.13596001 0.21975484],
	  std = [0.5927929  0.5703288  0.57606155 0.567555  ]
8.119999818503857
Episode 301	Average Score: 5.42	Score: 8.12actions batch at 211000-th learning:
	 shape = (128, 4),
	 mean = [0.25136214 0.3877336  0.23505735 0.05790387],
	  std = [0.6187465  0.5776745  0.56461775 0.5430434 ]
6.679999850690365
Episode 302	Average Score: 5.46	Score: 6.688.82999980263412
Episode 303	Average Score: 5.53	Score: 8.83actions batch at 212000-th learning:
	 shape = (128, 4),
	 mean = [0.17589393 0.3817002  0.17471777 0.21841715],
	  std = [0.571588   0.5975507  0.5911217  0.58945143]
5.1799998842179775
Episode 304	Average Score: 5.54	Score: 5.18actions batch at 213000-th learning:
	 shape = (128, 4),
	 mean = [0.16817017 0.34345224 0.17742822 0.18871275],
	  std = [0.5990277  0.60113907 0.56921935 0.57275766]
6.309999858960509
Episode 305	Average Score: 5.59	Score: 6.31actions batch at 214000-th learning:
	 shape = (128, 4),
	 mean = [0.18483828 0.2999902  0.25665474 0.18515001],
	  std = [0.58930135 0.5713958  0.5741231  0.5857526 ]
7.209999838843942
Episode 306	Average Score: 5.64	Score: 7.219.069999797269702
Episode 307	Average Score: 5.71	Score: 9.07actions batch at 215000-th learning:
	 shape = (128, 4),
	 mean = [0.12609327 0.33235624 0.31153208 0.1678987 ],
	  std = [0.55313534 0.57943636 0.60256624 0.5952977 ]
6.169999862089753
Episode 308	Average Score: 5.76	Score: 6.17actions batch at 216000-th learning:
	 shape = (128, 4),
	 mean = [0.12904574 0.30234307 0.16450632 0.24338236],
	  std = [0.5430598  0.5735911  0.5301036  0.58961374]
3.1199999302625656
Episode 309	Average Score: 5.77	Score: 3.126.049999864771962
Episode 310	Average Score: 5.79
actions batch at 217000-th learning:
	 shape = (128, 4),
	 mean = [0.24841    0.28116566 0.20997491 0.17094094],
	  std = [0.60515076 0.5993118  0.5868314  0.5787484 ]
5.739999871701002
Episode 311	Average Score: 5.80	Score: 5.74actions batch at 218000-th learning:
	 shape = (128, 4),
	 mean = [0.27798873 0.44118145 0.32518953 0.22538362],
	  std = [0.59148747 0.57624716 0.58929056 0.56973606]
5.689999872818589
Episode 312	Average Score: 5.81	Score: 5.695.389999879524112
Episode 313	Average Score: 5.80	Score: 5.39actions batch at 219000-th learning:
	 shape = (128, 4),
	 mean = [0.14412922 0.25386482 0.20341612 0.18552427],
	  std = [0.56309444 0.5728589  0.59458816 0.58334726]
5.7799998708069324
Episode 314	Average Score: 5.83	Score: 5.78actions batch at 220000-th learning:
	 shape = (128, 4),
	 mean = [0.13078442 0.2699595  0.28202724 0.22294925],
	  std = [0.572891   0.56532216 0.59558463 0.59889555]
6.889999845996499
Episode 315	Average Score: 5.87	Score: 6.89actions batch at 221000-th learning:
	 shape = (128, 4),
	 mean = [0.14302377 0.35653135 0.2242349  0.24146631],
	  std = [0.53503233 0.55887866 0.5723469  0.5954786 ]
6.73999984934926
Episode 316	Average Score: 5.90	Score: 6.749.979999776929617
Episode 317	Average Score: 5.96	Score: 9.98actions batch at 222000-th learning:
	 shape = (128, 4),
	 mean = [0.20653798 0.31197375 0.16181527 0.20889245],
	  std = [0.5739654  0.57637984 0.57856846 0.56320137]
6.019999865442514
Episode 318	Average Score: 5.99	Score: 6.02actions batch at 223000-th learning:
	 shape = (128, 4),
	 mean = [0.21179236 0.26156002 0.19351451 0.18129684],
	  std = [0.57015157 0.5557184  0.55594945 0.57272536]
6.789999848231673
Episode 319	Average Score: 6.01	Score: 6.795.259999882429838
Episode 320	Average Score: 6.03
actions batch at 224000-th learning:
	 shape = (128, 4),
	 mean = [0.20507023 0.32680532 0.22571106 0.256393  ],
	  std = [0.5898887  0.58359563 0.5955011  0.5849689 ]
8.159999817609787
Episode 321	Average Score: 6.08	Score: 8.16actions batch at 225000-th learning:
	 shape = (128, 4),
	 mean = [0.21595405 0.34376067 0.2660101  0.10158434],
	  std = [0.5763607  0.6008659  0.55075294 0.55327713]
8.109999818727374
Episode 322	Average Score: 6.11	Score: 8.116.479999855160713
Episode 323	Average Score: 6.11	Score: 6.48actions batch at 226000-th learning:
	 shape = (128, 4),
	 mean = [0.1424405  0.36297336 0.2010185  0.18553346],
	  std = [0.5566226  0.553165   0.5643723  0.56220144]
7.829999824985862
Episode 324	Average Score: 6.13	Score: 7.83actions batch at 227000-th learning:
	 shape = (128, 4),
	 mean = [0.24767832 0.39254916 0.1745627  0.13706781],
	  std = [0.548724   0.58153915 0.5468746  0.58242875]
8.649999806657434
Episode 325	Average Score: 6.18	Score: 8.65actions batch at 228000-th learning:
	 shape = (128, 4),
	 mean = [0.16544959 0.29529575 0.1996479  0.23362134],
	  std = [0.55464876 0.56578225 0.5499008  0.57866615]
8.039999820291996
Episode 326	Average Score: 6.21	Score: 8.048.149999817833304
Episode 327	Average Score: 6.24	Score: 8.15actions batch at 229000-th learning:
	 shape = (128, 4),
	 mean = [0.16591232 0.235449   0.20933156 0.14110585],
	  std = [0.5941238  0.559435   0.58314294 0.5704787 ]
7.5499998312443495
Episode 328	Average Score: 6.25	Score: 7.55actions batch at 230000-th learning:
	 shape = (128, 4),
	 mean = [0.17304894 0.35608584 0.20885548 0.21918423],
	  std = [0.5697097  0.57499826 0.5602389  0.5829651 ]
7.849999824538827
Episode 329	Average Score: 6.28	Score: 7.8510.509999765083194
Episode 330	Average Score: 6.33
actions batch at 231000-th learning:
	 shape = (128, 4),
	 mean = [0.20501307 0.25039136 0.08596229 0.2465848 ],
	  std = [0.60277927 0.5371151  0.51928437 0.5675409 ]
9.379999790340662
Episode 331	Average Score: 6.40	Score: 9.38actions batch at 232000-th learning:
	 shape = (128, 4),
	 mean = [0.24688758 0.25947958 0.19138762 0.23128174],
	  std = [0.59179705 0.57711583 0.54923147 0.61563677]
3.8699999134987593
Episode 332	Average Score: 6.41	Score: 3.877.109999841079116
Episode 333	Average Score: 6.42	Score: 7.11actions batch at 233000-th learning:
	 shape = (128, 4),
	 mean = [0.23183571 0.34293777 0.26976234 0.23801771],
	  std = [0.5892587 0.5691378 0.5936904 0.5866815]
6.11999986320734
Episode 334	Average Score: 6.44	Score: 6.12actions batch at 234000-th learning:
	 shape = (128, 4),
	 mean = [0.11480732 0.29044253 0.26079255 0.21645333],
	  std = [0.57118505 0.57812387 0.57421374 0.59399956]
7.609999829903245
Episode 335	Average Score: 6.48	Score: 7.61actions batch at 235000-th learning:
	 shape = (128, 4),
	 mean = [0.13842417 0.27357206 0.23124887 0.0964088 ],
	  std = [0.5848199  0.57727516 0.562814   0.5549379 ]
7.7899998258799314
Episode 336	Average Score: 6.50	Score: 7.794.60999989695847
Episode 337	Average Score: 6.50	Score: 4.61actions batch at 236000-th learning:
	 shape = (128, 4),
	 mean = [0.19067217 0.28943527 0.1974443  0.18379211],
	  std = [0.5748737 0.5795769 0.5834784 0.6008825]
4.369999902322888
Episode 338	Average Score: 6.51	Score: 4.37actions batch at 237000-th learning:
	 shape = (128, 4),
	 mean = [0.18078975 0.35396957 0.27165487 0.23216335],
	  std = [0.57960236 0.57071805 0.570682   0.5790113 ]
8.449999811127782
Episode 339	Average Score: 6.56	Score: 8.459.149999795481563
Episode 340	Average Score: 6.60
actions batch at 238000-th learning:
	 shape = (128, 4),
	 mean = [0.09475347 0.25824994 0.1419839  0.26381794],
	  std = [0.5592448  0.5770874  0.53781885 0.6116079 ]
8.079999819397926
Episode 341	Average Score: 6.61	Score: 8.08actions batch at 239000-th learning:
	 shape = (128, 4),
	 mean = [0.21739176 0.28418136 0.2003327  0.24734075],
	  std = [0.56990504 0.5648593  0.5376442  0.5832552 ]
6.819999847561121
Episode 342	Average Score: 6.65	Score: 6.828.819999802857637
Episode 343	Average Score: 6.69	Score: 8.82actions batch at 240000-th learning:
	 shape = (128, 4),
	 mean = [0.18679152 0.3485063  0.2680898  0.23049845],
	  std = [0.5466258 0.5641854 0.5704384 0.5918045]
7.849999824538827
Episode 344	Average Score: 6.72	Score: 7.85actions batch at 241000-th learning:
	 shape = (128, 4),
	 mean = [0.17365973 0.33139563 0.24483645 0.2337661 ],
	  std = [0.56463337 0.5928665  0.56572855 0.60847825]
6.599999852478504
Episode 345	Average Score: 6.74	Score: 6.60actions batch at 242000-th learning:
	 shape = (128, 4),
	 mean = [0.10633735 0.24224605 0.20577422 0.179659  ],
	  std = [0.54141414 0.58400303 0.56159943 0.58457667]
7.5499998312443495
Episode 346	Average Score: 6.77	Score: 7.555.629999874159694
Episode 347	Average Score: 6.77	Score: 5.63actions batch at 243000-th learning:
	 shape = (128, 4),
	 mean = [0.09770389 0.20655757 0.15657282 0.21808332],
	  std = [0.54465395 0.5796156  0.5702242  0.577427  ]
8.809999803081155
Episode 348	Average Score: 6.81	Score: 8.81actions batch at 244000-th learning:
	 shape = (128, 4),
	 mean = [0.21848142 0.33237067 0.25842777 0.23158458],
	  std = [0.6008107  0.5763321  0.56899136 0.55942667]
7.179999839514494
Episode 349	Average Score: 6.82	Score: 7.187.199999839067459
Episode 350	Average Score: 6.80
actions batch at 245000-th learning:
	 shape = (128, 4),
	 mean = [0.20471019 0.29275328 0.25040638 0.17394459],
	  std = [0.5740381  0.56377923 0.5649517  0.5699711 ]
9.21999979391694
Episode 351	Average Score: 6.84	Score: 9.22actions batch at 246000-th learning:
	 shape = (128, 4),
	 mean = [0.14812037 0.329155   0.17112239 0.20482448],
	  std = [0.5314827 0.5640218 0.5581455 0.555769 ]
8.419999811798334
Episode 352	Average Score: 6.88	Score: 8.4212.709999715909362
Episode 353	Average Score: 6.95	Score: 12.71actions batch at 247000-th learning:
	 shape = (128, 4),
	 mean = [0.16260347 0.26652643 0.2484939  0.17787611],
	  std = [0.58452183 0.55980426 0.56503636 0.5969495 ]
6.4499998558312654
Episode 354	Average Score: 6.97	Score: 6.45actions batch at 248000-th learning:
	 shape = (128, 4),
	 mean = [0.24144785 0.34940648 0.2457826  0.21124789],
	  std = [0.5994087  0.56779474 0.5667853  0.5677205 ]
6.979999843984842
Episode 355	Average Score: 6.98	Score: 6.98actions batch at 249000-th learning:
	 shape = (128, 4),
	 mean = [0.16601788 0.2600989  0.21254656 0.22206616],
	  std = [0.5672112  0.57752144 0.5429027  0.5696306 ]
7.6699998285621405
Episode 356	Average Score: 6.98	Score: 7.675.549999875947833
Episode 357	Average Score: 6.94	Score: 5.55actions batch at 250000-th learning:
	 shape = (128, 4),
	 mean = [0.15930524 0.27658945 0.17586963 0.1616897 ],
	  std = [0.54720396 0.58624524 0.55510694 0.5777843 ]
6.499999854713678
Episode 358	Average Score: 6.95	Score: 6.50actions batch at 251000-th learning:
	 shape = (128, 4),
	 mean = [0.16894326 0.32630417 0.22835243 0.18466507],
	  std = [0.588554   0.59465384 0.55672485 0.5856295 ]
7.539999831467867
Episode 359	Average Score: 6.95	Score: 7.547.079999841749668
Episode 360	Average Score: 6.96
actions batch at 252000-th learning:
	 shape = (128, 4),
	 mean = [0.2110013  0.25532416 0.30897924 0.20188461],
	  std = [0.56071746 0.5646239  0.5657381  0.5736756 ]
6.3299998585134745
Episode 361	Average Score: 6.95	Score: 6.33actions batch at 253000-th learning:
	 shape = (128, 4),
	 mean = [0.14995193 0.35444444 0.17968298 0.28960934],
	  std = [0.5468165  0.5777637  0.5691942  0.59018826]
6.239999860525131
Episode 362	Average Score: 6.96	Score: 6.248.74999980442226
Episode 363	Average Score: 6.96	Score: 8.75actions batch at 254000-th learning:
	 shape = (128, 4),
	 mean = [0.13513787 0.31029665 0.20148258 0.16993944],
	  std = [0.56999385 0.59747666 0.581525   0.6128667 ]
8.409999812021852
Episode 364	Average Score: 6.99	Score: 8.41actions batch at 255000-th learning:
	 shape = (128, 4),
	 mean = [0.2392545  0.27117804 0.2359647  0.2663242 ],
	  std = [0.5811497  0.5615791  0.55556035 0.6031517 ]
6.649999851360917
Episode 365	Average Score: 6.99	Score: 6.65actions batch at 256000-th learning:
	 shape = (128, 4),
	 mean = [0.20646931 0.30833757 0.25282767 0.1411681 ],
	  std = [0.56842405 0.56188875 0.5266195  0.5590613 ]
7.139999840408564
Episode 366	Average Score: 6.98	Score: 7.148.139999818056822
Episode 367	Average Score: 7.04	Score: 8.14actions batch at 257000-th learning:
	 shape = (128, 4),
	 mean = [0.18868099 0.3171603  0.24843702 0.14117539],
	  std = [0.58897984 0.5532772  0.54889786 0.53787756]
9.759999781847
Episode 368	Average Score: 7.06	Score: 9.76actions batch at 258000-th learning:
	 shape = (128, 4),
	 mean = [0.17606504 0.24367353 0.2437774  0.20659502],
	  std = [0.58609354 0.5784432  0.5562079  0.576307  ]
7.179999839514494
Episode 369	Average Score: 7.07	Score: 7.185.909999867901206
Episode 370	Average Score: 7.07
actions batch at 259000-th learning:
	 shape = (128, 4),
	 mean = [0.21649708 0.32999593 0.2589687  0.1769002 ],
	  std = [0.56163234 0.5398411  0.5572518  0.558516  ]
5.789999870583415
Episode 371	Average Score: 7.07	Score: 5.79actions batch at 260000-th learning:
	 shape = (128, 4),
	 mean = [0.25435552 0.38063675 0.17616276 0.28833705],
	  std = [0.59709    0.55668247 0.53484356 0.5838301 ]
7.199999839067459
Episode 372	Average Score: 7.06	Score: 7.206.679999850690365
Episode 373	Average Score: 7.06	Score: 6.68actions batch at 261000-th learning:
	 shape = (128, 4),
	 mean = [0.08572362 0.2170769  0.17095068 0.1635922 ],
	  std = [0.5110591  0.5596669  0.55469215 0.5643177 ]
4.449999900534749
Episode 374	Average Score: 7.03	Score: 4.45actions batch at 262000-th learning:
	 shape = (128, 4),
	 mean = [0.19679232 0.30656534 0.21586584 0.2092322 ],
	  std = [0.5740184  0.55290353 0.57248247 0.5825294 ]
11.249999748542905
Episode 375	Average Score: 7.10	Score: 11.25actions batch at 263000-th learning:
	 shape = (128, 4),
	 mean = [0.2251568  0.28605577 0.17701007 0.2273126 ],
	  std = [0.56594074 0.5681561  0.5221972  0.5968342 ]
9.459999788552523
Episode 376	Average Score: 7.11	Score: 9.464.559999898076057
Episode 377	Average Score: 7.08	Score: 4.56actions batch at 264000-th learning:
	 shape = (128, 4),
	 mean = [0.1431757  0.38191044 0.2639444  0.19285429],
	  std = [0.58916533 0.57921904 0.5603593  0.5862874 ]
4.039999909698963
Episode 378	Average Score: 7.07	Score: 4.04actions batch at 265000-th learning:
	 shape = (128, 4),
	 mean = [0.19410494 0.259094   0.21180554 0.13638149],
	  std = [0.5790648  0.54303384 0.5515507  0.5572141 ]
7.639999829232693
Episode 379	Average Score: 7.09	Score: 7.646.339999858289957
Episode 380	Average Score: 7.09
actions batch at 266000-th learning:
	 shape = (128, 4),
	 mean = [0.14533548 0.24682589 0.16198339 0.14716116],
	  std = [0.578206   0.5476856  0.5378808  0.55627453]
6.03999986499548
Episode 381	Average Score: 7.07	Score: 6.04actions batch at 267000-th learning:
	 shape = (128, 4),
	 mean = [0.13876663 0.34922278 0.19556318 0.21658273],
	  std = [0.5683041  0.59429    0.56138766 0.58457386]
5.999999865889549
Episode 382	Average Score: 7.06	Score: 6.003.5799999199807644
Episode 383	Average Score: 7.04	Score: 3.58actions batch at 268000-th learning:
	 shape = (128, 4),
	 mean = [0.19241244 0.385566   0.25444123 0.1981732 ],
	  std = [0.5606899  0.56438786 0.5698224  0.589139  ]
5.559999875724316
Episode 384	Average Score: 7.05	Score: 5.56actions batch at 269000-th learning:
	 shape = (128, 4),
	 mean = [0.14802462 0.30198088 0.20788965 0.24215303],
	  std = [0.607734  0.5894665 0.5785163 0.6151335]
5.709999872371554
Episode 385	Average Score: 7.06	Score: 5.71actions batch at 270000-th learning:
	 shape = (128, 4),
	 mean = [0.20710893 0.30470607 0.23188925 0.2071551 ],
	  std = [0.5665961  0.5838953  0.5402552  0.59771276]
5.979999866336584
Episode 386	Average Score: 7.04	Score: 5.986.359999857842922
Episode 387	Average Score: 7.04	Score: 6.36actions batch at 271000-th learning:
	 shape = (128, 4),
	 mean = [0.12187725 0.28335482 0.259722   0.06843872],
	  std = [0.5546794  0.5765721  0.55227923 0.5386645 ]
5.329999880865216
Episode 388	Average Score: 7.00	Score: 5.33actions batch at 272000-th learning:
	 shape = (128, 4),
	 mean = [0.29423365 0.40179998 0.2840805  0.13429654],
	  std = [0.60922444 0.5525202  0.55374247 0.57865536]
14.959999665617943
Episode 389	Average Score: 7.08	Score: 14.966.819999847561121
Episode 390	Average Score: 7.10
actions batch at 273000-th learning:
	 shape = (128, 4),
	 mean = [0.13300744 0.38945732 0.19374315 0.18181913],
	  std = [0.5305142  0.56966966 0.5381032  0.5769659 ]
3.05999993160367
Episode 391	Average Score: 7.07	Score: 3.06actions batch at 274000-th learning:
	 shape = (128, 4),
	 mean = [0.24177337 0.3862497  0.22748637 0.30910888],
	  std = [0.58651346 0.5874283  0.5890468  0.6037277 ]
4.9399998895823956
Episode 392	Average Score: 7.07	Score: 4.945.839999869465828
Episode 393	Average Score: 7.08	Score: 5.84actions batch at 275000-th learning:
	 shape = (128, 4),
	 mean = [0.23141366 0.3229554  0.20767573 0.16171175],
	  std = [0.6065975 0.5883341 0.5389064 0.5728627]
2.0399999544024467
Episode 394	Average Score: 7.03	Score: 2.04actions batch at 276000-th learning:
	 shape = (128, 4),
	 mean = [0.23226576 0.36934665 0.23355111 0.22720477],
	  std = [0.60254604 0.57865185 0.5534816  0.6019229 ]
3.729999916628003
Episode 395	Average Score: 6.98	Score: 3.73actions batch at 277000-th learning:
	 shape = (128, 4),
	 mean = [0.11347845 0.27270892 0.06220837 0.17758618],
	  std = [0.540069   0.5343184  0.4936744  0.60182655]
4.83999989181757
Episode 396	Average Score: 6.98	Score: 4.845.979999866336584
Episode 397	Average Score: 6.94	Score: 5.98actions batch at 278000-th learning:
	 shape = (128, 4),
	 mean = [0.19015898 0.29951277 0.28950143 0.19768623],
	  std = [0.5813728  0.57763296 0.5791189  0.5575234 ]
5.7799998708069324
Episode 398	Average Score: 6.90	Score: 5.78actions batch at 279000-th learning:
	 shape = (128, 4),
	 mean = [0.13656421 0.2628855  0.24613668 0.15142386],
	  std = [0.55004656 0.5865597  0.57756895 0.57606643]
5.879999868571758
Episode 399	Average Score: 6.90	Score: 5.884.899999890476465
Episode 400	Average Score: 6.89
actions batch at 280000-th learning:
	 shape = (128, 4),
	 mean = [0.11988517 0.2774384  0.21607776 0.20069212],
	  std = [0.554724   0.58291477 0.5793348  0.58778477]
5.259999882429838
Episode 401	Average Score: 6.86	Score: 5.26actions batch at 281000-th learning:
	 shape = (128, 4),
	 mean = [0.16453096 0.26884755 0.18475935 0.16086967],
	  std = [0.5694268  0.5658552  0.5439897  0.56990856]
5.599999874830246
Episode 402	Average Score: 6.85	Score: 5.605.699999872595072
Episode 403	Average Score: 6.82	Score: 5.70actions batch at 282000-th learning:
	 shape = (128, 4),
	 mean = [0.26936662 0.26648057 0.17742184 0.19743572],
	  std = [0.54134864 0.5674149  0.54621077 0.59223604]
3.94999991171062
Episode 404	Average Score: 6.80	Score: 3.95actions batch at 283000-th learning:
	 shape = (128, 4),
	 mean = [0.23470296 0.27143052 0.3028121  0.07070135],
	  std = [0.5687975  0.5540324  0.54430455 0.5340998 ]
5.379999879747629
Episode 405	Average Score: 6.79	Score: 5.38actions batch at 284000-th learning:
	 shape = (128, 4),
	 mean = [0.09603649 0.2616551  0.18666442 0.16079673],
	  std = [0.53884137 0.56496596 0.5260966  0.5659876 ]
5.80999987013638
Episode 406	Average Score: 6.78	Score: 5.815.069999886676669
Episode 407	Average Score: 6.74	Score: 5.07actions batch at 285000-th learning:
	 shape = (128, 4),
	 mean = [0.04245935 0.31376362 0.16065967 0.23866436],
	  std = [0.5404275 0.5581969 0.5470473 0.5780496]
5.249999882653356
Episode 408	Average Score: 6.73	Score: 5.25actions batch at 286000-th learning:
	 shape = (128, 4),
	 mean = [0.15542467 0.4351768  0.28386053 0.22765091],
	  std = [0.5759577  0.5563436  0.57293725 0.574581  ]
9.339999791234732
Episode 409	Average Score: 6.79	Score: 9.346.839999847114086
Episode 410	Average Score: 6.80
actions batch at 287000-th learning:
	 shape = (128, 4),
	 mean = [0.1914816  0.26057595 0.24627514 0.16753513],
	  std = [0.5733625  0.5553259  0.55403626 0.57561433]
6.839999847114086
Episode 411	Average Score: 6.81	Score: 6.84actions batch at 288000-th learning:
	 shape = (128, 4),
	 mean = [0.15317999 0.30781576 0.16123368 0.17784695],
	  std = [0.57695097 0.560133   0.53505504 0.5695839 ]
4.549999898299575
Episode 412	Average Score: 6.80	Score: 4.553.5099999215453863
Episode 413	Average Score: 6.78	Score: 3.51actions batch at 289000-th learning:
	 shape = (128, 4),
	 mean = [0.29654098 0.42076975 0.27485058 0.21636389],
	  std = [0.6125226  0.57510483 0.56463856 0.6065893 ]
5.409999879077077
Episode 414	Average Score: 6.78	Score: 5.41actions batch at 290000-th learning:
	 shape = (128, 4),
	 mean = [0.20659232 0.31870627 0.2625331  0.15371051],
	  std = [0.56774473 0.5758024  0.54156196 0.58612275]
5.069999886676669
Episode 415	Average Score: 6.76	Score: 5.07actions batch at 291000-th learning:
	 shape = (128, 4),
	 mean = [0.18938065 0.30763754 0.1648852  0.2219054 ],
	  std = [0.5959631  0.59302026 0.5515762  0.57360744]
7.099999841302633
Episode 416	Average Score: 6.76	Score: 7.106.239999860525131
Episode 417	Average Score: 6.73	Score: 6.24actions batch at 292000-th learning:
	 shape = (128, 4),
	 mean = [0.21793067 0.2225677  0.28105333 0.24555422],
	  std = [0.5706124  0.5705712  0.53657836 0.5970447 ]
7.679999828338623
Episode 418	Average Score: 6.74	Score: 7.68actions batch at 293000-th learning:
	 shape = (128, 4),
	 mean = [0.21654993 0.34623128 0.25367832 0.22494525],
	  std = [0.5957578  0.59013444 0.56262976 0.59209454]
5.689999872818589
Episode 419	Average Score: 6.73	Score: 5.698.079999819397926
Episode 420	Average Score: 6.76
actions batch at 294000-th learning:
	 shape = (128, 4),
	 mean = [0.10694425 0.2970188  0.2655102  0.1798862 ],
	  std = [0.55044574 0.5501131  0.566032   0.5506435 ]
8.419999811798334
Episode 421	Average Score: 6.76	Score: 8.42actions batch at 295000-th learning:
	 shape = (128, 4),
	 mean = [0.18236974 0.31172815 0.17077765 0.22715989],
	  std = [0.5559245  0.56995    0.540489   0.59352875]
6.5199998542666435
Episode 422	Average Score: 6.75	Score: 6.527.419999834150076
Episode 423	Average Score: 6.76	Score: 7.42actions batch at 296000-th learning:
	 shape = (128, 4),
	 mean = [0.21384245 0.3108516  0.21712032 0.18734881],
	  std = [0.58692026 0.5614177  0.55560637 0.59224457]
6.34999985806644
Episode 424	Average Score: 6.74	Score: 6.35actions batch at 297000-th learning:
	 shape = (128, 4),
	 mean = [0.17526056 0.31120247 0.26672083 0.23576967],
	  std = [0.5597569 0.5685041 0.5500169 0.5991744]
5.939999867230654
Episode 425	Average Score: 6.71	Score: 5.94actions batch at 298000-th learning:
	 shape = (128, 4),
	 mean = [0.16915578 0.28706187 0.18546812 0.1700816 ],
	  std = [0.59035987 0.5725651  0.53346974 0.57424253]
9.599999785423279
Episode 426	Average Score: 6.73	Score: 9.604.399999901652336
Episode 427	Average Score: 6.69	Score: 4.40actions batch at 299000-th learning:
	 shape = (128, 4),
	 mean = [0.20991148 0.28711495 0.24648905 0.17314743],
	  std = [0.5748957  0.5864542  0.5592974  0.58353347]
4.589999897405505
Episode 428	Average Score: 6.66	Score: 4.59actions batch at 300000-th learning:
	 shape = (128, 4),
	 mean = [0.1864498  0.27796546 0.21538182 0.18157391],
	  std = [0.5763567  0.5776925  0.5652375  0.57450813]
9.259999793022871
Episode 429	Average Score: 6.68	Score: 9.265.479999877512455
Episode 430	Average Score: 6.63
actions batch at 301000-th learning:
	 shape = (128, 4),
	 mean = [0.19146216 0.40829948 0.1653572  0.12235152],
	  std = [0.55683476 0.56282055 0.5417216  0.573922  ]
3.4299999233335257
Episode 431	Average Score: 6.57	Score: 3.43actions batch at 302000-th learning:
	 shape = (128, 4),
	 mean = [0.08349871 0.24652867 0.17423606 0.15009756],
	  std = [0.5285464  0.5483061  0.53542346 0.5710117 ]
7.769999826326966
Episode 432	Average Score: 6.61	Score: 7.774.399999901652336
Episode 433	Average Score: 6.58	Score: 4.40actions batch at 303000-th learning:
	 shape = (128, 4),
	 mean = [0.23565334 0.33529642 0.2504322  0.15747832],
	  std = [0.56887084 0.554402   0.5433747  0.58172804]
6.719999849796295
Episode 434	Average Score: 6.59	Score: 6.72actions batch at 304000-th learning:
	 shape = (128, 4),
	 mean = [0.2961156  0.3236392  0.2805827  0.22860941],
	  std = [0.57837373 0.56315935 0.527329   0.6013657 ]
13.779999691992998
Episode 435	Average Score: 6.65	Score: 13.78actions batch at 305000-th learning:
	 shape = (128, 4),
	 mean = [0.19621302 0.26893327 0.27640778 0.1541332 ],
	  std = [0.572593   0.6024557  0.53785765 0.5621425 ]
6.129999862983823
Episode 436	Average Score: 6.63	Score: 6.137.419999834150076
Episode 437	Average Score: 6.66	Score: 7.42actions batch at 306000-th learning:
	 shape = (128, 4),
	 mean = [0.13998498 0.21860325 0.20744476 0.13500388],
	  std = [0.5586935  0.5626724  0.5250405  0.54345185]
7.539999831467867
Episode 438	Average Score: 6.69	Score: 7.54actions batch at 307000-th learning:
	 shape = (128, 4),
	 mean = [0.12497421 0.19731382 0.14300667 0.22391227],
	  std = [0.5455557  0.56086934 0.4812517  0.60913944]
7.62999982945621
Episode 439	Average Score: 6.68	Score: 7.639.659999784082174
Episode 440	Average Score: 6.69
actions batch at 308000-th learning:
	 shape = (128, 4),
	 mean = [0.26922837 0.35212252 0.2434353  0.2386293 ],
	  std = [0.586268   0.5474552  0.55779356 0.5728527 ]
16.019999641925097
Episode 441	Average Score: 6.77	Score: 16.02actions batch at 309000-th learning:
	 shape = (128, 4),
	 mean = [0.17692384 0.33423716 0.27058774 0.1100928 ],
	  std = [0.5690514  0.5681036  0.56306547 0.54373145]
8.109999818727374
Episode 442	Average Score: 6.78	Score: 8.115.529999876394868
Episode 443	Average Score: 6.75	Score: 5.53actions batch at 310000-th learning:
	 shape = (128, 4),
	 mean = [0.2499282  0.3434491  0.14478591 0.1315868 ],
	  std = [0.59066683 0.5633786  0.51059455 0.56725955]
11.609999740496278
Episode 444	Average Score: 6.78	Score: 11.61actions batch at 311000-th learning:
	 shape = (128, 4),
	 mean = [0.14117256 0.27830902 0.15516794 0.16172554],
	  std = [0.55269426 0.56502813 0.52804977 0.56552005]
5.8499998692423105
Episode 445	Average Score: 6.78	Score: 5.85actions batch at 312000-th learning:
	 shape = (128, 4),
	 mean = [0.36063206 0.3073965  0.26119408 0.21308331],
	  std = [0.58705425 0.5978212  0.5543867  0.59762126]
7.109999841079116
Episode 446	Average Score: 6.77	Score: 7.1110.839999757707119
Episode 447	Average Score: 6.82	Score: 10.84actions batch at 313000-th learning:
	 shape = (128, 4),
	 mean = [0.13934915 0.19122332 0.22969347 0.11605585],
	  std = [0.5614505 0.5348062 0.5635945 0.5506584]
9.029999798163772
Episode 448	Average Score: 6.83	Score: 9.03actions batch at 314000-th learning:
	 shape = (128, 4),
	 mean = [0.21053837 0.3281726  0.30015013 0.28638375],
	  std = [0.55656195 0.57405096 0.5618825  0.590227  ]
10.239999771118164
Episode 449	Average Score: 6.86	Score: 10.247.809999825432897
Episode 450	Average Score: 6.86
actions batch at 315000-th learning:
	 shape = (128, 4),
	 mean = [0.2488263  0.16062793 0.16824661 0.1720474 ],
	  std = [0.5921131  0.5288539  0.51624936 0.5871258 ]
9.109999796375632
Episode 451	Average Score: 6.86	Score: 9.11actions batch at 316000-th learning:
	 shape = (128, 4),
	 mean = [0.21034455 0.2645246  0.16926229 0.1379679 ],
	  std = [0.5741762  0.55326134 0.5243384  0.5644531 ]
7.329999836161733
Episode 452	Average Score: 6.85	Score: 7.336.3999998569488525
Episode 453	Average Score: 6.79	Score: 6.40actions batch at 317000-th learning:
	 shape = (128, 4),
	 mean = [0.2144485  0.32435912 0.22188741 0.18540324],
	  std = [0.5623675  0.55643344 0.5387318  0.57094866]
6.889999845996499
Episode 454	Average Score: 6.79	Score: 6.89actions batch at 318000-th learning:
	 shape = (128, 4),
	 mean = [0.21174581 0.3349801  0.21862563 0.13410689],
	  std = [0.5653995  0.55705947 0.5444483  0.5879171 ]
14.649999672546983
Episode 455	Average Score: 6.87	Score: 14.65actions batch at 319000-th learning:
	 shape = (128, 4),
	 mean = [0.20092222 0.2831013  0.17343727 0.12908922],
	  std = [0.59040797 0.5772902  0.5580881  0.56399   ]
5.049999887123704
Episode 456	Average Score: 6.84	Score: 5.058.469999810680747
Episode 457	Average Score: 6.87	Score: 8.47actions batch at 320000-th learning:
	 shape = (128, 4),
	 mean = [0.10353358 0.35424334 0.14812835 0.2535463 ],
	  std = [0.56080794 0.55907357 0.5098611  0.57692575]
3.469999922439456
Episode 458	Average Score: 6.84	Score: 3.47actions batch at 321000-th learning:
	 shape = (128, 4),
	 mean = [0.18924813 0.25372508 0.18774281 0.31214425],
	  std = [0.5570587  0.57019407 0.55362266 0.59487265]
16.1899996381253
Episode 459	Average Score: 6.93	Score: 16.1910.219999771565199
Episode 460	Average Score: 6.96
actions batch at 322000-th learning:
	 shape = (128, 4),
	 mean = [0.28523552 0.31199613 0.22986166 0.2973762 ],
	  std = [0.5963641  0.5880976  0.52405494 0.61001956]
7.599999830126762
Episode 461	Average Score: 6.97	Score: 7.60actions batch at 323000-th learning:
	 shape = (128, 4),
	 mean = [0.16023324 0.29790562 0.14764939 0.21936294],
	  std = [0.5340329  0.5492364  0.5154548  0.58060753]
9.17999979481101
Episode 462	Average Score: 7.00	Score: 9.188.349999813362956
Episode 463	Average Score: 7.00	Score: 8.35actions batch at 324000-th learning:
	 shape = (128, 4),
	 mean = [0.18536115 0.27553445 0.24249282 0.16207105],
	  std = [0.5622755  0.57309264 0.5442342  0.57066625]
4.449999900534749
Episode 464	Average Score: 6.96	Score: 4.45actions batch at 325000-th learning:
	 shape = (128, 4),
	 mean = [0.24097201 0.35098547 0.29666248 0.1980482 ],
	  std = [0.56731427 0.55919856 0.55506855 0.59450275]
8.59999980777502
Episode 465	Average Score: 6.98	Score: 8.60actions batch at 326000-th learning:
	 shape = (128, 4),
	 mean = [0.13977265 0.21455428 0.16166173 0.15230183],
	  std = [0.5550393  0.5910247  0.54254687 0.57665366]
10.02999977581203
Episode 466	Average Score: 7.01	Score: 10.038.219999816268682
Episode 467	Average Score: 7.01	Score: 8.22actions batch at 327000-th learning:
	 shape = (128, 4),
	 mean = [0.21156938 0.3880552  0.19137391 0.25505283],
	  std = [0.5750604  0.56440854 0.53750306 0.5853486 ]
8.189999816939235
Episode 468	Average Score: 6.99	Score: 8.19actions batch at 328000-th learning:
	 shape = (128, 4),
	 mean = [0.17735471 0.23220515 0.16650209 0.2953148 ],
	  std = [0.5759832  0.58453196 0.55241925 0.58736676]
6.439999856054783
Episode 469	Average Score: 6.98	Score: 6.449.309999791905284
Episode 470	Average Score: 7.02
actions batch at 329000-th learning:
	 shape = (128, 4),
	 mean = [0.14827506 0.15722269 0.12363351 0.19096954],
	  std = [0.5614241  0.542468   0.49677747 0.5589932 ]
9.519999787211418
Episode 471	Average Score: 7.06	Score: 9.52actions batch at 330000-th learning:
	 shape = (128, 4),
	 mean = [0.2160615  0.28058907 0.20905854 0.18367417],
	  std = [0.576664  0.5749337 0.5571512 0.5807962]
8.669999806210399
Episode 472	Average Score: 7.07	Score: 8.679.289999792352319
Episode 473	Average Score: 7.10	Score: 9.29actions batch at 331000-th learning:
	 shape = (128, 4),
	 mean = [0.14478998 0.18132366 0.19974035 0.22469349],
	  std = [0.5819793  0.545818   0.5418736  0.59418434]
4.8699998911470175
Episode 474	Average Score: 7.10	Score: 4.87actions batch at 332000-th learning:
	 shape = (128, 4),
	 mean = [0.25849113 0.22405307 0.23471832 0.22954461],
	  std = [0.55220175 0.584389   0.5325051  0.5941362 ]
7.139999840408564
Episode 475	Average Score: 7.06	Score: 7.14actions batch at 333000-th learning:
	 shape = (128, 4),
	 mean = [0.19284509 0.3700411  0.26223797 0.23163414],
	  std = [0.5754139  0.58150774 0.519815   0.5759919 ]
7.519999831914902
Episode 476	Average Score: 7.04	Score: 7.528.339999813586473
Episode 477	Average Score: 7.08	Score: 8.34actions batch at 334000-th learning:
	 shape = (128, 4),
	 mean = [0.1753163  0.29589662 0.17629018 0.2307845 ],
	  std = [0.5936292  0.5943103  0.54802924 0.5888338 ]
10.859999757260084
Episode 478	Average Score: 7.15	Score: 10.86actions batch at 335000-th learning:
	 shape = (128, 4),
	 mean = [0.18292738 0.3139793  0.20785286 0.16798699],
	  std = [0.5749928 0.5671142 0.5492898 0.5675206]
9.959999777376652
Episode 479	Average Score: 7.17	Score: 9.9610.659999761730433
Episode 480	Average Score: 7.21
actions batch at 336000-th learning:
	 shape = (128, 4),
	 mean = [0.26832178 0.29416534 0.16695975 0.34011093],
	  std = [0.6049504  0.57678926 0.53973955 0.59301835]
7.949999822303653
Episode 481	Average Score: 7.23	Score: 7.95actions batch at 337000-th learning:
	 shape = (128, 4),
	 mean = [0.14625815 0.2590765  0.24322726 0.13524838],
	  std = [0.558166  0.5771248 0.5429408 0.5774749]
10.37999976798892
Episode 482	Average Score: 7.28	Score: 10.388.999999798834324
Episode 483	Average Score: 7.33	Score: 9.00actions batch at 338000-th learning:
	 shape = (128, 4),
	 mean = [0.15408735 0.37876302 0.21736935 0.20483683],
	  std = [0.5779231  0.5759483  0.55920374 0.56617916]
11.019999753683805
Episode 484	Average Score: 7.38	Score: 11.02actions batch at 339000-th learning:
	 shape = (128, 4),
	 mean = [0.20741485 0.29894722 0.23525898 0.14473045],
	  std = [0.5537896  0.56971645 0.56701785 0.5522558 ]
8.639999806880951
Episode 485	Average Score: 7.41	Score: 8.64actions batch at 340000-th learning:
	 shape = (128, 4),
	 mean = [0.20759068 0.27272218 0.25467128 0.15345673],
	  std = [0.5895067  0.56597185 0.52567655 0.58945763]
7.829999824985862
Episode 486	Average Score: 7.43	Score: 7.838.119999818503857
Episode 487	Average Score: 7.45	Score: 8.12actions batch at 341000-th learning:
	 shape = (128, 4),
	 mean = [0.26463637 0.19702137 0.16977707 0.12618345],
	  std = [0.5765879  0.5601691  0.50396496 0.5584081 ]
9.579999785870314
Episode 488	Average Score: 7.49	Score: 9.58actions batch at 342000-th learning:
	 shape = (128, 4),
	 mean = [0.32815698 0.27606437 0.28366783 0.12942922],
	  std = [0.5934102 0.5956572 0.5189685 0.5913078]
8.029999820515513
Episode 489	Average Score: 7.42	Score: 8.0310.929999755695462
Episode 490	Average Score: 7.46
actions batch at 343000-th learning:
	 shape = (128, 4),
	 mean = [0.14576146 0.2164773  0.16968566 0.17197387],
	  std = [0.5506813  0.5562776  0.5073283  0.57166576]
7.869999824091792
Episode 491	Average Score: 7.51	Score: 7.87actions batch at 344000-th learning:
	 shape = (128, 4),
	 mean = [0.17786656 0.31556368 0.23537035 0.27343556],
	  std = [0.55807626 0.57362133 0.5740153  0.59035534]
6.369999857619405
Episode 492	Average Score: 7.53	Score: 6.379.349999791011214
Episode 493	Average Score: 7.56	Score: 9.35actions batch at 345000-th learning:
	 shape = (128, 4),
	 mean = [0.2289799  0.23672965 0.1961482  0.23246655],
	  std = [0.57202446 0.5727291  0.51866823 0.5728507 ]
6.339999858289957
Episode 494	Average Score: 7.60	Score: 6.34actions batch at 346000-th learning:
	 shape = (128, 4),
	 mean = [0.17864905 0.27114424 0.18889254 0.15430705],
	  std = [0.5651469  0.58234924 0.5321497  0.56518626]
11.859999734908342
Episode 495	Average Score: 7.69	Score: 11.86actions batch at 347000-th learning:
	 shape = (128, 4),
	 mean = [0.25604758 0.28218278 0.21218027 0.21715347],
	  std = [0.57117206 0.5517883  0.53899807 0.58906597]
10.939999755471945
Episode 496	Average Score: 7.75	Score: 10.945.119999885559082
Episode 497	Average Score: 7.74	Score: 5.12actions batch at 348000-th learning:
	 shape = (128, 4),
	 mean = [0.21918136 0.36167383 0.26026827 0.17681895],
	  std = [0.57527137 0.5856037  0.5565762  0.5837341 ]
7.7399998269975185
Episode 498	Average Score: 7.76	Score: 7.74actions batch at 349000-th learning:
	 shape = (128, 4),
	 mean = [0.18916765 0.1839657  0.21121188 0.26350978],
	  std = [0.54434925 0.54241127 0.5423638  0.59610665]
10.839999757707119
Episode 499	Average Score: 7.81	Score: 10.849.63999978452921
Episode 500	Average Score: 7.86
actions batch at 350000-th learning:
	 shape = (128, 4),
	 mean = [0.30749655 0.3936128  0.21108022 0.17095059],
	  std = [0.58203745 0.5476559  0.54580665 0.58788866]
6.779999848455191
Episode 501	Average Score: 7.87	Score: 6.78actions batch at 351000-th learning:
	 shape = (128, 4),
	 mean = [0.14748968 0.22548369 0.18555167 0.24310215],
	  std = [0.54499245 0.5585451  0.5246655  0.58407605]
6.239999860525131
Episode 502	Average Score: 7.88	Score: 6.2414.349999679252505
Episode 503	Average Score: 7.96	Score: 14.35actions batch at 352000-th learning:
	 shape = (128, 4),
	 mean = [0.18680936 0.30415204 0.29716566 0.23221032],
	  std = [0.5651783  0.5912584  0.54152524 0.5820843 ]
11.079999752342701
Episode 504	Average Score: 8.03	Score: 11.08actions batch at 353000-th learning:
	 shape = (128, 4),
	 mean = [0.08516928 0.24826457 0.1827087  0.26014352],
	  std = [0.5023469  0.55378413 0.51408356 0.60607076]
6.419999856501818
Episode 505	Average Score: 8.04	Score: 6.42actions batch at 354000-th learning:
	 shape = (128, 4),
	 mean = [0.12058713 0.27617112 0.10156346 0.2522161 ],
	  std = [0.54718864 0.5596536  0.52382153 0.60476923]
8.079999819397926
Episode 506	Average Score: 8.07	Score: 8.085.259999882429838
Episode 507	Average Score: 8.07	Score: 5.26actions batch at 355000-th learning:
	 shape = (128, 4),
	 mean = [0.19994916 0.28198808 0.31346238 0.22575344],
	  std = [0.5669159  0.58076346 0.5381433  0.5872106 ]
5.8499998692423105
Episode 508	Average Score: 8.08	Score: 5.85actions batch at 356000-th learning:
	 shape = (128, 4),
	 mean = [0.2713839  0.32626414 0.24506402 0.3180739 ],
	  std = [0.5349772  0.5614636  0.53804636 0.5934116 ]
8.819999802857637
Episode 509	Average Score: 8.07	Score: 8.829.48999978788197
Episode 510	Average Score: 8.10
actions batch at 357000-th learning:
	 shape = (128, 4),
	 mean = [0.08645805 0.22321166 0.2385441  0.22233997],
	  std = [0.5462151  0.5609077  0.54248416 0.6050573 ]
8.959999799728394
Episode 511	Average Score: 8.12	Score: 8.96actions batch at 358000-th learning:
	 shape = (128, 4),
	 mean = [0.21097676 0.23248753 0.18697456 0.17533879],
	  std = [0.55541474 0.5697666  0.511501   0.5531365 ]
9.069999797269702
Episode 512	Average Score: 8.16	Score: 9.075.829999869689345
Episode 513	Average Score: 8.19	Score: 5.83actions batch at 359000-th learning:
	 shape = (128, 4),
	 mean = [0.23412906 0.28895923 0.29134947 0.274133  ],
	  std = [0.5612805 0.5549395 0.5715403 0.5966497]
5.839999869465828
Episode 514	Average Score: 8.19	Score: 5.84actions batch at 360000-th learning:
	 shape = (128, 4),
	 mean = [0.17159678 0.20023307 0.27261356 0.22194615],
	  std = [0.5542255  0.5865517  0.548061   0.60604566]
10.14999977312982
Episode 515	Average Score: 8.24	Score: 10.15actions batch at 361000-th learning:
	 shape = (128, 4),
	 mean = [0.24729592 0.31704745 0.31344864 0.23689097],
	  std = [0.58687055 0.58349985 0.548979   0.61083204]
5.689999872818589
Episode 516	Average Score: 8.23	Score: 5.695.7799998708069324
Episode 517	Average Score: 8.22	Score: 5.78actions batch at 362000-th learning:
	 shape = (128, 4),
	 mean = [0.18045616 0.29422277 0.20878698 0.2056555 ],
	  std = [0.5689266  0.5751293  0.52051055 0.57979614]
7.839999824762344
Episode 518	Average Score: 8.22	Score: 7.84actions batch at 363000-th learning:
	 shape = (128, 4),
	 mean = [0.16347994 0.20537646 0.2215024  0.24222358],
	  std = [0.57101566 0.5655505  0.5357716  0.57967114]
9.63999978452921
Episode 519	Average Score: 8.26	Score: 9.6412.309999724850059
Episode 520	Average Score: 8.31
actions batch at 364000-th learning:
	 shape = (128, 4),
	 mean = [0.2720128  0.29996687 0.22976054 0.21582459],
	  std = [0.5860111  0.5917978  0.53085107 0.56347525]
7.149999840185046
Episode 521	Average Score: 8.29	Score: 7.15actions batch at 365000-th learning:
	 shape = (128, 4),
	 mean = [0.23661289 0.26593807 0.26427475 0.19649325],
	  std = [0.5758093 0.566394  0.5338541 0.5757875]
5.829999869689345
Episode 522	Average Score: 8.29	Score: 5.8310.599999763071537
Episode 523	Average Score: 8.32	Score: 10.60actions batch at 366000-th learning:
	 shape = (128, 4),
	 mean = [0.26157156 0.26831198 0.17385766 0.16843954],
	  std = [0.59368104 0.57425284 0.51516896 0.5809233 ]
7.419999834150076
Episode 524	Average Score: 8.33	Score: 7.42actions batch at 367000-th learning:
	 shape = (128, 4),
	 mean = [0.20167118 0.21414791 0.15915027 0.21910357],
	  std = [0.5548373  0.5670647  0.5084885  0.57958686]
7.759999826550484
Episode 525	Average Score: 8.35	Score: 7.76actions batch at 368000-th learning:
	 shape = (128, 4),
	 mean = [0.17659487 0.32337117 0.19322655 0.27936932],
	  std = [0.5400268  0.5908821  0.51074207 0.5797898 ]
4.899999890476465
Episode 526	Average Score: 8.30	Score: 4.907.959999822080135
Episode 527	Average Score: 8.34	Score: 7.96actions batch at 369000-th learning:
	 shape = (128, 4),
	 mean = [0.18038517 0.25453338 0.26405683 0.281928  ],
	  std = [0.5858605  0.5825821  0.5571838  0.61066484]
7.689999828115106
Episode 528	Average Score: 8.37	Score: 7.69actions batch at 370000-th learning:
	 shape = (128, 4),
	 mean = [0.15985405 0.25260806 0.15021066 0.18538202],
	  std = [0.56423986 0.5481346  0.5218328  0.57449526]
9.209999794140458
Episode 529	Average Score: 8.37	Score: 9.219.779999781399965
Episode 530	Average Score: 8.41
actions batch at 371000-th learning:
	 shape = (128, 4),
	 mean = [0.21132928 0.25615767 0.27734306 0.1934171 ],
	  std = [0.5957036  0.58206683 0.5433915  0.5826686 ]
10.769999759271741
Episode 531	Average Score: 8.48	Score: 10.77actions batch at 372000-th learning:
	 shape = (128, 4),
	 mean = [0.19779027 0.27696756 0.14765003 0.2535853 ],
	  std = [0.5733451 0.5764143 0.5481104 0.6003711]
9.829999780282378
Episode 532	Average Score: 8.50	Score: 9.8310.429999766871333
Episode 533	Average Score: 8.56	Score: 10.43actions batch at 373000-th learning:
	 shape = (128, 4),
	 mean = [0.261473   0.15685286 0.31844598 0.20554602],
	  std = [0.57818806 0.56812316 0.5439506  0.5743897 ]
8.649999806657434
Episode 534	Average Score: 8.58	Score: 8.65actions batch at 374000-th learning:
	 shape = (128, 4),
	 mean = [0.23160088 0.28165576 0.22140786 0.2205275 ],
	  std = [0.55168635 0.5589865  0.5407008  0.5834192 ]
12.479999721050262
Episode 535	Average Score: 8.57	Score: 12.48actions batch at 375000-th learning:
	 shape = (128, 4),
	 mean = [0.2415612  0.22477923 0.21388589 0.2470134 ],
	  std = [0.55910105 0.54121184 0.5259919  0.5718381 ]
2.6199999414384365
Episode 536	Average Score: 8.53	Score: 2.629.729999782517552
Episode 537	Average Score: 8.56	Score: 9.73actions batch at 376000-th learning:
	 shape = (128, 4),
	 mean = [0.3001408  0.28971985 0.27493852 0.26297843],
	  std = [0.5576417 0.5395084 0.5535854 0.5816234]
11.259999748319387
Episode 538	Average Score: 8.60	Score: 11.26actions batch at 377000-th learning:
	 shape = (128, 4),
	 mean = [0.16187535 0.16640696 0.22159122 0.1942373 ],
	  std = [0.5980572  0.56938094 0.5675537  0.5665491 ]
10.409999767318368
Episode 539	Average Score: 8.62	Score: 10.4110.64999976195395
Episode 540	Average Score: 8.63
actions batch at 378000-th learning:
	 shape = (128, 4),
	 mean = [0.27465913 0.18238038 0.20401785 0.25196546],
	  std = [0.578243   0.5545371  0.52233315 0.565817  ]
8.159999817609787
Episode 541	Average Score: 8.55	Score: 8.16actions batch at 379000-th learning:
	 shape = (128, 4),
	 mean = [0.18028243 0.2578011  0.20710798 0.18484402],
	  std = [0.55332935 0.5467123  0.5350279  0.5830567 ]
9.069999797269702
Episode 542	Average Score: 8.56	Score: 9.0711.65999973937869
Episode 543	Average Score: 8.63	Score: 11.66actions batch at 380000-th learning:
	 shape = (128, 4),
	 mean = [0.1693486  0.19532184 0.1970266  0.29355726],
	  std = [0.55377936 0.5541001  0.5271056  0.58395976]
9.149999795481563
Episode 544	Average Score: 8.60	Score: 9.15actions batch at 381000-th learning:
	 shape = (128, 4),
	 mean = [0.22684222 0.30243257 0.17563775 0.24653432],
	  std = [0.57617927 0.58754    0.54906934 0.59927315]
8.459999810904264
Episode 545	Average Score: 8.63	Score: 8.46actions batch at 382000-th learning:
	 shape = (128, 4),
	 mean = [0.22977585 0.29790637 0.2951303  0.15836939],
	  std = [0.5603378  0.569035   0.5367464  0.59601957]
14.139999683946371
Episode 546	Average Score: 8.70	Score: 14.149.239999793469906
Episode 547	Average Score: 8.68	Score: 9.24actions batch at 383000-th learning:
	 shape = (128, 4),
	 mean = [0.25649214 0.23235767 0.22968207 0.27474156],
	  std = [0.59417737 0.5735488  0.5036705  0.5531834 ]
8.90999980084598
Episode 548	Average Score: 8.68	Score: 8.91actions batch at 384000-th learning:
	 shape = (128, 4),
	 mean = [0.22124025 0.14225172 0.18226312 0.19976884],
	  std = [0.56282806 0.5220424  0.51894414 0.5895354 ]
10.119999773800373
Episode 549	Average Score: 8.68	Score: 10.128.729999804869294
Episode 550	Average Score: 8.69
actions batch at 385000-th learning:
	 shape = (128, 4),
	 mean = [0.22449008 0.32092893 0.20181294 0.28567404],
	  std = [0.5666401  0.5822582  0.53142285 0.6027332 ]
8.919999800622463
Episode 551	Average Score: 8.69	Score: 8.92actions batch at 386000-th learning:
	 shape = (128, 4),
	 mean = [0.16794397 0.31578672 0.18723173 0.15520303],
	  std = [0.5625349 0.5826329 0.5176457 0.5479899]
13.74999969266355
Episode 552	Average Score: 8.75	Score: 13.757.879999823868275
Episode 553	Average Score: 8.76	Score: 7.88actions batch at 387000-th learning:
	 shape = (128, 4),
	 mean = [0.17734641 0.2517629  0.18560217 0.1760774 ],
	  std = [0.53343964 0.5788179  0.5406561  0.5508558 ]
9.159999795258045
Episode 554	Average Score: 8.79	Score: 9.16actions batch at 388000-th learning:
	 shape = (128, 4),
	 mean = [0.23132868 0.24258831 0.385773   0.20916803],
	  std = [0.55736566 0.57845443 0.53241396 0.57310635]
8.469999810680747
Episode 555	Average Score: 8.73	Score: 8.47actions batch at 389000-th learning:
	 shape = (128, 4),
	 mean = [0.3139527  0.2062558  0.246846   0.25204518],
	  std = [0.5788504  0.5722036  0.50087535 0.5901205 ]
6.239999860525131
Episode 556	Average Score: 8.74	Score: 6.249.879999779164791
Episode 557	Average Score: 8.75	Score: 9.88actions batch at 390000-th learning:
	 shape = (128, 4),
	 mean = [0.21812311 0.21267286 0.20923935 0.16194336],
	  std = [0.5823802  0.56401587 0.5503289  0.54796535]
7.919999822974205
Episode 558	Average Score: 8.80	Score: 7.92actions batch at 391000-th learning:
	 shape = (128, 4),
	 mean = [0.19354011 0.18225156 0.27071095 0.25212777],
	  std = [0.55807066 0.55762625 0.5435621  0.56808203]
9.05999979749322
Episode 559	Average Score: 8.72	Score: 9.068.419999811798334
Episode 560	Average Score: 8.71
actions batch at 392000-th learning:
	 shape = (128, 4),
	 mean = [0.14084478 0.22503135 0.23256303 0.23195487],
	  std = [0.5779955  0.55163693 0.520079   0.59145814]
7.019999843090773
Episode 561	Average Score: 8.70	Score: 7.02actions batch at 393000-th learning:
	 shape = (128, 4),
	 mean = [0.3039444  0.22939742 0.2316466  0.21333933],
	  std = [0.56766057 0.56728566 0.5220971  0.57597065]
7.159999839961529
Episode 562	Average Score: 8.68	Score: 7.1611.149999750778079
Episode 563	Average Score: 8.71	Score: 11.15actions batch at 394000-th learning:
	 shape = (128, 4),
	 mean = [0.11785359 0.27766848 0.17065819 0.17755648],
	  std = [0.5252373  0.5637581  0.5203631  0.58792764]
6.559999853372574
Episode 564	Average Score: 8.73	Score: 6.56actions batch at 395000-th learning:
	 shape = (128, 4),
	 mean = [0.14583263 0.25878373 0.18867151 0.2815558 ],
	  std = [0.5657266 0.5510272 0.5328371 0.5749656]
8.039999820291996
Episode 565	Average Score: 8.72	Score: 8.04actions batch at 396000-th learning:
	 shape = (128, 4),
	 mean = [0.16132282 0.29981115 0.16073368 0.25719625],
	  std = [0.5458591 0.582632  0.5207341 0.5682608]
10.799999758601189
Episode 566	Average Score: 8.73	Score: 10.808.899999801069498
Episode 567	Average Score: 8.74	Score: 8.90actions batch at 397000-th learning:
	 shape = (128, 4),
	 mean = [0.20006351 0.24246821 0.20123845 0.20571068],
	  std = [0.5646888 0.5775476 0.5056681 0.5814922]
9.17999979481101
Episode 568	Average Score: 8.75	Score: 9.18actions batch at 398000-th learning:
	 shape = (128, 4),
	 mean = [0.07638093 0.19125183 0.15026565 0.16729653],
	  std = [0.5237221  0.5395217  0.5490962  0.58620906]
9.389999790117145
Episode 569	Average Score: 8.78	Score: 9.398.069999819621444
Episode 570	Average Score: 8.77
actions batch at 399000-th learning:
	 shape = (128, 4),
	 mean = [0.15587284 0.16434017 0.14900671 0.11181647],
	  std = [0.56458956 0.54317445 0.49706063 0.564499  ]
10.989999754354358
Episode 571	Average Score: 8.78	Score: 10.99actions batch at 400000-th learning:
	 shape = (128, 4),
	 mean = [0.22049814 0.20301093 0.2884918  0.25777382],
	  std = [0.592355  0.5866557 0.559651  0.592582 ]
13.729999693110585
Episode 572	Average Score: 8.83	Score: 13.737.62999982945621
Episode 573	Average Score: 8.81	Score: 7.63actions batch at 401000-th learning:
	 shape = (128, 4),
	 mean = [0.17571203 0.30276853 0.19647007 0.28019077],
	  std = [0.5783629  0.55116105 0.53544915 0.5669916 ]
7.099999841302633
Episode 574	Average Score: 8.84	Score: 7.10actions batch at 402000-th learning:
	 shape = (128, 4),
	 mean = [0.15331519 0.21643622 0.18450116 0.20247719],
	  std = [0.56385094 0.5714435  0.50977784 0.5645874 ]
11.089999752119184
Episode 575	Average Score: 8.88	Score: 11.09actions batch at 403000-th learning:
	 shape = (128, 4),
	 mean = [0.17088059 0.22652754 0.2288113  0.17641221],
	  std = [0.5654039  0.5514751  0.5402112  0.56776917]
9.759999781847
Episode 576	Average Score: 8.90	Score: 9.764.9899998884648085
Episode 577	Average Score: 8.87	Score: 4.99actions batch at 404000-th learning:
	 shape = (128, 4),
	 mean = [0.18111373 0.1811357  0.20767845 0.20430706],
	  std = [0.5712587 0.5509691 0.529342  0.5771785]
6.289999859407544
Episode 578	Average Score: 8.82	Score: 6.29actions batch at 405000-th learning:
	 shape = (128, 4),
	 mean = [0.24389659 0.24186917 0.28894138 0.26496613],
	  std = [0.5758272  0.5354564  0.5279313  0.59436363]
7.959999822080135
Episode 579	Average Score: 8.80	Score: 7.9611.239999748766422
Episode 580	Average Score: 8.81
actions batch at 406000-th learning:
	 shape = (128, 4),
	 mean = [0.2086608  0.21896689 0.23662567 0.24719782],
	  std = [0.5700262  0.57515275 0.54473835 0.6094571 ]
8.499999810010195
Episode 581	Average Score: 8.81	Score: 8.50actions batch at 407000-th learning:
	 shape = (128, 4),
	 mean = [0.13530298 0.25619724 0.17010881 0.29345268],
	  std = [0.54582477 0.57253754 0.50625795 0.5677807 ]
7.569999830797315
Episode 582	Average Score: 8.78	Score: 7.5711.939999733120203
Episode 583	Average Score: 8.81	Score: 11.94actions batch at 408000-th learning:
	 shape = (128, 4),
	 mean = [0.21719486 0.24720778 0.16210015 0.17655827],
	  std = [0.5793886 0.581674  0.5042278 0.5857092]
10.29999976977706
Episode 584	Average Score: 8.81	Score: 10.30actions batch at 409000-th learning:
	 shape = (128, 4),
	 mean = [0.24502394 0.19226053 0.2232714  0.17753127],
	  std = [0.5740268  0.5523578  0.54527724 0.5933305 ]
9.199999794363976
Episode 585	Average Score: 8.81	Score: 9.20actions batch at 410000-th learning:
	 shape = (128, 4),
	 mean = [0.15521571 0.1936684  0.23237413 0.3043501 ],
	  std = [0.57103294 0.5369361  0.5361439  0.58013505]
10.529999764636159
Episode 586	Average Score: 8.84	Score: 10.539.229999793693423
Episode 587	Average Score: 8.85	Score: 9.23actions batch at 411000-th learning:
	 shape = (128, 4),
	 mean = [0.10317443 0.21114025 0.15896815 0.1292269 ],
	  std = [0.5333391  0.5675033  0.49953732 0.5443099 ]
8.309999814257026
Episode 588	Average Score: 8.84	Score: 8.31actions batch at 412000-th learning:
	 shape = (128, 4),
	 mean = [0.21239196 0.20309249 0.18331437 0.15215647],
	  std = [0.60947585 0.5433174  0.51990986 0.56720257]
7.159999839961529
Episode 589	Average Score: 8.83	Score: 7.1610.589999763295054
Episode 590	Average Score: 8.82
actions batch at 413000-th learning:
	 shape = (128, 4),
	 mean = [0.18830797 0.18883617 0.30256492 0.2218918 ],
	  std = [0.5711152  0.57209885 0.5384848  0.57767314]
8.949999799951911
Episode 591	Average Score: 8.83	Score: 8.95actions batch at 414000-th learning:
	 shape = (128, 4),
	 mean = [0.19945449 0.21728247 0.16864847 0.1460541 ],
	  std = [0.5771053  0.56468296 0.51246655 0.5735154 ]
8.639999806880951
Episode 592	Average Score: 8.86	Score: 8.647.519999831914902
Episode 593	Average Score: 8.84	Score: 7.52actions batch at 415000-th learning:
	 shape = (128, 4),
	 mean = [0.19186234 0.22529186 0.16828392 0.24275294],
	  std = [0.57782626 0.57149905 0.54727787 0.58451456]
6.249999860301614
Episode 594	Average Score: 8.84	Score: 6.25actions batch at 416000-th learning:
	 shape = (128, 4),
	 mean = [0.17159405 0.16694511 0.17193155 0.14524318],
	  std = [0.5457299 0.546861  0.5077054 0.5561827]
4.9399998895823956
Episode 595	Average Score: 8.77	Score: 4.94actions batch at 417000-th learning:
	 shape = (128, 4),
	 mean = [0.13974951 0.2099133  0.17254794 0.04615323],
	  std = [0.54342836 0.55274737 0.52043074 0.5181667 ]
10.10999977402389
Episode 596	Average Score: 8.76	Score: 10.118.029999820515513
Episode 597	Average Score: 8.79	Score: 8.03actions batch at 418000-th learning:
	 shape = (128, 4),
	 mean = [0.2150444  0.23225151 0.18477835 0.24955462],
	  std = [0.5596105  0.56804293 0.5467442  0.587448  ]
11.099999751895666
Episode 598	Average Score: 8.82	Score: 11.10actions batch at 419000-th learning:
	 shape = (128, 4),
	 mean = [0.19707239 0.24011467 0.22086811 0.11246803],
	  std = [0.5564508  0.5616255  0.5347741  0.55757076]
10.479999765753746
Episode 599	Average Score: 8.82	Score: 10.488.199999816715717
Episode 600	Average Score: 8.81
actions batch at 420000-th learning:
	 shape = (128, 4),
	 mean = [0.104666   0.19184671 0.1543878  0.1618554 ],
	  std = [0.5112323  0.5516438  0.51672184 0.56290257]
8.16999981738627
Episode 601	Average Score: 8.82	Score: 8.17actions batch at 421000-th learning:
	 shape = (128, 4),
	 mean = [0.18267581 0.2890304  0.26073742 0.1515049 ],
	  std = [0.5552575  0.6000601  0.54858017 0.56264466]
7.409999834373593
Episode 602	Average Score: 8.83	Score: 7.4110.129999773576856
Episode 603	Average Score: 8.79	Score: 10.13actions batch at 422000-th learning:
	 shape = (128, 4),
	 mean = [0.13685103 0.17314433 0.2641288  0.20737608],
	  std = [0.5387416  0.56452173 0.5319257  0.60665065]
8.51999980956316
Episode 604	Average Score: 8.76	Score: 8.52actions batch at 423000-th learning:
	 shape = (128, 4),
	 mean = [0.19283815 0.3116241  0.30628413 0.24416809],
	  std = [0.5736514  0.6012619  0.5473499  0.57676095]
8.279999814927578
Episode 605	Average Score: 8.78	Score: 8.28actions batch at 424000-th learning:
	 shape = (128, 4),
	 mean = [0.19632994 0.16349372 0.20019877 0.27072775],
	  std = [0.5686343  0.59960747 0.550322   0.56976134]
11.139999751001596
Episode 606	Average Score: 8.81	Score: 11.145.589999875053763
Episode 607	Average Score: 8.82	Score: 5.59actions batch at 425000-th learning:
	 shape = (128, 4),
	 mean = [0.27221078 0.26877588 0.26048288 0.16589135],
	  std = [0.58773506 0.584662   0.52516174 0.56997705]
2.509999943897128
Episode 608	Average Score: 8.78	Score: 2.51actions batch at 426000-th learning:
	 shape = (128, 4),
	 mean = [0.10953479 0.27019837 0.17918164 0.20843703],
	  std = [0.49223885 0.5611864  0.496025   0.55973965]
5.989999866113067
Episode 609	Average Score: 8.75	Score: 5.995.709999872371554
Episode 610	Average Score: 8.72
actions batch at 427000-th learning:
	 shape = (128, 4),
	 mean = [0.17764682 0.24180199 0.2185768  0.2877502 ],
	  std = [0.53605336 0.5623535  0.52434623 0.61069053]
9.67999978363514
Episode 611	Average Score: 8.72	Score: 9.68actions batch at 428000-th learning:
	 shape = (128, 4),
	 mean = [0.1152518  0.18887177 0.20762177 0.30324423],
	  std = [0.550466  0.5696503 0.534251  0.586655 ]
8.879999801516533
Episode 612	Average Score: 8.72	Score: 8.886.69999985024333
Episode 613	Average Score: 8.73	Score: 6.70actions batch at 429000-th learning:
	 shape = (128, 4),
	 mean = [0.29920235 0.27184278 0.23280911 0.1584134 ],
	  std = [0.58349305 0.5855009  0.53606015 0.541567  ]
4.649999896064401
Episode 614	Average Score: 8.72	Score: 4.65actions batch at 430000-th learning:
	 shape = (128, 4),
	 mean = [0.27789655 0.25864437 0.19021885 0.19456626],
	  std = [0.5818392 0.6038048 0.5173317 0.5613984]
8.379999812692404
Episode 615	Average Score: 8.70	Score: 8.38actions batch at 431000-th learning:
	 shape = (128, 4),
	 mean = [0.22718515 0.19843371 0.18622635 0.24129896],
	  std = [0.561917   0.5752133  0.50738674 0.57173944]
11.649999739602208
Episode 616	Average Score: 8.76	Score: 11.658.339999813586473
Episode 617	Average Score: 8.79	Score: 8.34actions batch at 432000-th learning:
	 shape = (128, 4),
	 mean = [0.12812689 0.21111436 0.12284866 0.22283246],
	  std = [0.5462175  0.5419407  0.49167964 0.54902035]
7.709999827668071
Episode 618	Average Score: 8.78	Score: 7.71actions batch at 433000-th learning:
	 shape = (128, 4),
	 mean = [0.14546357 0.0974602  0.23899998 0.18471056],
	  std = [0.55548316 0.5502029  0.5186693  0.58736753]
7.969999821856618
Episode 619	Average Score: 8.77	Score: 7.978.629999807104468
Episode 620	Average Score: 8.73
actions batch at 434000-th learning:
	 shape = (128, 4),
	 mean = [0.23416501 0.17412306 0.19270983 0.2515802 ],
	  std = [0.57060456 0.58188254 0.52583355 0.58073884]
7.409999834373593
Episode 621	Average Score: 8.73	Score: 7.41actions batch at 435000-th learning:
	 shape = (128, 4),
	 mean = [0.32103935 0.20221427 0.28768182 0.2882578 ],
	  std = [0.5884489  0.56832176 0.53485286 0.6097355 ]
9.669999783858657
Episode 622	Average Score: 8.77	Score: 9.677.179999839514494
Episode 623	Average Score: 8.74	Score: 7.18actions batch at 436000-th learning:
	 shape = (128, 4),
	 mean = [0.1827607  0.16637273 0.18396053 0.28820515],
	  std = [0.5776374  0.5448455  0.5164452  0.60185486]
6.309999858960509
Episode 624	Average Score: 8.73	Score: 6.31actions batch at 437000-th learning:
	 shape = (128, 4),
	 mean = [0.17419878 0.21199253 0.12431522 0.13724971],
	  std = [0.5515395  0.5628717  0.48876154 0.5403997 ]
9.909999778494239
Episode 625	Average Score: 8.75	Score: 9.91actions batch at 438000-th learning:
	 shape = (128, 4),
	 mean = [0.21522588 0.19913048 0.2733662  0.2447581 ],
	  std = [0.549297   0.5662276  0.5563023  0.60915256]
12.689999716356397
Episode 626	Average Score: 8.83	Score: 12.698.669999806210399
Episode 627	Average Score: 8.83	Score: 8.67actions batch at 439000-th learning:
	 shape = (128, 4),
	 mean = [0.13602282 0.20701237 0.24546333 0.23906091],
	  std = [0.5459011  0.54927325 0.53804225 0.58792955]
10.56999976374209
Episode 628	Average Score: 8.86	Score: 10.57actions batch at 440000-th learning:
	 shape = (128, 4),
	 mean = [0.14529538 0.22105893 0.19694018 0.23542146],
	  std = [0.56179607 0.5804045  0.48021084 0.57483965]
8.159999817609787
Episode 629	Average Score: 8.85	Score: 8.1610.72999976016581
Episode 630	Average Score: 8.86
actions batch at 441000-th learning:
	 shape = (128, 4),
	 mean = [0.22275272 0.28168783 0.17892247 0.209246  ],
	  std = [0.54748523 0.5779897  0.51390725 0.5626286 ]
3.3199999257922173
Episode 631	Average Score: 8.79	Score: 3.32actions batch at 442000-th learning:
	 shape = (128, 4),
	 mean = [0.15164231 0.22197331 0.2271807  0.31658238],
	  std = [0.5295616  0.5817919  0.50320685 0.5974389 ]
9.359999790787697
Episode 632	Average Score: 8.78	Score: 9.368.539999809116125
Episode 633	Average Score: 8.76	Score: 8.54actions batch at 443000-th learning:
	 shape = (128, 4),
	 mean = [0.18791932 0.13467236 0.21688196 0.21658781],
	  std = [0.5538304  0.56175387 0.5264163  0.5695155 ]
5.799999870359898
Episode 634	Average Score: 8.73	Score: 5.80actions batch at 444000-th learning:
	 shape = (128, 4),
	 mean = [0.16644083 0.16584443 0.17473762 0.2785894 ],
	  std = [0.5877228 0.5906512 0.527242  0.6145046]
8.059999819844961
Episode 635	Average Score: 8.69	Score: 8.06actions batch at 445000-th learning:
	 shape = (128, 4),
	 mean = [0.08783721 0.24352169 0.19410674 0.16746935],
	  std = [0.5423183  0.5769583  0.52806014 0.58514017]
10.559999763965607
Episode 636	Average Score: 8.77	Score: 10.568.59999980777502
Episode 637	Average Score: 8.76	Score: 8.60actions batch at 446000-th learning:
	 shape = (128, 4),
	 mean = [0.16007221 0.10827013 0.25194615 0.1862233 ],
	  std = [0.5259245  0.53089243 0.50661325 0.5773662 ]
10.519999764859676
Episode 638	Average Score: 8.75	Score: 10.52actions batch at 447000-th learning:
	 shape = (128, 4),
	 mean = [0.24560747 0.18833591 0.21298704 0.20204328],
	  std = [0.5610315  0.58573925 0.51809824 0.5769373 ]
10.849999757483602
Episode 639	Average Score: 8.76	Score: 10.8514.879999667406082
Episode 640	Average Score: 8.80
actions batch at 448000-th learning:
	 shape = (128, 4),
	 mean = [0.23614156 0.2075055  0.28491718 0.12698638],
	  std = [0.5733245 0.5676865 0.5423905 0.5417032]
3.4399999231100082
Episode 641	Average Score: 8.75	Score: 3.44actions batch at 449000-th learning:
	 shape = (128, 4),
	 mean = [0.19546133 0.20757681 0.2628177  0.22260624],
	  std = [0.58295923 0.58766276 0.5342953  0.5911253 ]
7.62999982945621
Episode 642	Average Score: 8.74	Score: 7.6310.68999976105988
Episode 643	Average Score: 8.73	Score: 10.69actions batch at 450000-th learning:
	 shape = (128, 4),
	 mean = [0.17073397 0.2759712  0.23644623 0.15048224],
	  std = [0.5619741 0.5980685 0.5113142 0.5806215]
11.169999750331044
Episode 644	Average Score: 8.75	Score: 11.17actions batch at 451000-th learning:
	 shape = (128, 4),
	 mean = [0.16445789 0.23631784 0.193172   0.13074121],
	  std = [0.5461113  0.5825475  0.51476145 0.57766473]
11.419999744743109
Episode 645	Average Score: 8.78	Score: 11.42actions batch at 452000-th learning:
	 shape = (128, 4),
	 mean = [0.281115   0.30202714 0.26954362 0.23624556],
	  std = [0.5829967  0.5956203  0.54090464 0.5965563 ]
9.44999978877604
Episode 646	Average Score: 8.73	Score: 9.456.289999859407544
Episode 647	Average Score: 8.70	Score: 6.29actions batch at 453000-th learning:
	 shape = (128, 4),
	 mean = [0.13563694 0.2266533  0.22727898 0.24398245],
	  std = [0.53145486 0.55692756 0.5136251  0.58093554]
6.299999859184027
Episode 648	Average Score: 8.67	Score: 6.30actions batch at 454000-th learning:
	 shape = (128, 4),
	 mean = [0.16500519 0.28912997 0.23750931 0.23521066],
	  std = [0.5664435  0.56978625 0.5353596  0.5804999 ]
7.639999829232693
Episode 649	Average Score: 8.65	Score: 7.649.249999793246388
Episode 650	Average Score: 8.65
actions batch at 455000-th learning:
	 shape = (128, 4),
	 mean = [0.18798666 0.296462   0.22764765 0.19057088],
	  std = [0.5295227  0.56883013 0.48715514 0.56220955]
7.059999842196703
Episode 651	Average Score: 8.64	Score: 7.06actions batch at 456000-th learning:
	 shape = (128, 4),
	 mean = [0.15368952 0.25524867 0.19109035 0.32789826],
	  std = [0.53452533 0.58655864 0.5052293  0.558388  ]
7.459999833256006
Episode 652	Average Score: 8.57	Score: 7.4617.519999608397484
Episode 653	Average Score: 8.67	Score: 17.52actions batch at 457000-th learning:
	 shape = (128, 4),
	 mean = [0.13915846 0.23246442 0.20516567 0.22952217],
	  std = [0.5794984 0.5681768 0.5339608 0.56907  ]
8.729999804869294
Episode 654	Average Score: 8.66	Score: 8.73actions batch at 458000-th learning:
	 shape = (128, 4),
	 mean = [0.21017905 0.18010369 0.25727752 0.23599644],
	  std = [0.5812939  0.56260157 0.5035746  0.5770761 ]
8.029999820515513
Episode 655	Average Score: 8.66	Score: 8.03actions batch at 459000-th learning:
	 shape = (128, 4),
	 mean = [0.20831153 0.2466964  0.15039493 0.23843388],
	  std = [0.57616764 0.5641497  0.49671027 0.56247   ]
8.24999981559813
Episode 656	Average Score: 8.68	Score: 8.257.079999841749668
Episode 657	Average Score: 8.65	Score: 7.08actions batch at 460000-th learning:
	 shape = (128, 4),
	 mean = [0.19871382 0.23362733 0.28494856 0.12628144],
	  std = [0.56185275 0.5628339  0.5273071  0.53681636]
4.969999888911843
Episode 658	Average Score: 8.62	Score: 4.97actions batch at 461000-th learning:
	 shape = (128, 4),
	 mean = [0.17870697 0.13389891 0.17034516 0.32472882],
	  std = [0.54420865 0.55177    0.53427416 0.5856994 ]
8.739999804645777
Episode 659	Average Score: 8.62	Score: 8.748.51999980956316
Episode 660	Average Score: 8.62
actions batch at 462000-th learning:
	 shape = (128, 4),
	 mean = [0.16446446 0.1332284  0.22905992 0.22370191],
	  std = [0.55823284 0.5348846  0.52884054 0.5944094 ]
8.28999981470406
Episode 661	Average Score: 8.63	Score: 8.29actions batch at 463000-th learning:
	 shape = (128, 4),
	 mean = [0.13478278 0.22915716 0.22506863 0.27999637],
	  std = [0.5558355  0.57427466 0.5016882  0.5804946 ]
10.25999977067113
Episode 662	Average Score: 8.66	Score: 10.2613.0899997074157
Episode 663	Average Score: 8.68	Score: 13.09actions batch at 464000-th learning:
	 shape = (128, 4),
	 mean = [0.21537858 0.22941327 0.33138213 0.27015167],
	  std = [0.54482794 0.5830307  0.5511835  0.5939915 ]
6.339999858289957
Episode 664	Average Score: 8.68	Score: 6.34actions batch at 465000-th learning:
	 shape = (128, 4),
	 mean = [0.25892705 0.23660213 0.11881253 0.27535644],
	  std = [0.5718645  0.56137115 0.49994072 0.5934097 ]
9.249999793246388
Episode 665	Average Score: 8.69	Score: 9.25actions batch at 466000-th learning:
	 shape = (128, 4),
	 mean = [0.24152178 0.2027602  0.31850553 0.23164453],
	  std = [0.5583886  0.56314725 0.5216182  0.5711784 ]
9.049999797716737
Episode 666	Average Score: 8.68	Score: 9.059.349999791011214
Episode 667	Average Score: 8.68	Score: 9.35actions batch at 467000-th learning:
	 shape = (128, 4),
	 mean = [0.20419472 0.20625159 0.19539128 0.22273576],
	  std = [0.55939937 0.56528264 0.49979487 0.5835022 ]
11.18999974988401
Episode 668	Average Score: 8.70	Score: 11.19actions batch at 468000-th learning:
	 shape = (128, 4),
	 mean = [0.28921095 0.22219482 0.18881945 0.06999128],
	  std = [0.5695326  0.550352   0.483902   0.53057855]
7.89999982342124
Episode 669	Average Score: 8.69	Score: 7.908.409999812021852
Episode 670	Average Score: 8.69
actions batch at 469000-th learning:
	 shape = (128, 4),
	 mean = [0.14465347 0.24063331 0.2468136  0.25922963],
	  std = [0.5422394  0.58987117 0.5132179  0.5656299 ]
11.019999753683805
Episode 671	Average Score: 8.69	Score: 11.02actions batch at 470000-th learning:
	 shape = (128, 4),
	 mean = [0.3139908  0.12481824 0.18813153 0.20776251],
	  std = [0.5889596  0.57184434 0.50344324 0.58154374]
7.35999983549118
Episode 672	Average Score: 8.63	Score: 7.3610.599999763071537
Episode 673	Average Score: 8.66	Score: 10.60actions batch at 471000-th learning:
	 shape = (128, 4),
	 mean = [0.13672048 0.12407667 0.12740119 0.15269023],
	  std = [0.5695109  0.5382712  0.49940726 0.5458575 ]
4.929999889805913
Episode 674	Average Score: 8.63	Score: 4.93actions batch at 472000-th learning:
	 shape = (128, 4),
	 mean = [0.15310822 0.24497968 0.20956174 0.18875712],
	  std = [0.58197397 0.5781212  0.5289173  0.54864347]
7.579999830573797
Episode 675	Average Score: 8.60	Score: 7.58actions batch at 473000-th learning:
	 shape = (128, 4),
	 mean = [0.17427802 0.1640715  0.2076479  0.2200089 ],
	  std = [0.54921436 0.5501899  0.51438856 0.5664249 ]
6.849999846890569
Episode 676	Average Score: 8.57	Score: 6.857.62999982945621
Episode 677	Average Score: 8.60	Score: 7.63actions batch at 474000-th learning:
	 shape = (128, 4),
	 mean = [0.2699368  0.21855098 0.23036596 0.20911369],
	  std = [0.57535005 0.5859869  0.5255171  0.58761656]
6.629999851807952
Episode 678	Average Score: 8.60	Score: 6.63actions batch at 475000-th learning:
	 shape = (128, 4),
	 mean = [0.19437139 0.3447048  0.27305746 0.21052033],
	  std = [0.5850763  0.57088995 0.5411035  0.57678723]
8.32999981380999
Episode 679	Average Score: 8.60	Score: 8.337.719999827444553
Episode 680	Average Score: 8.57
actions batch at 476000-th learning:
	 shape = (128, 4),
	 mean = [0.1185104  0.15801065 0.21771978 0.2734977 ],
	  std = [0.50057024 0.54818326 0.5194762  0.56488925]
7.079999841749668
Episode 681	Average Score: 8.55	Score: 7.08actions batch at 477000-th learning:
	 shape = (128, 4),
	 mean = [0.30099118 0.25856593 0.24958898 0.1834198 ],
	  std = [0.57743263 0.5690456  0.5325299  0.57760125]
8.319999814033508
Episode 682	Average Score: 8.56	Score: 8.328.82999980263412
Episode 683	Average Score: 8.53	Score: 8.83actions batch at 478000-th learning:
	 shape = (128, 4),
	 mean = [0.258582   0.19020313 0.2863408  0.16859798],
	  std = [0.57152355 0.5706191  0.51574266 0.57556677]
9.869999779388309
Episode 684	Average Score: 8.53	Score: 9.87actions batch at 479000-th learning:
	 shape = (128, 4),
	 mean = [0.10382827 0.12287927 0.21576895 0.24237527],
	  std = [0.5406721 0.54571   0.5214731 0.5761038]
8.029999820515513
Episode 685	Average Score: 8.51	Score: 8.03actions batch at 480000-th learning:
	 shape = (128, 4),
	 mean = [0.22856565 0.20008212 0.22677065 0.28234017],
	  std = [0.589215   0.5700226  0.52614164 0.5786139 ]
7.259999837726355
Episode 686	Average Score: 8.48	Score: 7.268.659999806433916
Episode 687	Average Score: 8.48	Score: 8.66actions batch at 481000-th learning:
	 shape = (128, 4),
	 mean = [0.28523424 0.26222748 0.28678972 0.20737089],
	  std = [0.5734362 0.5624189 0.5586627 0.5537294]
7.659999828785658
Episode 688	Average Score: 8.47	Score: 7.66actions batch at 482000-th learning:
	 shape = (128, 4),
	 mean = [0.2838394  0.25629386 0.24054058 0.273739  ],
	  std = [0.57514626 0.58104855 0.52204365 0.6140273 ]
6.139999862760305
Episode 689	Average Score: 8.46	Score: 6.145.689999872818589
Episode 690	Average Score: 8.41
actions batch at 483000-th learning:
	 shape = (128, 4),
	 mean = [0.27386287 0.21484502 0.18854351 0.25892824],
	  std = [0.57704854 0.5705357  0.52610457 0.57947254]
7.579999830573797
Episode 691	Average Score: 8.40	Score: 7.58actions batch at 484000-th learning:
	 shape = (128, 4),
	 mean = [0.204518   0.21287186 0.3033535  0.33929384],
	  std = [0.57257384 0.5214556  0.5626673  0.60121703]
5.989999866113067
Episode 692	Average Score: 8.37	Score: 5.999.309999791905284
Episode 693	Average Score: 8.39	Score: 9.31actions batch at 485000-th learning:
	 shape = (128, 4),
	 mean = [0.16066372 0.08267278 0.16425543 0.2700543 ],
	  std = [0.5712056  0.53001624 0.5079573  0.5860708 ]
5.80999987013638
Episode 694	Average Score: 8.38	Score: 5.81actions batch at 486000-th learning:
	 shape = (128, 4),
	 mean = [0.21702969 0.13429488 0.24577996 0.22386469],
	  std = [0.56852853 0.56518394 0.53228205 0.56501347]
6.309999858960509
Episode 695	Average Score: 8.40	Score: 6.31actions batch at 487000-th learning:
	 shape = (128, 4),
	 mean = [0.14199166 0.2712835  0.19337536 0.20745546],
	  std = [0.5597914  0.58959997 0.50115234 0.56495607]
7.099999841302633
Episode 696	Average Score: 8.37	Score: 7.108.799999803304672
Episode 697	Average Score: 8.37	Score: 8.80actions batch at 488000-th learning:
	 shape = (128, 4),
	 mean = [0.17573199 0.19421467 0.29777822 0.24833776],
	  std = [0.56471926 0.56842214 0.5457046  0.59385777]
5.979999866336584
Episode 698	Average Score: 8.32	Score: 5.98actions batch at 489000-th learning:
	 shape = (128, 4),
	 mean = [0.18206367 0.18216987 0.2625877  0.2442673 ],
	  std = [0.5732063  0.59679896 0.5467566  0.6068043 ]
10.289999770000577
Episode 699	Average Score: 8.32	Score: 10.298.429999811574817
Episode 700	Average Score: 8.32
actions batch at 490000-th learning:
	 shape = (128, 4),
	 mean = [0.19382264 0.24992944 0.26784718 0.16588753],
	  std = [0.5707998  0.58837557 0.5327311  0.5498936 ]
7.259999837726355
Episode 701	Average Score: 8.31	Score: 7.26actions batch at 491000-th learning:
	 shape = (128, 4),
	 mean = [0.15571383 0.1837785  0.19825026 0.18844955],
	  std = [0.56036085 0.5622622  0.4989568  0.5588039 ]
5.7299998719245195
Episode 702	Average Score: 8.30	Score: 5.738.219999816268682
Episode 703	Average Score: 8.28	Score: 8.22actions batch at 492000-th learning:
	 shape = (128, 4),
	 mean = [0.18465106 0.2606382  0.26067218 0.285557  ],
	  std = [0.5684467  0.5863221  0.51205105 0.5831577 ]
5.7299998719245195
Episode 704	Average Score: 8.25	Score: 5.73actions batch at 493000-th learning:
	 shape = (128, 4),
	 mean = [0.19452721 0.24921821 0.25285643 0.13161962],
	  std = [0.5622907  0.5500466  0.5201279  0.55942386]
8.269999815151095
Episode 705	Average Score: 8.25	Score: 8.27actions batch at 494000-th learning:
	 shape = (128, 4),
	 mean = [0.2408081  0.20151505 0.30703005 0.2414702 ],
	  std = [0.57176065 0.55003893 0.53203    0.5633361 ]
6.2099998611956835
Episode 706	Average Score: 8.20	Score: 6.215.619999874383211
Episode 707	Average Score: 8.20	Score: 5.62actions batch at 495000-th learning:
	 shape = (128, 4),
	 mean = [0.21431923 0.22417916 0.22890018 0.27393574],
	  std = [0.5903472  0.56468487 0.5429559  0.60332125]
6.799999848008156
Episode 708	Average Score: 8.24	Score: 6.80actions batch at 496000-th learning:
	 shape = (128, 4),
	 mean = [0.292538   0.20862387 0.34315932 0.2423492 ],
	  std = [0.57592106 0.58559847 0.55305564 0.5901857 ]
4.2699999045580626
Episode 709	Average Score: 8.23	Score: 4.273.299999926239252
Episode 710	Average Score: 8.20
actions batch at 497000-th learning:
	 shape = (128, 4),
	 mean = [0.19819614 0.31806248 0.27894464 0.33412218],
	  std = [0.58041114 0.57886636 0.5333921  0.58922595]
8.569999808445573
Episode 711	Average Score: 8.19	Score: 8.57actions batch at 498000-th learning:
	 shape = (128, 4),
	 mean = [0.1298355  0.16407044 0.1773071  0.19782396],
	  std = [0.5479902  0.53937674 0.5012786  0.56743306]
10.389999767765403
Episode 712	Average Score: 8.21	Score: 10.3910.559999763965607
Episode 713	Average Score: 8.25	Score: 10.56actions batch at 499000-th learning:
	 shape = (128, 4),
	 mean = [0.21112484 0.23724146 0.15164332 0.19654535],
	  std = [0.57062685 0.5716306  0.52466035 0.5749296 ]
6.909999845549464
Episode 714	Average Score: 8.27	Score: 6.91actions batch at 500000-th learning:
	 shape = (128, 4),
	 mean = [0.27651444 0.21365216 0.24436481 0.24391976],
	  std = [0.5651264  0.57239026 0.5314123  0.5730174 ]
7.059999842196703
Episode 715	Average Score: 8.26	Score: 7.06actions batch at 501000-th learning:
	 shape = (128, 4),
	 mean = [0.18074824 0.19010523 0.2440306  0.17965215],
	  std = [0.5399459  0.5593853  0.52065045 0.5801596 ]
4.619999896734953
Episode 716	Average Score: 8.19	Score: 4.6210.519999764859676
Episode 717	Average Score: 8.21	Score: 10.52actions batch at 502000-th learning:
	 shape = (128, 4),
	 mean = [0.23197144 0.09220891 0.22891913 0.12166308],
	  std = [0.57337403 0.50406283 0.51911825 0.54047674]
5.7799998708069324
Episode 718	Average Score: 8.19	Score: 5.78actions batch at 503000-th learning:
	 shape = (128, 4),
	 mean = [0.23280968 0.16539723 0.21368648 0.20286235],
	  std = [0.5832388 0.5580598 0.5131219 0.5653513]
6.589999852702022
Episode 719	Average Score: 8.17	Score: 6.597.799999825656414
Episode 720	Average Score: 8.17
actions batch at 504000-th learning:
	 shape = (128, 4),
	 mean = [0.2720582  0.19479904 0.2535093  0.24315578],
	  std = [0.5915892 0.5492908 0.5175045 0.5684848]
8.639999806880951
Episode 721	Average Score: 8.18	Score: 8.64actions batch at 505000-th learning:
	 shape = (128, 4),
	 mean = [0.2655608  0.25988263 0.23004033 0.24333361],
	  std = [0.60530543 0.5689912  0.48484132 0.5969015 ]
5.089999886229634
Episode 722	Average Score: 8.13	Score: 5.098.179999817162752
Episode 723	Average Score: 8.14	Score: 8.18actions batch at 506000-th learning:
	 shape = (128, 4),
	 mean = [0.11241477 0.16315302 0.20239748 0.23119423],
	  std = [0.5122504  0.5525141  0.48611873 0.56736815]
10.449999766424298
Episode 724	Average Score: 8.18	Score: 10.45actions batch at 507000-th learning:
	 shape = (128, 4),
	 mean = [0.10707559 0.20060636 0.20006123 0.30741942],
	  std = [0.51453143 0.5874797  0.5169212  0.60333467]
7.889999823644757
Episode 725	Average Score: 8.16	Score: 7.89actions batch at 508000-th learning:
	 shape = (128, 4),
	 mean = [0.26654255 0.19294398 0.19689198 0.13801748],
	  std = [0.5654853 0.5565593 0.5086964 0.562842 ]
6.979999843984842
Episode 726	Average Score: 8.11	Score: 6.9810.929999755695462
Episode 727	Average Score: 8.13	Score: 10.93actions batch at 509000-th learning:
	 shape = (128, 4),
	 mean = [0.14740656 0.17945065 0.22960107 0.185216  ],
	  std = [0.55025065 0.5472032  0.49691412 0.58602333]
8.12999981828034
Episode 728	Average Score: 8.10	Score: 8.13actions batch at 510000-th learning:
	 shape = (128, 4),
	 mean = [0.15634075 0.25534707 0.2639318  0.23558512],
	  std = [0.54872197 0.5801384  0.5255002  0.57796174]
6.539999853819609
Episode 729	Average Score: 8.09	Score: 6.547.679999828338623
Episode 730	Average Score: 8.06
actions batch at 511000-th learning:
	 shape = (128, 4),
	 mean = [0.13135453 0.1811159  0.17369437 0.16696683],
	  std = [0.5303977 0.567285  0.5151501 0.5556803]
10.719999760389328
Episode 731	Average Score: 8.13	Score: 10.72actions batch at 512000-th learning:
	 shape = (128, 4),
	 mean = [0.11355477 0.14679764 0.15467085 0.17920245],
	  std = [0.54732865 0.535621   0.4788851  0.57122236]
4.549999898299575
Episode 732	Average Score: 8.08	Score: 4.5510.129999773576856
Episode 733	Average Score: 8.10	Score: 10.13actions batch at 513000-th learning:
	 shape = (128, 4),
	 mean = [0.156009   0.23393269 0.23026013 0.20202963],
	  std = [0.5391474  0.591      0.5337567  0.56386304]
9.189999794587493
Episode 734	Average Score: 8.13	Score: 9.19actions batch at 514000-th learning:
	 shape = (128, 4),
	 mean = [0.18147883 0.20370972 0.21736003 0.2620971 ],
	  std = [0.56162757 0.5821541  0.51301813 0.5797591 ]
11.219999749213457
Episode 735	Average Score: 8.16	Score: 11.22actions batch at 515000-th learning:
	 shape = (128, 4),
	 mean = [0.1506858  0.15681963 0.24977618 0.27463016],
	  std = [0.5412656  0.56809694 0.5303794  0.58444905]
7.289999837055802
Episode 736	Average Score: 8.13	Score: 7.298.999999798834324
Episode 737	Average Score: 8.14	Score: 9.00actions batch at 516000-th learning:
	 shape = (128, 4),
	 mean = [0.14461148 0.2724026  0.08337638 0.2635635 ],
	  std = [0.53575325 0.5555811  0.46319956 0.55728835]
8.849999802187085
Episode 738	Average Score: 8.12	Score: 8.85actions batch at 517000-th learning:
	 shape = (128, 4),
	 mean = [0.04366425 0.19079837 0.2628672  0.18250372],
	  std = [0.502833   0.57493293 0.5299815  0.572225  ]
10.659999761730433
Episode 739	Average Score: 8.12	Score: 10.668.689999805763364
Episode 740	Average Score: 8.06
actions batch at 518000-th learning:
	 shape = (128, 4),
	 mean = [0.25588313 0.18344913 0.19938399 0.17296281],
	  std = [0.58304995 0.5603983  0.49899238 0.5801665 ]
5.839999869465828
Episode 741	Average Score: 8.08	Score: 5.84actions batch at 519000-th learning:
	 shape = (128, 4),
	 mean = [0.38309842 0.35565555 0.3325745  0.18214928],
	  std = [0.5808768  0.58366966 0.53416663 0.5915451 ]
10.769999759271741
Episode 742	Average Score: 8.11	Score: 10.778.32999981380999
Episode 743	Average Score: 8.09	Score: 8.33actions batch at 520000-th learning:
	 shape = (128, 4),
	 mean = [0.22622801 0.15524587 0.23601624 0.20015615],
	  std = [0.5823146  0.5608736  0.53358537 0.5746921 ]
6.539999853819609
Episode 744	Average Score: 8.04	Score: 6.54actions batch at 521000-th learning:
	 shape = (128, 4),
	 mean = [0.17067637 0.12565476 0.24349868 0.21117797],
	  std = [0.5648298  0.5413651  0.52550244 0.5523295 ]
12.38999972306192
Episode 745	Average Score: 8.05	Score: 12.39actions batch at 522000-th learning:
	 shape = (128, 4),
	 mean = [0.16042443 0.1638419  0.17645273 0.24712725],
	  std = [0.56480217 0.5615084  0.48971614 0.5843995 ]
8.719999805092812
Episode 746	Average Score: 8.04	Score: 8.729.399999789893627
Episode 747	Average Score: 8.07	Score: 9.40actions batch at 523000-th learning:
	 shape = (128, 4),
	 mean = [0.244243   0.23573722 0.17914447 0.29236916],
	  std = [0.5679143  0.56555396 0.5427629  0.58719957]
8.01999982073903
Episode 748	Average Score: 8.09	Score: 8.02actions batch at 524000-th learning:
	 shape = (128, 4),
	 mean = [0.2857363  0.24097262 0.30078173 0.22259432],
	  std = [0.5697344  0.5565174  0.51640505 0.5860952 ]
7.169999839738011
Episode 749	Average Score: 8.09	Score: 7.177.1199998408555984
Episode 750	Average Score: 8.07
actions batch at 525000-th learning:
	 shape = (128, 4),
	 mean = [0.210344   0.06733917 0.22985885 0.27389708],
	  std = [0.5365106  0.50504524 0.5269636  0.588053  ]
7.689999828115106
Episode 751	Average Score: 8.07	Score: 7.69actions batch at 526000-th learning:
	 shape = (128, 4),
	 mean = [0.2262993  0.20372538 0.24181162 0.2823822 ],
	  std = [0.554646   0.5457283  0.52188206 0.5847022 ]
8.699999805539846
Episode 752	Average Score: 8.08	Score: 8.705.639999873936176
Episode 753	Average Score: 7.97	Score: 5.64actions batch at 527000-th learning:
	 shape = (128, 4),
	 mean = [0.18278983 0.17100495 0.22736211 0.2020195 ],
	  std = [0.5656455  0.54496753 0.5235469  0.5610173 ]
8.269999815151095
Episode 754	Average Score: 7.96	Score: 8.27actions batch at 528000-th learning:
	 shape = (128, 4),
	 mean = [0.22076346 0.20295936 0.2723594  0.19961773],
	  std = [0.58124393 0.5815875  0.54327524 0.57161254]
7.919999822974205
Episode 755	Average Score: 7.96	Score: 7.92actions batch at 529000-th learning:
	 shape = (128, 4),
	 mean = [0.17906937 0.16694988 0.22665834 0.18696111],
	  std = [0.54077846 0.5565616  0.52313817 0.5739334 ]
7.1899998392909765
Episode 756	Average Score: 7.95	Score: 7.1910.289999770000577
Episode 757	Average Score: 7.98	Score: 10.29actions batch at 530000-th learning:
	 shape = (128, 4),
	 mean = [0.30035812 0.14005092 0.29512522 0.27798155],
	  std = [0.5927056  0.5648909  0.50003314 0.58459944]
7.409999834373593
Episode 758	Average Score: 8.01	Score: 7.41actions batch at 531000-th learning:
	 shape = (128, 4),
	 mean = [0.27022302 0.23370467 0.22746415 0.22830378],
	  std = [0.55210084 0.57894963 0.53637534 0.58119607]
0.6199999861419201
Episode 759	Average Score: 7.92	Score: 0.626.909999845549464
Episode 760	Average Score: 7.91
actions batch at 532000-th learning:
	 shape = (128, 4),
	 mean = [0.22574596 0.22249772 0.27601248 0.2524048 ],
	  std = [0.54437673 0.5703755  0.5330503  0.578137  ]
8.579999808222055
Episode 761	Average Score: 7.91	Score: 8.58actions batch at 533000-th learning:
	 shape = (128, 4),
	 mean = [0.2578127  0.18997851 0.23120219 0.29026917],
	  std = [0.55670947 0.5452056  0.5445242  0.5687961 ]
8.389999812468886
Episode 762	Average Score: 7.89	Score: 8.3910.549999764189124
Episode 763	Average Score: 7.87	Score: 10.55actions batch at 534000-th learning:
	 shape = (128, 4),
	 mean = [0.18110844 0.15148191 0.32162565 0.24677503],
	  std = [0.56978196 0.55279845 0.52895784 0.5452837 ]
5.719999872148037
Episode 764	Average Score: 7.86	Score: 5.72actions batch at 535000-th learning:
	 shape = (128, 4),
	 mean = [0.24612673 0.24050571 0.21618074 0.23242003],
	  std = [0.5729927  0.5493952  0.5257416  0.57978076]
7.769999826326966
Episode 765	Average Score: 7.85	Score: 7.77actions batch at 536000-th learning:
	 shape = (128, 4),
	 mean = [0.18347982 0.20741446 0.2606806  0.2257188 ],
	  std = [0.5525601  0.56496614 0.5110889  0.58146805]
5.239999882876873
Episode 766	Average Score: 7.81	Score: 5.244.159999907016754
Episode 767	Average Score: 7.76	Score: 4.16actions batch at 537000-th learning:
	 shape = (128, 4),
	 mean = [0.25884384 0.25811392 0.24074815 0.19816639],
	  std = [0.5626334  0.59960693 0.5193982  0.56625456]
10.969999754801393
Episode 768	Average Score: 7.75	Score: 10.97actions batch at 538000-th learning:
	 shape = (128, 4),
	 mean = [0.28590694 0.18675964 0.21949407 0.19139314],
	  std = [0.55356145 0.5620569  0.48993143 0.58201104]
8.309999814257026
Episode 769	Average Score: 7.76	Score: 8.319.569999786093831
Episode 770	Average Score: 7.77
actions batch at 539000-th learning:
	 shape = (128, 4),
	 mean = [0.19951905 0.24643025 0.16688299 0.261217  ],
	  std = [0.53789026 0.5963729  0.49888197 0.5900693 ]
6.289999859407544
Episode 771	Average Score: 7.72	Score: 6.29actions batch at 540000-th learning:
	 shape = (128, 4),
	 mean = [0.28340754 0.17827407 0.29722393 0.21458898],
	  std = [0.58566034 0.56407243 0.5368569  0.568228  ]
6.11999986320734
Episode 772	Average Score: 7.71	Score: 6.126.729999849572778
Episode 773	Average Score: 7.67	Score: 6.73actions batch at 541000-th learning:
	 shape = (128, 4),
	 mean = [0.16352071 0.18267189 0.24126372 0.24161215],
	  std = [0.54535115 0.56314784 0.5520026  0.59235924]
5.749999871477485
Episode 774	Average Score: 7.68	Score: 5.75actions batch at 542000-th learning:
	 shape = (128, 4),
	 mean = [0.19696146 0.17960854 0.21457677 0.22214542],
	  std = [0.5482908  0.5411062  0.51157147 0.57127434]
6.929999845102429
Episode 775	Average Score: 7.67	Score: 6.93actions batch at 543000-th learning:
	 shape = (128, 4),
	 mean = [0.27020767 0.24960552 0.35607624 0.23328882],
	  std = [0.59339255 0.5890758  0.522309   0.57557386]
8.82999980263412
Episode 776	Average Score: 7.69	Score: 8.838.839999802410603
Episode 777	Average Score: 7.71	Score: 8.84actions batch at 544000-th learning:
	 shape = (128, 4),
	 mean = [0.18566899 0.16484956 0.2666831  0.20182265],
	  std = [0.5493429 0.5478413 0.5316729 0.5597638]
6.3299998585134745
Episode 778	Average Score: 7.70	Score: 6.33actions batch at 545000-th learning:
	 shape = (128, 4),
	 mean = [0.2033079  0.21907711 0.19651096 0.29648864],
	  std = [0.5904291  0.56884664 0.5249365  0.6003464 ]
9.789999781176448
Episode 779	Average Score: 7.72	Score: 9.796.539999853819609
Episode 780	Average Score: 7.71
actions batch at 546000-th learning:
	 shape = (128, 4),
	 mean = [0.16706526 0.18056981 0.17576346 0.14254065],
	  std = [0.5480396 0.5738843 0.5134349 0.5350927]
7.289999837055802
Episode 781	Average Score: 7.71	Score: 7.29actions batch at 547000-th learning:
	 shape = (128, 4),
	 mean = [0.25052717 0.24340124 0.22385651 0.26873377],
	  std = [0.55644524 0.5537495  0.4997968  0.55844134]
9.48999978788197
Episode 782	Average Score: 7.72	Score: 9.497.569999830797315
Episode 783	Average Score: 7.71	Score: 7.57actions batch at 548000-th learning:
	 shape = (128, 4),
	 mean = [0.24265002 0.23976539 0.211832   0.2795054 ],
	  std = [0.5850477  0.5782429  0.5200021  0.56766593]
8.299999814480543
Episode 784	Average Score: 7.69	Score: 8.30actions batch at 549000-th learning:
	 shape = (128, 4),
	 mean = [0.16839382 0.17889102 0.26610827 0.24793538],
	  std = [0.5631894 0.5495693 0.5220167 0.5645984]
11.059999752789736
Episode 785	Average Score: 7.72	Score: 11.06actions batch at 550000-th learning:
	 shape = (128, 4),
	 mean = [0.22168802 0.22239672 0.2278907  0.20130742],
	  std = [0.55047053 0.56207246 0.53038365 0.5807205 ]
10.969999754801393
Episode 786	Average Score: 7.76	Score: 10.976.379999857395887
Episode 787	Average Score: 7.74	Score: 6.38actions batch at 551000-th learning:
	 shape = (128, 4),
	 mean = [0.22141442 0.2863856  0.27995476 0.2512062 ],
	  std = [0.55384773 0.56775427 0.5419078  0.60106534]
7.399999834597111
Episode 788	Average Score: 7.73	Score: 7.40actions batch at 552000-th learning:
	 shape = (128, 4),
	 mean = [0.16894752 0.11978152 0.2285068  0.23814571],
	  std = [0.5576553  0.5458577  0.5180214  0.54820216]
7.289999837055802
Episode 789	Average Score: 7.74	Score: 7.296.869999846443534
Episode 790	Average Score: 7.76
actions batch at 553000-th learning:
	 shape = (128, 4),
	 mean = [0.17632577 0.21069172 0.17249238 0.23140419],
	  std = [0.5870823  0.579551   0.5322531  0.58036965]
7.869999824091792
Episode 791	Average Score: 7.76	Score: 7.87actions batch at 554000-th learning:
	 shape = (128, 4),
	 mean = [0.26694995 0.3117197  0.20516498 0.21750274],
	  std = [0.5730005  0.5642157  0.5087334  0.55533004]
6.609999852254987
Episode 792	Average Score: 7.76	Score: 6.618.109999818727374
Episode 793	Average Score: 7.75	Score: 8.11actions batch at 555000-th learning:
	 shape = (128, 4),
	 mean = [0.28717098 0.17749074 0.2916469  0.2966034 ],
	  std = [0.5691458 0.576248  0.5400955 0.5912106]
5.689999872818589
Episode 794	Average Score: 7.75	Score: 5.69actions batch at 556000-th learning:
	 shape = (128, 4),
	 mean = [0.25825635 0.25752202 0.2740815  0.27435237],
	  std = [0.59102863 0.56254196 0.5449212  0.593979  ]
5.429999878630042
Episode 795	Average Score: 7.74	Score: 5.43actions batch at 557000-th learning:
	 shape = (128, 4),
	 mean = [0.22504292 0.20852397 0.21927074 0.17114058],
	  std = [0.5556469  0.5649836  0.4911985  0.58066654]
6.049999864771962
Episode 796	Average Score: 7.73	Score: 6.054.7499998938292265
Episode 797	Average Score: 7.69	Score: 4.75actions batch at 558000-th learning:
	 shape = (128, 4),
	 mean = [0.24598372 0.15306291 0.22642775 0.23017271],
	  std = [0.5804518  0.54555106 0.5174616  0.57537395]
6.029999865218997
Episode 798	Average Score: 7.69	Score: 6.03actions batch at 559000-th learning:
	 shape = (128, 4),
	 mean = [0.26226264 0.33859187 0.24840532 0.22276396],
	  std = [0.5732809  0.5640372  0.52553207 0.5740018 ]
6.569999853149056
Episode 799	Average Score: 7.66	Score: 6.577.4299998339265585
Episode 800	Average Score: 7.65
actions batch at 560000-th learning:
	 shape = (128, 4),
	 mean = [0.1470882  0.19791472 0.22157353 0.10426454],
	  std = [0.5389732  0.5723828  0.5214929  0.55109185]
5.089999886229634
Episode 801	Average Score: 7.62	Score: 5.09actions batch at 561000-th learning:
	 shape = (128, 4),
	 mean = [0.22999677 0.13717695 0.22536573 0.2378707 ],
	  std = [0.5433644  0.55530214 0.51903224 0.55569464]
5.6599998734891415
Episode 802	Average Score: 7.62	Score: 5.666.779999848455191
Episode 803	Average Score: 7.61	Score: 6.78actions batch at 562000-th learning:
	 shape = (128, 4),
	 mean = [0.19475514 0.26999775 0.2626356  0.15423533],
	  std = [0.56699675 0.5446015  0.5338796  0.56635374]
7.419999834150076
Episode 804	Average Score: 7.63	Score: 7.42actions batch at 563000-th learning:
	 shape = (128, 4),
	 mean = [0.15461618 0.14156449 0.25928673 0.1899394 ],
	  std = [0.5475878 0.5352075 0.5123875 0.5349031]
8.009999820962548
Episode 805	Average Score: 7.62	Score: 8.01actions batch at 564000-th learning:
	 shape = (128, 4),
	 mean = [0.1904246  0.14574035 0.24643932 0.2371663 ],
	  std = [0.5618845 0.5609214 0.528155  0.5914776]
8.039999820291996
Episode 806	Average Score: 7.64	Score: 8.047.209999838843942
Episode 807	Average Score: 7.66	Score: 7.21actions batch at 565000-th learning:
	 shape = (128, 4),
	 mean = [0.172383   0.20330782 0.25953284 0.27026406],
	  std = [0.49896914 0.5683597  0.5037766  0.57909757]
6.889999845996499
Episode 808	Average Score: 7.66	Score: 6.89actions batch at 566000-th learning:
	 shape = (128, 4),
	 mean = [0.22823241 0.20866856 0.22893134 0.20752004],
	  std = [0.5671826 0.5595073 0.502298  0.5705579]
4.129999907687306
Episode 809	Average Score: 7.66	Score: 4.139.029999798163772
Episode 810	Average Score: 7.71
actions batch at 567000-th learning:
	 shape = (128, 4),
	 mean = [0.22473213 0.15230627 0.23469992 0.31563595],
	  std = [0.54543835 0.5480133  0.5074818  0.5817826 ]
6.5199998542666435
Episode 811	Average Score: 7.69	Score: 6.52actions batch at 568000-th learning:
	 shape = (128, 4),
	 mean = [0.18659145 0.15323994 0.30089986 0.30908978],
	  std = [0.52748567 0.5603258  0.53119564 0.58301884]
4.0799999088048935
Episode 812	Average Score: 7.63	Score: 4.087.329999836161733
Episode 813	Average Score: 7.60	Score: 7.33actions batch at 569000-th learning:
	 shape = (128, 4),
	 mean = [0.29265028 0.1983232  0.25896567 0.20596023],
	  std = [0.59268016 0.5760557  0.53879803 0.59167963]
6.2799998596310616
Episode 814	Average Score: 7.59	Score: 6.28actions batch at 570000-th learning:
	 shape = (128, 4),
	 mean = [0.19292931 0.1547384  0.23257141 0.1452868 ],
	  std = [0.5658454  0.53999466 0.53179884 0.5779661 ]
9.499999787658453
Episode 815	Average Score: 7.62	Score: 9.50actions batch at 571000-th learning:
	 shape = (128, 4),
	 mean = [0.29803595 0.24081379 0.25056264 0.2465683 ],
	  std = [0.5613744  0.5923648  0.51828754 0.5854523 ]
8.269999815151095
Episode 816	Average Score: 7.65	Score: 8.276.459999855607748
Episode 817	Average Score: 7.61	Score: 6.46actions batch at 572000-th learning:
	 shape = (128, 4),
	 mean = [0.25507024 0.2042912  0.2302674  0.2503879 ],
	  std = [0.5592975  0.57537735 0.52117276 0.57073146]
4.329999903216958
Episode 818	Average Score: 7.60	Score: 4.33actions batch at 573000-th learning:
	 shape = (128, 4),
	 mean = [0.1565939  0.18836562 0.29901722 0.25912395],
	  std = [0.56011206 0.54872525 0.52374667 0.58332   ]
6.439999856054783
Episode 819	Average Score: 7.60	Score: 6.449.71999978274107
Episode 820	Average Score: 7.62
actions batch at 574000-th learning:
	 shape = (128, 4),
	 mean = [0.27359188 0.19372542 0.22964048 0.28063446],
	  std = [0.58870864 0.5607909  0.53258413 0.570288  ]
6.609999852254987
Episode 821	Average Score: 7.59	Score: 6.61actions batch at 575000-th learning:
	 shape = (128, 4),
	 mean = [0.1925971  0.1533559  0.12820248 0.14359748],
	  std = [0.545809   0.5493703  0.47034353 0.577029  ]
5.0599998869001865
Episode 822	Average Score: 7.59	Score: 5.067.00999984331429
Episode 823	Average Score: 7.58	Score: 7.01actions batch at 576000-th learning:
	 shape = (128, 4),
	 mean = [0.33505386 0.26325238 0.25509128 0.16497527],
	  std = [0.5553113  0.558134   0.5381779  0.54836875]
6.03999986499548
Episode 824	Average Score: 7.54	Score: 6.04actions batch at 577000-th learning:
	 shape = (128, 4),
	 mean = [0.21050176 0.21939579 0.18989319 0.26681823],
	  std = [0.56734437 0.57771987 0.51162803 0.5953859 ]
6.859999846667051
Episode 825	Average Score: 7.53	Score: 6.86actions batch at 578000-th learning:
	 shape = (128, 4),
	 mean = [0.24662086 0.25516385 0.26308766 0.24275196],
	  std = [0.57387125 0.5554139  0.5205328  0.57031965]
6.989999843761325
Episode 826	Average Score: 7.53	Score: 6.995.5399998761713505
Episode 827	Average Score: 7.47	Score: 5.54actions batch at 579000-th learning:
	 shape = (128, 4),
	 mean = [0.16960923 0.14408228 0.1664495  0.24924093],
	  std = [0.54903907 0.5523825  0.50169325 0.5492636 ]
7.129999840632081
Episode 828	Average Score: 7.46	Score: 7.13actions batch at 580000-th learning:
	 shape = (128, 4),
	 mean = [0.21474877 0.16566265 0.1465797  0.22920907],
	  std = [0.56801474 0.55279046 0.50425977 0.5635855 ]
5.509999876841903
Episode 829	Average Score: 7.45	Score: 5.515.829999869689345
Episode 830	Average Score: 7.44
actions batch at 581000-th learning:
	 shape = (128, 4),
	 mean = [0.17672431 0.20691451 0.14856184 0.29729527],
	  std = [0.5372781 0.5769818 0.5140057 0.5974845]
4.669999895617366
Episode 831	Average Score: 7.38	Score: 4.67actions batch at 582000-th learning:
	 shape = (128, 4),
	 mean = [0.20923387 0.2196842  0.28850928 0.19017243],
	  std = [0.5502798  0.56855905 0.4995438  0.5659667 ]
5.279999881982803
Episode 832	Average Score: 7.38	Score: 5.286.4299998562783
Episode 833	Average Score: 7.35	Score: 6.43actions batch at 583000-th learning:
	 shape = (128, 4),
	 mean = [0.23830366 0.2679756  0.2814074  0.2792998 ],
	  std = [0.5527586 0.5699809 0.528731  0.5674695]
13.199999704957008
Episode 834	Average Score: 7.39	Score: 13.20actions batch at 584000-th learning:
	 shape = (128, 4),
	 mean = [0.27931562 0.21768397 0.24024908 0.19261052],
	  std = [0.5597402  0.57387567 0.5029311  0.57692987]
6.109999863430858
Episode 835	Average Score: 7.33	Score: 6.11actions batch at 585000-th learning:
	 shape = (128, 4),
	 mean = [0.20625828 0.11917555 0.16125564 0.2596892 ],
	  std = [0.5498523 0.5417032 0.4845659 0.592845 ]
7.1199998408555984
Episode 836	Average Score: 7.33	Score: 7.126.609999852254987
Episode 837	Average Score: 7.31	Score: 6.61actions batch at 586000-th learning:
	 shape = (128, 4),
	 mean = [0.17889716 0.12367769 0.17786482 0.23633046],
	  std = [0.54265064 0.5573442  0.5154243  0.56619483]
12.789999714121222
Episode 838	Average Score: 7.35	Score: 12.79actions batch at 587000-th learning:
	 shape = (128, 4),
	 mean = [0.20228477 0.17590712 0.32461673 0.23159762],
	  std = [0.5425369  0.54356885 0.52264774 0.5727014 ]
5.709999872371554
Episode 839	Average Score: 7.30	Score: 5.717.519999831914902
Episode 840	Average Score: 7.29
actions batch at 588000-th learning:
	 shape = (128, 4),
	 mean = [0.18881299 0.03710084 0.24231282 0.36099994],
	  std = [0.5877376  0.47986642 0.5594142  0.5821979 ]
7.499999832361937
Episode 841	Average Score: 7.30	Score: 7.50actions batch at 589000-th learning:
	 shape = (128, 4),
	 mean = [0.22521208 0.21692805 0.21120021 0.19533691],
	  std = [0.5631474  0.54543    0.50579536 0.57189345]
5.279999881982803
Episode 842	Average Score: 7.25	Score: 5.286.07999986410141
Episode 843	Average Score: 7.23	Score: 6.08actions batch at 590000-th learning:
	 shape = (128, 4),
	 mean = [0.23169352 0.25804457 0.17038208 0.34984082],
	  std = [0.5454127  0.58424866 0.5036002  0.5826911 ]
8.849999802187085
Episode 844	Average Score: 7.25	Score: 8.85actions batch at 591000-th learning:
	 shape = (128, 4),
	 mean = [0.18851021 0.17981993 0.09619864 0.26347426],
	  std = [0.551054   0.55241376 0.48731604 0.55681694]
7.259999837726355
Episode 845	Average Score: 7.20	Score: 7.26actions batch at 592000-th learning:
	 shape = (128, 4),
	 mean = [0.19980797 0.15632349 0.16346388 0.23606975],
	  std = [0.5381969  0.54512054 0.52759284 0.5736667 ]
7.179999839514494
Episode 846	Average Score: 7.18	Score: 7.188.16999981738627
Episode 847	Average Score: 7.17	Score: 8.17actions batch at 593000-th learning:
	 shape = (128, 4),
	 mean = [0.14393187 0.2212078  0.15395963 0.20996688],
	  std = [0.5319111  0.5545753  0.49154708 0.5626906 ]
8.309999814257026
Episode 848	Average Score: 7.17	Score: 8.31actions batch at 594000-th learning:
	 shape = (128, 4),
	 mean = [0.26154435 0.28955987 0.2171808  0.21967608],
	  std = [0.56763685 0.57504773 0.5312358  0.60149807]
8.55999980866909
Episode 849	Average Score: 7.19	Score: 8.569.17999979481101
Episode 850	Average Score: 7.21
actions batch at 595000-th learning:
	 shape = (128, 4),
	 mean = [0.23278148 0.2501747  0.20780365 0.16706415],
	  std = [0.54638284 0.5658313  0.52013564 0.55771923]
7.559999831020832
Episode 851	Average Score: 7.21	Score: 7.56actions batch at 596000-th learning:
	 shape = (128, 4),
	 mean = [0.16635634 0.21911874 0.24872819 0.21730708],
	  std = [0.5599766  0.54678535 0.53245765 0.5995359 ]
9.599999785423279
Episode 852	Average Score: 7.22	Score: 9.6010.719999760389328
Episode 853	Average Score: 7.27	Score: 10.72actions batch at 597000-th learning:
	 shape = (128, 4),
	 mean = [0.22098963 0.11019881 0.22883761 0.20683706],
	  std = [0.5602847  0.5391888  0.52352226 0.5486485 ]
5.259999882429838
Episode 854	Average Score: 7.24	Score: 5.26actions batch at 598000-th learning:
	 shape = (128, 4),
	 mean = [0.256003   0.16824883 0.15397829 0.26886964],
	  std = [0.55406153 0.5334589  0.47024935 0.5673262 ]
11.989999732002616
Episode 855	Average Score: 7.28	Score: 11.99actions batch at 599000-th learning:
	 shape = (128, 4),
	 mean = [0.20706569 0.26558086 0.24911803 0.28719947],
	  std = [0.5479887 0.5640303 0.5025237 0.58422  ]
9.94999977760017
Episode 856	Average Score: 7.30	Score: 9.958.849999802187085
Episode 857	Average Score: 7.29	Score: 8.85actions batch at 600000-th learning:
	 shape = (128, 4),
	 mean = [0.29929492 0.19580713 0.16586542 0.15230384],
	  std = [0.5772878  0.5672184  0.5128003  0.55947477]
6.839999847114086
Episode 858	Average Score: 7.28	Score: 6.84actions batch at 601000-th learning:
	 shape = (128, 4),
	 mean = [0.22601561 0.15035126 0.20918815 0.22945042],
	  std = [0.56611186 0.5505457  0.52087003 0.58203036]
8.279999814927578
Episode 859	Average Score: 7.36	Score: 8.288.279999814927578
Episode 860	Average Score: 7.37
actions batch at 602000-th learning:
	 shape = (128, 4),
	 mean = [0.26597401 0.28021753 0.26155648 0.26032358],
	  std = [0.5986475  0.5930399  0.5043535  0.59114736]
7.649999829009175
Episode 861	Average Score: 7.37	Score: 7.65actions batch at 603000-th learning:
	 shape = (128, 4),
	 mean = [0.25871843 0.175143   0.2872619  0.21095479],
	  std = [0.5510333  0.5203151  0.49998206 0.55278385]
6.849999846890569
Episode 862	Average Score: 7.35	Score: 6.855.469999877735972
Episode 863	Average Score: 7.30	Score: 5.47actions batch at 604000-th learning:
	 shape = (128, 4),
	 mean = [0.16365723 0.21417962 0.25246373 0.20560832],
	  std = [0.54994833 0.5652206  0.5118538  0.54151195]
8.679999805986881
Episode 864	Average Score: 7.33	Score: 8.68actions batch at 605000-th learning:
	 shape = (128, 4),
	 mean = [0.32079983 0.19693846 0.23041913 0.17889297],
	  std = [0.5591758  0.57508487 0.516824   0.54904795]
7.179999839514494
Episode 865	Average Score: 7.32	Score: 7.18actions batch at 606000-th learning:
	 shape = (128, 4),
	 mean = [0.3337208  0.17757204 0.22168568 0.27881908],
	  std = [0.5857026  0.56470025 0.5244869  0.57401454]
5.689999872818589
Episode 866	Average Score: 7.33	Score: 5.6910.809999758377671
Episode 867	Average Score: 7.39	Score: 10.81actions batch at 607000-th learning:
	 shape = (128, 4),
	 mean = [0.22810498 0.24069795 0.22818479 0.27243122],
	  std = [0.55322796 0.5735133  0.52192295 0.55990136]
7.819999825209379
Episode 868	Average Score: 7.36	Score: 7.82actions batch at 608000-th learning:
	 shape = (128, 4),
	 mean = [0.16513254 0.13184173 0.17250365 0.24976279],
	  std = [0.5321511  0.52187866 0.51751596 0.58289856]
6.099999863654375
Episode 869	Average Score: 7.34	Score: 6.106.4299998562783
Episode 870	Average Score: 7.31
actions batch at 609000-th learning:
	 shape = (128, 4),
	 mean = [0.30655512 0.2875862  0.2975166  0.32633683],
	  std = [0.6029453  0.588316   0.53759813 0.5853499 ]
8.999999798834324
Episode 871	Average Score: 7.34	Score: 9.00actions batch at 610000-th learning:
	 shape = (128, 4),
	 mean = [0.21674614 0.1909964  0.22619483 0.24260642],
	  std = [0.55842733 0.5657554  0.5156356  0.5748825 ]
5.289999881759286
Episode 872	Average Score: 7.33	Score: 5.295.2999998815357685
Episode 873	Average Score: 7.31	Score: 5.30actions batch at 611000-th learning:
	 shape = (128, 4),
	 mean = [0.22885671 0.20019232 0.1449348  0.21173538],
	  std = [0.5694081  0.55073994 0.49782622 0.54588884]
5.529999876394868
Episode 874	Average Score: 7.31	Score: 5.53actions batch at 612000-th learning:
	 shape = (128, 4),
	 mean = [0.2118295  0.22621176 0.19992562 0.27556488],
	  std = [0.56753546 0.57652783 0.5176533  0.5847509 ]
6.979999843984842
Episode 875	Average Score: 7.31	Score: 6.98actions batch at 613000-th learning:
	 shape = (128, 4),
	 mean = [0.25177136 0.22407311 0.2723064  0.3160759 ],
	  std = [0.5765103  0.56518465 0.55066746 0.5632347 ]
9.599999785423279
Episode 876	Average Score: 7.32	Score: 9.607.089999841526151
Episode 877	Average Score: 7.30	Score: 7.09actions batch at 614000-th learning:
	 shape = (128, 4),
	 mean = [0.1613592  0.20839947 0.15994485 0.14223084],
	  std = [0.5569003  0.54538566 0.5087476  0.549328  ]
5.279999881982803
Episode 878	Average Score: 7.29	Score: 5.28actions batch at 615000-th learning:
	 shape = (128, 4),
	 mean = [0.23190847 0.26568377 0.24426138 0.24060015],
	  std = [0.5725895  0.559205   0.53165525 0.59688765]
6.529999854043126
Episode 879	Average Score: 7.26	Score: 6.536.919999845325947
Episode 880	Average Score: 7.26
actions batch at 616000-th learning:
	 shape = (128, 4),
	 mean = [0.15829653 0.16224016 0.17096075 0.18681574],
	  std = [0.5485594  0.5498798  0.53796136 0.57155716]
5.6099998746067286
Episode 881	Average Score: 7.25	Score: 5.61actions batch at 617000-th learning:
	 shape = (128, 4),
	 mean = [0.24919108 0.16407056 0.28234532 0.2390998 ],
	  std = [0.5687998  0.56230754 0.5211533  0.58871603]
6.73999984934926
Episode 882	Average Score: 7.22	Score: 6.746.959999844431877
Episode 883	Average Score: 7.21	Score: 6.96actions batch at 618000-th learning:
	 shape = (128, 4),
	 mean = [0.16430745 0.14687742 0.13425553 0.21767524],
	  std = [0.5489231  0.53441906 0.4765842  0.53289825]
5.309999881312251
Episode 884	Average Score: 7.18	Score: 5.31actions batch at 619000-th learning:
	 shape = (128, 4),
	 mean = [0.24876171 0.26748934 0.2051166  0.18171921],
	  std = [0.5390041  0.5796384  0.51717764 0.5427204 ]
5.799999870359898
Episode 885	Average Score: 7.13	Score: 5.80actions batch at 620000-th learning:
	 shape = (128, 4),
	 mean = [0.19683738 0.22558247 0.21932974 0.25375658],
	  std = [0.5395114  0.5638181  0.5158598  0.58179927]
8.279999814927578
Episode 886	Average Score: 7.10	Score: 8.287.639999829232693
Episode 887	Average Score: 7.12	Score: 7.64actions batch at 621000-th learning:
	 shape = (128, 4),
	 mean = [0.20018575 0.13266356 0.20418444 0.23875217],
	  std = [0.571913   0.5378238  0.50921553 0.57152045]
6.779999848455191
Episode 888	Average Score: 7.11	Score: 6.78actions batch at 622000-th learning:
	 shape = (128, 4),
	 mean = [0.19388428 0.18126181 0.22442211 0.25646073],
	  std = [0.5336028  0.53412634 0.5208653  0.58735526]
8.929999800398946
Episode 889	Average Score: 7.13	Score: 8.935.859999869018793
Episode 890	Average Score: 7.12
actions batch at 623000-th learning:
	 shape = (128, 4),
	 mean = [0.3603233  0.24878909 0.29152006 0.29337803],
	  std = [0.55943394 0.56885266 0.52156144 0.5785213 ]
5.289999881759286
Episode 891	Average Score: 7.09	Score: 5.29actions batch at 624000-th learning:
	 shape = (128, 4),
	 mean = [0.17853832 0.15714401 0.21389958 0.21974163],
	  std = [0.5567611  0.5346899  0.5006341  0.57405734]
5.45999987795949
Episode 892	Average Score: 7.08	Score: 5.465.799999870359898
Episode 893	Average Score: 7.06	Score: 5.80actions batch at 625000-th learning:
	 shape = (128, 4),
	 mean = [0.24543385 0.12291767 0.3115863  0.2794514 ],
	  std = [0.5734385  0.530399   0.54098654 0.5516307 ]
8.729999804869294
Episode 894	Average Score: 7.09	Score: 8.73actions batch at 626000-th learning:
	 shape = (128, 4),
	 mean = [0.24806048 0.1892795  0.27055088 0.23967333],
	  std = [0.5644587  0.5597236  0.522772   0.57440114]
5.619999874383211
Episode 895	Average Score: 7.09	Score: 5.62actions batch at 627000-th learning:
	 shape = (128, 4),
	 mean = [0.23595928 0.1759389  0.22453603 0.2242198 ],
	  std = [0.5420124  0.542221   0.50460744 0.554447  ]
8.429999811574817
Episode 896	Average Score: 7.11	Score: 8.438.719999805092812
Episode 897	Average Score: 7.15	Score: 8.72actions batch at 628000-th learning:
	 shape = (128, 4),
	 mean = [0.27053395 0.13277106 0.30304092 0.22990677],
	  std = [0.5555658  0.52351815 0.53684366 0.54757416]
6.539999853819609
Episode 898	Average Score: 7.16	Score: 6.54actions batch at 629000-th learning:
	 shape = (128, 4),
	 mean = [0.21598995 0.16312666 0.2834115  0.21861751],
	  std = [0.5372811  0.57546955 0.5413832  0.5639478 ]
6.309999858960509
Episode 899	Average Score: 7.15	Score: 6.318.459999810904264
Episode 900	Average Score: 7.16
actions batch at 630000-th learning:
	 shape = (128, 4),
	 mean = [0.27798256 0.12462708 0.1652029  0.3264435 ],
	  std = [0.5692858  0.5420963  0.50703764 0.5740309 ]
9.36999979056418
Episode 901	Average Score: 7.21	Score: 9.37actions batch at 631000-th learning:
	 shape = (128, 4),
	 mean = [0.24442625 0.1669648  0.28071615 0.22329327],
	  std = [0.60465914 0.55845374 0.52041477 0.58276266]
9.029999798163772
Episode 902	Average Score: 7.24	Score: 9.038.429999811574817
Episode 903	Average Score: 7.26	Score: 8.43actions batch at 632000-th learning:
	 shape = (128, 4),
	 mean = [0.20894411 0.17156789 0.2893752  0.24656254],
	  std = [0.5537807  0.5423677  0.53918463 0.5805085 ]
7.339999835938215
Episode 904	Average Score: 7.26	Score: 7.34actions batch at 633000-th learning:
	 shape = (128, 4),
	 mean = [0.2365295  0.11054953 0.24793477 0.25093496],
	  std = [0.54444534 0.5370097  0.5131887  0.56910586]
6.589999852702022
Episode 905	Average Score: 7.24	Score: 6.59actions batch at 634000-th learning:
	 shape = (128, 4),
	 mean = [0.29770604 0.20138787 0.31324446 0.26384538],
	  std = [0.5331872  0.54088646 0.53100985 0.56324035]
11.739999737590551
Episode 906	Average Score: 7.28	Score: 11.747.989999821409583
Episode 907	Average Score: 7.29	Score: 7.99actions batch at 635000-th learning:
	 shape = (128, 4),
	 mean = [0.25031304 0.19566855 0.3195771  0.29312438],
	  std = [0.5430588  0.5481385  0.53629094 0.5883624 ]
6.4499998558312654
Episode 908	Average Score: 7.28	Score: 6.45actions batch at 636000-th learning:
	 shape = (128, 4),
	 mean = [0.21439146 0.28393185 0.20404604 0.23272781],
	  std = [0.54039615 0.5922308  0.50149006 0.57055676]
7.329999836161733
Episode 909	Average Score: 7.31	Score: 7.339.269999792799354
Episode 910	Average Score: 7.32
actions batch at 637000-th learning:
	 shape = (128, 4),
	 mean = [0.3443873  0.20250383 0.3025663  0.15173762],
	  std = [0.58566713 0.59608644 0.5185113  0.5502275 ]
3.3899999242275953
Episode 911	Average Score: 7.29	Score: 3.39actions batch at 638000-th learning:
	 shape = (128, 4),
	 mean = [0.27245682 0.23126458 0.2916395  0.2298709 ],
	  std = [0.56006265 0.5580435  0.510203   0.57859725]
11.489999743178487
Episode 912	Average Score: 7.36	Score: 11.497.6699998285621405
Episode 913	Average Score: 7.36	Score: 7.67actions batch at 639000-th learning:
	 shape = (128, 4),
	 mean = [0.21195017 0.17128587 0.25134748 0.13180473],
	  std = [0.563017   0.54611254 0.5259382  0.5575755 ]
8.259999815374613
Episode 914	Average Score: 7.38	Score: 8.26actions batch at 640000-th learning:
	 shape = (128, 4),
	 mean = [0.24954054 0.16040286 0.26039612 0.27163127],
	  std = [0.5751141  0.5628372  0.54199284 0.57798165]
6.9999998435378075
Episode 915	Average Score: 7.36	Score: 7.00actions batch at 641000-th learning:
	 shape = (128, 4),
	 mean = [0.2660213  0.19577165 0.14019871 0.23902023],
	  std = [0.5507336 0.5550306 0.4968434 0.5786035]
3.0999999307096004
Episode 916	Average Score: 7.31	Score: 3.106.4499998558312654
Episode 917	Average Score: 7.31	Score: 6.45actions batch at 642000-th learning:
	 shape = (128, 4),
	 mean = [0.11259277 0.1810871  0.20979239 0.28327253],
	  std = [0.5376878  0.54473066 0.5093853  0.5681115 ]
10.45999976620078
Episode 918	Average Score: 7.37	Score: 10.46actions batch at 643000-th learning:
	 shape = (128, 4),
	 mean = [0.24058333 0.1785713  0.29755914 0.34394425],
	  std = [0.5614872  0.539858   0.49185523 0.56411403]
8.59999980777502
Episode 919	Average Score: 7.39	Score: 8.608.78999980352819
Episode 920	Average Score: 7.38
actions batch at 644000-th learning:
	 shape = (128, 4),
	 mean = [0.24627128 0.20520316 0.24626106 0.27845812],
	  std = [0.55758166 0.5531142  0.5017707  0.5611826 ]
7.379999835044146
Episode 921	Average Score: 7.39	Score: 7.38actions batch at 645000-th learning:
	 shape = (128, 4),
	 mean = [0.19003196 0.17888623 0.28100654 0.29980418],
	  std = [0.56155    0.57291234 0.53790724 0.5695178 ]
6.129999862983823
Episode 922	Average Score: 7.40	Score: 6.136.539999853819609
Episode 923	Average Score: 7.39	Score: 6.54actions batch at 646000-th learning:
	 shape = (128, 4),
	 mean = [0.23750886 0.18985854 0.275019   0.24976423],
	  std = [0.5691038  0.58039725 0.51515025 0.5643803 ]
13.949999688193202
Episode 924	Average Score: 7.47	Score: 13.95actions batch at 647000-th learning:
	 shape = (128, 4),
	 mean = [0.25177002 0.17645827 0.21245195 0.2332075 ],
	  std = [0.5915311  0.5473249  0.50316656 0.57188666]
10.099999774247408
Episode 925	Average Score: 7.50	Score: 10.10actions batch at 648000-th learning:
	 shape = (128, 4),
	 mean = [0.24488364 0.15745033 0.23490411 0.1713201 ],
	  std = [0.52611756 0.5448974  0.4887607  0.5250557 ]
6.629999851807952
Episode 926	Average Score: 7.50	Score: 6.636.2099998611956835
Episode 927	Average Score: 7.51	Score: 6.21actions batch at 649000-th learning:
	 shape = (128, 4),
	 mean = [0.1327731  0.1831165  0.16798937 0.24650873],
	  std = [0.53462523 0.5801945  0.5057289  0.58167523]
8.659999806433916
Episode 928	Average Score: 7.52	Score: 8.66actions batch at 650000-th learning:
	 shape = (128, 4),
	 mean = [0.2870731  0.1857051  0.19062744 0.20265895],
	  std = [0.577795   0.53545207 0.5345462  0.5592744 ]
6.3999998569488525
Episode 929	Average Score: 7.53	Score: 6.406.289999859407544
Episode 930	Average Score: 7.54
actions batch at 651000-th learning:
	 shape = (128, 4),
	 mean = [0.2657322  0.23436432 0.27581218 0.23404925],
	  std = [0.567088  0.5752227 0.5230745 0.5606065]
9.279999792575836
Episode 931	Average Score: 7.58	Score: 9.28actions batch at 652000-th learning:
	 shape = (128, 4),
	 mean = [0.16785261 0.17704648 0.15916125 0.229203  ],
	  std = [0.5493329  0.5841644  0.50509125 0.5705532 ]
7.349999835714698
Episode 932	Average Score: 7.60	Score: 7.359.549999786540866
Episode 933	Average Score: 7.63	Score: 9.55actions batch at 653000-th learning:
	 shape = (128, 4),
	 mean = [0.19857024 0.21757221 0.28294873 0.19436245],
	  std = [0.5661645  0.55994743 0.5035788  0.5572238 ]
6.379999857395887
Episode 934	Average Score: 7.57	Score: 6.38actions batch at 654000-th learning:
	 shape = (128, 4),
	 mean = [0.14169614 0.17056003 0.1990937  0.23655169],
	  std = [0.5527182 0.5285206 0.5107187 0.5538409]
12.379999723285437
Episode 935	Average Score: 7.63	Score: 12.38actions batch at 655000-th learning:
	 shape = (128, 4),
	 mean = [0.16423783 0.18506584 0.29848814 0.27866507],
	  std = [0.5307278 0.5749695 0.5106402 0.5885502]
4.91999989002943
Episode 936	Average Score: 7.61	Score: 4.926.729999849572778
Episode 937	Average Score: 7.61	Score: 6.73actions batch at 656000-th learning:
	 shape = (128, 4),
	 mean = [0.17271993 0.19926488 0.23784947 0.2530317 ],
	  std = [0.5581081  0.5716833  0.5173463  0.57114106]
5.049999887123704
Episode 938	Average Score: 7.53	Score: 5.05actions batch at 657000-th learning:
	 shape = (128, 4),
	 mean = [0.29573235 0.2235786  0.291868   0.23365441],
	  std = [0.55343735 0.56248206 0.5395562  0.56555575]
6.499999854713678
Episode 939	Average Score: 7.54	Score: 6.506.4499998558312654
Episode 940	Average Score: 7.53
actions batch at 658000-th learning:
	 shape = (128, 4),
	 mean = [0.14931293 0.18437971 0.23018746 0.1744531 ],
	  std = [0.5485052 0.5556442 0.5250632 0.5782306]
6.129999862983823
Episode 941	Average Score: 7.51	Score: 6.13actions batch at 659000-th learning:
	 shape = (128, 4),
	 mean = [0.21146347 0.18616474 0.18355332 0.26661143],
	  std = [0.53139395 0.5318423  0.50491977 0.57795334]
7.04999984242022
Episode 942	Average Score: 7.53	Score: 7.056.679999850690365
Episode 943	Average Score: 7.54	Score: 6.68actions batch at 660000-th learning:
	 shape = (128, 4),
	 mean = [0.24677199 0.17947993 0.2933745  0.33716387],
	  std = [0.5566723  0.58186805 0.5275766  0.58854353]
9.589999785646796
Episode 944	Average Score: 7.55	Score: 9.59actions batch at 661000-th learning:
	 shape = (128, 4),
	 mean = [0.29681128 0.2243845  0.25599423 0.2775789 ],
	  std = [0.59727687 0.54539424 0.5007613  0.55906487]
7.539999831467867
Episode 945	Average Score: 7.55	Score: 7.54actions batch at 662000-th learning:
	 shape = (128, 4),
	 mean = [0.21324655 0.13447218 0.1634157  0.2356568 ],
	  std = [0.5339746  0.5517474  0.49383888 0.5582529 ]
6.479999855160713
Episode 946	Average Score: 7.54	Score: 6.486.799999848008156
Episode 947	Average Score: 7.53	Score: 6.80actions batch at 663000-th learning:
	 shape = (128, 4),
	 mean = [0.18852368 0.11323974 0.24495642 0.21552433],
	  std = [0.54430187 0.55331683 0.49371928 0.57098657]
9.699999783188105
Episode 948	Average Score: 7.54	Score: 9.70actions batch at 664000-th learning:
	 shape = (128, 4),
	 mean = [0.29686794 0.15997557 0.28133947 0.25203496],
	  std = [0.5723713  0.56174695 0.53449583 0.56615263]
6.609999852254987
Episode 949	Average Score: 7.52	Score: 6.617.609999829903245
Episode 950	Average Score: 7.51
actions batch at 665000-th learning:
	 shape = (128, 4),
	 mean = [0.2441025  0.11588486 0.30178395 0.2617923 ],
	  std = [0.5619307  0.51716685 0.5053204  0.57283723]
7.439999833703041
Episode 951	Average Score: 7.50	Score: 7.44actions batch at 666000-th learning:
	 shape = (128, 4),
	 mean = [0.2187289  0.1928902  0.2841184  0.31963423],
	  std = [0.55291635 0.56026274 0.5008456  0.5749715 ]
6.619999852031469
Episode 952	Average Score: 7.48	Score: 6.626.9999998435378075
Episode 953	Average Score: 7.44	Score: 7.00actions batch at 667000-th learning:
	 shape = (128, 4),
	 mean = [0.28295633 0.20095314 0.27626875 0.37371626],
	  std = [0.56166136 0.528952   0.52205944 0.57399696]
7.459999833256006
Episode 954	Average Score: 7.46	Score: 7.46actions batch at 668000-th learning:
	 shape = (128, 4),
	 mean = [0.26432723 0.12258842 0.2540782  0.21866214],
	  std = [0.55995363 0.5519624  0.52239317 0.5465058 ]
9.779999781399965
Episode 955	Average Score: 7.44	Score: 9.78actions batch at 669000-th learning:
	 shape = (128, 4),
	 mean = [0.16730782 0.2265282  0.1531764  0.26805136],
	  std = [0.5441718  0.58778733 0.4958374  0.57963485]
5.699999872595072
Episode 956	Average Score: 7.40	Score: 5.706.6599998511374
Episode 957	Average Score: 7.37	Score: 6.66actions batch at 670000-th learning:
	 shape = (128, 4),
	 mean = [0.15435833 0.10728789 0.268751   0.29257497],
	  std = [0.54301107 0.5299601  0.53658736 0.57886124]
12.309999724850059
Episode 958	Average Score: 7.43	Score: 12.31actions batch at 671000-th learning:
	 shape = (128, 4),
	 mean = [0.28978136 0.18048856 0.30475658 0.34556672],
	  std = [0.555576   0.53526974 0.5350459  0.591213  ]
11.389999745413661
Episode 959	Average Score: 7.46	Score: 11.398.959999799728394
Episode 960	Average Score: 7.47
actions batch at 672000-th learning:
	 shape = (128, 4),
	 mean = [0.25960028 0.25375643 0.28672943 0.15702225],
	  std = [0.5555323  0.5588848  0.53475875 0.56361693]
8.069999819621444
Episode 961	Average Score: 7.47	Score: 8.07actions batch at 673000-th learning:
	 shape = (128, 4),
	 mean = [0.24826324 0.23869288 0.268774   0.2538049 ],
	  std = [0.56613    0.57453895 0.49899298 0.5888987 ]
9.659999784082174
Episode 962	Average Score: 7.50	Score: 9.667.6199998296797276
Episode 963	Average Score: 7.52	Score: 7.62actions batch at 674000-th learning:
	 shape = (128, 4),
	 mean = [0.26083827 0.18479821 0.24018992 0.1220007 ],
	  std = [0.540047   0.57267505 0.5112492  0.536384  ]
7.749999826774001
Episode 964	Average Score: 7.51	Score: 7.75actions batch at 675000-th learning:
	 shape = (128, 4),
	 mean = [0.3225036  0.27915493 0.26448053 0.2950264 ],
	  std = [0.551577   0.55224377 0.5188463  0.58670205]
5.519999876618385
Episode 965	Average Score: 7.49	Score: 5.52actions batch at 676000-th learning:
	 shape = (128, 4),
	 mean = [0.25469214 0.20342235 0.27816933 0.25073707],
	  std = [0.5454323  0.57659364 0.5353227  0.5565831 ]
7.869999824091792
Episode 966	Average Score: 7.52	Score: 7.877.299999836832285
Episode 967	Average Score: 7.48	Score: 7.30actions batch at 677000-th learning:
	 shape = (128, 4),
	 mean = [0.24883133 0.20541637 0.2854849  0.24447843],
	  std = [0.5538514 0.5598282 0.5387282 0.580196 ]
9.969999777153134
Episode 968	Average Score: 7.50	Score: 9.97actions batch at 678000-th learning:
	 shape = (128, 4),
	 mean = [0.2318102  0.15195754 0.19099177 0.23952693],
	  std = [0.56225836 0.54849845 0.5034807  0.56275845]
7.509999832138419
Episode 969	Average Score: 7.52	Score: 7.515.509999876841903
Episode 970	Average Score: 7.51
actions batch at 679000-th learning:
	 shape = (128, 4),
	 mean = [0.30687708 0.16857506 0.279267   0.31474555],
	  std = [0.576957   0.5364736  0.553201   0.57047975]
6.839999847114086
Episode 971	Average Score: 7.49	Score: 6.84actions batch at 680000-th learning:
	 shape = (128, 4),
	 mean = [0.27010825 0.13530812 0.30514392 0.25426745],
	  std = [0.5732203  0.55580384 0.5396446  0.5831863 ]
8.489999810233712
Episode 972	Average Score: 7.52	Score: 8.499.40999978967011
Episode 973	Average Score: 7.56	Score: 9.41actions batch at 681000-th learning:
	 shape = (128, 4),
	 mean = [0.19052196 0.23419172 0.17639448 0.23301937],
	  std = [0.54471546 0.5767237  0.49401438 0.55547976]
7.909999823197722
Episode 974	Average Score: 7.58	Score: 7.91actions batch at 682000-th learning:
	 shape = (128, 4),
	 mean = [0.2273622  0.19110103 0.2581157  0.1714525 ],
	  std = [0.5484627 0.5645109 0.545022  0.5651801]
6.6399998515844345
Episode 975	Average Score: 7.58	Score: 6.64actions batch at 683000-th learning:
	 shape = (128, 4),
	 mean = [0.19216643 0.13149904 0.17174931 0.2977561 ],
	  std = [0.5591629  0.52998465 0.5178338  0.5785456 ]
0.9199999794363976
Episode 976	Average Score: 7.49	Score: 0.929.129999795928597
Episode 977	Average Score: 7.51	Score: 9.13actions batch at 684000-th learning:
	 shape = (128, 4),
	 mean = [0.2573868  0.23818229 0.27819663 0.300288  ],
	  std = [0.55567336 0.58275265 0.4988697  0.5613866 ]
7.499999832361937
Episode 978	Average Score: 7.53	Score: 7.50actions batch at 685000-th learning:
	 shape = (128, 4),
	 mean = [0.21361108 0.21932139 0.28163838 0.09633679],
	  std = [0.56576633 0.5430045  0.5129114  0.514924  ]
8.359999813139439
Episode 979	Average Score: 7.55	Score: 8.366.049999864771962
Episode 980	Average Score: 7.54
actions batch at 686000-th learning:
	 shape = (128, 4),
	 mean = [0.23712134 0.23458804 0.2952591  0.24074459],
	  std = [0.5606583  0.546552   0.54302967 0.57331085]
7.469999833032489
Episode 981	Average Score: 7.56	Score: 7.47actions batch at 687000-th learning:
	 shape = (128, 4),
	 mean = [0.20915727 0.21656679 0.20194116 0.31838667],
	  std = [0.52948713 0.5507668  0.5100302  0.5552566 ]
8.16999981738627
Episode 982	Average Score: 7.58	Score: 8.177.529999831691384
Episode 983	Average Score: 7.58	Score: 7.53actions batch at 688000-th learning:
	 shape = (128, 4),
	 mean = [0.15677552 0.15485251 0.26133934 0.24674298],
	  std = [0.5147635  0.56518734 0.5016282  0.5766195 ]
5.829999869689345
Episode 984	Average Score: 7.59	Score: 5.83actions batch at 689000-th learning:
	 shape = (128, 4),
	 mean = [0.25286052 0.10388676 0.28698292 0.27917653],
	  std = [0.5424628  0.5393815  0.53083545 0.5639119 ]
6.03999986499548
Episode 985	Average Score: 7.59	Score: 6.04actions batch at 690000-th learning:
	 shape = (128, 4),
	 mean = [0.25595158 0.22504622 0.27547434 0.19927053],
	  std = [0.57459927 0.5728989  0.5427394  0.5516526 ]
8.219999816268682
Episode 986	Average Score: 7.59	Score: 8.2211.389999745413661
Episode 987	Average Score: 7.63	Score: 11.39actions batch at 691000-th learning:
	 shape = (128, 4),
	 mean = [0.28251317 0.18830636 0.27155444 0.24460047],
	  std = [0.55595285 0.5692243  0.4893436  0.5685251 ]
5.949999867007136
Episode 988	Average Score: 7.62	Score: 5.95actions batch at 692000-th learning:
	 shape = (128, 4),
	 mean = [0.26315773 0.12121274 0.166078   0.32275775],
	  std = [0.5684436  0.53251004 0.48109877 0.57284325]
6.869999846443534
Episode 989	Average Score: 7.60	Score: 6.878.719999805092812
Episode 990	Average Score: 7.63
actions batch at 693000-th learning:
	 shape = (128, 4),
	 mean = [0.306817   0.23799737 0.26189962 0.313065  ],
	  std = [0.56833893 0.5602931  0.53223825 0.5644519 ]
9.689999783411622
Episode 991	Average Score: 7.67	Score: 9.69actions batch at 694000-th learning:
	 shape = (128, 4),
	 mean = [0.2515137  0.187342   0.24730183 0.35984164],
	  std = [0.5821546  0.5650037  0.54463273 0.5495897 ]
10.989999754354358
Episode 992	Average Score: 7.73	Score: 10.999.939999777823687
Episode 993	Average Score: 7.77	Score: 9.94actions batch at 695000-th learning:
	 shape = (128, 4),
	 mean = [0.18084677 0.13535485 0.22915427 0.2542182 ],
	  std = [0.5583065 0.5136274 0.5137378 0.5668667]
7.699999827891588
Episode 994	Average Score: 7.76	Score: 7.70actions batch at 696000-th learning:
	 shape = (128, 4),
	 mean = [0.20838979 0.23888674 0.1928874  0.23597018],
	  std = [0.56349117 0.5642023  0.5220647  0.57723564]
8.47999981045723
Episode 995	Average Score: 7.79	Score: 8.48actions batch at 697000-th learning:
	 shape = (128, 4),
	 mean = [0.1450526  0.21457554 0.21557744 0.22883841],
	  std = [0.52240956 0.56984484 0.5305275  0.5728889 ]
12.019999731332064
Episode 996	Average Score: 7.82	Score: 12.0210.199999772012234
Episode 997	Average Score: 7.84	Score: 10.20actions batch at 698000-th learning:
	 shape = (128, 4),
	 mean = [0.24721771 0.11209974 0.24018677 0.23640579],
	  std = [0.5827614 0.5229001 0.5105882 0.5706339]
5.549999875947833
Episode 998	Average Score: 7.83	Score: 5.55actions batch at 699000-th learning:
	 shape = (128, 4),
	 mean = [0.23988056 0.17808478 0.18655646 0.16081294],
	  std = [0.5335619  0.5646196  0.49432024 0.4995561 ]
11.76999973692
Episode 999	Average Score: 7.88	Score: 11.7711.319999746978283
Episode 1000	Average Score: 7.91
actions batch at 700000-th learning:
	 shape = (128, 4),
	 mean = [0.18880959 0.11937474 0.17740302 0.24111623],
	  std = [0.54927945 0.527113   0.50316644 0.5341884 ]
10.389999767765403
Episode 1001	Average Score: 7.92	Score: 10.39actions batch at 701000-th learning:
	 shape = (128, 4),
	 mean = [0.16338369 0.19582483 0.16565776 0.21197033],
	  std = [0.52936274 0.5444217  0.48261905 0.57218194]
10.33999976888299
Episode 1002	Average Score: 7.93	Score: 10.349.649999784305692
Episode 1003	Average Score: 7.95	Score: 9.65actions batch at 702000-th learning:
	 shape = (128, 4),
	 mean = [0.2671511  0.22927551 0.33485565 0.2652953 ],
	  std = [0.5597739  0.5613992  0.54286134 0.5814874 ]
14.069999685510993
Episode 1004	Average Score: 8.01	Score: 14.07actions batch at 703000-th learning:
	 shape = (128, 4),
	 mean = [0.16408952 0.20789734 0.22912014 0.28098822],
	  std = [0.5280055 0.5768059 0.5277788 0.5968535]
9.129999795928597
Episode 1005	Average Score: 8.04	Score: 9.13actions batch at 704000-th learning:
	 shape = (128, 4),
	 mean = [0.12996052 0.18316196 0.24585061 0.25187302],
	  std = [0.51622    0.5233801  0.49705756 0.5629089 ]
9.83999978005886
Episode 1006	Average Score: 8.02	Score: 9.849.459999788552523
Episode 1007	Average Score: 8.03	Score: 9.46actions batch at 705000-th learning:
	 shape = (128, 4),
	 mean = [0.2181962  0.06976034 0.21924827 0.09906186],
	  std = [0.54976684 0.49097458 0.51943624 0.5228756 ]
7.58999983035028
Episode 1008	Average Score: 8.05	Score: 7.59actions batch at 706000-th learning:
	 shape = (128, 4),
	 mean = [0.29781115 0.18622233 0.19785121 0.14080942],
	  std = [0.55140287 0.5420048  0.4874548  0.53915024]
9.21999979391694
Episode 1009	Average Score: 8.06	Score: 9.228.16999981738627
Episode 1010	Average Score: 8.05
actions batch at 707000-th learning:
	 shape = (128, 4),
	 mean = [0.2597546  0.22081454 0.29535648 0.26707944],
	  std = [0.5226223  0.5507746  0.50007474 0.5877588 ]
6.96999984420836
Episode 1011	Average Score: 8.09	Score: 6.97actions batch at 708000-th learning:
	 shape = (128, 4),
	 mean = [0.23275286 0.2119607  0.26160508 0.20148441],
	  std = [0.5640072  0.5574551  0.54735863 0.5502303 ]
1.9499999564141035
Episode 1012	Average Score: 7.99	Score: 1.9517.059999618679285
Episode 1013	Average Score: 8.09	Score: 17.06actions batch at 709000-th learning:
	 shape = (128, 4),
	 mean = [0.23676448 0.17399785 0.29964104 0.19969526],
	  std = [0.5582479  0.55451834 0.54401577 0.5655986 ]
11.589999740943313
Episode 1014	Average Score: 8.12	Score: 11.59actions batch at 710000-th learning:
	 shape = (128, 4),
	 mean = [0.36538798 0.25031868 0.33550996 0.29842153],
	  std = [0.59876305 0.56543064 0.54890716 0.575481  ]
10.829999757930636
Episode 1015	Average Score: 8.16	Score: 10.83actions batch at 711000-th learning:
	 shape = (128, 4),
	 mean = [0.3423302  0.2536719  0.23589072 0.27775267],
	  std = [0.5712479  0.5578862  0.48272312 0.57037586]
13.589999696239829
Episode 1016	Average Score: 8.26	Score: 13.598.619999807327986
Episode 1017	Average Score: 8.29	Score: 8.62actions batch at 712000-th learning:
	 shape = (128, 4),
	 mean = [0.32404232 0.23102146 0.28039488 0.26621595],
	  std = [0.5506606 0.5814385 0.521458  0.5579339]
9.439999788999557
Episode 1018	Average Score: 8.28	Score: 9.44actions batch at 713000-th learning:
	 shape = (128, 4),
	 mean = [0.24971999 0.19527984 0.3293022  0.2684101 ],
	  std = [0.557054   0.57867676 0.53657913 0.5651287 ]
8.989999799057841
Episode 1019	Average Score: 8.28	Score: 8.999.349999791011214
Episode 1020	Average Score: 8.29
actions batch at 714000-th learning:
	 shape = (128, 4),
	 mean = [0.28881156 0.19696246 0.23666939 0.22669448],
	  std = [0.555842  0.5559957 0.5289132 0.5627695]
4.399999901652336
Episode 1021	Average Score: 8.26	Score: 4.40actions batch at 715000-th learning:
	 shape = (128, 4),
	 mean = [0.2849713  0.18809569 0.3007164  0.2729096 ],
	  std = [0.5667136  0.54842055 0.55747473 0.58345985]
7.749999826774001
Episode 1022	Average Score: 8.27	Score: 7.758.549999808892608
Episode 1023	Average Score: 8.29	Score: 8.55actions batch at 716000-th learning:
	 shape = (128, 4),
	 mean = [0.2888092  0.2517498  0.28620258 0.28745878],
	  std = [0.5777814  0.5596119  0.509873   0.57742184]
7.689999828115106
Episode 1024	Average Score: 8.23	Score: 7.69actions batch at 717000-th learning:
	 shape = (128, 4),
	 mean = [0.22959428 0.31378028 0.30790797 0.33528414],
	  std = [0.5529838  0.5874529  0.5477994  0.56020796]
9.419999789446592
Episode 1025	Average Score: 8.22	Score: 9.42actions batch at 718000-th learning:
	 shape = (128, 4),
	 mean = [0.2645194  0.20259312 0.20769037 0.20191754],
	  std = [0.54016364 0.56003594 0.49614248 0.54419196]
9.289999792352319
Episode 1026	Average Score: 8.25	Score: 9.290.3899999912828207
Episode 1027	Average Score: 8.19	Score: 0.39actions batch at 719000-th learning:
	 shape = (128, 4),
	 mean = [0.24670526 0.20812607 0.27578843 0.22729193],
	  std = [0.59407616 0.5364415  0.55260915 0.56258774]
10.559999763965607
Episode 1028	Average Score: 8.21	Score: 10.56actions batch at 720000-th learning:
	 shape = (128, 4),
	 mean = [0.23138101 0.16803464 0.27509922 0.21464159],
	  std = [0.5558007  0.5309687  0.51420313 0.53596896]
8.699999805539846
Episode 1029	Average Score: 8.23	Score: 8.707.58999983035028
Episode 1030	Average Score: 8.25
actions batch at 721000-th learning:
	 shape = (128, 4),
	 mean = [0.3615631  0.18691769 0.24424697 0.34698686],
	  std = [0.5515063  0.56501174 0.51930344 0.58106124]
11.289999747648835
Episode 1031	Average Score: 8.27	Score: 11.29actions batch at 722000-th learning:
	 shape = (128, 4),
	 mean = [0.29125577 0.17957106 0.21549723 0.22783391],
	  std = [0.5328476  0.5650217  0.49007177 0.56524044]
4.789999892935157
Episode 1032	Average Score: 8.24	Score: 4.799.67999978363514
Episode 1033	Average Score: 8.24	Score: 9.68actions batch at 723000-th learning:
	 shape = (128, 4),
	 mean = [0.26890633 0.09504493 0.29865175 0.28368473],
	  std = [0.5390172  0.5397548  0.52982956 0.5933031 ]
6.819999847561121
Episode 1034	Average Score: 8.25	Score: 6.82actions batch at 724000-th learning:
	 shape = (128, 4),
	 mean = [0.17419732 0.18838622 0.2549861  0.24954358],
	  std = [0.5442663 0.5396756 0.5152402 0.560623 ]
6.299999859184027
Episode 1035	Average Score: 8.19	Score: 6.30actions batch at 725000-th learning:
	 shape = (128, 4),
	 mean = [0.22558393 0.14468816 0.32344082 0.31283274],
	  std = [0.5577138  0.53108853 0.5531282  0.5922852 ]
8.649999806657434
Episode 1036	Average Score: 8.22	Score: 8.6511.159999750554562
Episode 1037	Average Score: 8.27	Score: 11.16actions batch at 726000-th learning:
	 shape = (128, 4),
	 mean = [0.27664313 0.08595522 0.2707901  0.25158516],
	  std = [0.56206733 0.51047736 0.5482934  0.5535772 ]
5.929999867454171
Episode 1038	Average Score: 8.28	Score: 5.93actions batch at 727000-th learning:
	 shape = (128, 4),
	 mean = [0.21786231 0.17412134 0.23508546 0.2563579 ],
	  std = [0.56594014 0.53091383 0.5561107  0.54370904]
13.249999703839421
Episode 1039	Average Score: 8.34	Score: 13.259.359999790787697
Episode 1040	Average Score: 8.37
actions batch at 728000-th learning:
	 shape = (128, 4),
	 mean = [0.24321398 0.25382665 0.28683233 0.2640729 ],
	  std = [0.53923535 0.5538002  0.5200807  0.56232005]
6.619999852031469
Episode 1041	Average Score: 8.38	Score: 6.62actions batch at 729000-th learning:
	 shape = (128, 4),
	 mean = [0.2447867  0.25217023 0.27860764 0.22642067],
	  std = [0.5523633 0.5519099 0.540706  0.5556084]
8.739999804645777
Episode 1042	Average Score: 8.39	Score: 8.7410.10999977402389
Episode 1043	Average Score: 8.43	Score: 10.11actions batch at 730000-th learning:
	 shape = (128, 4),
	 mean = [0.20526281 0.24384885 0.23332149 0.11771994],
	  std = [0.570575  0.5518797 0.5047697 0.5364332]
8.569999808445573
Episode 1044	Average Score: 8.42	Score: 8.57actions batch at 731000-th learning:
	 shape = (128, 4),
	 mean = [0.26760843 0.14770441 0.25043344 0.23980324],
	  std = [0.5673638  0.5536228  0.49133715 0.5575013 ]
7.7399998269975185
Episode 1045	Average Score: 8.42	Score: 7.74actions batch at 732000-th learning:
	 shape = (128, 4),
	 mean = [0.28926784 0.20893277 0.22744471 0.30643126],
	  std = [0.5719977  0.5724087  0.50326663 0.5560845 ]
8.619999807327986
Episode 1046	Average Score: 8.44	Score: 8.6211.479999743402004
Episode 1047	Average Score: 8.49	Score: 11.48actions batch at 733000-th learning:
	 shape = (128, 4),
	 mean = [0.2841153  0.22092772 0.27060896 0.24810342],
	  std = [0.5826826  0.5645313  0.53155875 0.5776994 ]
9.969999777153134
Episode 1048	Average Score: 8.49	Score: 9.97actions batch at 734000-th learning:
	 shape = (128, 4),
	 mean = [0.2499918  0.20035368 0.23628898 0.3033065 ],
	  std = [0.56151456 0.5548019  0.51656324 0.545161  ]
10.56999976374209
Episode 1049	Average Score: 8.53	Score: 10.576.609999852254987
Episode 1050	Average Score: 8.52
actions batch at 735000-th learning:
	 shape = (128, 4),
	 mean = [0.23216027 0.20090571 0.2288519  0.27259055],
	  std = [0.53428787 0.5579983  0.48860204 0.55277103]
7.489999832585454
Episode 1051	Average Score: 8.52	Score: 7.49actions batch at 736000-th learning:
	 shape = (128, 4),
	 mean = [0.26645976 0.25994802 0.23009941 0.20365062],
	  std = [0.5433554 0.5672478 0.5173845 0.548469 ]
6.189999861642718
Episode 1052	Average Score: 8.52	Score: 6.195.159999884665012
Episode 1053	Average Score: 8.50	Score: 5.16actions batch at 737000-th learning:
	 shape = (128, 4),
	 mean = [0.13912876 0.19349435 0.2621838  0.1664214 ],
	  std = [0.5658418  0.52996093 0.4999566  0.5539422 ]
10.849999757483602
Episode 1054	Average Score: 8.53	Score: 10.85actions batch at 738000-th learning:
	 shape = (128, 4),
	 mean = [0.30326855 0.28759584 0.3360925  0.24402674],
	  std = [0.53200984 0.55483747 0.5377929  0.5671527 ]
5.219999883323908
Episode 1055	Average Score: 8.49	Score: 5.22actions batch at 739000-th learning:
	 shape = (128, 4),
	 mean = [0.18291658 0.1424024  0.20365639 0.30362552],
	  std = [0.55356634 0.5314335  0.5272535  0.5830292 ]
8.189999816939235
Episode 1056	Average Score: 8.51	Score: 8.1912.189999727532268
Episode 1057	Average Score: 8.57	Score: 12.19actions batch at 740000-th learning:
	 shape = (128, 4),
	 mean = [0.23063025 0.1880749  0.36139598 0.25076857],
	  std = [0.5662002  0.5528445  0.51379126 0.5637903 ]
8.309999814257026
Episode 1058	Average Score: 8.53	Score: 8.31actions batch at 741000-th learning:
	 shape = (128, 4),
	 mean = [0.20858474 0.16819495 0.3298666  0.1935693 ],
	  std = [0.5492677  0.53942627 0.5324864  0.5675832 ]
14.339999679476023
Episode 1059	Average Score: 8.56	Score: 14.344.489999899640679
Episode 1060	Average Score: 8.51
actions batch at 742000-th learning:
	 shape = (128, 4),
	 mean = [0.2273043  0.32809594 0.26147607 0.32222813],
	  std = [0.57013524 0.5505483  0.5668374  0.60204864]
7.729999827221036
Episode 1061	Average Score: 8.51	Score: 7.73actions batch at 743000-th learning:
	 shape = (128, 4),
	 mean = [0.23803318 0.22065283 0.2311021  0.26241672],
	  std = [0.5680329  0.53093374 0.5015533  0.54945576]
6.6599998511374
Episode 1062	Average Score: 8.48	Score: 6.668.649999806657434
Episode 1063	Average Score: 8.49	Score: 8.65actions batch at 744000-th learning:
	 shape = (128, 4),
	 mean = [0.21854986 0.18890181 0.25636536 0.21140164],
	  std = [0.5612803  0.5652404  0.50121534 0.55952466]
7.7399998269975185
Episode 1064	Average Score: 8.49	Score: 7.74actions batch at 745000-th learning:
	 shape = (128, 4),
	 mean = [0.2742424  0.1644137  0.22455594 0.2525551 ],
	  std = [0.5351266  0.5728783  0.52795833 0.52855307]
12.109999729320407
Episode 1065	Average Score: 8.55	Score: 12.11actions batch at 746000-th learning:
	 shape = (128, 4),
	 mean = [0.260576   0.20424613 0.25024566 0.20362598],
	  std = [0.5732441  0.5297975  0.50760245 0.54535407]
7.719999827444553
Episode 1066	Average Score: 8.55	Score: 7.726.38999985717237
Episode 1067	Average Score: 8.54	Score: 6.39actions batch at 747000-th learning:
	 shape = (128, 4),
	 mean = [0.3243672  0.26281577 0.27848944 0.2976736 ],
	  std = [0.5510432 0.5729867 0.5497473 0.5704898]
2.249999949708581
Episode 1068	Average Score: 8.47	Score: 2.25actions batch at 748000-th learning:
	 shape = (128, 4),
	 mean = [0.19374457 0.1660402  0.2342479  0.20910355],
	  std = [0.53347343 0.55531865 0.5108771  0.5384881 ]
13.549999697133899
Episode 1069	Average Score: 8.53	Score: 13.558.199999816715717
Episode 1070	Average Score: 8.55
actions batch at 749000-th learning:
	 shape = (128, 4),
	 mean = [0.32882398 0.14897606 0.20174699 0.24971308],
	  std = [0.5758332  0.5106999  0.49572027 0.5590406 ]
8.2099998164922
Episode 1071	Average Score: 8.57	Score: 8.21actions batch at 750000-th learning:
	 shape = (128, 4),
	 mean = [0.21733528 0.15650931 0.26588982 0.2375288 ],
	  std = [0.55122805 0.5836188  0.52181715 0.5712915 ]
9.619999784976244
Episode 1072	Average Score: 8.58	Score: 9.627.999999821186066
Episode 1073	Average Score: 8.56	Score: 8.00actions batch at 751000-th learning:
	 shape = (128, 4),
	 mean = [0.2319063  0.28502882 0.2744417  0.30508852],
	  std = [0.57938814 0.57947767 0.5058913  0.5505894 ]
9.509999787434936
Episode 1074	Average Score: 8.58	Score: 9.51actions batch at 752000-th learning:
	 shape = (128, 4),
	 mean = [0.25555894 0.13384955 0.255184   0.19990358],
	  std = [0.5741235  0.52756995 0.52574074 0.5539814 ]
8.649999806657434
Episode 1075	Average Score: 8.60	Score: 8.65actions batch at 753000-th learning:
	 shape = (128, 4),
	 mean = [0.23626186 0.19887602 0.21771064 0.2153103 ],
	  std = [0.5650212 0.5755486 0.500392  0.5595952]
7.799999825656414
Episode 1076	Average Score: 8.67	Score: 7.809.509999787434936
Episode 1077	Average Score: 8.67	Score: 9.51actions batch at 754000-th learning:
	 shape = (128, 4),
	 mean = [0.2504929  0.19849232 0.28462288 0.21672376],
	  std = [0.5301914  0.5645743  0.49962154 0.5690772 ]
3.399999924004078
Episode 1078	Average Score: 8.63	Score: 3.40actions batch at 755000-th learning:
	 shape = (128, 4),
	 mean = [0.21882823 0.181304   0.2001148  0.29035702],
	  std = [0.5331195 0.5510919 0.5141654 0.5634446]
5.009999888017774
Episode 1079	Average Score: 8.60	Score: 5.016.539999853819609
Episode 1080	Average Score: 8.60
actions batch at 756000-th learning:
	 shape = (128, 4),
	 mean = [0.27338928 0.1809863  0.2619702  0.32396922],
	  std = [0.53993493 0.5450106  0.5341675  0.5733061 ]
7.229999838396907
Episode 1081	Average Score: 8.60	Score: 7.23actions batch at 757000-th learning:
	 shape = (128, 4),
	 mean = [0.27791494 0.2878554  0.32671776 0.21875556],
	  std = [0.5820332  0.5692797  0.54563046 0.5572722 ]
7.599999830126762
Episode 1082	Average Score: 8.60	Score: 7.608.219999816268682
Episode 1083	Average Score: 8.60	Score: 8.22actions batch at 758000-th learning:
	 shape = (128, 4),
	 mean = [0.16665584 0.21308637 0.17326675 0.23361216],
	  std = [0.5391095  0.55247635 0.49613017 0.5690377 ]
9.319999791681767
Episode 1084	Average Score: 8.64	Score: 9.32actions batch at 759000-th learning:
	 shape = (128, 4),
	 mean = [0.15847878 0.2551018  0.23111844 0.22098356],
	  std = [0.53303474 0.5809182  0.4921161  0.5788922 ]
6.779999848455191
Episode 1085	Average Score: 8.64	Score: 6.78actions batch at 760000-th learning:
	 shape = (128, 4),
	 mean = [0.2498941  0.2435319  0.28189024 0.2826493 ],
	  std = [0.5396409  0.5577874  0.53437364 0.573011  ]
9.17999979481101
Episode 1086	Average Score: 8.65	Score: 9.188.589999807998538
Episode 1087	Average Score: 8.63	Score: 8.59actions batch at 761000-th learning:
	 shape = (128, 4),
	 mean = [0.26405326 0.13417952 0.24263535 0.2673328 ],
	  std = [0.5427441 0.5519341 0.5216487 0.5720577]
18.099999595433474
Episode 1088	Average Score: 8.75	Score: 18.10actions batch at 762000-th learning:
	 shape = (128, 4),
	 mean = [0.23449008 0.13145967 0.22784024 0.30864605],
	  std = [0.5649447  0.5130231  0.5220532  0.55008554]
7.199999839067459
Episode 1089	Average Score: 8.75	Score: 7.209.669999783858657
Episode 1090	Average Score: 8.76
actions batch at 763000-th learning:
	 shape = (128, 4),
	 mean = [0.2804378  0.18912666 0.25404185 0.24732022],
	  std = [0.53792363 0.5817445  0.51680684 0.545451  ]
8.699999805539846
Episode 1091	Average Score: 8.75	Score: 8.70actions batch at 764000-th learning:
	 shape = (128, 4),
	 mean = [0.24177684 0.27585602 0.2608214  0.26341167],
	  std = [0.55499136 0.55923134 0.52219474 0.5492405 ]
9.67999978363514
Episode 1092	Average Score: 8.74	Score: 9.688.949999799951911
Episode 1093	Average Score: 8.73	Score: 8.95actions batch at 765000-th learning:
	 shape = (128, 4),
	 mean = [0.24686006 0.22772148 0.219596   0.23886564],
	  std = [0.5431184  0.58044827 0.50698364 0.5566373 ]
12.81999971345067
Episode 1094	Average Score: 8.78	Score: 12.82actions batch at 766000-th learning:
	 shape = (128, 4),
	 mean = [0.3086471  0.22649543 0.24067478 0.29358765],
	  std = [0.56326306 0.52970606 0.5046092  0.57976454]
8.259999815374613
Episode 1095	Average Score: 8.78	Score: 8.26actions batch at 767000-th learning:
	 shape = (128, 4),
	 mean = [0.22316939 0.16848573 0.2503551  0.2160795 ],
	  std = [0.5693312  0.55374146 0.5271534  0.54058725]
8.739999804645777
Episode 1096	Average Score: 8.74	Score: 8.7414.609999673441052
Episode 1097	Average Score: 8.79	Score: 14.61actions batch at 768000-th learning:
	 shape = (128, 4),
	 mean = [0.32295966 0.21897301 0.29711685 0.3003933 ],
	  std = [0.5581223  0.5634063  0.50851786 0.5408026 ]
9.209999794140458
Episode 1098	Average Score: 8.82	Score: 9.21actions batch at 769000-th learning:
	 shape = (128, 4),
	 mean = [0.2976494  0.18997996 0.20991284 0.16389814],
	  std = [0.5643043  0.56824654 0.544516   0.54066354]
10.279999770224094
Episode 1099	Average Score: 8.81	Score: 10.289.779999781399965
Episode 1100	Average Score: 8.79
actions batch at 770000-th learning:
	 shape = (128, 4),
	 mean = [0.27782348 0.2504608  0.3396524  0.27407253],
	  std = [0.58639663 0.5789873  0.49185678 0.5423854 ]
7.6699998285621405
Episode 1101	Average Score: 8.77	Score: 7.67actions batch at 771000-th learning:
	 shape = (128, 4),
	 mean = [0.2192316  0.18044882 0.18485674 0.18361363],
	  std = [0.52000386 0.5492943  0.48873922 0.53078204]
8.259999815374613
Episode 1102	Average Score: 8.75	Score: 8.268.12999981828034
Episode 1103	Average Score: 8.73	Score: 8.13actions batch at 772000-th learning:
	 shape = (128, 4),
	 mean = [0.289652   0.19108535 0.29397032 0.21407242],
	  std = [0.56390935 0.54902035 0.5159879  0.57931596]
7.239999838173389
Episode 1104	Average Score: 8.66	Score: 7.24actions batch at 773000-th learning:
	 shape = (128, 4),
	 mean = [0.11451999 0.23863249 0.19807689 0.318272  ],
	  std = [0.55753005 0.56422126 0.5003731  0.5813294 ]
12.399999722838402
Episode 1105	Average Score: 8.70	Score: 12.40actions batch at 774000-th learning:
	 shape = (128, 4),
	 mean = [0.2845697  0.18463497 0.31708017 0.23165491],
	  std = [0.56717557 0.53260434 0.54399395 0.57011515]
7.729999827221036
Episode 1106	Average Score: 8.67	Score: 7.738.639999806880951
Episode 1107	Average Score: 8.67	Score: 8.64actions batch at 775000-th learning:
	 shape = (128, 4),
	 mean = [0.31677064 0.16660081 0.29435772 0.19170175],
	  std = [0.5727421  0.5311389  0.53733516 0.57702565]
8.799999803304672
Episode 1108	Average Score: 8.68	Score: 8.80actions batch at 776000-th learning:
	 shape = (128, 4),
	 mean = [0.21919349 0.13828617 0.22967142 0.2042958 ],
	  std = [0.55560786 0.540116   0.4958646  0.5375706 ]
5.749999871477485
Episode 1109	Average Score: 8.64	Score: 5.758.079999819397926
Episode 1110	Average Score: 8.64
actions batch at 777000-th learning:
	 shape = (128, 4),
	 mean = [0.2517791  0.27849972 0.3170857  0.2621024 ],
	  std = [0.5869684  0.5816403  0.5226453  0.56596804]
8.399999812245369
Episode 1111	Average Score: 8.66	Score: 8.40actions batch at 778000-th learning:
	 shape = (128, 4),
	 mean = [0.2810017  0.25614935 0.33405805 0.23209383],
	  std = [0.552912   0.5675381  0.5335296  0.52722764]
9.109999796375632
Episode 1112	Average Score: 8.73	Score: 9.117.31999983638525
Episode 1113	Average Score: 8.63	Score: 7.32actions batch at 779000-th learning:
	 shape = (128, 4),
	 mean = [0.2795485  0.19928233 0.21295896 0.30056328],
	  std = [0.5558263  0.55833364 0.5107467  0.547973  ]
7.389999834820628
Episode 1114	Average Score: 8.59	Score: 7.39actions batch at 780000-th learning:
	 shape = (128, 4),
	 mean = [0.2744931  0.14562184 0.26330164 0.20192297],
	  std = [0.5580263  0.54025155 0.51107883 0.5545135 ]
8.109999818727374
Episode 1115	Average Score: 8.56	Score: 8.11actions batch at 781000-th learning:
	 shape = (128, 4),
	 mean = [0.14797333 0.2126189  0.25967857 0.24779594],
	  std = [0.53888685 0.54308635 0.52968174 0.5811035 ]
7.759999826550484
Episode 1116	Average Score: 8.50	Score: 7.766.479999855160713
Episode 1117	Average Score: 8.48	Score: 6.48actions batch at 782000-th learning:
	 shape = (128, 4),
	 mean = [0.260999   0.21133637 0.2176637  0.2869794 ],
	  std = [0.54138774 0.5528986  0.5049515  0.554153  ]
11.669999739155173
Episode 1118	Average Score: 8.50	Score: 11.67actions batch at 783000-th learning:
	 shape = (128, 4),
	 mean = [0.29583323 0.19298528 0.29839417 0.33494148],
	  std = [0.5643753  0.54960245 0.51269764 0.6027133 ]
8.709999805316329
Episode 1119	Average Score: 8.50	Score: 8.717.179999839514494
Episode 1120	Average Score: 8.48
actions batch at 784000-th learning:
	 shape = (128, 4),
	 mean = [0.22559032 0.16771774 0.215163   0.23444498],
	  std = [0.58068687 0.5549921  0.50611186 0.54514956]
8.01999982073903
Episode 1121	Average Score: 8.52	Score: 8.02actions batch at 785000-th learning:
	 shape = (128, 4),
	 mean = [0.2915829  0.1507987  0.23234884 0.2757674 ],
	  std = [0.5591287  0.5344173  0.49836057 0.566101  ]
8.149999817833304
Episode 1122	Average Score: 8.52	Score: 8.157.379999835044146
Episode 1123	Average Score: 8.51	Score: 7.38actions batch at 786000-th learning:
	 shape = (128, 4),
	 mean = [0.24007227 0.16961257 0.19298224 0.17986023],
	  std = [0.5518282  0.52553326 0.48482105 0.55591965]
7.389999834820628
Episode 1124	Average Score: 8.51	Score: 7.39actions batch at 787000-th learning:
	 shape = (128, 4),
	 mean = [0.2223122  0.18926635 0.23325416 0.21413717],
	  std = [0.52812266 0.55319804 0.51886827 0.54234314]
2.12999995239079
Episode 1125	Average Score: 8.43	Score: 2.13actions batch at 788000-th learning:
	 shape = (128, 4),
	 mean = [0.19197825 0.21560141 0.26227432 0.29465336],
	  std = [0.54448676 0.5567547  0.51385707 0.56255215]
8.389999812468886
Episode 1126	Average Score: 8.42	Score: 8.396.559999853372574
Episode 1127	Average Score: 8.49	Score: 6.56actions batch at 789000-th learning:
	 shape = (128, 4),
	 mean = [0.15913597 0.13456622 0.23933621 0.21834497],
	  std = [0.54449815 0.5312174  0.5106513  0.5421944 ]
8.469999810680747
Episode 1128	Average Score: 8.46	Score: 8.47actions batch at 790000-th learning:
	 shape = (128, 4),
	 mean = [0.31268412 0.19864586 0.29970264 0.22539873],
	  std = [0.5568171 0.5696595 0.5331507 0.5502879]
7.919999822974205
Episode 1129	Average Score: 8.46	Score: 7.926.559999853372574
Episode 1130	Average Score: 8.45
actions batch at 791000-th learning:
	 shape = (128, 4),
	 mean = [0.265374   0.24183178 0.19317533 0.27536952],
	  std = [0.54469615 0.5705127  0.5115697  0.52108073]
12.339999724179506
Episode 1131	Average Score: 8.46	Score: 12.34actions batch at 792000-th learning:
	 shape = (128, 4),
	 mean = [0.19836888 0.16611986 0.2650803  0.22992173],
	  std = [0.5836276  0.55360126 0.52222884 0.5732637 ]
11.559999741613865
Episode 1132	Average Score: 8.52	Score: 11.5611.119999751448631
Episode 1133	Average Score: 8.54	Score: 11.12actions batch at 793000-th learning:
	 shape = (128, 4),
	 mean = [0.25414908 0.31063136 0.20530187 0.27655607],
	  std = [0.54969126 0.56415915 0.5004512  0.5696355 ]
9.589999785646796
Episode 1134	Average Score: 8.57	Score: 9.59actions batch at 794000-th learning:
	 shape = (128, 4),
	 mean = [0.21772721 0.18944007 0.2332702  0.21225528],
	  std = [0.5597993  0.5652804  0.53028697 0.5473852 ]
8.839999802410603
Episode 1135	Average Score: 8.59	Score: 8.84actions batch at 795000-th learning:
	 shape = (128, 4),
	 mean = [0.2894733  0.27029988 0.25971702 0.306081  ],
	  std = [0.5628558  0.5406938  0.5070953  0.57090896]
9.259999793022871
Episode 1136	Average Score: 8.60	Score: 9.2610.119999773800373
Episode 1137	Average Score: 8.59	Score: 10.12actions batch at 796000-th learning:
	 shape = (128, 4),
	 mean = [0.16628289 0.21254772 0.21045405 0.29506886],
	  std = [0.5478193 0.5572647 0.5287382 0.5801655]
11.289999747648835
Episode 1138	Average Score: 8.64	Score: 11.29actions batch at 797000-th learning:
	 shape = (128, 4),
	 mean = [0.24202442 0.20497653 0.30154046 0.2788228 ],
	  std = [0.5454351  0.5778443  0.52373993 0.579032  ]
9.919999778270721
Episode 1139	Average Score: 8.61	Score: 9.9211.159999750554562
Episode 1140	Average Score: 8.63
actions batch at 798000-th learning:
	 shape = (128, 4),
	 mean = [0.3567365  0.10939572 0.29300275 0.2561489 ],
	  std = [0.5631858  0.55460393 0.4973816  0.5688719 ]
11.80999973602593
Episode 1141	Average Score: 8.68	Score: 11.81actions batch at 799000-th learning:
	 shape = (128, 4),
	 mean = [0.23754948 0.21403323 0.2071973  0.2855015 ],
	  std = [0.5405468  0.56854576 0.52421993 0.5699092 ]
10.009999776259065
Episode 1142	Average Score: 8.69	Score: 10.014.649999896064401
Episode 1143	Average Score: 8.64	Score: 4.65actions batch at 800000-th learning:
	 shape = (128, 4),
	 mean = [0.25655234 0.2564103  0.17183048 0.22431554],
	  std = [0.5386704  0.5611739  0.51297915 0.5747931 ]
12.54999971948564
Episode 1144	Average Score: 8.68	Score: 12.55actions batch at 801000-th learning:
	 shape = (128, 4),
	 mean = [0.30508655 0.23253275 0.35309577 0.3339207 ],
	  std = [0.5568939  0.5718317  0.5291911  0.56562084]
12.269999725744128
Episode 1145	Average Score: 8.72	Score: 12.27actions batch at 802000-th learning:
	 shape = (128, 4),
	 mean = [0.26321167 0.27717865 0.27405426 0.27795064],
	  std = [0.57349116 0.53231084 0.5086532  0.5498562 ]
10.589999763295054
Episode 1146	Average Score: 8.74	Score: 10.597.989999821409583
Episode 1147	Average Score: 8.71	Score: 7.99actions batch at 803000-th learning:
	 shape = (128, 4),
	 mean = [0.29979962 0.26698804 0.26930353 0.2857846 ],
	  std = [0.56251657 0.5749568  0.5057736  0.59938425]
10.699999760836363
Episode 1148	Average Score: 8.71	Score: 10.70actions batch at 804000-th learning:
	 shape = (128, 4),
	 mean = [0.26655447 0.16210055 0.2573195  0.29817477],
	  std = [0.53333265 0.5459125  0.50681305 0.57767165]
10.329999769106507
Episode 1149	Average Score: 8.71	Score: 10.3313.409999700263143
Episode 1150	Average Score: 8.78
actions batch at 805000-th learning:
	 shape = (128, 4),
	 mean = [0.2177641  0.19901936 0.26581532 0.24220896],
	  std = [0.53714824 0.5429463  0.4969736  0.5421407 ]
14.449999677017331
Episode 1151	Average Score: 8.85	Score: 14.45actions batch at 806000-th learning:
	 shape = (128, 4),
	 mean = [0.1046223  0.09910186 0.19111592 0.27635804],
	  std = [0.5045892 0.5076107 0.494606  0.5766097]
12.189999727532268
Episode 1152	Average Score: 8.91	Score: 12.199.779999781399965
Episode 1153	Average Score: 8.95	Score: 9.78actions batch at 807000-th learning:
	 shape = (128, 4),
	 mean = [0.16745102 0.18268079 0.17086394 0.24652572],
	  std = [0.5206837  0.53635156 0.49847007 0.5481554 ]
10.739999759942293
Episode 1154	Average Score: 8.95	Score: 10.74actions batch at 808000-th learning:
	 shape = (128, 4),
	 mean = [0.2832604  0.26413423 0.25346118 0.22546968],
	  std = [0.5268596  0.56233805 0.5366237  0.5369384 ]
10.699999760836363
Episode 1155	Average Score: 9.01	Score: 10.70actions batch at 809000-th learning:
	 shape = (128, 4),
	 mean = [0.2455055  0.18723974 0.27191195 0.23262848],
	  std = [0.5316633  0.54348505 0.5209227  0.55245733]
10.279999770224094
Episode 1156	Average Score: 9.03	Score: 10.2812.269999725744128
Episode 1157	Average Score: 9.03	Score: 12.27actions batch at 810000-th learning:
	 shape = (128, 4),
	 mean = [0.29285496 0.28728005 0.26276258 0.28299525],
	  std = [0.56990397 0.5833245  0.5254225  0.5862049 ]
9.259999793022871
Episode 1158	Average Score: 9.04	Score: 9.26actions batch at 811000-th learning:
	 shape = (128, 4),
	 mean = [0.22803    0.15774187 0.15193564 0.2542059 ],
	  std = [0.5479882  0.5362341  0.49696246 0.546826  ]
9.429999789223075
Episode 1159	Average Score: 8.99	Score: 9.4311.839999735355377
Episode 1160	Average Score: 9.06
actions batch at 812000-th learning:
	 shape = (128, 4),
	 mean = [0.3548376  0.14235665 0.22115995 0.28004155],
	  std = [0.5786104  0.5014926  0.52325636 0.56310195]
8.78999980352819
Episode 1161	Average Score: 9.07	Score: 8.79actions batch at 813000-th learning:
	 shape = (128, 4),
	 mean = [0.29310843 0.2100758  0.25894037 0.2287593 ],
	  std = [0.58279806 0.5560591  0.5151459  0.57578856]
11.009999753907323
Episode 1162	Average Score: 9.12	Score: 11.018.16999981738627
Episode 1163	Average Score: 9.11	Score: 8.17actions batch at 814000-th learning:
	 shape = (128, 4),
	 mean = [0.35113403 0.10036633 0.27214015 0.24340543],
	  std = [0.5636733  0.5341983  0.51880527 0.55007607]
8.659999806433916
Episode 1164	Average Score: 9.12	Score: 8.66actions batch at 815000-th learning:
	 shape = (128, 4),
	 mean = [0.2310025  0.0899235  0.18533058 0.21969032],
	  std = [0.5357838  0.5098759  0.47312248 0.56463116]
7.089999841526151
Episode 1165	Average Score: 9.07	Score: 7.09actions batch at 816000-th learning:
	 shape = (128, 4),
	 mean = [0.20269313 0.08193656 0.23165128 0.2881218 ],
	  std = [0.56461686 0.49388698 0.54490674 0.57407403]
7.909999823197722
Episode 1166	Average Score: 9.07	Score: 7.915.1099998857825994
Episode 1167	Average Score: 9.06	Score: 5.11actions batch at 817000-th learning:
	 shape = (128, 4),
	 mean = [0.24979205 0.1490047  0.26394948 0.30443326],
	  std = [0.57853925 0.5210023  0.5274085  0.5541743 ]
11.819999735802412
Episode 1168	Average Score: 9.16	Score: 11.82actions batch at 818000-th learning:
	 shape = (128, 4),
	 mean = [0.32465765 0.11429212 0.28818318 0.25824752],
	  std = [0.5602398  0.54771686 0.5368309  0.56448066]
13.069999707862735
Episode 1169	Average Score: 9.15	Score: 13.0711.76999973692
Episode 1170	Average Score: 9.19
actions batch at 819000-th learning:
	 shape = (128, 4),
	 mean = [0.20358424 0.20258485 0.2311445  0.27657354],
	  std = [0.5475512  0.5547922  0.5048028  0.56825334]
14.959999665617943
Episode 1171	Average Score: 9.26	Score: 14.96actions batch at 820000-th learning:
	 shape = (128, 4),
	 mean = [0.18531309 0.21409388 0.26404995 0.20374113],
	  std = [0.53028136 0.53886247 0.5251206  0.52189   ]
10.219999771565199
Episode 1172	Average Score: 9.26	Score: 10.229.309999791905284
Episode 1173	Average Score: 9.27	Score: 9.31actions batch at 821000-th learning:
	 shape = (128, 4),
	 mean = [0.2309004  0.2083333  0.18095364 0.21266344],
	  std = [0.5482523  0.56109375 0.49571422 0.54447687]
6.869999846443534
Episode 1174	Average Score: 9.25	Score: 6.87actions batch at 822000-th learning:
	 shape = (128, 4),
	 mean = [0.25319043 0.16683576 0.18989025 0.16326609],
	  std = [0.52121615 0.54009175 0.49191964 0.5327214 ]
10.68999976105988
Episode 1175	Average Score: 9.27	Score: 10.69actions batch at 823000-th learning:
	 shape = (128, 4),
	 mean = [0.3599425  0.2891565  0.36052838 0.22956048],
	  std = [0.5578681 0.5721845 0.5150495 0.5634383]
9.17999979481101
Episode 1176	Average Score: 9.28	Score: 9.187.829999824985862
Episode 1177	Average Score: 9.27	Score: 7.83actions batch at 824000-th learning:
	 shape = (128, 4),
	 mean = [0.23034862 0.08836648 0.21710339 0.18164861],
	  std = [0.5573794  0.51220447 0.49679488 0.5523131 ]
8.619999807327986
Episode 1178	Average Score: 9.32	Score: 8.62actions batch at 825000-th learning:
	 shape = (128, 4),
	 mean = [0.24336506 0.24000014 0.2004533  0.2838799 ],
	  std = [0.5225241  0.5589428  0.47544095 0.53480977]
11.009999753907323
Episode 1179	Average Score: 9.38	Score: 11.018.2099998164922
Episode 1180	Average Score: 9.39
actions batch at 826000-th learning:
	 shape = (128, 4),
	 mean = [0.20165962 0.21340336 0.20521833 0.2130608 ],
	  std = [0.5485007  0.52751607 0.50540495 0.5289078 ]
7.719999827444553
Episode 1181	Average Score: 9.40	Score: 7.72actions batch at 827000-th learning:
	 shape = (128, 4),
	 mean = [0.24093746 0.22138302 0.21703832 0.21086897],
	  std = [0.54004014 0.5460861  0.49220222 0.52207   ]
10.56999976374209
Episode 1182	Average Score: 9.43	Score: 10.577.079999841749668
Episode 1183	Average Score: 9.42	Score: 7.08actions batch at 828000-th learning:
	 shape = (128, 4),
	 mean = [0.1983716  0.1869857  0.21446893 0.26620978],
	  std = [0.5595332  0.5366286  0.51762205 0.5415688 ]
10.939999755471945
Episode 1184	Average Score: 9.43	Score: 10.94actions batch at 829000-th learning:
	 shape = (128, 4),
	 mean = [0.32824197 0.21071179 0.3143403  0.2045697 ],
	  std = [0.5633378 0.5931948 0.5175586 0.5665213]
8.619999807327986
Episode 1185	Average Score: 9.45	Score: 8.62actions batch at 830000-th learning:
	 shape = (128, 4),
	 mean = [0.25578618 0.20782593 0.20757538 0.25921115],
	  std = [0.55870795 0.5438103  0.48526096 0.53244853]
1.959999956190586
Episode 1186	Average Score: 9.38	Score: 1.968.529999809339643
Episode 1187	Average Score: 9.38	Score: 8.53actions batch at 831000-th learning:
	 shape = (128, 4),
	 mean = [0.26938012 0.20150802 0.20126417 0.24514869],
	  std = [0.56411314 0.56968504 0.47781008 0.5847652 ]
9.479999788105488
Episode 1188	Average Score: 9.29	Score: 9.48actions batch at 832000-th learning:
	 shape = (128, 4),
	 mean = [0.24361034 0.21232454 0.20831512 0.19735557],
	  std = [0.5490372 0.5566855 0.5143562 0.5382657]
7.749999826774001
Episode 1189	Average Score: 9.30	Score: 7.757.479999832808971
Episode 1190	Average Score: 9.28
actions batch at 833000-th learning:
	 shape = (128, 4),
	 mean = [0.2480075  0.13511433 0.21107    0.2680457 ],
	  std = [0.5669866 0.5428805 0.5025895 0.5558108]
9.79999978095293
Episode 1191	Average Score: 9.29	Score: 9.80actions batch at 834000-th learning:
	 shape = (128, 4),
	 mean = [0.25009802 0.19638656 0.20072582 0.23204021],
	  std = [0.5726906  0.5699895  0.49882856 0.55001646]
6.779999848455191
Episode 1192	Average Score: 9.26	Score: 6.789.659999784082174
Episode 1193	Average Score: 9.27	Score: 9.66actions batch at 835000-th learning:
	 shape = (128, 4),
	 mean = [0.24083973 0.23486511 0.31002626 0.21788363],
	  std = [0.57500947 0.55992097 0.4935403  0.55575436]
7.04999984242022
Episode 1194	Average Score: 9.21	Score: 7.05actions batch at 836000-th learning:
	 shape = (128, 4),
	 mean = [0.29675862 0.14529228 0.32020324 0.30971318],
	  std = [0.5589083  0.5385774  0.54362    0.54716456]
8.149999817833304
Episode 1195	Average Score: 9.21	Score: 8.15actions batch at 837000-th learning:
	 shape = (128, 4),
	 mean = [0.32164085 0.27168605 0.20775253 0.24428406],
	  std = [0.5674281 0.5542589 0.511652  0.5631067]
5.899999868124723
Episode 1196	Average Score: 9.18	Score: 5.908.51999980956316
Episode 1197	Average Score: 9.12	Score: 8.52actions batch at 838000-th learning:
	 shape = (128, 4),
	 mean = [0.2686978  0.1827533  0.28354383 0.24081391],
	  std = [0.5628139  0.5535243  0.5489181  0.57267296]
4.969999888911843
Episode 1198	Average Score: 9.08	Score: 4.97actions batch at 839000-th learning:
	 shape = (128, 4),
	 mean = [0.29398164 0.28889778 0.25304604 0.27627802],
	  std = [0.58036816 0.57238424 0.5179281  0.5691472 ]
11.989999732002616
Episode 1199	Average Score: 9.09	Score: 11.9910.64999976195395
Episode 1200	Average Score: 9.10
actions batch at 840000-th learning:
	 shape = (128, 4),
	 mean = [0.2935799  0.20222591 0.2593167  0.2663735 ],
	  std = [0.56220394 0.5456404  0.5239358  0.5636955 ]
9.579999785870314
Episode 1201	Average Score: 9.12	Score: 9.58actions batch at 841000-th learning:
	 shape = (128, 4),
	 mean = [0.2100238  0.26548472 0.26461816 0.28757358],
	  std = [0.5277631  0.5406914  0.51852137 0.5593928 ]
6.939999844878912
Episode 1202	Average Score: 9.11	Score: 6.9410.019999776035547
Episode 1203	Average Score: 9.13	Score: 10.02actions batch at 842000-th learning:
	 shape = (128, 4),
	 mean = [0.19965304 0.19705336 0.17389053 0.24649803],
	  std = [0.56884223 0.55259395 0.5302519  0.5386898 ]
8.469999810680747
Episode 1204	Average Score: 9.14	Score: 8.47actions batch at 843000-th learning:
	 shape = (128, 4),
	 mean = [0.2426265  0.15749995 0.20834853 0.14059131],
	  std = [0.55327916 0.5380247  0.51153743 0.5324244 ]
12.889999711886048
Episode 1205	Average Score: 9.14	Score: 12.89actions batch at 844000-th learning:
	 shape = (128, 4),
	 mean = [0.3062892  0.21468496 0.24252026 0.19686365],
	  std = [0.5696868  0.5638126  0.52257633 0.57543117]
12.399999722838402
Episode 1206	Average Score: 9.19	Score: 12.409.929999778047204
Episode 1207	Average Score: 9.20	Score: 9.93actions batch at 845000-th learning:
	 shape = (128, 4),
	 mean = [0.22417612 0.18407822 0.21923788 0.28499007],
	  std = [0.5517649 0.5536402 0.5156082 0.5761885]
12.069999730214477
Episode 1208	Average Score: 9.24	Score: 12.07actions batch at 846000-th learning:
	 shape = (128, 4),
	 mean = [0.2782981  0.14020798 0.24352187 0.25358978],
	  std = [0.5623242  0.5437558  0.51789737 0.57479626]
10.509999765083194
Episode 1209	Average Score: 9.28	Score: 10.5110.02999977581203
Episode 1210	Average Score: 9.30
actions batch at 847000-th learning:
	 shape = (128, 4),
	 mean = [0.23512456 0.22509407 0.20879994 0.2931843 ],
	  std = [0.56162804 0.56569767 0.4839026  0.58067715]
7.379999835044146
Episode 1211	Average Score: 9.29	Score: 7.38actions batch at 848000-th learning:
	 shape = (128, 4),
	 mean = [0.29893896 0.21577479 0.2087053  0.25774038],
	  std = [0.55241156 0.54392177 0.52970403 0.5584613 ]
10.059999775141478
Episode 1212	Average Score: 9.30	Score: 10.0610.139999773353338
Episode 1213	Average Score: 9.33	Score: 10.14actions batch at 849000-th learning:
	 shape = (128, 4),
	 mean = [0.2429394  0.24241024 0.23632443 0.21348009],
	  std = [0.5517036  0.5550055  0.5178559  0.54346526]
13.579999696463346
Episode 1214	Average Score: 9.39	Score: 13.58actions batch at 850000-th learning:
	 shape = (128, 4),
	 mean = [0.22698313 0.24804336 0.2603102  0.26947743],
	  std = [0.54756206 0.5723588  0.52783775 0.55279416]
8.319999814033508
Episode 1215	Average Score: 9.39	Score: 8.32actions batch at 851000-th learning:
	 shape = (128, 4),
	 mean = [0.35832453 0.2808674  0.26376414 0.26055306],
	  std = [0.5825364  0.59166044 0.5457492  0.5655971 ]
8.779999803751707
Episode 1216	Average Score: 9.40	Score: 8.7811.689999738708138
Episode 1217	Average Score: 9.46	Score: 11.69actions batch at 852000-th learning:
	 shape = (128, 4),
	 mean = [0.26727098 0.26660344 0.30575892 0.2827891 ],
	  std = [0.57652605 0.5545755  0.5473677  0.6000005 ]
12.189999727532268
Episode 1218	Average Score: 9.46	Score: 12.19actions batch at 853000-th learning:
	 shape = (128, 4),
	 mean = [0.2042752  0.19402337 0.20589554 0.21467878],
	  std = [0.5179007  0.53668    0.49027064 0.54292494]
11.459999743849039
Episode 1219	Average Score: 9.49	Score: 11.4610.41999976709485
Episode 1220	Average Score: 9.52
actions batch at 854000-th learning:
	 shape = (128, 4),
	 mean = [0.15741128 0.11172637 0.22541575 0.27406612],
	  std = [0.5403372  0.54091406 0.531659   0.5776023 ]
11.249999748542905
Episode 1221	Average Score: 9.55	Score: 11.25actions batch at 855000-th learning:
	 shape = (128, 4),
	 mean = [0.35587826 0.24040753 0.21502683 0.25569502],
	  std = [0.54436094 0.5660697  0.514811   0.5652373 ]
10.119999773800373
Episode 1222	Average Score: 9.57	Score: 10.129.739999782294035
Episode 1223	Average Score: 9.60	Score: 9.74actions batch at 856000-th learning:
	 shape = (128, 4),
	 mean = [0.23821732 0.21736743 0.24435583 0.12294313],
	  std = [0.56936485 0.54426885 0.5243592  0.5048851 ]
11.18999974988401
Episode 1224	Average Score: 9.64	Score: 11.19actions batch at 857000-th learning:
	 shape = (128, 4),
	 mean = [0.24772796 0.27816665 0.29507416 0.2770483 ],
	  std = [0.561209   0.5785715  0.5177855  0.55580646]
11.61999974027276
Episode 1225	Average Score: 9.73	Score: 11.62actions batch at 858000-th learning:
	 shape = (128, 4),
	 mean = [0.27360222 0.20650026 0.27946153 0.27187195],
	  std = [0.5723131  0.5402983  0.5141464  0.55193126]
9.169999795034528
Episode 1226	Average Score: 9.74	Score: 9.1712.049999730661511
Episode 1227	Average Score: 9.79	Score: 12.05actions batch at 859000-th learning:
	 shape = (128, 4),
	 mean = [0.3728942  0.1716151  0.28675994 0.20327526],
	  std = [0.58933157 0.5360994  0.52540153 0.5407332 ]
11.719999738037586
Episode 1228	Average Score: 9.83	Score: 11.72actions batch at 860000-th learning:
	 shape = (128, 4),
	 mean = [0.21366327 0.15046737 0.24295437 0.23808074],
	  std = [0.570522   0.5407855  0.5341618  0.54217774]
6.009999865666032
Episode 1229	Average Score: 9.81	Score: 6.0111.909999733790755
Episode 1230	Average Score: 9.86
actions batch at 861000-th learning:
	 shape = (128, 4),
	 mean = [0.25077805 0.20850849 0.2204691  0.27461106],
	  std = [0.548298   0.5364279  0.50565195 0.5369538 ]
11.369999745860696
Episode 1231	Average Score: 9.85	Score: 11.37actions batch at 862000-th learning:
	 shape = (128, 4),
	 mean = [0.19602858 0.15583146 0.2811495  0.26091853],
	  std = [0.5280623  0.55062705 0.52698946 0.5650852 ]
8.59999980777502
Episode 1232	Average Score: 9.82	Score: 8.6013.179999705404043
Episode 1233	Average Score: 9.84	Score: 13.18actions batch at 863000-th learning:
	 shape = (128, 4),
	 mean = [0.26749018 0.2246005  0.26789054 0.22522418],
	  std = [0.5613177  0.566355   0.53386515 0.5443205 ]
10.359999768435955
Episode 1234	Average Score: 9.85	Score: 10.36actions batch at 864000-th learning:
	 shape = (128, 4),
	 mean = [0.18604338 0.24186973 0.23690769 0.15455145],
	  std = [0.5740747  0.55764484 0.52552885 0.5394357 ]
16.319999635219574
Episode 1235	Average Score: 9.92	Score: 16.32actions batch at 865000-th learning:
	 shape = (128, 4),
	 mean = [0.291509   0.19586529 0.26742077 0.24761806],
	  std = [0.57816255 0.52682245 0.50742286 0.5653225 ]
10.279999770224094
Episode 1236	Average Score: 9.93	Score: 10.2810.139999773353338
Episode 1237	Average Score: 9.93	Score: 10.14actions batch at 866000-th learning:
	 shape = (128, 4),
	 mean = [0.33153686 0.29016796 0.33754975 0.32713792],
	  std = [0.55715925 0.55843455 0.5276794  0.55984753]
7.6199998296797276
Episode 1238	Average Score: 9.90	Score: 7.62actions batch at 867000-th learning:
	 shape = (128, 4),
	 mean = [0.2823013  0.21112034 0.29339373 0.27290213],
	  std = [0.5601018  0.5383493  0.536908   0.56411767]
13.12999970652163
Episode 1239	Average Score: 9.93	Score: 13.138.319999814033508
Episode 1240	Average Score: 9.90
actions batch at 868000-th learning:
	 shape = (128, 4),
	 mean = [0.21336848 0.20155965 0.25023985 0.2196915 ],
	  std = [0.5507591  0.5425565  0.48174956 0.5605717 ]
7.729999827221036
Episode 1241	Average Score: 9.86	Score: 7.73actions batch at 869000-th learning:
	 shape = (128, 4),
	 mean = [0.22657096 0.1842528  0.30977786 0.2872689 ],
	  std = [0.55232006 0.54824466 0.524889   0.57433814]
11.219999749213457
Episode 1242	Average Score: 9.87	Score: 11.228.809999803081155
Episode 1243	Average Score: 9.91	Score: 8.81actions batch at 870000-th learning:
	 shape = (128, 4),
	 mean = [0.23981103 0.2615182  0.24637878 0.30401447],
	  std = [0.5471486  0.560475   0.49295273 0.5659699 ]
8.609999807551503
Episode 1244	Average Score: 9.87	Score: 8.61actions batch at 871000-th learning:
	 shape = (128, 4),
	 mean = [0.2561019  0.15305717 0.18939681 0.31573194],
	  std = [0.5295854  0.5435041  0.50237286 0.56690705]
12.269999725744128
Episode 1245	Average Score: 9.87	Score: 12.27actions batch at 872000-th learning:
	 shape = (128, 4),
	 mean = [0.2845793  0.2384845  0.26073852 0.26150045],
	  std = [0.5377007  0.5547113  0.53022    0.57912236]
8.799999803304672
Episode 1246	Average Score: 9.86	Score: 8.8010.559999763965607
Episode 1247	Average Score: 9.88	Score: 10.56actions batch at 873000-th learning:
	 shape = (128, 4),
	 mean = [0.2604629  0.25227836 0.29001668 0.21070965],
	  std = [0.5877327  0.561768   0.52030843 0.53639114]
11.879999734461308
Episode 1248	Average Score: 9.89	Score: 11.88actions batch at 874000-th learning:
	 shape = (128, 4),
	 mean = [0.2817483  0.16753921 0.29038376 0.31587788],
	  std = [0.5418888  0.5273276  0.53073716 0.5583472 ]
11.639999739825726
Episode 1249	Average Score: 9.91	Score: 11.647.339999835938215
Episode 1250	Average Score: 9.85
actions batch at 875000-th learning:
	 shape = (128, 4),
	 mean = [0.25643283 0.32746825 0.25664347 0.28867066],
	  std = [0.558867   0.5692841  0.5297144  0.55474824]
8.919999800622463
Episode 1251	Average Score: 9.79	Score: 8.92actions batch at 876000-th learning:
	 shape = (128, 4),
	 mean = [0.27635944 0.16357656 0.21419764 0.2240697 ],
	  std = [0.5392813  0.5414182  0.5074059  0.53603816]
7.409999834373593
Episode 1252	Average Score: 9.74	Score: 7.418.219999816268682
Episode 1253	Average Score: 9.73	Score: 8.22actions batch at 877000-th learning:
	 shape = (128, 4),
	 mean = [0.2903317  0.32460654 0.25591755 0.23633884],
	  std = [0.5419189 0.5857917 0.5141304 0.5269073]
7.809999825432897
Episode 1254	Average Score: 9.70	Score: 7.81actions batch at 878000-th learning:
	 shape = (128, 4),
	 mean = [0.3105742  0.30081782 0.3274886  0.31487083],
	  std = [0.5596293  0.5778434  0.52771527 0.575159  ]
7.609999829903245
Episode 1255	Average Score: 9.67	Score: 7.61actions batch at 879000-th learning:
	 shape = (128, 4),
	 mean = [0.2463038  0.22612397 0.20699437 0.31141573],
	  std = [0.5606848  0.57027537 0.5287863  0.5660969 ]
7.499999832361937
Episode 1256	Average Score: 9.64	Score: 7.509.089999796822667
Episode 1257	Average Score: 9.61	Score: 9.09actions batch at 880000-th learning:
	 shape = (128, 4),
	 mean = [0.3057121  0.2700897  0.22776628 0.2020964 ],
	  std = [0.57481796 0.5580848  0.52384996 0.52551275]
8.099999818950891
Episode 1258	Average Score: 9.60	Score: 8.10actions batch at 881000-th learning:
	 shape = (128, 4),
	 mean = [0.2643615  0.2210638  0.30413762 0.277345  ],
	  std = [0.5557812  0.54297554 0.51690376 0.51553595]
16.529999630525708
Episode 1259	Average Score: 9.67	Score: 16.5311.57999974116683
Episode 1260	Average Score: 9.66
actions batch at 882000-th learning:
	 shape = (128, 4),
	 mean = [0.21073522 0.19030009 0.1884928  0.24232475],
	  std = [0.54640746 0.5496968  0.5179903  0.5258394 ]
10.64999976195395
Episode 1261	Average Score: 9.68	Score: 10.65actions batch at 883000-th learning:
	 shape = (128, 4),
	 mean = [0.35670465 0.29532015 0.28427202 0.32997712],
	  std = [0.55889565 0.55086964 0.51982725 0.553451  ]
6.669999850913882
Episode 1262	Average Score: 9.64	Score: 6.679.13999979570508
Episode 1263	Average Score: 9.65	Score: 9.14actions batch at 884000-th learning:
	 shape = (128, 4),
	 mean = [0.34698233 0.09169941 0.34562838 0.21973419],
	  std = [0.5697122  0.50692487 0.5373917  0.53304684]
5.389999879524112
Episode 1264	Average Score: 9.62	Score: 5.39actions batch at 885000-th learning:
	 shape = (128, 4),
	 mean = [0.25285256 0.20024633 0.19149622 0.2817295 ],
	  std = [0.5235281  0.51491207 0.5193976  0.5809046 ]
7.699999827891588
Episode 1265	Average Score: 9.62	Score: 7.70actions batch at 886000-th learning:
	 shape = (128, 4),
	 mean = [0.20298576 0.21381839 0.24229702 0.23688297],
	  std = [0.5240185  0.55202925 0.534703   0.5628707 ]
8.059999819844961
Episode 1266	Average Score: 9.62	Score: 8.069.419999789446592
Episode 1267	Average Score: 9.67	Score: 9.42actions batch at 887000-th learning:
	 shape = (128, 4),
	 mean = [0.2949148  0.17933525 0.25968647 0.30210197],
	  std = [0.557991  0.5443645 0.5253793 0.561355 ]
10.529999764636159
Episode 1268	Average Score: 9.65	Score: 10.53actions batch at 888000-th learning:
	 shape = (128, 4),
	 mean = [0.24891268 0.20612063 0.20457739 0.34852225],
	  std = [0.5466633  0.5346866  0.5563563  0.56906253]
5.409999879077077
Episode 1269	Average Score: 9.58	Score: 5.4110.45999976620078
Episode 1270	Average Score: 9.57
actions batch at 889000-th learning:
	 shape = (128, 4),
	 mean = [0.3235112  0.13854161 0.2696296  0.24845964],
	  std = [0.55586475 0.5318482  0.50032127 0.56574786]
8.389999812468886
Episode 1271	Average Score: 9.50	Score: 8.39actions batch at 890000-th learning:
	 shape = (128, 4),
	 mean = [0.23204969 0.33677924 0.26306558 0.24629971],
	  std = [0.54941726 0.55222094 0.5408845  0.55103475]
8.259999815374613
Episode 1272	Average Score: 9.48	Score: 8.262.2799999490380287
Episode 1273	Average Score: 9.41	Score: 2.28actions batch at 891000-th learning:
	 shape = (128, 4),
	 mean = [0.22661315 0.2752282  0.26816812 0.3390042 ],
	  std = [0.5671723  0.5676423  0.5221177  0.56565887]
7.649999829009175
Episode 1274	Average Score: 9.42	Score: 7.65actions batch at 892000-th learning:
	 shape = (128, 4),
	 mean = [0.24189478 0.14656343 0.19277093 0.25579277],
	  std = [0.5689246  0.55152214 0.499316   0.54575825]
8.919999800622463
Episode 1275	Average Score: 9.40	Score: 8.92actions batch at 893000-th learning:
	 shape = (128, 4),
	 mean = [0.2565916  0.24444582 0.25656325 0.25117275],
	  std = [0.58357316 0.57785416 0.53585917 0.57479036]
7.999999821186066
Episode 1276	Average Score: 9.39	Score: 8.0014.399999678134918
Episode 1277	Average Score: 9.45	Score: 14.40actions batch at 894000-th learning:
	 shape = (128, 4),
	 mean = [0.23955965 0.18873523 0.21474792 0.24070197],
	  std = [0.53552675 0.52434427 0.5068786  0.5458804 ]
12.449999721720815
Episode 1278	Average Score: 9.49	Score: 12.45actions batch at 895000-th learning:
	 shape = (128, 4),
	 mean = [0.325819   0.28369972 0.30240697 0.35837808],
	  std = [0.56350243 0.55489093 0.5189938  0.55240095]
6.309999858960509
Episode 1279	Average Score: 9.44	Score: 6.317.299999836832285
Episode 1280	Average Score: 9.44
actions batch at 896000-th learning:
	 shape = (128, 4),
	 mean = [0.25162157 0.21151562 0.26098853 0.23005499],
	  std = [0.5774397  0.5589836  0.5170301  0.56806815]
8.769999803975224
Episode 1281	Average Score: 9.45	Score: 8.77actions batch at 897000-th learning:
	 shape = (128, 4),
	 mean = [0.27590865 0.15576074 0.23156497 0.2986603 ],
	  std = [0.5388163  0.50778455 0.5186824  0.56743944]
8.949999799951911
Episode 1282	Average Score: 9.43	Score: 8.959.229999793693423
Episode 1283	Average Score: 9.45	Score: 9.23actions batch at 898000-th learning:
	 shape = (128, 4),
	 mean = [0.20486914 0.19487329 0.23031601 0.21232046],
	  std = [0.5761548 0.5041564 0.508952  0.5374472]
7.379999835044146
Episode 1284	Average Score: 9.42	Score: 7.38actions batch at 899000-th learning:
	 shape = (128, 4),
	 mean = [0.25386646 0.13168453 0.2924629  0.22502233],
	  std = [0.5603109 0.530342  0.5323281 0.5413694]
10.41999976709485
Episode 1285	Average Score: 9.43	Score: 10.42actions batch at 900000-th learning:
	 shape = (128, 4),
	 mean = [0.20135662 0.23939188 0.19170736 0.16525517],
	  std = [0.5554089  0.56707084 0.5011321  0.5629354 ]
5.879999868571758
Episode 1286	Average Score: 9.47	Score: 5.887.8599998243153095
Episode 1287	Average Score: 9.47	Score: 7.86actions batch at 901000-th learning:
	 shape = (128, 4),
	 mean = [0.30946234 0.2860349  0.32050583 0.23785822],
	  std = [0.569153   0.5676904  0.5299519  0.54427564]
9.569999786093831
Episode 1288	Average Score: 9.47	Score: 9.57actions batch at 902000-th learning:
	 shape = (128, 4),
	 mean = [0.33033544 0.2715763  0.36389476 0.32169962],
	  std = [0.5915582  0.53801954 0.5339552  0.5645669 ]
9.109999796375632
Episode 1289	Average Score: 9.48	Score: 9.116.569999853149056
Episode 1290	Average Score: 9.47
actions batch at 903000-th learning:
	 shape = (128, 4),
	 mean = [0.28884774 0.21140434 0.23891665 0.15250438],
	  std = [0.5248353  0.5231651  0.49470085 0.54552376]
5.999999865889549
Episode 1291	Average Score: 9.43	Score: 6.00actions batch at 904000-th learning:
	 shape = (128, 4),
	 mean = [0.28102967 0.23906331 0.2732236  0.34137684],
	  std = [0.5680726  0.54718995 0.503128   0.574038  ]
8.769999803975224
Episode 1292	Average Score: 9.45	Score: 8.777.849999824538827
Episode 1293	Average Score: 9.44	Score: 7.85actions batch at 905000-th learning:
	 shape = (128, 4),
	 mean = [0.2592511  0.23229964 0.21363932 0.19228092],
	  std = [0.5140026 0.5615339 0.5269809 0.5345882]
8.259999815374613
Episode 1294	Average Score: 9.45	Score: 8.26actions batch at 906000-th learning:
	 shape = (128, 4),
	 mean = [0.187542   0.25771648 0.25319797 0.1682218 ],
	  std = [0.55424637 0.5438962  0.51405    0.5239168 ]
9.329999791458249
Episode 1295	Average Score: 9.46	Score: 9.33actions batch at 907000-th learning:
	 shape = (128, 4),
	 mean = [0.26707196 0.19086085 0.16530472 0.34207353],
	  std = [0.5577562  0.57467973 0.5104924  0.58007294]
10.429999766871333
Episode 1296	Average Score: 9.50	Score: 10.436.609999852254987
Episode 1297	Average Score: 9.49	Score: 6.61actions batch at 908000-th learning:
	 shape = (128, 4),
	 mean = [0.26236933 0.22780858 0.21390235 0.25284833],
	  std = [0.5284501  0.5594614  0.50334734 0.53996354]
5.369999879971147
Episode 1298	Average Score: 9.49	Score: 5.37actions batch at 909000-th learning:
	 shape = (128, 4),
	 mean = [0.23721032 0.23745704 0.2021813  0.32677907],
	  std = [0.57297355 0.55064535 0.5222631  0.57279056]
6.709999850019813
Episode 1299	Average Score: 9.44	Score: 6.717.849999824538827
Episode 1300	Average Score: 9.41
actions batch at 910000-th learning:
	 shape = (128, 4),
	 mean = [0.3468054  0.23510918 0.2252637  0.25745723],
	  std = [0.584388   0.5523512  0.54233146 0.55326384]
5.879999868571758
Episode 1301	Average Score: 9.37	Score: 5.88actions batch at 911000-th learning:
	 shape = (128, 4),
	 mean = [0.23534733 0.1762259  0.30497283 0.30371153],
	  std = [0.56582254 0.54458094 0.5324707  0.58032966]
10.219999771565199
Episode 1302	Average Score: 9.40	Score: 10.226.959999844431877
Episode 1303	Average Score: 9.37	Score: 6.96actions batch at 912000-th learning:
	 shape = (128, 4),
	 mean = [0.20737235 0.20617464 0.2490545  0.26843232],
	  std = [0.553078   0.54895353 0.52079105 0.54267126]
6.34999985806644
Episode 1304	Average Score: 9.35	Score: 6.35actions batch at 913000-th learning:
	 shape = (128, 4),
	 mean = [0.2250158  0.14769147 0.1255735  0.24758385],
	  std = [0.56035584 0.5125025  0.46214035 0.5508688 ]
6.889999845996499
Episode 1305	Average Score: 9.29	Score: 6.89actions batch at 914000-th learning:
	 shape = (128, 4),
	 mean = [0.22186707 0.16087715 0.12702145 0.16621831],
	  std = [0.5165817  0.5669885  0.47497463 0.51755565]
6.499999854713678
Episode 1306	Average Score: 9.23	Score: 6.5016.219999637454748
Episode 1307	Average Score: 9.30	Score: 16.22actions batch at 915000-th learning:
	 shape = (128, 4),
	 mean = [0.24101546 0.20267427 0.26806635 0.14924714],
	  std = [0.5312459  0.52817994 0.51664597 0.5214064 ]
5.22999988310039
Episode 1308	Average Score: 9.23	Score: 5.23actions batch at 916000-th learning:
	 shape = (128, 4),
	 mean = [0.23773772 0.18855314 0.18042533 0.19403051],
	  std = [0.53618485 0.52344275 0.47544125 0.55012995]
6.5199998542666435
Episode 1309	Average Score: 9.19	Score: 6.527.04999984242022
Episode 1310	Average Score: 9.16
actions batch at 917000-th learning:
	 shape = (128, 4),
	 mean = [0.22471282 0.18307728 0.16704759 0.2136558 ],
	  std = [0.5574585  0.538245   0.50870454 0.54794675]
8.149999817833304
Episode 1311	Average Score: 9.17	Score: 8.15actions batch at 918000-th learning:
	 shape = (128, 4),
	 mean = [0.2724591  0.22138391 0.31093332 0.21156101],
	  std = [0.56452686 0.5329417  0.50885814 0.5239348 ]
9.109999796375632
Episode 1312	Average Score: 9.16	Score: 9.119.399999789893627
Episode 1313	Average Score: 9.15	Score: 9.40actions batch at 919000-th learning:
	 shape = (128, 4),
	 mean = [0.2567406  0.22232686 0.20987394 0.19176656],
	  std = [0.57221305 0.5446561  0.4952799  0.5679658 ]
7.459999833256006
Episode 1314	Average Score: 9.09	Score: 7.46actions batch at 920000-th learning:
	 shape = (128, 4),
	 mean = [0.21333466 0.24405317 0.26447058 0.3228557 ],
	  std = [0.582297   0.54405606 0.51666844 0.55840886]
6.979999843984842
Episode 1315	Average Score: 9.07	Score: 6.98actions batch at 921000-th learning:
	 shape = (128, 4),
	 mean = [0.22542976 0.2031899  0.25914872 0.23354414],
	  std = [0.5657855 0.534614  0.5388397 0.5579685]
10.189999772235751
Episode 1316	Average Score: 9.09	Score: 10.198.059999819844961
Episode 1317	Average Score: 9.05	Score: 8.06actions batch at 922000-th learning:
	 shape = (128, 4),
	 mean = [0.2966486  0.229733   0.26206568 0.3044253 ],
	  std = [0.58209544 0.55031157 0.5018229  0.55511045]
6.689999850466847
Episode 1318	Average Score: 9.00	Score: 6.69actions batch at 923000-th learning:
	 shape = (128, 4),
	 mean = [0.2836967  0.21880639 0.23492251 0.24169709],
	  std = [0.5241627  0.5329599  0.5291803  0.55201614]
9.5299997869879
Episode 1319	Average Score: 8.98	Score: 9.534.229999905452132
Episode 1320	Average Score: 8.92
actions batch at 924000-th learning:
	 shape = (128, 4),
	 mean = [0.24093516 0.22975323 0.28560278 0.27681822],
	  std = [0.5404058  0.54119664 0.5257456  0.5506944 ]
10.119999773800373
Episode 1321	Average Score: 8.91	Score: 10.12actions batch at 925000-th learning:
	 shape = (128, 4),
	 mean = [0.26079062 0.18396491 0.1422555  0.30255187],
	  std = [0.5283613  0.5192991  0.46588472 0.5516576 ]
5.759999871253967
Episode 1322	Average Score: 8.86	Score: 5.769.899999778717756
Episode 1323	Average Score: 8.86	Score: 9.90actions batch at 926000-th learning:
	 shape = (128, 4),
	 mean = [0.25279492 0.2447993  0.24131818 0.28178093],
	  std = [0.55543685 0.5693633  0.5109895  0.5589295 ]
8.55999980866909
Episode 1324	Average Score: 8.84	Score: 8.56actions batch at 927000-th learning:
	 shape = (128, 4),
	 mean = [0.23188747 0.26136065 0.29007795 0.26586056],
	  std = [0.5726607  0.5318751  0.50647366 0.5937956 ]
11.299999747425318
Episode 1325	Average Score: 8.83	Score: 11.30actions batch at 928000-th learning:
	 shape = (128, 4),
	 mean = [0.28970906 0.1876382  0.17668103 0.19885533],
	  std = [0.5532763  0.5486568  0.48761177 0.55330306]
9.899999778717756
Episode 1326	Average Score: 8.84	Score: 9.908.719999805092812
Episode 1327	Average Score: 8.81	Score: 8.72actions batch at 929000-th learning:
	 shape = (128, 4),
	 mean = [0.21701035 0.1149629  0.29500717 0.3576841 ],
	  std = [0.5669925  0.5451474  0.52692324 0.5491397 ]
9.629999784752727
Episode 1328	Average Score: 8.79	Score: 9.63actions batch at 930000-th learning:
	 shape = (128, 4),
	 mean = [0.30423886 0.178602   0.29266372 0.27365988],
	  std = [0.548604   0.54068536 0.5043444  0.5801913 ]
8.469999810680747
Episode 1329	Average Score: 8.81	Score: 8.4711.76999973692
Episode 1330	Average Score: 8.81
actions batch at 931000-th learning:
	 shape = (128, 4),
	 mean = [0.25349033 0.23757483 0.2070541  0.27170172],
	  std = [0.5287001  0.5498625  0.48782885 0.54183334]
9.809999780729413
Episode 1331	Average Score: 8.79	Score: 9.81actions batch at 932000-th learning:
	 shape = (128, 4),
	 mean = [0.25873026 0.23828357 0.2337237  0.2729233 ],
	  std = [0.56692475 0.56297386 0.49254948 0.5679489 ]
8.999999798834324
Episode 1332	Average Score: 8.80	Score: 9.009.329999791458249
Episode 1333	Average Score: 8.76	Score: 9.33actions batch at 933000-th learning:
	 shape = (128, 4),
	 mean = [0.22693656 0.23964885 0.22495028 0.23959361],
	  std = [0.54925805 0.5358564  0.5183018  0.5447788 ]
12.949999710544944
Episode 1334	Average Score: 8.79	Score: 12.95actions batch at 934000-th learning:
	 shape = (128, 4),
	 mean = [0.28638387 0.22942273 0.20149109 0.28745264],
	  std = [0.55319625 0.5430228  0.49401587 0.56370205]
12.659999717026949
Episode 1335	Average Score: 8.75	Score: 12.66actions batch at 935000-th learning:
	 shape = (128, 4),
	 mean = [0.17957678 0.19131742 0.19616535 0.2732411 ],
	  std = [0.5409872  0.5350006  0.5082065  0.54024315]
14.569999674335122
Episode 1336	Average Score: 8.79	Score: 14.5710.479999765753746
Episode 1337	Average Score: 8.80	Score: 10.48actions batch at 936000-th learning:
	 shape = (128, 4),
	 mean = [0.23739107 0.28189966 0.25606725 0.3439594 ],
	  std = [0.5599259  0.58411646 0.53191864 0.5687234 ]
6.629999851807952
Episode 1338	Average Score: 8.79	Score: 6.63actions batch at 937000-th learning:
	 shape = (128, 4),
	 mean = [0.26998731 0.18805827 0.2938053  0.2548766 ],
	  std = [0.54928154 0.5281338  0.50830275 0.5463273 ]
10.06999977491796
Episode 1339	Average Score: 8.76	Score: 10.0710.409999767318368
Episode 1340	Average Score: 8.78
actions batch at 938000-th learning:
	 shape = (128, 4),
	 mean = [0.2832199  0.1782133  0.261602   0.32323584],
	  std = [0.5515602  0.5391457  0.49608314 0.5406674 ]
11.03999975323677
Episode 1341	Average Score: 8.81	Score: 11.04actions batch at 939000-th learning:
	 shape = (128, 4),
	 mean = [0.18395199 0.21686172 0.1660869  0.23947524],
	  std = [0.55157995 0.5590592  0.49477223 0.54133356]
9.319999791681767
Episode 1342	Average Score: 8.79	Score: 9.3210.33999976888299
Episode 1343	Average Score: 8.81	Score: 10.34actions batch at 940000-th learning:
	 shape = (128, 4),
	 mean = [0.2241231  0.22821547 0.19955648 0.2074803 ],
	  std = [0.556525   0.5691057  0.51303315 0.54240125]
20.039999552071095
Episode 1344	Average Score: 8.92	Score: 20.04actions batch at 941000-th learning:
	 shape = (128, 4),
	 mean = [0.22196688 0.19125153 0.2532099  0.30959082],
	  std = [0.5361658  0.54242235 0.5098384  0.55319715]
9.699999783188105
Episode 1345	Average Score: 8.89	Score: 9.70actions batch at 942000-th learning:
	 shape = (128, 4),
	 mean = [0.26266703 0.28330222 0.28752667 0.33623013],
	  std = [0.552329   0.5841555  0.52962923 0.5464626 ]
9.63999978452921
Episode 1346	Average Score: 8.90	Score: 9.6410.939999755471945
Episode 1347	Average Score: 8.91	Score: 10.94actions batch at 943000-th learning:
	 shape = (128, 4),
	 mean = [0.22668423 0.21505529 0.19686812 0.18717001],
	  std = [0.54828304 0.54931116 0.51124567 0.56403905]
9.40999978967011
Episode 1348	Average Score: 8.88	Score: 9.41actions batch at 944000-th learning:
	 shape = (128, 4),
	 mean = [0.26933482 0.2690816  0.27402472 0.29230967],
	  std = [0.57678306 0.6009116  0.54263043 0.5701273 ]
11.409999744966626
Episode 1349	Average Score: 8.88	Score: 11.418.299999814480543
Episode 1350	Average Score: 8.89
actions batch at 945000-th learning:
	 shape = (128, 4),
	 mean = [0.17765896 0.1991891  0.25899315 0.2828223 ],
	  std = [0.526049   0.5557981  0.50107694 0.5803387 ]
10.089999774470925
Episode 1351	Average Score: 8.90	Score: 10.09actions batch at 946000-th learning:
	 shape = (128, 4),
	 mean = [0.26987547 0.2800791  0.16059564 0.23983851],
	  std = [0.54616356 0.5522127  0.5053542  0.5754666 ]
10.989999754354358
Episode 1352	Average Score: 8.94	Score: 10.999.229999793693423
Episode 1353	Average Score: 8.95	Score: 9.23actions batch at 947000-th learning:
	 shape = (128, 4),
	 mean = [0.18118803 0.24282414 0.23191674 0.28741848],
	  std = [0.5532435  0.5195467  0.48950228 0.5843964 ]
6.839999847114086
Episode 1354	Average Score: 8.94	Score: 6.84actions batch at 948000-th learning:
	 shape = (128, 4),
	 mean = [0.3799135  0.30304238 0.33041185 0.2527464 ],
	  std = [0.5628227  0.5600489  0.52257484 0.555312  ]
10.56999976374209
Episode 1355	Average Score: 8.97	Score: 10.57actions batch at 949000-th learning:
	 shape = (128, 4),
	 mean = [0.36687577 0.21645582 0.27111852 0.24584673],
	  std = [0.56908584 0.54600966 0.5185966  0.55539453]
9.759999781847
Episode 1356	Average Score: 8.99	Score: 9.7610.99999975413084
Episode 1357	Average Score: 9.01	Score: 11.00actions batch at 950000-th learning:
	 shape = (128, 4),
	 mean = [0.2816097  0.22827585 0.18592963 0.2688341 ],
	  std = [0.5480836  0.5401021  0.48179588 0.55995053]
8.449999811127782
Episode 1358	Average Score: 9.01	Score: 8.45actions batch at 951000-th learning:
	 shape = (128, 4),
	 mean = [0.3185884  0.26971796 0.1804517  0.25586075],
	  std = [0.5367299 0.5579415 0.4856638 0.532822 ]
9.999999776482582
Episode 1359	Average Score: 8.95	Score: 10.0011.079999752342701
Episode 1360	Average Score: 8.94
actions batch at 952000-th learning:
	 shape = (128, 4),
	 mean = [0.2760699  0.236475   0.20035933 0.22730818],
	  std = [0.54114854 0.5569614  0.48135868 0.5522355 ]
13.43999969959259
Episode 1361	Average Score: 8.97	Score: 13.44actions batch at 953000-th learning:
	 shape = (128, 4),
	 mean = [0.3653929  0.21419898 0.32910836 0.2790365 ],
	  std = [0.56439334 0.5571965  0.52781236 0.5211069 ]
6.839999847114086
Episode 1362	Average Score: 8.97	Score: 6.848.579999808222055
Episode 1363	Average Score: 8.97	Score: 8.58actions batch at 954000-th learning:
	 shape = (128, 4),
	 mean = [0.23859334 0.15002957 0.19132915 0.23210381],
	  std = [0.5454528  0.55463487 0.5049366  0.57452875]
7.27999983727932
Episode 1364	Average Score: 8.98	Score: 7.28actions batch at 955000-th learning:
	 shape = (128, 4),
	 mean = [0.22481087 0.24164999 0.22853167 0.23576646],
	  std = [0.54722875 0.55632484 0.50368726 0.5361883 ]
6.139999862760305
Episode 1365	Average Score: 8.97	Score: 6.14actions batch at 956000-th learning:
	 shape = (128, 4),
	 mean = [0.28711587 0.22503902 0.22290069 0.17838748],
	  std = [0.5569354 0.5389868 0.508321  0.5337946]
6.869999846443534
Episode 1366	Average Score: 8.96	Score: 6.8711.5399997420609
Episode 1367	Average Score: 8.98	Score: 11.54actions batch at 957000-th learning:
	 shape = (128, 4),
	 mean = [0.2337775  0.16186349 0.2572756  0.27568275],
	  std = [0.53969586 0.53325444 0.52516234 0.53378266]
9.919999778270721
Episode 1368	Average Score: 8.97	Score: 9.92actions batch at 958000-th learning:
	 shape = (128, 4),
	 mean = [0.28499216 0.23128408 0.18090463 0.25183588],
	  std = [0.548728  0.5259036 0.4881617 0.5770527]
9.239999793469906
Episode 1369	Average Score: 9.01	Score: 9.248.659999806433916
Episode 1370	Average Score: 8.99
actions batch at 959000-th learning:
	 shape = (128, 4),
	 mean = [0.1825782  0.26220334 0.11996321 0.21505535],
	  std = [0.5220952  0.55413944 0.50911033 0.542078  ]
16.819999624043703
Episode 1371	Average Score: 9.08	Score: 16.82actions batch at 960000-th learning:
	 shape = (128, 4),
	 mean = [0.27973256 0.22788778 0.23255624 0.2709083 ],
	  std = [0.5620978  0.5549892  0.51538414 0.5558911 ]
13.379999700933695
Episode 1372	Average Score: 9.13	Score: 13.3810.789999758824706
Episode 1373	Average Score: 9.21	Score: 10.79actions batch at 961000-th learning:
	 shape = (128, 4),
	 mean = [0.30973643 0.23328131 0.2344278  0.34538898],
	  std = [0.5649952  0.5548759  0.54625624 0.58684164]
10.579999763518572
Episode 1374	Average Score: 9.24	Score: 10.58actions batch at 962000-th learning:
	 shape = (128, 4),
	 mean = [0.29306746 0.12007383 0.19512692 0.2719138 ],
	  std = [0.552696   0.49757347 0.49520597 0.5795572 ]
10.079999774694443
Episode 1375	Average Score: 9.25	Score: 10.08actions batch at 963000-th learning:
	 shape = (128, 4),
	 mean = [0.27494332 0.23973285 0.30959076 0.33370206],
	  std = [0.5663586 0.5580371 0.5384857 0.5674365]
7.889999823644757
Episode 1376	Average Score: 9.25	Score: 7.8910.68999976105988
Episode 1377	Average Score: 9.22	Score: 10.69actions batch at 964000-th learning:
	 shape = (128, 4),
	 mean = [0.22708748 0.25385725 0.1812683  0.2496731 ],
	  std = [0.55689263 0.5458734  0.5156415  0.5375183 ]
9.879999779164791
Episode 1378	Average Score: 9.19	Score: 9.88actions batch at 965000-th learning:
	 shape = (128, 4),
	 mean = [0.30733427 0.2613274  0.1381647  0.20121174],
	  std = [0.5727525  0.5474637  0.49361283 0.5449815 ]
8.889999801293015
Episode 1379	Average Score: 9.22	Score: 8.897.299999836832285
Episode 1380	Average Score: 9.22
actions batch at 966000-th learning:
	 shape = (128, 4),
	 mean = [0.20745827 0.16921268 0.2584487  0.27361745],
	  std = [0.5572076  0.55898756 0.5078163  0.5519882 ]
7.519999831914902
Episode 1381	Average Score: 9.20	Score: 7.52actions batch at 967000-th learning:
	 shape = (128, 4),
	 mean = [0.19395213 0.05523591 0.1961117  0.30676177],
	  std = [0.5607275  0.48456788 0.5052644  0.5497292 ]
8.309999814257026
Episode 1382	Average Score: 9.20	Score: 8.318.16999981738627
Episode 1383	Average Score: 9.19	Score: 8.17actions batch at 968000-th learning:
	 shape = (128, 4),
	 mean = [0.28001276 0.18372463 0.25394872 0.2507916 ],
	  std = [0.5359259  0.53666526 0.5050644  0.54151696]
10.519999764859676
Episode 1384	Average Score: 9.22	Score: 10.52actions batch at 969000-th learning:
	 shape = (128, 4),
	 mean = [0.32176408 0.2797194  0.18856135 0.28990376],
	  std = [0.5515483  0.53470725 0.48149377 0.5679316 ]
10.189999772235751
Episode 1385	Average Score: 9.21	Score: 10.19actions batch at 970000-th learning:
	 shape = (128, 4),
	 mean = [0.23721762 0.28285927 0.28865036 0.25716543],
	  std = [0.561026   0.56201947 0.51987046 0.54528886]
9.629999784752727
Episode 1386	Average Score: 9.25	Score: 9.637.569999830797315
Episode 1387	Average Score: 9.25	Score: 7.57actions batch at 971000-th learning:
	 shape = (128, 4),
	 mean = [0.29507086 0.20622997 0.18776791 0.21418874],
	  std = [0.5605066  0.5418679  0.48166293 0.5526241 ]
3.569999920204282
Episode 1388	Average Score: 9.19	Score: 3.57actions batch at 972000-th learning:
	 shape = (128, 4),
	 mean = [0.33981425 0.22908452 0.1904657  0.205685  ],
	  std = [0.56509745 0.54008925 0.46649808 0.5251007 ]
11.5399997420609
Episode 1389	Average Score: 9.21	Score: 11.5410.33999976888299
Episode 1390	Average Score: 9.25
actions batch at 973000-th learning:
	 shape = (128, 4),
	 mean = [0.30815098 0.18786608 0.23175868 0.28261515],
	  std = [0.5667441  0.5453439  0.509454   0.56141067]
10.349999768659472
Episode 1391	Average Score: 9.30	Score: 10.35actions batch at 974000-th learning:
	 shape = (128, 4),
	 mean = [0.25257257 0.23051478 0.20403956 0.28932106],
	  std = [0.5903083  0.5505536  0.49082154 0.5833619 ]
9.539999786764383
Episode 1392	Average Score: 9.30	Score: 9.549.319999791681767
Episode 1393	Average Score: 9.32	Score: 9.32actions batch at 975000-th learning:
	 shape = (128, 4),
	 mean = [0.1975383  0.18972093 0.18757942 0.3222835 ],
	  std = [0.52709156 0.5276888  0.5128783  0.5424817 ]
10.25999977067113
Episode 1394	Average Score: 9.34	Score: 10.26actions batch at 976000-th learning:
	 shape = (128, 4),
	 mean = [0.25114456 0.28809172 0.23475108 0.26710364],
	  std = [0.5663527  0.5464464  0.54214984 0.5747097 ]
5.559999875724316
Episode 1395	Average Score: 9.30	Score: 5.56actions batch at 977000-th learning:
	 shape = (128, 4),
	 mean = [0.2334781  0.22703935 0.2222312  0.30704233],
	  std = [0.55893254 0.5354952  0.49459818 0.5628337 ]
10.509999765083194
Episode 1396	Average Score: 9.30	Score: 10.5110.749999759718776
Episode 1397	Average Score: 9.34	Score: 10.75actions batch at 978000-th learning:
	 shape = (128, 4),
	 mean = [0.20342799 0.17544699 0.2150079  0.2162512 ],
	  std = [0.54220235 0.53596735 0.50311947 0.5460538 ]
11.92999973334372
Episode 1398	Average Score: 9.41	Score: 11.93actions batch at 979000-th learning:
	 shape = (128, 4),
	 mean = [0.25876775 0.22216636 0.2303117  0.21522063],
	  std = [0.5440497  0.5510824  0.53720886 0.55736476]
11.159999750554562
Episode 1399	Average Score: 9.45	Score: 11.169.119999796152115
Episode 1400	Average Score: 9.46
actions batch at 980000-th learning:
	 shape = (128, 4),
	 mean = [0.20397681 0.17462878 0.22680908 0.24854119],
	  std = [0.5305982  0.5340261  0.5229196  0.54076636]
7.079999841749668
Episode 1401	Average Score: 9.48	Score: 7.08actions batch at 981000-th learning:
	 shape = (128, 4),
	 mean = [0.15363352 0.20156741 0.1610871  0.2995962 ],
	  std = [0.52770174 0.54003924 0.50212836 0.5553291 ]
10.329999769106507
Episode 1402	Average Score: 9.48	Score: 10.338.319999814033508
Episode 1403	Average Score: 9.49	Score: 8.32actions batch at 982000-th learning:
	 shape = (128, 4),
	 mean = [0.22826391 0.20937793 0.21819502 0.24382646],
	  std = [0.5665871  0.55711067 0.5137376  0.5519947 ]
9.339999791234732
Episode 1404	Average Score: 9.52	Score: 9.34actions batch at 983000-th learning:
	 shape = (128, 4),
	 mean = [0.25352138 0.27833664 0.26368722 0.24155596],
	  std = [0.5533809  0.5544727  0.49975044 0.55476904]
9.21999979391694
Episode 1405	Average Score: 9.54	Score: 9.22actions batch at 984000-th learning:
	 shape = (128, 4),
	 mean = [0.2840904  0.1481185  0.25828755 0.27652398],
	  std = [0.5458124  0.51534986 0.503155   0.56989396]
6.419999856501818
Episode 1406	Average Score: 9.54	Score: 6.4212.959999710321426
Episode 1407	Average Score: 9.51	Score: 12.96actions batch at 985000-th learning:
	 shape = (128, 4),
	 mean = [0.29753336 0.20853834 0.19540626 0.2504957 ],
	  std = [0.5189842  0.54719734 0.47888255 0.5167275 ]
10.399999767541885
Episode 1408	Average Score: 9.56	Score: 10.40actions batch at 986000-th learning:
	 shape = (128, 4),
	 mean = [0.1589467  0.15129106 0.13460629 0.22841398],
	  std = [0.5131383  0.5462099  0.47423297 0.5544498 ]
7.6199998296797276
Episode 1409	Average Score: 9.57	Score: 7.629.789999781176448
Episode 1410	Average Score: 9.60
actions batch at 987000-th learning:
	 shape = (128, 4),
	 mean = [0.2514782  0.28755048 0.3027585  0.183565  ],
	  std = [0.55911446 0.56640357 0.52553093 0.56731707]
9.83999978005886
Episode 1411	Average Score: 9.62	Score: 9.84actions batch at 988000-th learning:
	 shape = (128, 4),
	 mean = [0.25999954 0.17487472 0.23099029 0.24567251],
	  std = [0.56350803 0.49829137 0.49496576 0.5592264 ]
8.2099998164922
Episode 1412	Average Score: 9.61	Score: 8.219.569999786093831
Episode 1413	Average Score: 9.61	Score: 9.57actions batch at 989000-th learning:
	 shape = (128, 4),
	 mean = [0.24604742 0.18048385 0.23858011 0.12379097],
	  std = [0.5500389 0.5258553 0.5186184 0.5090495]
12.649999717250466
Episode 1414	Average Score: 9.66	Score: 12.65actions batch at 990000-th learning:
	 shape = (128, 4),
	 mean = [0.26660708 0.17327365 0.22874348 0.37494382],
	  std = [0.5246048  0.5285299  0.51646084 0.5565509 ]
8.179999817162752
Episode 1415	Average Score: 9.67	Score: 8.18actions batch at 991000-th learning:
	 shape = (128, 4),
	 mean = [0.22766712 0.1287495  0.22254437 0.27800122],
	  std = [0.5374041 0.5001798 0.5219309 0.5552791]
9.589999785646796
Episode 1416	Average Score: 9.67	Score: 9.598.16999981738627
Episode 1417	Average Score: 9.67	Score: 8.17actions batch at 992000-th learning:
	 shape = (128, 4),
	 mean = [0.25269878 0.116864   0.21915446 0.31176955],
	  std = [0.54731596 0.54063725 0.4774659  0.5672536 ]
8.719999805092812
Episode 1418	Average Score: 9.69	Score: 8.72actions batch at 993000-th learning:
	 shape = (128, 4),
	 mean = [0.25954846 0.24653034 0.17329447 0.21476763],
	  std = [0.53397614 0.55751574 0.49054307 0.55617326]
5.929999867454171
Episode 1419	Average Score: 9.65	Score: 5.938.979999799281359
Episode 1420	Average Score: 9.70
actions batch at 994000-th learning:
	 shape = (128, 4),
	 mean = [0.21745001 0.18736222 0.19531706 0.25439477],
	  std = [0.53093153 0.5460743  0.48333052 0.5396689 ]
6.229999860748649
Episode 1421	Average Score: 9.66	Score: 6.23actions batch at 995000-th learning:
	 shape = (128, 4),
	 mean = [0.22980288 0.19102448 0.21569213 0.25444144],
	  std = [0.5566353  0.53935313 0.47098434 0.5430079 ]
7.959999822080135
Episode 1422	Average Score: 9.68	Score: 7.968.12999981828034
Episode 1423	Average Score: 9.67	Score: 8.13actions batch at 996000-th learning:
	 shape = (128, 4),
	 mean = [0.2511989  0.18801971 0.2175414  0.27283183],
	  std = [0.54459435 0.5421458  0.50971    0.5254838 ]
11.279999747872353
Episode 1424	Average Score: 9.69	Score: 11.28actions batch at 997000-th learning:
	 shape = (128, 4),
	 mean = [0.33505943 0.13072506 0.2544993  0.24706613],
	  std = [0.54783076 0.5272244  0.48952344 0.56275344]
9.869999779388309
Episode 1425	Average Score: 9.68	Score: 9.87actions batch at 998000-th learning:
	 shape = (128, 4),
	 mean = [0.25902832 0.16984408 0.26670784 0.30580455],
	  std = [0.52427167 0.52468336 0.49162978 0.56325233]
10.87999975681305
Episode 1426	Average Score: 9.69	Score: 10.888.189999816939235
Episode 1427	Average Score: 9.68	Score: 8.19actions batch at 999000-th learning:
	 shape = (128, 4),
	 mean = [0.23930931 0.3337119  0.39717793 0.2251981 ],
	  std = [0.55033857 0.5649073  0.516174   0.569615  ]
5.889999868348241
Episode 1428	Average Score: 9.65	Score: 5.89actions batch at 1000000-th learning:
	 shape = (128, 4),
	 mean = [0.2869651  0.26113087 0.23377676 0.28475264],
	  std = [0.5343132  0.55800533 0.517993   0.5396849 ]
9.599999785423279
Episode 1429	Average Score: 9.66	Score: 9.608.059999819844961
Episode 1430	Average Score: 9.62
actions batch at 1001000-th learning:
	 shape = (128, 4),
	 mean = [0.29043084 0.2120023  0.21020262 0.2591753 ],
	  std = [0.55606997 0.5260799  0.5173716  0.55089056]
9.209999794140458
Episode 1431	Average Score: 9.62	Score: 9.21actions batch at 1002000-th learning:
	 shape = (128, 4),
	 mean = [0.33001435 0.1844344  0.20286931 0.27921784],
	  std = [0.54757243 0.5000052  0.5038283  0.55109143]
9.159999795258045
Episode 1432	Average Score: 9.62	Score: 9.168.009999820962548
Episode 1433	Average Score: 9.60	Score: 8.01actions batch at 1003000-th learning:
	 shape = (128, 4),
	 mean = [0.21122004 0.24133787 0.2562716  0.37559408],
	  std = [0.5349229  0.54345965 0.4870789  0.56265485]
9.149999795481563
Episode 1434	Average Score: 9.57	Score: 9.15actions batch at 1004000-th learning:
	 shape = (128, 4),
	 mean = [0.2637958  0.1880006  0.26741236 0.24622583],
	  std = [0.5464154  0.5250097  0.5073103  0.55794346]
7.58999983035028
Episode 1435	Average Score: 9.52	Score: 7.59actions batch at 1005000-th learning:
	 shape = (128, 4),
	 mean = [0.338705   0.27975664 0.22355439 0.28373906],
	  std = [0.5616181 0.5774419 0.5067666 0.5532565]
7.27999983727932
Episode 1436	Average Score: 9.44	Score: 7.2810.789999758824706
Episode 1437	Average Score: 9.45	Score: 10.79actions batch at 1006000-th learning:
	 shape = (128, 4),
	 mean = [0.26892868 0.33289462 0.24996333 0.2793349 ],
	  std = [0.5506899  0.5667642  0.474678   0.57424974]
8.609999807551503
Episode 1438	Average Score: 9.47	Score: 8.61actions batch at 1007000-th learning:
	 shape = (128, 4),
	 mean = [0.27151525 0.21652985 0.23183168 0.2518826 ],
	  std = [0.5430996  0.5521971  0.49211627 0.5233629 ]
10.009999776259065
Episode 1439	Average Score: 9.46	Score: 10.018.459999810904264
Episode 1440	Average Score: 9.44
actions batch at 1008000-th learning:
	 shape = (128, 4),
	 mean = [0.27109507 0.20505998 0.18294251 0.30817345],
	  std = [0.5487405  0.5239279  0.5044149  0.53926075]
7.299999836832285
Episode 1441	Average Score: 9.41	Score: 7.30actions batch at 1009000-th learning:
	 shape = (128, 4),
	 mean = [0.2596801  0.20925488 0.19750032 0.22086787],
	  std = [0.562848   0.5493013  0.47886378 0.56599194]
10.349999768659472
Episode 1442	Average Score: 9.42	Score: 10.3514.729999670758843
Episode 1443	Average Score: 9.46	Score: 14.73actions batch at 1010000-th learning:
	 shape = (128, 4),
	 mean = [0.3490926  0.2783057  0.27859563 0.2531677 ],
	  std = [0.5451075  0.5711295  0.50032175 0.5398817 ]
10.249999770894647
Episode 1444	Average Score: 9.36	Score: 10.25actions batch at 1011000-th learning:
	 shape = (128, 4),
	 mean = [0.28809845 0.18562806 0.16379048 0.3185355 ],
	  std = [0.5456424 0.5419746 0.4756763 0.5630256]
11.26999974809587
Episode 1445	Average Score: 9.38	Score: 11.27actions batch at 1012000-th learning:
	 shape = (128, 4),
	 mean = [0.2505216  0.11761129 0.26159132 0.30644986],
	  std = [0.55672497 0.52107245 0.506528   0.5696821 ]
12.889999711886048
Episode 1446	Average Score: 9.41	Score: 12.897.6199998296797276
Episode 1447	Average Score: 9.38	Score: 7.62actions batch at 1013000-th learning:
	 shape = (128, 4),
	 mean = [0.2788251  0.26072198 0.26776028 0.24914277],
	  std = [0.5392509  0.5285102  0.50749964 0.5257409 ]
11.829999735578895
Episode 1448	Average Score: 9.40	Score: 11.83actions batch at 1014000-th learning:
	 shape = (128, 4),
	 mean = [0.3561317  0.19073279 0.3360214  0.27781075],
	  std = [0.55766386 0.5194749  0.5164755  0.5618388 ]
10.079999774694443
Episode 1449	Average Score: 9.39	Score: 10.0811.219999749213457
Episode 1450	Average Score: 9.42
actions batch at 1015000-th learning:
	 shape = (128, 4),
	 mean = [0.21376944 0.21277189 0.24347234 0.16674575],
	  std = [0.5382483  0.55601305 0.51184016 0.5131639 ]
15.429999655112624
Episode 1451	Average Score: 9.47	Score: 15.43actions batch at 1016000-th learning:
	 shape = (128, 4),
	 mean = [0.34163454 0.2690025  0.27236444 0.29659972],
	  std = [0.56030685 0.5697844  0.5256373  0.5442017 ]
6.789999848231673
Episode 1452	Average Score: 9.43	Score: 6.796.649999851360917
Episode 1453	Average Score: 9.40	Score: 6.65actions batch at 1017000-th learning:
	 shape = (128, 4),
	 mean = [0.26856133 0.13177347 0.27223602 0.24720429],
	  std = [0.5573591  0.53566384 0.49706244 0.5654749 ]
0.2799999937415123
Episode 1454	Average Score: 9.34	Score: 0.28actions batch at 1018000-th learning:
	 shape = (128, 4),
	 mean = [0.2394461  0.17933168 0.18960287 0.29736593],
	  std = [0.56563    0.55933017 0.47212422 0.5342715 ]
7.039999842643738
Episode 1455	Average Score: 9.30	Score: 7.04actions batch at 1019000-th learning:
	 shape = (128, 4),
	 mean = [0.2507205  0.22817042 0.26035693 0.34860826],
	  std = [0.56493294 0.5309988  0.5054316  0.5648856 ]
7.679999828338623
Episode 1456	Average Score: 9.28	Score: 7.686.419999856501818
Episode 1457	Average Score: 9.24	Score: 6.42actions batch at 1020000-th learning:
	 shape = (128, 4),
	 mean = [0.23961544 0.17980586 0.2698242  0.22507428],
	  std = [0.54667693 0.5448043  0.5258877  0.5504263 ]
8.649999806657434
Episode 1458	Average Score: 9.24	Score: 8.65actions batch at 1021000-th learning:
	 shape = (128, 4),
	 mean = [0.2917767  0.26824957 0.25518736 0.2796342 ],
	  std = [0.57928944 0.56363195 0.48953143 0.55793494]
8.779999803751707
Episode 1459	Average Score: 9.23	Score: 8.7810.619999762624502
Episode 1460	Average Score: 9.22
actions batch at 1022000-th learning:
	 shape = (128, 4),
	 mean = [0.2516183  0.22894037 0.197533   0.38795513],
	  std = [0.5254642 0.5145689 0.4931344 0.5724726]
10.799999758601189
Episode 1461	Average Score: 9.20	Score: 10.80actions batch at 1023000-th learning:
	 shape = (128, 4),
	 mean = [0.30575487 0.18652138 0.34314865 0.22066921],
	  std = [0.5228941  0.5294607  0.4886765  0.54607797]
15.429999655112624
Episode 1462	Average Score: 9.28	Score: 15.4310.019999776035547
Episode 1463	Average Score: 9.30	Score: 10.02actions batch at 1024000-th learning:
	 shape = (128, 4),
	 mean = [0.26406127 0.20842573 0.15513831 0.27709743],
	  std = [0.5558242  0.5443068  0.49083874 0.52935505]
9.889999778941274
Episode 1464	Average Score: 9.32	Score: 9.89actions batch at 1025000-th learning:
	 shape = (128, 4),
	 mean = [0.2536041  0.14026909 0.17109713 0.29540005],
	  std = [0.5455502  0.56251204 0.49634668 0.5564419 ]
9.299999792128801
Episode 1465	Average Score: 9.35	Score: 9.30actions batch at 1026000-th learning:
	 shape = (128, 4),
	 mean = [0.30542985 0.20224702 0.26302972 0.30476475],
	  std = [0.5579514  0.5305849  0.48857418 0.53554255]
23.54999947361648
Episode 1466	Average Score: 9.52	Score: 23.5510.349999768659472
Episode 1467	Average Score: 9.51	Score: 10.35actions batch at 1027000-th learning:
	 shape = (128, 4),
	 mean = [0.25831875 0.26688612 0.2743564  0.23112935],
	  std = [0.54745936 0.56745774 0.5155567  0.5683421 ]
9.069999797269702
Episode 1468	Average Score: 9.50	Score: 9.07actions batch at 1028000-th learning:
	 shape = (128, 4),
	 mean = [0.23105073 0.21670216 0.24145843 0.25021175],
	  std = [0.55907774 0.51891166 0.48933473 0.54711044]
11.819999735802412
Episode 1469	Average Score: 9.53	Score: 11.827.649999829009175
Episode 1470	Average Score: 9.52
actions batch at 1029000-th learning:
	 shape = (128, 4),
	 mean = [0.28584233 0.13477597 0.23242581 0.30007276],
	  std = [0.54114956 0.54493415 0.5047379  0.56020504]
10.599999763071537
Episode 1471	Average Score: 9.45	Score: 10.60actions batch at 1030000-th learning:
	 shape = (128, 4),
	 mean = [0.3311445  0.25609803 0.29460117 0.2323041 ],
	  std = [0.5524324  0.51968133 0.5097766  0.5191981 ]
12.969999710097909
Episode 1472	Average Score: 9.45	Score: 12.978.679999805986881
Episode 1473	Average Score: 9.43	Score: 8.68actions batch at 1031000-th learning:
	 shape = (128, 4),
	 mean = [0.25758535 0.16231593 0.2141124  0.25685158],
	  std = [0.5497863  0.5057524  0.48435792 0.5563055 ]
15.60999965108931
Episode 1474	Average Score: 9.48	Score: 15.61actions batch at 1032000-th learning:
	 shape = (128, 4),
	 mean = [0.2685958  0.1876661  0.23830588 0.27571672],
	  std = [0.57991755 0.5356836  0.49974456 0.5442502 ]
10.169999772682786
Episode 1475	Average Score: 9.48	Score: 10.17actions batch at 1033000-th learning:
	 shape = (128, 4),
	 mean = [0.18077753 0.26082933 0.28183764 0.22788258],
	  std = [0.52428496 0.5462453  0.50325996 0.5551965 ]
10.079999774694443
Episode 1476	Average Score: 9.50	Score: 10.0811.919999733567238
Episode 1477	Average Score: 9.51	Score: 11.92actions batch at 1034000-th learning:
	 shape = (128, 4),
	 mean = [0.14204082 0.2559346  0.244799   0.3136681 ],
	  std = [0.54506046 0.5650076  0.5034838  0.53659415]
9.669999783858657
Episode 1478	Average Score: 9.51	Score: 9.67actions batch at 1035000-th learning:
	 shape = (128, 4),
	 mean = [0.3315913  0.27906087 0.34949696 0.35087904],
	  std = [0.58602625 0.5078144  0.47546905 0.5522552 ]
10.669999761506915
Episode 1479	Average Score: 9.53	Score: 10.6711.759999737143517
Episode 1480	Average Score: 9.57
actions batch at 1036000-th learning:
	 shape = (128, 4),
	 mean = [0.22543299 0.19881712 0.19403061 0.40756994],
	  std = [0.5678743  0.5215691  0.50105983 0.5524608 ]
15.519999653100967
Episode 1481	Average Score: 9.65	Score: 15.52actions batch at 1037000-th learning:
	 shape = (128, 4),
	 mean = [0.20392759 0.3109042  0.20795755 0.20906669],
	  std = [0.5510669  0.5608102  0.48379615 0.5337019 ]
10.899999756366014
Episode 1482	Average Score: 9.68	Score: 10.9014.539999675005674
Episode 1483	Average Score: 9.74	Score: 14.54actions batch at 1038000-th learning:
	 shape = (128, 4),
	 mean = [0.18571389 0.21874408 0.18014817 0.31726503],
	  std = [0.5270873  0.5328892  0.49690256 0.5536939 ]
14.659999672323465
Episode 1484	Average Score: 9.79	Score: 14.66actions batch at 1039000-th learning:
	 shape = (128, 4),
	 mean = [0.2793457  0.28494895 0.28498727 0.24619988],
	  std = [0.55094916 0.5570381  0.5324816  0.5590449 ]
16.57999962940812
Episode 1485	Average Score: 9.85	Score: 16.58actions batch at 1040000-th learning:
	 shape = (128, 4),
	 mean = [0.20221585 0.15751117 0.23973793 0.34904534],
	  std = [0.52628225 0.55427617 0.5487594  0.5502734 ]
10.629999762400985
Episode 1486	Average Score: 9.86	Score: 10.639.599999785423279
Episode 1487	Average Score: 9.88	Score: 9.60actions batch at 1041000-th learning:
	 shape = (128, 4),
	 mean = [0.28860068 0.22891952 0.21325244 0.24916953],
	  std = [0.5763455  0.55078506 0.48816758 0.56653047]
10.209999771788716
Episode 1488	Average Score: 9.95	Score: 10.21actions batch at 1042000-th learning:
	 shape = (128, 4),
	 mean = [0.32188618 0.3087387  0.22928657 0.25253922],
	  std = [0.56716394 0.55759263 0.4830044  0.57449377]
7.649999829009175
Episode 1489	Average Score: 9.91	Score: 7.658.149999817833304
Episode 1490	Average Score: 9.89
actions batch at 1043000-th learning:
	 shape = (128, 4),
	 mean = [0.25119838 0.15165673 0.23869313 0.22830018],
	  std = [0.55380046 0.52504396 0.51218706 0.54792255]
8.989999799057841
Episode 1491	Average Score: 9.87	Score: 8.99actions batch at 1044000-th learning:
	 shape = (128, 4),
	 mean = [0.35688776 0.1284411  0.24715061 0.27006394],
	  std = [0.5526832 0.5168489 0.5082513 0.5732112]
16.169999638572335
Episode 1492	Average Score: 9.94	Score: 16.1711.88999973423779
Episode 1493	Average Score: 9.96	Score: 11.89actions batch at 1045000-th learning:
	 shape = (128, 4),
	 mean = [0.2500739  0.26786262 0.279647   0.2882517 ],
	  std = [0.5565713  0.56033164 0.50990504 0.55657405]
6.6599998511374
Episode 1494	Average Score: 9.93	Score: 6.66actions batch at 1046000-th learning:
	 shape = (128, 4),
	 mean = [0.27917644 0.23192942 0.2852891  0.33857152],
	  std = [0.5540937  0.54144984 0.536673   0.5254096 ]
9.559999786317348
Episode 1495	Average Score: 9.97	Score: 9.56actions batch at 1047000-th learning:
	 shape = (128, 4),
	 mean = [0.35410792 0.19210845 0.2196059  0.27369848],
	  std = [0.56059766 0.54454064 0.4732194  0.5405354 ]
8.82999980263412
Episode 1496	Average Score: 9.95	Score: 8.8310.619999762624502
Episode 1497	Average Score: 9.95	Score: 10.62actions batch at 1048000-th learning:
	 shape = (128, 4),
	 mean = [0.27843556 0.23266114 0.15448757 0.22937518],
	  std = [0.54064065 0.57799685 0.4816596  0.5225572 ]
7.04999984242022
Episode 1498	Average Score: 9.90	Score: 7.05actions batch at 1049000-th learning:
	 shape = (128, 4),
	 mean = [0.27993426 0.26384392 0.2574133  0.34733436],
	  std = [0.54745656 0.5668216  0.49209896 0.56042063]
9.209999794140458
Episode 1499	Average Score: 9.88	Score: 9.218.809999803081155
Episode 1500	Average Score: 9.88
actions batch at 1050000-th learning:
	 shape = (128, 4),
	 mean = [0.30436692 0.19762978 0.3158415  0.29560137],
	  std = [0.5733111  0.5327793  0.50389856 0.5380404 ]
8.509999809786677
Episode 1501	Average Score: 9.89	Score: 8.51actions batch at 1051000-th learning:
	 shape = (128, 4),
	 mean = [0.18902005 0.22888535 0.1991123  0.30457926],
	  std = [0.5260183 0.5308646 0.5060029 0.5507369]
8.099999818950891
Episode 1502	Average Score: 9.87	Score: 8.107.439999833703041
Episode 1503	Average Score: 9.86	Score: 7.44actions batch at 1052000-th learning:
	 shape = (128, 4),
	 mean = [0.26443562 0.3128849  0.13335907 0.29220426],
	  std = [0.56435084 0.5602909  0.4649829  0.5599958 ]
9.759999781847
Episode 1504	Average Score: 9.87	Score: 9.76actions batch at 1053000-th learning:
	 shape = (128, 4),
	 mean = [0.29468924 0.18231408 0.26642305 0.25051734],
	  std = [0.5669963  0.5011105  0.50255245 0.5570162 ]
9.289999792352319
Episode 1505	Average Score: 9.87	Score: 9.29actions batch at 1054000-th learning:
	 shape = (128, 4),
	 mean = [0.24673456 0.17362644 0.25578147 0.23915024],
	  std = [0.57513267 0.5090061  0.51718605 0.5790134 ]
9.309999791905284
Episode 1506	Average Score: 9.89	Score: 9.3111.669999739155173
Episode 1507	Average Score: 9.88	Score: 11.67actions batch at 1055000-th learning:
	 shape = (128, 4),
	 mean = [0.23098427 0.15796138 0.26130736 0.2609206 ],
	  std = [0.53549016 0.5275068  0.4971114  0.53724194]
12.959999710321426
Episode 1508	Average Score: 9.91	Score: 12.96actions batch at 1056000-th learning:
	 shape = (128, 4),
	 mean = [0.29521412 0.23851882 0.24053518 0.26947832],
	  std = [0.5539826  0.51895636 0.48454624 0.566232  ]
11.529999742284417
Episode 1509	Average Score: 9.95	Score: 11.539.879999779164791
Episode 1510	Average Score: 9.95
actions batch at 1057000-th learning:
	 shape = (128, 4),
	 mean = [0.33882892 0.2511645  0.22049302 0.22514312],
	  std = [0.5488807  0.52421325 0.50855994 0.5452514 ]
9.379999790340662
Episode 1511	Average Score: 9.94	Score: 9.38actions batch at 1058000-th learning:
	 shape = (128, 4),
	 mean = [0.35822037 0.20496236 0.236488   0.2629781 ],
	  std = [0.56613   0.5403028 0.5128581 0.5255802]
20.579999540001154
Episode 1512	Average Score: 10.07	Score: 20.5811.399999745190144
Episode 1513	Average Score: 10.09	Score: 11.40actions batch at 1059000-th learning:
	 shape = (128, 4),
	 mean = [0.24306431 0.20564432 0.23310637 0.301732  ],
	  std = [0.5642723  0.54781014 0.50228983 0.5743277 ]
9.09999979659915
Episode 1514	Average Score: 10.05	Score: 9.10actions batch at 1060000-th learning:
	 shape = (128, 4),
	 mean = [0.27889422 0.27243683 0.2515984  0.2609929 ],
	  std = [0.5558582  0.54232585 0.4910184  0.55453986]
10.169999772682786
Episode 1515	Average Score: 10.07	Score: 10.17actions batch at 1061000-th learning:
	 shape = (128, 4),
	 mean = [0.25256553 0.15255862 0.19017771 0.20765051],
	  std = [0.5474028  0.50430423 0.5017283  0.5528585 ]
9.5299997869879
Episode 1516	Average Score: 10.07	Score: 9.539.63999978452921
Episode 1517	Average Score: 10.08	Score: 9.64actions batch at 1062000-th learning:
	 shape = (128, 4),
	 mean = [0.2912815  0.23979217 0.25341958 0.2689301 ],
	  std = [0.5466142  0.54014844 0.5002444  0.53979445]
8.309999814257026
Episode 1518	Average Score: 10.08	Score: 8.31actions batch at 1063000-th learning:
	 shape = (128, 4),
	 mean = [0.23341444 0.20120752 0.23531902 0.34515837],
	  std = [0.55471873 0.5209034  0.4901383  0.54564303]
6.109999863430858
Episode 1519	Average Score: 10.08	Score: 6.119.919999778270721
Episode 1520	Average Score: 10.09
actions batch at 1064000-th learning:
	 shape = (128, 4),
	 mean = [0.29033986 0.29093277 0.2839997  0.18821806],
	  std = [0.5295872 0.555291  0.511487  0.5416799]
7.949999822303653
Episode 1521	Average Score: 10.11	Score: 7.95actions batch at 1065000-th learning:
	 shape = (128, 4),
	 mean = [0.34789225 0.19422899 0.19423547 0.30183303],
	  std = [0.5549575 0.5354502 0.4767609 0.5392624]
7.689999828115106
Episode 1522	Average Score: 10.11	Score: 7.6911.429999744519591
Episode 1523	Average Score: 10.14	Score: 11.43actions batch at 1066000-th learning:
	 shape = (128, 4),
	 mean = [0.32741037 0.13036206 0.3381779  0.24604852],
	  std = [0.555759   0.50177556 0.51294273 0.56180245]
7.93999982252717
Episode 1524	Average Score: 10.10	Score: 7.94actions batch at 1067000-th learning:
	 shape = (128, 4),
	 mean = [0.23586482 0.13834728 0.19825165 0.3024226 ],
	  std = [0.56428623 0.49479488 0.49449962 0.5645389 ]
4.469999900087714
Episode 1525	Average Score: 10.05	Score: 4.47actions batch at 1068000-th learning:
	 shape = (128, 4),
	 mean = [0.2684194  0.15880062 0.19370389 0.22561853],
	  std = [0.54286975 0.5085829  0.47693786 0.5006316 ]
8.47999981045723
Episode 1526	Average Score: 10.03	Score: 8.4810.209999771788716
Episode 1527	Average Score: 10.05	Score: 10.21actions batch at 1069000-th learning:
	 shape = (128, 4),
	 mean = [0.22451678 0.14988051 0.19619404 0.24211243],
	  std = [0.4993641  0.5344819  0.49181995 0.53449714]
10.14999977312982
Episode 1528	Average Score: 10.09	Score: 10.15actions batch at 1070000-th learning:
	 shape = (128, 4),
	 mean = [0.2793437  0.20596875 0.26212385 0.29117653],
	  std = [0.5680353  0.5496867  0.5116389  0.54507667]
13.999999687075615
Episode 1529	Average Score: 10.13	Score: 14.008.299999814480543
Episode 1530	Average Score: 10.14
actions batch at 1071000-th learning:
	 shape = (128, 4),
	 mean = [0.24473245 0.17176314 0.2666333  0.33196333],
	  std = [0.5632997  0.5365854  0.50316054 0.5607493 ]
5.719999872148037
Episode 1531	Average Score: 10.10	Score: 5.72actions batch at 1072000-th learning:
	 shape = (128, 4),
	 mean = [0.24806353 0.2155221  0.26661247 0.26509032],
	  std = [0.55220574 0.5499086  0.5231023  0.5366255 ]
13.859999690204859
Episode 1532	Average Score: 10.15	Score: 13.868.78999980352819
Episode 1533	Average Score: 10.16	Score: 8.79actions batch at 1073000-th learning:
	 shape = (128, 4),
	 mean = [0.28430077 0.14534733 0.26391393 0.2568531 ],
	  std = [0.5309119 0.5174038 0.51742   0.5516965]
12.489999720826745
Episode 1534	Average Score: 10.19	Score: 12.49actions batch at 1074000-th learning:
	 shape = (128, 4),
	 mean = [0.25070235 0.17156702 0.23928703 0.29723102],
	  std = [0.54416907 0.53243035 0.48169935 0.5313642 ]
8.86999980174005
Episode 1535	Average Score: 10.20	Score: 8.87actions batch at 1075000-th learning:
	 shape = (128, 4),
	 mean = [0.2408483  0.17092997 0.18403053 0.22330269],
	  std = [0.52446574 0.5031816  0.48626068 0.5753589 ]
10.529999764636159
Episode 1536	Average Score: 10.23	Score: 10.539.569999786093831
Episode 1537	Average Score: 10.22	Score: 9.57actions batch at 1076000-th learning:
	 shape = (128, 4),
	 mean = [0.31212038 0.23598784 0.31007516 0.2288837 ],
	  std = [0.5780409  0.5534007  0.541402   0.53661835]
8.369999812915921
Episode 1538	Average Score: 10.22	Score: 8.37actions batch at 1077000-th learning:
	 shape = (128, 4),
	 mean = [0.2997915  0.18755879 0.27956045 0.18070549],
	  std = [0.54234207 0.556545   0.47293136 0.5425011 ]
12.749999715015292
Episode 1539	Average Score: 10.25	Score: 12.7511.059999752789736
Episode 1540	Average Score: 10.27
actions batch at 1078000-th learning:
	 shape = (128, 4),
	 mean = [0.26807845 0.14891171 0.24174091 0.2946021 ],
	  std = [0.54219055 0.4990401  0.5090497  0.58700824]
10.429999766871333
Episode 1541	Average Score: 10.30	Score: 10.43actions batch at 1079000-th learning:
	 shape = (128, 4),
	 mean = [0.27350804 0.26682582 0.32966116 0.28639987],
	  std = [0.5423082  0.5674978  0.51656115 0.51810473]
9.469999788329005
Episode 1542	Average Score: 10.30	Score: 9.4711.199999749660492
Episode 1543	Average Score: 10.26	Score: 11.20actions batch at 1080000-th learning:
	 shape = (128, 4),
	 mean = [0.33195522 0.15099329 0.25894654 0.25084355],
	  std = [0.563634  0.5434239 0.5001005 0.5429347]
13.339999701827765
Episode 1544	Average Score: 10.29	Score: 13.34actions batch at 1081000-th learning:
	 shape = (128, 4),
	 mean = [0.3798892  0.17791407 0.27951595 0.26166186],
	  std = [0.52605534 0.5384573  0.5071492  0.5281677 ]
8.729999804869294
Episode 1545	Average Score: 10.27	Score: 8.73actions batch at 1082000-th learning:
	 shape = (128, 4),
	 mean = [0.21185787 0.1684806  0.18802771 0.2776591 ],
	  std = [0.5337933  0.5395983  0.47256696 0.56721044]
11.96999973244965
Episode 1546	Average Score: 10.26	Score: 11.9712.23999972641468
Episode 1547	Average Score: 10.30	Score: 12.24actions batch at 1083000-th learning:
	 shape = (128, 4),
	 mean = [0.2307506  0.16892666 0.22451068 0.21518253],
	  std = [0.5563543  0.5110488  0.4810731  0.54252785]
10.249999770894647
Episode 1548	Average Score: 10.29	Score: 10.25actions batch at 1084000-th learning:
	 shape = (128, 4),
	 mean = [0.23214015 0.17853867 0.22933245 0.26235998],
	  std = [0.5174433 0.538587  0.506524  0.5283122]
10.619999762624502
Episode 1549	Average Score: 10.29	Score: 10.628.119999818503857
Episode 1550	Average Score: 10.26
actions batch at 1085000-th learning:
	 shape = (128, 4),
	 mean = [0.28570583 0.22034465 0.26621482 0.2807006 ],
	  std = [0.53317934 0.5518205  0.4966817  0.5492625 ]
15.469999654218554
Episode 1551	Average Score: 10.26	Score: 15.47actions batch at 1086000-th learning:
	 shape = (128, 4),
	 mean = [0.28419566 0.18295589 0.27291554 0.21815334],
	  std = [0.586126   0.53512937 0.52187204 0.5574506 ]
9.499999787658453
Episode 1552	Average Score: 10.29	Score: 9.5013.00999970920384
Episode 1553	Average Score: 10.35	Score: 13.01actions batch at 1087000-th learning:
	 shape = (128, 4),
	 mean = [0.25420606 0.25304347 0.24628627 0.3176721 ],
	  std = [0.53921884 0.53652585 0.5007288  0.54960054]
13.219999704509974
Episode 1554	Average Score: 10.48	Score: 13.22actions batch at 1088000-th learning:
	 shape = (128, 4),
	 mean = [0.3698007  0.15477993 0.28474194 0.23958802],
	  std = [0.5351853  0.52870226 0.47181422 0.5526285 ]
10.579999763518572
Episode 1555	Average Score: 10.52	Score: 10.58actions batch at 1089000-th learning:
	 shape = (128, 4),
	 mean = [0.2414273  0.17686732 0.19453199 0.34135592],
	  std = [0.55098736 0.54458076 0.52592164 0.5491141 ]
9.069999797269702
Episode 1556	Average Score: 10.53	Score: 9.0710.889999756589532
Episode 1557	Average Score: 10.58	Score: 10.89actions batch at 1090000-th learning:
	 shape = (128, 4),
	 mean = [0.30918872 0.17420633 0.32198355 0.2282716 ],
	  std = [0.5539297  0.55987597 0.5138434  0.5405453 ]
12.579999718815088
Episode 1558	Average Score: 10.62	Score: 12.58actions batch at 1091000-th learning:
	 shape = (128, 4),
	 mean = [0.39105684 0.18606827 0.28286785 0.33028406],
	  std = [0.56452656 0.5469887  0.51842576 0.55028504]
11.239999748766422
Episode 1559	Average Score: 10.64	Score: 11.2413.499999698251486
Episode 1560	Average Score: 10.67
actions batch at 1092000-th learning:
	 shape = (128, 4),
	 mean = [0.24805325 0.17548615 0.18210998 0.21850257],
	  std = [0.54581636 0.5484954  0.43799096 0.5272079 ]
11.569999741390347
Episode 1561	Average Score: 10.68	Score: 11.57actions batch at 1093000-th learning:
	 shape = (128, 4),
	 mean = [0.19363378 0.12203802 0.16064256 0.2406913 ],
	  std = [0.5519587  0.5104768  0.48246607 0.5470941 ]
11.34999974630773
Episode 1562	Average Score: 10.64	Score: 11.359.359999790787697
Episode 1563	Average Score: 10.63	Score: 9.36actions batch at 1094000-th learning:
	 shape = (128, 4),
	 mean = [0.25044113 0.1741825  0.20260294 0.3040186 ],
	  std = [0.53790754 0.5558927  0.48370317 0.53204954]
10.079999774694443
Episode 1564	Average Score: 10.63	Score: 10.08actions batch at 1095000-th learning:
	 shape = (128, 4),
	 mean = [0.25989765 0.19215538 0.24374165 0.2524005 ],
	  std = [0.5674105  0.54791844 0.51215917 0.5477573 ]
9.229999793693423
Episode 1565	Average Score: 10.63	Score: 9.23actions batch at 1096000-th learning:
	 shape = (128, 4),
	 mean = [0.23718557 0.1321112  0.19935188 0.22271615],
	  std = [0.570639   0.5229647  0.46877393 0.54885787]
10.219999771565199
Episode 1566	Average Score: 10.50	Score: 10.2210.699999760836363
Episode 1567	Average Score: 10.50	Score: 10.70actions batch at 1097000-th learning:
	 shape = (128, 4),
	 mean = [0.32448962 0.21453062 0.36226982 0.25173295],
	  std = [0.57271546 0.5896822  0.48208275 0.5260465 ]
12.969999710097909
Episode 1568	Average Score: 10.54	Score: 12.97actions batch at 1098000-th learning:
	 shape = (128, 4),
	 mean = [0.28090534 0.18371937 0.16148233 0.27453655],
	  std = [0.53958356 0.51674694 0.4901141  0.52934873]
9.979999776929617
Episode 1569	Average Score: 10.52	Score: 9.9813.869999689981341
Episode 1570	Average Score: 10.58
actions batch at 1099000-th learning:
	 shape = (128, 4),
	 mean = [0.3004845  0.21768892 0.2248059  0.26862925],
	  std = [0.5455358  0.55092573 0.49768773 0.53830343]
5.029999887570739
Episode 1571	Average Score: 10.53	Score: 5.03actions batch at 1100000-th learning:
	 shape = (128, 4),
	 mean = [0.3581467  0.21210918 0.32938102 0.20580907],
	  std = [0.5501587  0.5191909  0.49480233 0.54540586]
9.209999794140458
Episode 1572	Average Score: 10.49	Score: 9.2111.239999748766422
Episode 1573	Average Score: 10.52	Score: 11.24actions batch at 1101000-th learning:
	 shape = (128, 4),
	 mean = [0.2367871  0.19608381 0.28654876 0.24220887],
	  std = [0.5604032  0.5308302  0.50834304 0.5358956 ]
8.719999805092812
Episode 1574	Average Score: 10.45	Score: 8.72actions batch at 1102000-th learning:
	 shape = (128, 4),
	 mean = [0.31896418 0.20899335 0.30329242 0.23331386],
	  std = [0.5738163  0.5763913  0.49947342 0.53652084]
10.749999759718776
Episode 1575	Average Score: 10.45	Score: 10.75actions batch at 1103000-th learning:
	 shape = (128, 4),
	 mean = [0.21427926 0.14672112 0.12363441 0.20463076],
	  std = [0.5636513  0.5237483  0.46971714 0.5219995 ]
11.689999738708138
Episode 1576	Average Score: 10.47	Score: 11.6911.019999753683805
Episode 1577	Average Score: 10.46	Score: 11.02actions batch at 1104000-th learning:
	 shape = (128, 4),
	 mean = [0.30124983 0.23624527 0.24783249 0.2673377 ],
	  std = [0.5573387  0.5537611  0.47876352 0.5687899 ]
10.91999975591898
Episode 1578	Average Score: 10.47	Score: 10.92actions batch at 1105000-th learning:
	 shape = (128, 4),
	 mean = [0.3452221  0.1997889  0.25935182 0.33064   ],
	  std = [0.56258047 0.5508775  0.47349545 0.5666748 ]
10.349999768659472
Episode 1579	Average Score: 10.47	Score: 10.3510.869999757036567
Episode 1580	Average Score: 10.46
actions batch at 1106000-th learning:
	 shape = (128, 4),
	 mean = [0.25555903 0.26498222 0.2302722  0.30371994],
	  std = [0.5242035  0.56362355 0.47952232 0.53112113]
12.539999719709158
Episode 1581	Average Score: 10.43	Score: 12.54actions batch at 1107000-th learning:
	 shape = (128, 4),
	 mean = [0.2393642 0.1153232 0.2527697 0.2755078],
	  std = [0.55061823 0.51652974 0.4857749  0.5290666 ]
14.779999669641256
Episode 1582	Average Score: 10.47	Score: 14.789.559999786317348
Episode 1583	Average Score: 10.42	Score: 9.56actions batch at 1108000-th learning:
	 shape = (128, 4),
	 mean = [0.28005967 0.22691062 0.19767794 0.28834423],
	  std = [0.55007863 0.54513234 0.49485728 0.5427814 ]
12.659999717026949
Episode 1584	Average Score: 10.40	Score: 12.66actions batch at 1109000-th learning:
	 shape = (128, 4),
	 mean = [0.26395655 0.17802495 0.25838938 0.23465648],
	  std = [0.55583626 0.55250555 0.50848615 0.5401397 ]
9.319999791681767
Episode 1585	Average Score: 10.33	Score: 9.32actions batch at 1110000-th learning:
	 shape = (128, 4),
	 mean = [0.35023773 0.27349818 0.31818566 0.3749996 ],
	  std = [0.53923553 0.5506393  0.5088311  0.55478287]
10.019999776035547
Episode 1586	Average Score: 10.32	Score: 10.0211.799999736249447
Episode 1587	Average Score: 10.34	Score: 11.80actions batch at 1111000-th learning:
	 shape = (128, 4),
	 mean = [0.3157291  0.22908342 0.1707641  0.30743456],
	  std = [0.56660897 0.53845507 0.4849056  0.5343383 ]
9.029999798163772
Episode 1588	Average Score: 10.33	Score: 9.03actions batch at 1112000-th learning:
	 shape = (128, 4),
	 mean = [0.25226375 0.20429498 0.31945774 0.29227883],
	  std = [0.562384   0.54002273 0.49610683 0.5247509 ]
6.569999853149056
Episode 1589	Average Score: 10.32	Score: 6.579.189999794587493
Episode 1590	Average Score: 10.33
actions batch at 1113000-th learning:
	 shape = (128, 4),
	 mean = [0.29749027 0.18688032 0.31280378 0.32908455],
	  std = [0.55102783 0.5138705  0.5129344  0.5611371 ]
9.13999979570508
Episode 1591	Average Score: 10.33	Score: 9.14actions batch at 1114000-th learning:
	 shape = (128, 4),
	 mean = [0.25943157 0.18153891 0.30899185 0.3233974 ],
	  std = [0.54571927 0.5347774  0.5222998  0.5323907 ]
9.94999977760017
Episode 1592	Average Score: 10.27	Score: 9.9511.119999751448631
Episode 1593	Average Score: 10.26	Score: 11.12actions batch at 1115000-th learning:
	 shape = (128, 4),
	 mean = [0.27828252 0.24546118 0.25653452 0.28544998],
	  std = [0.5580692 0.5492752 0.5165674 0.5596824]
8.229999816045165
Episode 1594	Average Score: 10.28	Score: 8.23actions batch at 1116000-th learning:
	 shape = (128, 4),
	 mean = [0.24272332 0.266153   0.23846014 0.30239013],
	  std = [0.58291095 0.5657085  0.4923373  0.5609915 ]
8.629999807104468
Episode 1595	Average Score: 10.27	Score: 8.63actions batch at 1117000-th learning:
	 shape = (128, 4),
	 mean = [0.26142076 0.16598369 0.18840241 0.29536015],
	  std = [0.57101625 0.55757177 0.44482207 0.5554007 ]
12.099999729543924
Episode 1596	Average Score: 10.30	Score: 12.109.209999794140458
Episode 1597	Average Score: 10.29	Score: 9.21actions batch at 1118000-th learning:
	 shape = (128, 4),
	 mean = [0.299567   0.14177744 0.2992543  0.25079864],
	  std = [0.567231   0.5146716  0.52952313 0.55732656]
11.369999745860696
Episode 1598	Average Score: 10.33	Score: 11.37actions batch at 1119000-th learning:
	 shape = (128, 4),
	 mean = [0.27619234 0.20122962 0.22754598 0.29398212],
	  std = [0.57851106 0.5429053  0.49457526 0.54356277]
9.229999793693423
Episode 1599	Average Score: 10.33	Score: 9.2311.219999749213457
Episode 1600	Average Score: 10.35
actions batch at 1120000-th learning:
	 shape = (128, 4),
	 mean = [0.1637251  0.24625953 0.18357286 0.25982746],
	  std = [0.52837396 0.55759335 0.5002668  0.5022054 ]
7.869999824091792
Episode 1601	Average Score: 10.35	Score: 7.87actions batch at 1121000-th learning:
	 shape = (128, 4),
	 mean = [0.29567975 0.24273558 0.25254303 0.34313598],
	  std = [0.55610204 0.5495085  0.50265956 0.5653014 ]
10.089999774470925
Episode 1602	Average Score: 10.37	Score: 10.0910.819999758154154
Episode 1603	Average Score: 10.40	Score: 10.82actions batch at 1122000-th learning:
	 shape = (128, 4),
	 mean = [0.3154186  0.1677802  0.32450327 0.3178926 ],
	  std = [0.5712816  0.53112847 0.5165681  0.5434382 ]
10.519999764859676
Episode 1604	Average Score: 10.41	Score: 10.52actions batch at 1123000-th learning:
	 shape = (128, 4),
	 mean = [0.14503503 0.18431357 0.25701278 0.38995987],
	  std = [0.5367535  0.54585457 0.511513   0.547847  ]
11.019999753683805
Episode 1605	Average Score: 10.43	Score: 11.02actions batch at 1124000-th learning:
	 shape = (128, 4),
	 mean = [0.20224008 0.20005964 0.17367694 0.23995784],
	  std = [0.5493049  0.5151087  0.47152618 0.5407548 ]
12.979999709874392
Episode 1606	Average Score: 10.46	Score: 12.988.74999980442226
Episode 1607	Average Score: 10.43	Score: 8.75actions batch at 1125000-th learning:
	 shape = (128, 4),
	 mean = [0.34321973 0.20956986 0.31110287 0.24459699],
	  std = [0.55345446 0.55291486 0.50530684 0.5534418 ]
4.309999903663993
Episode 1608	Average Score: 10.35	Score: 4.31actions batch at 1126000-th learning:
	 shape = (128, 4),
	 mean = [0.28430736 0.13298602 0.24173298 0.26510507],
	  std = [0.54755366 0.53193384 0.49617293 0.5247456 ]
9.149999795481563
Episode 1609	Average Score: 10.32	Score: 9.159.979999776929617
Episode 1610	Average Score: 10.32
actions batch at 1127000-th learning:
	 shape = (128, 4),
	 mean = [0.27069852 0.23822956 0.23243627 0.3421215 ],
	  std = [0.53453195 0.5707858  0.5016706  0.53952634]
9.419999789446592
Episode 1611	Average Score: 10.33	Score: 9.42actions batch at 1128000-th learning:
	 shape = (128, 4),
	 mean = [0.1769979  0.1753558  0.20080468 0.2535732 ],
	  std = [0.54679227 0.54283434 0.48736873 0.53651315]
5.749999871477485
Episode 1612	Average Score: 10.18	Score: 5.757.779999826103449
Episode 1613	Average Score: 10.14	Score: 7.78actions batch at 1129000-th learning:
	 shape = (128, 4),
	 mean = [0.3643528  0.10738742 0.28856325 0.3206932 ],
	  std = [0.539427   0.5133526  0.52302647 0.5645129 ]
16.26999963633716
Episode 1614	Average Score: 10.21	Score: 16.27actions batch at 1130000-th learning:
	 shape = (128, 4),
	 mean = [0.27258688 0.10218278 0.25431424 0.3139616 ],
	  std = [0.5545041  0.50165284 0.4815003  0.5495452 ]
7.679999828338623
Episode 1615	Average Score: 10.19	Score: 7.68actions batch at 1131000-th learning:
	 shape = (128, 4),
	 mean = [0.29124743 0.20645747 0.3390875  0.2508824 ],
	  std = [0.58579594 0.54114896 0.4885959  0.5282521 ]
12.46999972127378
Episode 1616	Average Score: 10.22	Score: 12.4714.169999683275819
Episode 1617	Average Score: 10.26	Score: 14.17actions batch at 1132000-th learning:
	 shape = (128, 4),
	 mean = [0.23927023 0.15062155 0.2614016  0.23842192],
	  std = [0.56198746 0.48355353 0.50063455 0.53862464]
9.549999786540866
Episode 1618	Average Score: 10.27	Score: 9.55actions batch at 1133000-th learning:
	 shape = (128, 4),
	 mean = [0.31603554 0.13596141 0.24352849 0.25103182],
	  std = [0.55236495 0.534626   0.51402056 0.5263722 ]
16.199999637901783
Episode 1619	Average Score: 10.38	Score: 16.209.469999788329005
Episode 1620	Average Score: 10.37
actions batch at 1134000-th learning:
	 shape = (128, 4),
	 mean = [0.1636399  0.2441657  0.246147   0.30923104],
	  std = [0.53948474 0.5725566  0.5021931  0.58568966]
9.809999780729413
Episode 1621	Average Score: 10.39	Score: 9.81actions batch at 1135000-th learning:
	 shape = (128, 4),
	 mean = [0.21680427 0.17550887 0.21789776 0.27761212],
	  std = [0.5458308  0.560272   0.49199766 0.5406169 ]
10.159999772906303
Episode 1622	Average Score: 10.41	Score: 10.1611.089999752119184
Episode 1623	Average Score: 10.41	Score: 11.09actions batch at 1136000-th learning:
	 shape = (128, 4),
	 mean = [0.1606533  0.20452921 0.26136926 0.28695434],
	  std = [0.5160337 0.5260419 0.5027181 0.5408503]
5.579999875277281
Episode 1624	Average Score: 10.39	Score: 5.58actions batch at 1137000-th learning:
	 shape = (128, 4),
	 mean = [0.18929456 0.06872253 0.24802141 0.2573586 ],
	  std = [0.55398965 0.5058028  0.50307405 0.5369059 ]
13.70999969355762
Episode 1625	Average Score: 10.48	Score: 13.71actions batch at 1138000-th learning:
	 shape = (128, 4),
	 mean = [0.30255875 0.20430255 0.26203224 0.28553978],
	  std = [0.5799636  0.54152125 0.51227605 0.5631962 ]
11.439999744296074
Episode 1626	Average Score: 10.51	Score: 11.4413.149999706074595
Episode 1627	Average Score: 10.54	Score: 13.15actions batch at 1139000-th learning:
	 shape = (128, 4),
	 mean = [0.31464106 0.25225452 0.22504212 0.32376042],
	  std = [0.5638391  0.5822461  0.5247357  0.52829057]
9.389999790117145
Episode 1628	Average Score: 10.53	Score: 9.39actions batch at 1140000-th learning:
	 shape = (128, 4),
	 mean = [0.30449426 0.29625592 0.31278145 0.25386104],
	  std = [0.6122043  0.55549383 0.51155615 0.5694355 ]
10.719999760389328
Episode 1629	Average Score: 10.50	Score: 10.728.459999810904264
Episode 1630	Average Score: 10.50
actions batch at 1141000-th learning:
	 shape = (128, 4),
	 mean = [0.34879526 0.2583917  0.32096836 0.30422583],
	  std = [0.55383563 0.5468199  0.52457416 0.554803  ]
9.079999797046185
Episode 1631	Average Score: 10.53	Score: 9.08actions batch at 1142000-th learning:
	 shape = (128, 4),
	 mean = [0.304107   0.24517937 0.34150386 0.32889605],
	  std = [0.5731589  0.5425415  0.52175415 0.54714614]
10.739999759942293
Episode 1632	Average Score: 10.50	Score: 10.749.009999798610806
Episode 1633	Average Score: 10.50	Score: 9.01actions batch at 1143000-th learning:
	 shape = (128, 4),
	 mean = [0.23539314 0.19029696 0.25984627 0.34575346],
	  std = [0.5277121 0.5237507 0.5245538 0.5803237]
13.209999704733491
Episode 1634	Average Score: 10.51	Score: 13.21actions batch at 1144000-th learning:
	 shape = (128, 4),
	 mean = [0.20946264 0.16309386 0.3013697  0.34538734],
	  std = [0.54598016 0.52388173 0.4972705  0.56619483]
10.249999770894647
Episode 1635	Average Score: 10.53	Score: 10.25actions batch at 1145000-th learning:
	 shape = (128, 4),
	 mean = [0.29646888 0.13264173 0.20501022 0.27146825],
	  std = [0.5634426  0.5414195  0.46961802 0.53643906]
14.359999679028988
Episode 1636	Average Score: 10.56	Score: 14.369.329999791458249
Episode 1637	Average Score: 10.56	Score: 9.33actions batch at 1146000-th learning:
	 shape = (128, 4),
	 mean = [0.2076236  0.16525145 0.2641954  0.33298782],
	  std = [0.5430813 0.5667731 0.511382  0.5433211]
8.539999809116125
Episode 1638	Average Score: 10.56	Score: 8.54actions batch at 1147000-th learning:
	 shape = (128, 4),
	 mean = [0.23482381 0.19499601 0.23144801 0.30660856],
	  std = [0.57047844 0.52682936 0.4530504  0.5302431 ]
7.62999982945621
Episode 1639	Average Score: 10.51	Score: 7.638.589999807998538
Episode 1640	Average Score: 10.49
actions batch at 1148000-th learning:
	 shape = (128, 4),
	 mean = [0.3431259  0.20740937 0.2566451  0.34819388],
	  std = [0.573947   0.54478246 0.48628777 0.547479  ]
5.429999878630042
Episode 1641	Average Score: 10.44	Score: 5.43actions batch at 1149000-th learning:
	 shape = (128, 4),
	 mean = [0.30043954 0.16225493 0.33290365 0.32595342],
	  std = [0.56755275 0.5200528  0.51580656 0.53870887]
9.44999978877604
Episode 1642	Average Score: 10.44	Score: 9.459.129999795928597
Episode 1643	Average Score: 10.42	Score: 9.13actions batch at 1150000-th learning:
	 shape = (128, 4),
	 mean = [0.22746725 0.10614874 0.15202354 0.2776811 ],
	  std = [0.5437333  0.5160258  0.47962528 0.5495054 ]
11.589999740943313
Episode 1644	Average Score: 10.40	Score: 11.59actions batch at 1151000-th learning:
	 shape = (128, 4),
	 mean = [0.31229082 0.285814   0.2647346  0.26670253],
	  std = [0.5737988  0.5722922  0.47965595 0.5577149 ]
7.179999839514494
Episode 1645	Average Score: 10.38	Score: 7.18actions batch at 1152000-th learning:
	 shape = (128, 4),
	 mean = [0.2478059  0.19351909 0.16431822 0.15974878],
	  std = [0.54173625 0.55661064 0.47678807 0.51522064]
11.199999749660492
Episode 1646	Average Score: 10.38	Score: 11.209.109999796375632
Episode 1647	Average Score: 10.34	Score: 9.11actions batch at 1153000-th learning:
	 shape = (128, 4),
	 mean = [0.32201353 0.18899572 0.21342257 0.29603183],
	  std = [0.5725752  0.53963435 0.4983113  0.57820415]
11.169999750331044
Episode 1648	Average Score: 10.35	Score: 11.17actions batch at 1154000-th learning:
	 shape = (128, 4),
	 mean = [0.28397438 0.23625568 0.26108167 0.36661148],
	  std = [0.54830146 0.55645496 0.49142036 0.5440306 ]
7.9799998216331005
Episode 1649	Average Score: 10.33	Score: 7.988.939999800175428
Episode 1650	Average Score: 10.34
actions batch at 1155000-th learning:
	 shape = (128, 4),
	 mean = [0.24449344 0.10947558 0.17290264 0.40335807],
	  std = [0.56491506 0.51363313 0.4588638  0.5739311 ]
10.049999775364995
Episode 1651	Average Score: 10.28	Score: 10.05actions batch at 1156000-th learning:
	 shape = (128, 4),
	 mean = [0.18241428 0.10932965 0.24638246 0.1859714 ],
	  std = [0.540312  0.5267752 0.4788969 0.5368615]
6.609999852254987
Episode 1652	Average Score: 10.25	Score: 6.617.019999843090773
Episode 1653	Average Score: 10.19	Score: 7.02actions batch at 1157000-th learning:
	 shape = (128, 4),
	 mean = [0.23320305 0.1831015  0.22268392 0.27597067],
	  std = [0.5511001  0.5361185  0.47007287 0.5498133 ]
9.609999785199761
Episode 1654	Average Score: 10.16	Score: 9.61actions batch at 1158000-th learning:
	 shape = (128, 4),
	 mean = [0.25385377 0.20046505 0.25711578 0.26865792],
	  std = [0.5594611  0.5247565  0.50580245 0.5720759 ]
7.1199998408555984
Episode 1655	Average Score: 10.12	Score: 7.12actions batch at 1159000-th learning:
	 shape = (128, 4),
	 mean = [0.2203914  0.2748996  0.24772277 0.263924  ],
	  std = [0.53442556 0.57190377 0.5224007  0.53577703]
10.119999773800373
Episode 1656	Average Score: 10.13	Score: 10.126.849999846890569
Episode 1657	Average Score: 10.09	Score: 6.85actions batch at 1160000-th learning:
	 shape = (128, 4),
	 mean = [0.22493565 0.29139277 0.24831106 0.39736366],
	  std = [0.5761964  0.53679425 0.5238679  0.5388378 ]
9.299999792128801
Episode 1658	Average Score: 10.06	Score: 9.30actions batch at 1161000-th learning:
	 shape = (128, 4),
	 mean = [0.26246855 0.18118481 0.2651643  0.25274777],
	  std = [0.53931665 0.5106762  0.5235639  0.5458146 ]
6.909999845549464
Episode 1659	Average Score: 10.02	Score: 6.918.839999802410603
Episode 1660	Average Score: 9.97
actions batch at 1162000-th learning:
	 shape = (128, 4),
	 mean = [0.2924019  0.26628128 0.2218637  0.23703451],
	  std = [0.53479606 0.5699571  0.48855898 0.54650235]
8.849999802187085
Episode 1661	Average Score: 9.94	Score: 8.85actions batch at 1163000-th learning:
	 shape = (128, 4),
	 mean = [0.17883661 0.20120397 0.23120232 0.21841875],
	  std = [0.5053791  0.5557849  0.50647104 0.51202005]
11.399999745190144
Episode 1662	Average Score: 9.94	Score: 11.4010.389999767765403
Episode 1663	Average Score: 9.95	Score: 10.39actions batch at 1164000-th learning:
	 shape = (128, 4),
	 mean = [0.23886687 0.20951581 0.13666227 0.2992129 ],
	  std = [0.56089413 0.551516   0.45296866 0.5836711 ]
11.449999744072556
Episode 1664	Average Score: 9.97	Score: 11.45actions batch at 1165000-th learning:
	 shape = (128, 4),
	 mean = [0.24809878 0.2869046  0.31272534 0.35124213],
	  std = [0.55458146 0.5677591  0.4960009  0.5716523 ]
7.409999834373593
Episode 1665	Average Score: 9.95	Score: 7.41actions batch at 1166000-th learning:
	 shape = (128, 4),
	 mean = [0.27463758 0.18190017 0.29827148 0.32058364],
	  std = [0.5333754  0.5543047  0.49665582 0.56971306]
1.6199999637901783
Episode 1666	Average Score: 9.86	Score: 1.629.979999776929617
Episode 1667	Average Score: 9.85	Score: 9.98actions batch at 1167000-th learning:
	 shape = (128, 4),
	 mean = [0.33318782 0.2693866  0.28696844 0.39910367],
	  std = [0.5819802  0.5604442  0.4909204  0.55476195]
7.019999843090773
Episode 1668	Average Score: 9.80	Score: 7.02actions batch at 1168000-th learning:
	 shape = (128, 4),
	 mean = [0.27825692 0.16394378 0.32920465 0.2723213 ],
	  std = [0.56960374 0.5377331  0.49134657 0.5437812 ]
10.159999772906303
Episode 1669	Average Score: 9.80	Score: 10.166.489999854937196
Episode 1670	Average Score: 9.72
actions batch at 1169000-th learning:
	 shape = (128, 4),
	 mean = [0.31853592 0.14532715 0.23526418 0.35303384],
	  std = [0.5596138  0.54479945 0.5188445  0.55012465]
5.899999868124723
Episode 1671	Average Score: 9.73	Score: 5.90actions batch at 1170000-th learning:
	 shape = (128, 4),
	 mean = [0.26108223 0.21521446 0.22957718 0.27910948],
	  std = [0.5483717  0.5118699  0.5119168  0.53828365]
8.55999980866909
Episode 1672	Average Score: 9.73	Score: 8.565.6099998746067286
Episode 1673	Average Score: 9.67	Score: 5.61actions batch at 1171000-th learning:
	 shape = (128, 4),
	 mean = [0.29008752 0.22304592 0.21761824 0.33657023],
	  std = [0.5726975  0.5494365  0.5020338  0.54892373]
8.979999799281359
Episode 1674	Average Score: 9.67	Score: 8.98actions batch at 1172000-th learning:
	 shape = (128, 4),
	 mean = [0.2694435  0.22108874 0.2668214  0.28309613],
	  std = [0.5564625 0.5596239 0.497346  0.5693312]
2.0899999532848597
Episode 1675	Average Score: 9.59	Score: 2.09actions batch at 1173000-th learning:
	 shape = (128, 4),
	 mean = [0.19065458 0.13747405 0.20458192 0.29925615],
	  std = [0.55411017 0.5138578  0.47669512 0.519857  ]
6.4299998562783
Episode 1676	Average Score: 9.53	Score: 6.4310.849999757483602
Episode 1677	Average Score: 9.53	Score: 10.85actions batch at 1174000-th learning:
	 shape = (128, 4),
	 mean = [0.23287295 0.24091555 0.25719506 0.32488698],
	  std = [0.57926047 0.56237537 0.49582103 0.5452255 ]
6.129999862983823
Episode 1678	Average Score: 9.48	Score: 6.13actions batch at 1175000-th learning:
	 shape = (128, 4),
	 mean = [0.31975284 0.23444857 0.252956   0.35478044],
	  std = [0.55270827 0.5326329  0.5084539  0.556695  ]
7.8599998243153095
Episode 1679	Average Score: 9.46	Score: 7.866.749999849125743
Episode 1680	Average Score: 9.42
actions batch at 1176000-th learning:
	 shape = (128, 4),
	 mean = [0.32431543 0.30909088 0.25024235 0.22056672],
	  std = [0.5518929 0.5591894 0.4923666 0.5339532]
5.289999881759286
Episode 1681	Average Score: 9.34	Score: 5.29actions batch at 1177000-th learning:
	 shape = (128, 4),
	 mean = [0.31927773 0.14394501 0.26688063 0.25898504],
	  std = [0.56429297 0.5367316  0.48567578 0.5323464 ]
7.639999829232693
Episode 1682	Average Score: 9.27	Score: 7.647.699999827891588
Episode 1683	Average Score: 9.25	Score: 7.70actions batch at 1178000-th learning:
	 shape = (128, 4),
	 mean = [0.33116627 0.23599403 0.26007926 0.31827143],
	  std = [0.5865526 0.5678187 0.5307255 0.5571436]
9.579999785870314
Episode 1684	Average Score: 9.22	Score: 9.58actions batch at 1179000-th learning:
	 shape = (128, 4),
	 mean = [0.1889631  0.17866418 0.16826746 0.23978646],
	  std = [0.5293973  0.5433285  0.47656697 0.53233474]
10.509999765083194
Episode 1685	Average Score: 9.24	Score: 10.51actions batch at 1180000-th learning:
	 shape = (128, 4),
	 mean = [0.22313605 0.09078    0.17624946 0.19456393],
	  std = [0.53370106 0.52052575 0.4761672  0.5245707 ]
11.139999751001596
Episode 1686	Average Score: 9.25	Score: 11.147.0699998419731855
Episode 1687	Average Score: 9.20	Score: 7.07actions batch at 1181000-th learning:
	 shape = (128, 4),
	 mean = [0.3301207  0.25021976 0.26070628 0.20708077],
	  std = [0.6041701 0.5499394 0.5139944 0.5086539]
8.569999808445573
Episode 1688	Average Score: 9.19	Score: 8.57actions batch at 1182000-th learning:
	 shape = (128, 4),
	 mean = [0.21620414 0.21190526 0.25275046 0.26114142],
	  std = [0.5692809  0.56431836 0.5098458  0.53718835]
9.83999978005886
Episode 1689	Average Score: 9.23	Score: 9.846.379999857395887
Episode 1690	Average Score: 9.20
actions batch at 1183000-th learning:
	 shape = (128, 4),
	 mean = [0.34018365 0.1407638  0.25130662 0.26822573],
	  std = [0.5526193 0.5463617 0.509327  0.5525253]
11.65999973937869
Episode 1691	Average Score: 9.22	Score: 11.66actions batch at 1184000-th learning:
	 shape = (128, 4),
	 mean = [0.23381577 0.18782854 0.25377867 0.22211611],
	  std = [0.54241765 0.5742787  0.5126801  0.53603566]
10.229999771341681
Episode 1692	Average Score: 9.23	Score: 10.237.459999833256006
Episode 1693	Average Score: 9.19	Score: 7.46actions batch at 1185000-th learning:
	 shape = (128, 4),
	 mean = [0.3819702  0.21896751 0.29800555 0.28481165],
	  std = [0.5575502 0.552585  0.4954693 0.5519445]
6.679999850690365
Episode 1694	Average Score: 9.18	Score: 6.68actions batch at 1186000-th learning:
	 shape = (128, 4),
	 mean = [0.28631988 0.21289757 0.30001062 0.33068037],
	  std = [0.5536115 0.5747216 0.4936129 0.5390407]
8.709999805316329
Episode 1695	Average Score: 9.18	Score: 8.71actions batch at 1187000-th learning:
	 shape = (128, 4),
	 mean = [0.27910444 0.21295182 0.23451708 0.3023399 ],
	  std = [0.5810155  0.53456473 0.5031673  0.5910242 ]
7.6199998296797276
Episode 1696	Average Score: 9.13	Score: 7.6211.119999751448631
Episode 1697	Average Score: 9.15	Score: 11.12actions batch at 1188000-th learning:
	 shape = (128, 4),
	 mean = [0.25451452 0.24283531 0.18677396 0.29533705],
	  std = [0.5665587 0.5637619 0.4920905 0.556187 ]
11.469999743625522
Episode 1698	Average Score: 9.15	Score: 11.47actions batch at 1189000-th learning:
	 shape = (128, 4),
	 mean = [0.32129985 0.2420417  0.253696   0.2833394 ],
	  std = [0.55653036 0.57248354 0.5467778  0.5504519 ]
7.269999837502837
Episode 1699	Average Score: 9.13	Score: 7.2710.539999764412642
Episode 1700	Average Score: 9.13
actions batch at 1190000-th learning:
	 shape = (128, 4),
	 mean = [0.25706995 0.21761051 0.27104682 0.29440853],
	  std = [0.5409971  0.5521966  0.5087176  0.54533005]
23.399999476969242
Episode 1701	Average Score: 9.28	Score: 23.40actions batch at 1191000-th learning:
	 shape = (128, 4),
	 mean = [0.24634477 0.16624284 0.23057501 0.2339856 ],
	  std = [0.5473205 0.5178599 0.5052271 0.5575491]
10.589999763295054
Episode 1702	Average Score: 9.29	Score: 10.5910.06999977491796
Episode 1703	Average Score: 9.28	Score: 10.07actions batch at 1192000-th learning:
	 shape = (128, 4),
	 mean = [0.2807719  0.3004597  0.30150616 0.24133833],
	  std = [0.56029034 0.56294703 0.5241819  0.52312744]
10.589999763295054
Episode 1704	Average Score: 9.28	Score: 10.59actions batch at 1193000-th learning:
	 shape = (128, 4),
	 mean = [0.2540409  0.2785174  0.17505859 0.31623304],
	  std = [0.56707585 0.54582685 0.4968029  0.5368686 ]
7.1899998392909765
Episode 1705	Average Score: 9.24	Score: 7.19actions batch at 1194000-th learning:
	 shape = (128, 4),
	 mean = [0.3120911  0.22260836 0.2925539  0.2594842 ],
	  std = [0.54706085 0.54104507 0.4925704  0.5533564 ]
9.9899997767061
Episode 1706	Average Score: 9.21	Score: 9.999.249999793246388
Episode 1707	Average Score: 9.22	Score: 9.25actions batch at 1195000-th learning:
	 shape = (128, 4),
	 mean = [0.38064444 0.20106876 0.34853256 0.25962633],
	  std = [0.5743333  0.5226562  0.51957947 0.54683506]
11.049999753013253
Episode 1708	Average Score: 9.28	Score: 11.05actions batch at 1196000-th learning:
	 shape = (128, 4),
	 mean = [0.32644585 0.2266722  0.30789903 0.31633878],
	  std = [0.562502  0.5489243 0.5087811 0.5452499]
9.579999785870314
Episode 1709	Average Score: 9.29	Score: 9.589.339999791234732
Episode 1710	Average Score: 9.28
actions batch at 1197000-th learning:
	 shape = (128, 4),
	 mean = [0.31670934 0.22743425 0.23665668 0.2809215 ],
	  std = [0.5502517  0.5568572  0.5190696  0.53717685]
9.469999788329005
Episode 1711	Average Score: 9.28	Score: 9.47actions batch at 1198000-th learning:
	 shape = (128, 4),
	 mean = [0.2871546  0.16344589 0.24786241 0.31882206],
	  std = [0.5661431  0.5657634  0.48968664 0.5478662 ]
6.509999854490161
Episode 1712	Average Score: 9.29	Score: 6.515.959999866783619
Episode 1713	Average Score: 9.27	Score: 5.96actions batch at 1199000-th learning:
	 shape = (128, 4),
	 mean = [0.21329734 0.1928607  0.24152745 0.23924322],
	  std = [0.54505605 0.54362214 0.49134904 0.5568251 ]
15.539999652653933
Episode 1714	Average Score: 9.26	Score: 15.54actions batch at 1200000-th learning:
	 shape = (128, 4),
	 mean = [0.25134364 0.15502395 0.24925865 0.28608918],
	  std = [0.5612224  0.50387436 0.5497532  0.55526245]
12.929999710991979
Episode 1715	Average Score: 9.32	Score: 12.93actions batch at 1201000-th learning:
	 shape = (128, 4),
	 mean = [0.31208807 0.23695695 0.2800366  0.29021665],
	  std = [0.5651494  0.54790825 0.5026046  0.5462121 ]
6.539999853819609
Episode 1716	Average Score: 9.26	Score: 6.548.309999814257026
Episode 1717	Average Score: 9.20	Score: 8.31actions batch at 1202000-th learning:
	 shape = (128, 4),
	 mean = [0.2836512  0.10585856 0.28530747 0.2617424 ],
	  std = [0.5466156  0.51474375 0.48598936 0.5448398 ]
10.389999767765403
Episode 1718	Average Score: 9.21	Score: 10.39actions batch at 1203000-th learning:
	 shape = (128, 4),
	 mean = [0.2988808  0.13821921 0.20218189 0.25667644],
	  std = [0.5508112  0.51777506 0.4901947  0.5525697 ]
12.349999723955989
Episode 1719	Average Score: 9.17	Score: 12.356.959999844431877
Episode 1720	Average Score: 9.14
actions batch at 1204000-th learning:
	 shape = (128, 4),
	 mean = [0.35856065 0.08400116 0.24093916 0.30708444],
	  std = [0.54264814 0.501916   0.5046786  0.5456573 ]
8.899999801069498
Episode 1721	Average Score: 9.13	Score: 8.90actions batch at 1205000-th learning:
	 shape = (128, 4),
	 mean = [0.18750185 0.26454094 0.26027668 0.27611336],
	  std = [0.53329635 0.5305547  0.4777949  0.5444953 ]
8.779999803751707
Episode 1722	Average Score: 9.12	Score: 8.7810.629999762400985
Episode 1723	Average Score: 9.12	Score: 10.63actions batch at 1206000-th learning:
	 shape = (128, 4),
	 mean = [0.31711787 0.23782004 0.23805866 0.23720212],
	  std = [0.5774348  0.5499276  0.49948838 0.5111753 ]
10.25999977067113
Episode 1724	Average Score: 9.16	Score: 10.26actions batch at 1207000-th learning:
	 shape = (128, 4),
	 mean = [0.36427245 0.14468014 0.32061267 0.3058399 ],
	  std = [0.5891339  0.51554936 0.506149   0.5531902 ]
5.599999874830246
Episode 1725	Average Score: 9.08	Score: 5.60actions batch at 1208000-th learning:
	 shape = (128, 4),
	 mean = [0.3320017  0.15463305 0.24388848 0.35908934],
	  std = [0.5432906  0.5427012  0.52192384 0.5395437 ]
5.919999867677689
Episode 1726	Average Score: 9.03	Score: 5.927.029999842867255
Episode 1727	Average Score: 8.96	Score: 7.03actions batch at 1209000-th learning:
	 shape = (128, 4),
	 mean = [0.21926925 0.11395331 0.26796162 0.26742983],
	  std = [0.53331953 0.5235417  0.5082318  0.5470633 ]
9.789999781176448
Episode 1728	Average Score: 8.97	Score: 9.79actions batch at 1210000-th learning:
	 shape = (128, 4),
	 mean = [0.28537107 0.21866842 0.21087252 0.2848566 ],
	  std = [0.55660206 0.5353494  0.495332   0.54566437]
13.449999699369073
Episode 1729	Average Score: 9.00	Score: 13.459.79999978095293
Episode 1730	Average Score: 9.01
actions batch at 1211000-th learning:
	 shape = (128, 4),
	 mean = [0.29818565 0.13538304 0.2763488  0.21277647],
	  std = [0.5533167  0.5165051  0.49343115 0.5576883 ]
9.109999796375632
Episode 1731	Average Score: 9.01	Score: 9.11actions batch at 1212000-th learning:
	 shape = (128, 4),
	 mean = [0.351782   0.16060552 0.21486327 0.24078147],
	  std = [0.56387365 0.51632214 0.50363886 0.5375909 ]
9.149999795481563
Episode 1732	Average Score: 8.99	Score: 9.1511.739999737590551
Episode 1733	Average Score: 9.02	Score: 11.74actions batch at 1213000-th learning:
	 shape = (128, 4),
	 mean = [0.29655382 0.13615987 0.26830924 0.24214229],
	  std = [0.5735382  0.52176803 0.46823174 0.55097604]
12.939999710768461
Episode 1734	Average Score: 9.02	Score: 12.94actions batch at 1214000-th learning:
	 shape = (128, 4),
	 mean = [0.2218701  0.1576806  0.20537086 0.2204719 ],
	  std = [0.55336505 0.5503518  0.4895337  0.50361717]
7.9799998216331005
Episode 1735	Average Score: 9.00	Score: 7.98actions batch at 1215000-th learning:
	 shape = (128, 4),
	 mean = [0.31427982 0.24412628 0.28053215 0.28296104],
	  std = [0.5659807 0.5578624 0.5015715 0.554517 ]
9.089999796822667
Episode 1736	Average Score: 8.94	Score: 9.0910.599999763071537
Episode 1737	Average Score: 8.96	Score: 10.60actions batch at 1216000-th learning:
	 shape = (128, 4),
	 mean = [0.34172755 0.2066489  0.27030236 0.20187439],
	  std = [0.54297394 0.55966854 0.47811726 0.5118456 ]
11.429999744519591
Episode 1738	Average Score: 8.98	Score: 11.43actions batch at 1217000-th learning:
	 shape = (128, 4),
	 mean = [0.3207145  0.19841081 0.28177878 0.32492337],
	  std = [0.576315   0.5467317  0.5141194  0.58350384]
10.049999775364995
Episode 1739	Average Score: 9.01	Score: 10.058.399999812245369
Episode 1740	Average Score: 9.01
actions batch at 1218000-th learning:
	 shape = (128, 4),
	 mean = [0.3273892  0.23949535 0.30006677 0.35269237],
	  std = [0.5731816  0.5425794  0.52238697 0.56155705]
11.34999974630773
Episode 1741	Average Score: 9.07	Score: 11.35actions batch at 1219000-th learning:
	 shape = (128, 4),
	 mean = [0.35327148 0.13214958 0.29739845 0.20688684],
	  std = [0.55542904 0.5355862  0.50361824 0.5272497 ]
8.279999814927578
Episode 1742	Average Score: 9.05	Score: 8.289.629999784752727
Episode 1743	Average Score: 9.06	Score: 9.63actions batch at 1220000-th learning:
	 shape = (128, 4),
	 mean = [0.23317942 0.17175734 0.2550397  0.29881424],
	  std = [0.5311835  0.56387985 0.5125493  0.5061615 ]
10.709999760612845
Episode 1744	Average Score: 9.05	Score: 10.71actions batch at 1221000-th learning:
	 shape = (128, 4),
	 mean = [0.28604758 0.22769092 0.23384044 0.2303663 ],
	  std = [0.5494482  0.5646891  0.46761736 0.5607494 ]
10.489999765530229
Episode 1745	Average Score: 9.08	Score: 10.49actions batch at 1222000-th learning:
	 shape = (128, 4),
	 mean = [0.2446033  0.1579863  0.2517028  0.24438895],
	  std = [0.53922695 0.49498078 0.50729483 0.5333656 ]
7.8599998243153095
Episode 1746	Average Score: 9.05	Score: 7.869.829999780282378
Episode 1747	Average Score: 9.06	Score: 9.83actions batch at 1223000-th learning:
	 shape = (128, 4),
	 mean = [0.17640609 0.15352064 0.22930412 0.26207024],
	  std = [0.5318837  0.5322983  0.49474803 0.5424229 ]
10.399999767541885
Episode 1748	Average Score: 9.05	Score: 10.40actions batch at 1224000-th learning:
	 shape = (128, 4),
	 mean = [0.22814502 0.24171507 0.20807077 0.21324915],
	  std = [0.566441   0.5565522  0.49865055 0.5422198 ]
11.389999745413661
Episode 1749	Average Score: 9.08	Score: 11.398.55999980866909
Episode 1750	Average Score: 9.08
actions batch at 1225000-th learning:
	 shape = (128, 4),
	 mean = [0.33650884 0.30834594 0.27688438 0.30930766],
	  std = [0.5929619  0.55557805 0.49434674 0.5427491 ]
8.429999811574817
Episode 1751	Average Score: 9.06	Score: 8.43actions batch at 1226000-th learning:
	 shape = (128, 4),
	 mean = [0.30104747 0.2450886  0.26181233 0.19955167],
	  std = [0.5436109  0.52640074 0.51465875 0.5371024 ]
6.789999848231673
Episode 1752	Average Score: 9.07	Score: 6.7915.389999656006694
Episode 1753	Average Score: 9.15	Score: 15.39actions batch at 1227000-th learning:
	 shape = (128, 4),
	 mean = [0.33865976 0.2619478  0.27744853 0.40123308],
	  std = [0.5704196 0.5430296 0.5186125 0.5502197]
4.559999898076057
Episode 1754	Average Score: 9.10	Score: 4.56actions batch at 1228000-th learning:
	 shape = (128, 4),
	 mean = [0.30379346 0.211571   0.21980987 0.31379044],
	  std = [0.57286984 0.52788246 0.49046054 0.56600624]
11.099999751895666
Episode 1755	Average Score: 9.14	Score: 11.10actions batch at 1229000-th learning:
	 shape = (128, 4),
	 mean = [0.34326383 0.24566571 0.2587066  0.22818567],
	  std = [0.5787698  0.5713413  0.48605856 0.5542979 ]
12.999999709427357
Episode 1756	Average Score: 9.17	Score: 13.0011.03999975323677
Episode 1757	Average Score: 9.21	Score: 11.04actions batch at 1230000-th learning:
	 shape = (128, 4),
	 mean = [0.25048795 0.16388944 0.2790747  0.32504454],
	  std = [0.59657526 0.5579339  0.5022312  0.54856026]
12.19999972730875
Episode 1758	Average Score: 9.24	Score: 12.20actions batch at 1231000-th learning:
	 shape = (128, 4),
	 mean = [0.2601543  0.20266336 0.26293486 0.26747662],
	  std = [0.55743116 0.5377337  0.49368066 0.52933854]
9.9899997767061
Episode 1759	Average Score: 9.27	Score: 9.999.209999794140458
Episode 1760	Average Score: 9.27
actions batch at 1232000-th learning:
	 shape = (128, 4),
	 mean = [0.26871917 0.1928812  0.2510596  0.3049326 ],
	  std = [0.5671266  0.50912434 0.52694005 0.5528562 ]
8.499999810010195
Episode 1761	Average Score: 9.27	Score: 8.50actions batch at 1233000-th learning:
	 shape = (128, 4),
	 mean = [0.32113314 0.22973102 0.32829508 0.22006759],
	  std = [0.5465802  0.5514821  0.51605695 0.5405842 ]
6.499999854713678
Episode 1762	Average Score: 9.22	Score: 6.5011.979999732226133
Episode 1763	Average Score: 9.24	Score: 11.98actions batch at 1234000-th learning:
	 shape = (128, 4),
	 mean = [0.2787752  0.20297807 0.26674205 0.29046386],
	  std = [0.5780299  0.53401935 0.5285702  0.5516786 ]
10.829999757930636
Episode 1764	Average Score: 9.23	Score: 10.83actions batch at 1235000-th learning:
	 shape = (128, 4),
	 mean = [0.32349154 0.23002951 0.2603097  0.17565368],
	  std = [0.5642062  0.5239675  0.51033586 0.5162063 ]
9.579999785870314
Episode 1765	Average Score: 9.25	Score: 9.58actions batch at 1236000-th learning:
	 shape = (128, 4),
	 mean = [0.16777644 0.24219693 0.19566551 0.27905723],
	  std = [0.55040026 0.56277204 0.5011996  0.53224146]
11.069999752566218
Episode 1766	Average Score: 9.35	Score: 11.0713.279999703168869
Episode 1767	Average Score: 9.38	Score: 13.28actions batch at 1237000-th learning:
	 shape = (128, 4),
	 mean = [0.27584538 0.18710653 0.30950055 0.27240518],
	  std = [0.566444  0.5397818 0.5142013 0.5259927]
12.519999720156193
Episode 1768	Average Score: 9.43	Score: 12.52actions batch at 1238000-th learning:
	 shape = (128, 4),
	 mean = [0.28562182 0.20046842 0.26867586 0.24561135],
	  std = [0.562369  0.5493606 0.4940488 0.5548662]
8.109999818727374
Episode 1769	Average Score: 9.41	Score: 8.1111.379999745637178
Episode 1770	Average Score: 9.46
actions batch at 1239000-th learning:
	 shape = (128, 4),
	 mean = [0.3248933  0.28005382 0.22995757 0.36925584],
	  std = [0.56329095 0.5593736  0.498041   0.5580249 ]
11.639999739825726
Episode 1771	Average Score: 9.52	Score: 11.64actions batch at 1240000-th learning:
	 shape = (128, 4),
	 mean = [0.24736129 0.17038715 0.24455    0.30267888],
	  std = [0.5736352  0.53261286 0.50206465 0.5356272 ]
9.979999776929617
Episode 1772	Average Score: 9.53	Score: 9.9812.19999972730875
Episode 1773	Average Score: 9.60	Score: 12.20actions batch at 1241000-th learning:
	 shape = (128, 4),
	 mean = [0.3356514  0.30656978 0.2211231  0.362746  ],
	  std = [0.54670775 0.55413204 0.52512765 0.55092585]
6.729999849572778
Episode 1774	Average Score: 9.58	Score: 6.73actions batch at 1242000-th learning:
	 shape = (128, 4),
	 mean = [0.33466226 0.30775046 0.34699288 0.2940457 ],
	  std = [0.5505232 0.5529228 0.5013805 0.5334384]
8.449999811127782
Episode 1775	Average Score: 9.64	Score: 8.45actions batch at 1243000-th learning:
	 shape = (128, 4),
	 mean = [0.22822697 0.21621421 0.20622472 0.2322371 ],
	  std = [0.56722677 0.57595456 0.50155437 0.54243094]
11.119999751448631
Episode 1776	Average Score: 9.69	Score: 11.1212.109999729320407
Episode 1777	Average Score: 9.70	Score: 12.11actions batch at 1244000-th learning:
	 shape = (128, 4),
	 mean = [0.32194412 0.19024949 0.2836084  0.28897166],
	  std = [0.5592751  0.56249636 0.5064459  0.56195366]
6.869999846443534
Episode 1778	Average Score: 9.71	Score: 6.87actions batch at 1245000-th learning:
	 shape = (128, 4),
	 mean = [0.2629782  0.15983509 0.19789976 0.18535101],
	  std = [0.55735356 0.5360913  0.46803    0.5398235 ]
8.82999980263412
Episode 1779	Average Score: 9.72	Score: 8.836.7599998489022255
Episode 1780	Average Score: 9.72
actions batch at 1246000-th learning:
	 shape = (128, 4),
	 mean = [0.18503995 0.20772201 0.22476617 0.26899588],
	  std = [0.547531   0.55051035 0.49391803 0.5424684 ]
8.319999814033508
Episode 1781	Average Score: 9.75	Score: 8.32actions batch at 1247000-th learning:
	 shape = (128, 4),
	 mean = [0.28248888 0.21113844 0.24720973 0.26517352],
	  std = [0.5251616  0.5336938  0.47510356 0.56115794]
10.079999774694443
Episode 1782	Average Score: 9.77	Score: 10.0810.889999756589532
Episode 1783	Average Score: 9.80	Score: 10.89actions batch at 1248000-th learning:
	 shape = (128, 4),
	 mean = [0.27862838 0.19206873 0.25376293 0.32005966],
	  std = [0.56471217 0.54627264 0.4949148  0.56858486]
7.169999839738011
Episode 1784	Average Score: 9.78	Score: 7.17actions batch at 1249000-th learning:
	 shape = (128, 4),
	 mean = [0.2597597  0.20859697 0.2903095  0.29130936],
	  std = [0.5709154  0.555265   0.51012117 0.5759215 ]
8.999999798834324
Episode 1785	Average Score: 9.77	Score: 9.00actions batch at 1250000-th learning:
	 shape = (128, 4),
	 mean = [0.24037609 0.16882907 0.22240706 0.29403752],
	  std = [0.5705834  0.56807894 0.49497128 0.57801247]
9.119999796152115
Episode 1786	Average Score: 9.75	Score: 9.126.3299998585134745
Episode 1787	Average Score: 9.74	Score: 6.33actions batch at 1251000-th learning:
	 shape = (128, 4),
	 mean = [0.2467171  0.18967684 0.21467884 0.29816356],
	  std = [0.56734645 0.5429089  0.51464164 0.54712856]
6.38999985717237
Episode 1788	Average Score: 9.72	Score: 6.39actions batch at 1252000-th learning:
	 shape = (128, 4),
	 mean = [0.3289773  0.28252622 0.24868898 0.22143626],
	  std = [0.55443084 0.52380574 0.49440002 0.5345109 ]
11.729999737814069
Episode 1789	Average Score: 9.73	Score: 11.7312.749999715015292
Episode 1790	Average Score: 9.80
actions batch at 1253000-th learning:
	 shape = (128, 4),
	 mean = [0.2293346  0.19616686 0.25155294 0.3114652 ],
	  std = [0.53647494 0.53483945 0.5031722  0.5312709 ]
8.189999816939235
Episode 1791	Average Score: 9.76	Score: 8.19actions batch at 1254000-th learning:
	 shape = (128, 4),
	 mean = [0.37082833 0.21714681 0.27952975 0.22640222],
	  std = [0.56526864 0.53677195 0.5087341  0.54068255]
8.839999802410603
Episode 1792	Average Score: 9.75	Score: 8.848.999999798834324
Episode 1793	Average Score: 9.77	Score: 9.00actions batch at 1255000-th learning:
	 shape = (128, 4),
	 mean = [0.2357037  0.2420303  0.23054639 0.31587768],
	  std = [0.57114494 0.560588   0.48850152 0.5479778 ]
4.8799998909235
Episode 1794	Average Score: 9.75	Score: 4.88actions batch at 1256000-th learning:
	 shape = (128, 4),
	 mean = [0.28300372 0.25095782 0.26354426 0.2554972 ],
	  std = [0.56117874 0.55446994 0.534148   0.5384565 ]
12.669999716803432
Episode 1795	Average Score: 9.79	Score: 12.67actions batch at 1257000-th learning:
	 shape = (128, 4),
	 mean = [0.27144837 0.23091428 0.32125813 0.27139452],
	  std = [0.59378576 0.5549493  0.52201605 0.5590795 ]
11.18999974988401
Episode 1796	Average Score: 9.82	Score: 11.1910.209999771788716
Episode 1797	Average Score: 9.81	Score: 10.21actions batch at 1258000-th learning:
	 shape = (128, 4),
	 mean = [0.30555987 0.25895226 0.2856526  0.25664163],
	  std = [0.57659066 0.5538956  0.51692605 0.5579397 ]
11.129999751225114
Episode 1798	Average Score: 9.81	Score: 11.13actions batch at 1259000-th learning:
	 shape = (128, 4),
	 mean = [0.28659105 0.16400902 0.2794763  0.32260388],
	  std = [0.58459735 0.51547205 0.51027244 0.5465883 ]
8.12999981828034
Episode 1799	Average Score: 9.82	Score: 8.1312.179999727755785
Episode 1800	Average Score: 9.84
actions batch at 1260000-th learning:
	 shape = (128, 4),
	 mean = [0.26051635 0.34376928 0.21691108 0.24933794],
	  std = [0.547264   0.5445565  0.4630656  0.51833296]
