# [DEV][PPO] Crawler

---

In this notebook, you will learn how to use the Unity ML-Agents environment for the second project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program.

### 1. Start the Environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

In [8]:
from unityagents import UnityEnvironment
import torch
import numpy as np

# widget bar to display progress
#!pip install progressbar
import progressbar as pb

Next, we will start the environment!  **_Before running the code cell below_**, change the `file_name` parameter to match the location of the Unity environment that you downloaded.

- **Mac**: `"path/to/Reacher.app"`
- **Windows** (x86): `"path/to/Reacher_Windows_x86/Reacher.exe"`
- **Windows** (x86_64): `"path/to/Reacher_Windows_x86_64/Reacher.exe"`
- **Linux** (x86): `"path/to/Reacher_Linux/Reacher.x86"`
- **Linux** (x86_64): `"path/to/Reacher_Linux/Reacher.x86_64"`
- **Linux** (x86, headless): `"path/to/Reacher_Linux_NoVis/Reacher.x86"`
- **Linux** (x86_64, headless): `"path/to/Reacher_Linux_NoVis/Reacher.x86_64"`

For instance, if you are using a Mac, then you downloaded `Reacher.app`.  If this file is in the same folder as the notebook, then the line below should appear as follows:
```
env = UnityEnvironment(file_name="Reacher.app")
```

In [9]:
env = UnityEnvironment(file_name='Crawler.app', worker_id=101,  no_graphics=True)
#env = UnityEnvironment(file_name='Reacher.app')

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: CrawlerBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 129
        Number of stacked Vector Observation: 1
        Vector Action space type: continuous
        Vector Action space size (per agent): 20
        Vector Action descriptions: , , , , , , , , , , , , , , , , , , , 


Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

### 2. Examine the State and Action Spaces

In this environment, a double-jointed arm can move to target locations. A reward of `+0.1` is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.

The observation space consists of `33` variables corresponding to position, rotation, velocity, and angular velocities of the arm.  Each action is a vector with four numbers, corresponding to torque applicable to two joints.  Every entry in the action vector must be a number between `-1` and `1`.

Run the code cell below to print some information about the environment.

In [10]:
# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# name of brain
print('Name of brain:', brain_name)

# number of agents
num_agents = len(env_info.agents)
print('Number of agents:', num_agents)

# size of each action
action_size = brain.vector_action_space_size
print('Size of each action:', action_size)

# examine the state space 
states = env_info.vector_observations
state_size = states.shape[1]
print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))
print('The state for the first agent looks like:', states[0])

Name of brain: CrawlerBrain
Number of agents: 12
Size of each action: 20
There are 12 agents. Each observes a state with length: 129
The state for the first agent looks like: [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  2.25000000e+00
  1.00000000e+00  0.00000000e+00  1.78813934e-07  0.00000000e+00
  1.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  6.06093168e-01 -1.42857209e-01 -6.06078804e-01  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  1.33339906e+00 -1.42857209e-01
 -1.33341408e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e

### 3. Take Random Actions in the Environment

In the next code cell, you will learn how to use the Python API to control the agent and receive feedback from the environment.

Once this cell is executed, you will watch the agent's performance, if it selects an action at random with each time step.  A window should pop up that allows you to observe the agent, as it moves through the environment.  

Of course, as part of the project, you'll have to change the code so that the agent is able to use its experience to gradually choose better actions when interacting with the environment!

In [11]:
env_info = env.reset(train_mode=False)[brain_name]     # reset the environment    
states = env_info.vector_observations                  # get the current state (for each agent)
scores = np.zeros(num_agents)                          # initialize the score (for each agent)
while True:
    actions = np.random.randn(num_agents, action_size) # select an action (for each agent)
    actions = np.clip(actions, -1, 1)                  # all actions between -1 and 1
    env_info = env.step(actions)[brain_name]           # send all actions to tne environment
    next_states = env_info.vector_observations         # get next state (for each agent)
    rewards = env_info.rewards                         # get reward (for each agent)
    dones = env_info.local_done                        # see if episode finished
    scores += env_info.rewards                         # update the score (for each agent)
    states = next_states                               # roll over states to next time step
    if np.any(dones):                                  # exit loop if episode finished
        break
print('Total score (averaged over agents) this episode: {}'.format(np.mean(scores)))
#env.close()

Total score (averaged over agents) this episode: 0.49039648139538866


### 4. It's Your Turn!

Now it's your turn to train your own agent to solve the environment!  When training the environment, set `train_mode=True`, so that the line for resetting the environment looks like the following:
```python
env_info = env.reset(train_mode=True)[brain_name]
```

In [12]:
def saveTrainedModel(agent, path):
    state_dicts = {'model': agent.model_local.state_dict()}
    torch.save(state_dicts, path)
    
def loadTrainedModel(agent, path):
    state_dicts = torch.load(path,map_location={'cuda:0': 'cpu'})

    agent.model_local.load_state_dict(state_dicts['model'])
    
    return agent

In [13]:
from PPO_agent import PPO_Agent
#import PPO_util 

model_dir = 'saved_models/'
model_name = 'unity_continuous_' + str(brain_name) + '_' + str(num_agents) + '_agents.pt'

agent = PPO_Agent(env, state_size, action_size, num_agents=num_agents, seed=1234)
#agent = loadTrainedModel(agent, model_dir+model_name)

current device:  cpu


In [None]:
episode_max = 50000 # training loop max iterations
episode_reward = 0.0
mean_rewards = []
e = 0

widget = ['training loop: ', pb.Percentage(), ' ', 
          pb.Bar(), ' ', pb.ETA() ]

#widget = ['Episode: ', pb.Counter(),'/',str(episode_max),'  ',  
#          'eps reward: ', str(np.mean(episode_reward)) ,'  ',
#          'Avg score (100e): ', str(mean_rewards[-100:]) ,'  ',
#          'actor gain: ', str(np.mean(agent.actor_gain)) ,'  ',
#          'critic loss: ', str(np.mean(agent.critic_loss)) ,'  ',
#           pb.ETA(), ' ', pb.Bar(marker=pb.RotatingMarker()), '  ' ]

timer = pb.ProgressBar(widgets=widget, maxval=episode_max).start()


while e < episode_max:

    # collect trajectories
    agent.step()
    episode_reward = agent.episodic_rewards
    
    # display some progress every 20 iterations
    if agent.is_training:

        # get the average reward of the parallel environments
        mean_rewards.append(np.mean(episode_reward))        
        
        if (e+1)%1==0 :
            print("Episode: {}   score: {:.2f}   Avg score (100e): {:.2f}   "
                  "actor gain: {:.2f}   critic loss: {:.2f}   steps: {}".format(e+1, np.mean(episode_reward),
                                                                         np.mean(mean_rewards[-100:]),
                                                                         np.mean(agent.actor_gain), 
                                                                         np.mean(agent.critic_loss),
                                                                         agent.t_step))
            
        if np.mean(mean_rewards[-100:]) >= 100:
            print("Average score over all agents across 100th episodes > 100. Problem Solved!")
            break
                
        timer.update(e)
        
        e += 1
    else:
        print('\rFetching experiences... {} '.format(len(agent.memory.memory)), end="")
        
    #update progress widget bar
    #timer.update(e+1)
    
timer.finish()

training loop:   0% |                                          | ETA:  --:--:--


Prefetch completed. Training starts! 
Number of Agents:  12
Device:  cpu


training loop:   0% |                                          | ETA:  --:--:--

Episode: 1   score: 2.09   Avg score (100e): 2.09   actor gain: -0.49   critic loss: 1.08   steps: 1


training loop:   0% |                                 | ETA:  45 days, 21:05:49

Episode: 2   score: 2.02   Avg score (100e): 2.06   actor gain: -0.49   critic loss: 0.95   steps: 2


training loop:   0% |                                  | ETA:  35 days, 7:48:22

Episode: 3   score: 2.12   Avg score (100e): 2.08   actor gain: -0.50   critic loss: 0.86   steps: 3


training loop:   0% |                                 | ETA:  32 days, 14:35:58

Episode: 4   score: 1.97   Avg score (100e): 2.05   actor gain: -0.49   critic loss: 0.81   steps: 4


training loop:   0% |                                 | ETA:  31 days, 15:25:59

Episode: 5   score: 2.03   Avg score (100e): 2.05   actor gain: -0.49   critic loss: 0.76   steps: 5


training loop:   0% |                                 | ETA:  30 days, 16:06:07

Episode: 6   score: 2.04   Avg score (100e): 2.05   actor gain: -0.49   critic loss: 0.73   steps: 6


training loop:   0% |                                  | ETA:  30 days, 2:13:04

Episode: 7   score: 2.07   Avg score (100e): 2.05   actor gain: -0.49   critic loss: 0.70   steps: 7


training loop:   0% |                                 | ETA:  29 days, 20:09:20

Episode: 8   score: 2.08   Avg score (100e): 2.05   actor gain: -0.49   critic loss: 0.68   steps: 8


training loop:   0% |                                 | ETA:  29 days, 21:22:56

Episode: 9   score: 2.08   Avg score (100e): 2.06   actor gain: -0.49   critic loss: 0.66   steps: 9


training loop:   0% |                                  | ETA:  30 days, 4:45:02

Episode: 10   score: 2.11   Avg score (100e): 2.06   actor gain: -0.49   critic loss: 0.65   steps: 10


training loop:   0% |                                 | ETA:  30 days, 15:22:55

Episode: 11   score: 2.26   Avg score (100e): 2.08   actor gain: -0.50   critic loss: 0.63   steps: 11


training loop:   0% |                                 | ETA:  30 days, 23:07:04

Episode: 12   score: 2.34   Avg score (100e): 2.10   actor gain: -0.50   critic loss: 0.62   steps: 12


training loop:   0% |                                  | ETA:  31 days, 5:49:21

Episode: 13   score: 2.34   Avg score (100e): 2.12   actor gain: -0.49   critic loss: 0.61   steps: 13


training loop:   0% |                                 | ETA:  31 days, 16:57:24

Episode: 14   score: 2.35   Avg score (100e): 2.14   actor gain: -0.49   critic loss: 0.60   steps: 14


training loop:   0% |                                 | ETA:  31 days, 18:16:17

Episode: 15   score: 2.34   Avg score (100e): 2.15   actor gain: -0.49   critic loss: 0.59   steps: 15


training loop:   0% |                                 | ETA:  31 days, 20:11:45

Episode: 16   score: 2.36   Avg score (100e): 2.16   actor gain: -0.49   critic loss: 0.58   steps: 16


training loop:   0% |                                  | ETA:  32 days, 0:45:33

Episode: 17   score: 2.38   Avg score (100e): 2.18   actor gain: -0.49   critic loss: 0.57   steps: 17


training loop:   0% |                                  | ETA:  32 days, 5:58:44

Episode: 18   score: 2.44   Avg score (100e): 2.19   actor gain: -0.49   critic loss: 0.57   steps: 18


training loop:   0% |                                 | ETA:  32 days, 13:20:49

Episode: 19   score: 2.50   Avg score (100e): 2.21   actor gain: -0.49   critic loss: 0.56   steps: 19


training loop:   0% |                                 | ETA:  32 days, 18:04:20

Episode: 20   score: 2.54   Avg score (100e): 2.22   actor gain: -0.49   critic loss: 0.56   steps: 20


training loop:   0% |                                 | ETA:  32 days, 23:28:19

Episode: 21   score: 2.62   Avg score (100e): 2.24   actor gain: -0.49   critic loss: 0.55   steps: 21


training loop:   0% |                                 | ETA:  32 days, 22:19:40

Episode: 22   score: 2.63   Avg score (100e): 2.26   actor gain: -0.49   critic loss: 0.55   steps: 22


training loop:   0% |                                 | ETA:  32 days, 22:59:11

Episode: 23   score: 2.62   Avg score (100e): 2.28   actor gain: -0.49   critic loss: 0.54   steps: 23


training loop:   0% |                                 | ETA:  32 days, 21:41:23

Episode: 24   score: 2.65   Avg score (100e): 2.29   actor gain: -0.49   critic loss: 0.54   steps: 24


training loop:   0% |                                 | ETA:  32 days, 21:32:49

Episode: 25   score: 2.67   Avg score (100e): 2.31   actor gain: -0.48   critic loss: 0.54   steps: 25


training loop:   0% |                                 | ETA:  32 days, 22:08:03

Episode: 26   score: 2.71   Avg score (100e): 2.32   actor gain: -0.48   critic loss: 0.51   steps: 26


training loop:   0% |                                  | ETA:  33 days, 2:21:06

Episode: 27   score: 2.80   Avg score (100e): 2.34   actor gain: -0.48   critic loss: 0.50   steps: 27


training loop:   0% |                                  | ETA:  33 days, 3:53:47

Episode: 28   score: 2.84   Avg score (100e): 2.36   actor gain: -0.48   critic loss: 0.49   steps: 28


training loop:   0% |                                  | ETA:  33 days, 6:59:36

Episode: 29   score: 2.88   Avg score (100e): 2.38   actor gain: -0.48   critic loss: 0.48   steps: 29


training loop:   0% |                                 | ETA:  33 days, 10:03:50

Episode: 30   score: 2.91   Avg score (100e): 2.39   actor gain: -0.48   critic loss: 0.47   steps: 30


training loop:   0% |                                 | ETA:  33 days, 11:58:29

Episode: 31   score: 2.93   Avg score (100e): 2.41   actor gain: -0.48   critic loss: 0.47   steps: 31


training loop:   0% |                                 | ETA:  33 days, 15:55:03

Episode: 32   score: 2.96   Avg score (100e): 2.43   actor gain: -0.48   critic loss: 0.47   steps: 32


training loop:   0% |                                 | ETA:  33 days, 21:51:40

Episode: 33   score: 3.00   Avg score (100e): 2.45   actor gain: -0.48   critic loss: 0.47   steps: 33


training loop:   0% |                                  | ETA:  34 days, 0:40:58

Episode: 34   score: 3.02   Avg score (100e): 2.46   actor gain: -0.48   critic loss: 0.46   steps: 34


training loop:   0% |                                 | ETA:  33 days, 23:44:10

Episode: 35   score: 3.01   Avg score (100e): 2.48   actor gain: -0.48   critic loss: 0.46   steps: 35


training loop:   0% |                                 | ETA:  33 days, 23:21:43

Episode: 36   score: 3.05   Avg score (100e): 2.49   actor gain: -0.47   critic loss: 0.46   steps: 36


training loop:   0% |                                 | ETA:  33 days, 23:31:51

Episode: 37   score: 3.09   Avg score (100e): 2.51   actor gain: -0.47   critic loss: 0.46   steps: 37


training loop:   0% |                                  | ETA:  34 days, 1:40:00

Episode: 38   score: 3.14   Avg score (100e): 2.53   actor gain: -0.47   critic loss: 0.46   steps: 38


training loop:   0% |                                  | ETA:  34 days, 1:06:21

Episode: 39   score: 3.19   Avg score (100e): 2.54   actor gain: -0.47   critic loss: 0.46   steps: 39


training loop:   0% |                                  | ETA:  34 days, 1:05:23

Episode: 40   score: 3.25   Avg score (100e): 2.56   actor gain: -0.47   critic loss: 0.46   steps: 40


training loop:   0% |                                  | ETA:  34 days, 0:51:03

Episode: 41   score: 3.29   Avg score (100e): 2.58   actor gain: -0.47   critic loss: 0.46   steps: 41


training loop:   0% |                                  | ETA:  34 days, 5:19:38

Episode: 42   score: 3.32   Avg score (100e): 2.60   actor gain: -2.32   critic loss: 0.46   steps: 42


training loop:   0% |                                  | ETA:  34 days, 4:19:24

Episode: 43   score: 3.39   Avg score (100e): 2.61   actor gain: -2.32   critic loss: 0.46   steps: 43


training loop:   0% |                                  | ETA:  34 days, 4:13:47

Episode: 44   score: 3.42   Avg score (100e): 2.63   actor gain: -2.32   critic loss: 0.45   steps: 44


training loop:   0% |                                  | ETA:  34 days, 3:06:03

Episode: 45   score: 3.42   Avg score (100e): 2.65   actor gain: -2.32   critic loss: 0.46   steps: 45


training loop:   0% |                                  | ETA:  34 days, 3:03:48

Episode: 46   score: 3.49   Avg score (100e): 2.67   actor gain: -2.32   critic loss: 0.46   steps: 46


training loop:   0% |                                  | ETA:  34 days, 2:18:54

Episode: 47   score: 3.50   Avg score (100e): 2.69   actor gain: -2.32   critic loss: 0.46   steps: 47


training loop:   0% |                                  | ETA:  34 days, 2:28:04

Episode: 48   score: 3.57   Avg score (100e): 2.70   actor gain: -2.32   critic loss: 0.46   steps: 48


training loop:   0% |                                  | ETA:  34 days, 2:27:22

Episode: 49   score: 3.62   Avg score (100e): 2.72   actor gain: -2.32   critic loss: 0.46   steps: 49


training loop:   0% |                                  | ETA:  34 days, 3:29:37

Episode: 50   score: 3.64   Avg score (100e): 2.74   actor gain: -2.32   critic loss: 0.46   steps: 50


training loop:   0% |                                  | ETA:  34 days, 4:44:36

Episode: 51   score: 3.69   Avg score (100e): 2.76   actor gain: -2.32   critic loss: 0.46   steps: 51


training loop:   0% |                                  | ETA:  34 days, 4:31:44

Episode: 52   score: 3.70   Avg score (100e): 2.78   actor gain: -2.32   critic loss: 0.46   steps: 52


training loop:   0% |                                  | ETA:  34 days, 4:17:24

Episode: 53   score: 3.73   Avg score (100e): 2.80   actor gain: -2.32   critic loss: 0.45   steps: 53


training loop:   0% |                                  | ETA:  34 days, 4:03:14

Episode: 54   score: 3.77   Avg score (100e): 2.81   actor gain: -2.32   critic loss: 0.45   steps: 54


training loop:   0% |                                  | ETA:  34 days, 3:20:56

Episode: 55   score: 3.80   Avg score (100e): 2.83   actor gain: -2.32   critic loss: 0.45   steps: 55


training loop:   0% |                                  | ETA:  34 days, 3:14:09

Episode: 56   score: 3.86   Avg score (100e): 2.85   actor gain: -2.32   critic loss: 0.45   steps: 56


training loop:   0% |                                  | ETA:  34 days, 3:22:02

Episode: 57   score: 3.89   Avg score (100e): 2.87   actor gain: -2.33   critic loss: 0.45   steps: 57


training loop:   0% |                                  | ETA:  34 days, 3:17:12

Episode: 58   score: 3.92   Avg score (100e): 2.89   actor gain: -2.32   critic loss: 0.45   steps: 58


training loop:   0% |                                  | ETA:  34 days, 3:33:39

Episode: 59   score: 3.97   Avg score (100e): 2.90   actor gain: -2.33   critic loss: 0.45   steps: 59


training loop:   0% |                                  | ETA:  34 days, 0:30:36

Episode: 60   score: 3.98   Avg score (100e): 2.92   actor gain: -2.32   critic loss: 0.45   steps: 60


training loop:   0% |                                 | ETA:  33 days, 21:49:21

Episode: 61   score: 4.00   Avg score (100e): 2.94   actor gain: -2.32   critic loss: 0.45   steps: 61


training loop:   0% |                                 | ETA:  33 days, 17:41:31

Episode: 62   score: 4.03   Avg score (100e): 2.96   actor gain: -2.32   critic loss: 0.45   steps: 62


training loop:   0% |                                 | ETA:  33 days, 14:27:35

Episode: 63   score: 4.09   Avg score (100e): 2.98   actor gain: -2.32   critic loss: 0.45   steps: 63


training loop:   0% |                                 | ETA:  33 days, 12:59:18

Episode: 64   score: 4.12   Avg score (100e): 2.99   actor gain: -2.32   critic loss: 0.44   steps: 64


training loop:   0% |                                 | ETA:  33 days, 13:05:17

Episode: 65   score: 4.16   Avg score (100e): 3.01   actor gain: -2.33   critic loss: 0.44   steps: 65


training loop:   0% |                                 | ETA:  33 days, 11:41:07

Episode: 66   score: 4.19   Avg score (100e): 3.03   actor gain: -2.33   critic loss: 0.44   steps: 66


training loop:   0% |                                  | ETA:  33 days, 9:42:23

Episode: 67   score: 4.22   Avg score (100e): 3.05   actor gain: -0.48   critic loss: 0.44   steps: 67


training loop:   0% |                                  | ETA:  33 days, 7:57:03

Episode: 68   score: 4.21   Avg score (100e): 3.06   actor gain: -0.48   critic loss: 0.44   steps: 68


training loop:   0% |                                  | ETA:  33 days, 4:40:38

Episode: 69   score: 4.24   Avg score (100e): 3.08   actor gain: -0.48   critic loss: 0.44   steps: 69


training loop:   0% |                                  | ETA:  33 days, 2:16:52

Episode: 70   score: 4.29   Avg score (100e): 3.10   actor gain: -0.48   critic loss: 0.44   steps: 70


training loop:   0% |                                  | ETA:  33 days, 0:26:22

Episode: 71   score: 4.31   Avg score (100e): 3.12   actor gain: -0.48   critic loss: 0.44   steps: 71


training loop:   0% |                                 | ETA:  32 days, 22:36:42

Episode: 72   score: 4.34   Avg score (100e): 3.13   actor gain: -0.48   critic loss: 0.44   steps: 72


training loop:   0% |                                 | ETA:  32 days, 19:58:39

Episode: 73   score: 4.35   Avg score (100e): 3.15   actor gain: -0.48   critic loss: 0.44   steps: 73


training loop:   0% |                                 | ETA:  32 days, 19:54:32

Episode: 74   score: 4.39   Avg score (100e): 3.17   actor gain: -0.48   critic loss: 0.43   steps: 74


training loop:   0% |                                 | ETA:  32 days, 17:23:42

Episode: 75   score: 4.42   Avg score (100e): 3.18   actor gain: -0.49   critic loss: 0.43   steps: 75


training loop:   0% |                                 | ETA:  32 days, 15:46:35

Episode: 76   score: 4.45   Avg score (100e): 3.20   actor gain: -0.49   critic loss: 0.43   steps: 76


training loop:   0% |                                 | ETA:  32 days, 14:15:45

Episode: 77   score: 4.46   Avg score (100e): 3.22   actor gain: -0.49   critic loss: 0.43   steps: 77


training loop:   0% |                                 | ETA:  32 days, 11:56:20

Episode: 78   score: 4.47   Avg score (100e): 3.23   actor gain: -0.49   critic loss: 0.43   steps: 78


training loop:   0% |                                  | ETA:  32 days, 9:04:08

Episode: 79   score: 4.49   Avg score (100e): 3.25   actor gain: -0.49   critic loss: 0.43   steps: 79


training loop:   0% |                                  | ETA:  32 days, 6:37:53

Episode: 80   score: 4.51   Avg score (100e): 3.26   actor gain: -0.49   critic loss: 0.43   steps: 80


training loop:   0% |                                  | ETA:  32 days, 3:59:57

Episode: 81   score: 4.54   Avg score (100e): 3.28   actor gain: -0.49   critic loss: 0.43   steps: 81


training loop:   0% |                                  | ETA:  32 days, 1:06:20

Episode: 82   score: 4.56   Avg score (100e): 3.30   actor gain: -0.49   critic loss: 0.43   steps: 82


training loop:   0% |                                 | ETA:  31 days, 23:13:45

Episode: 83   score: 4.59   Avg score (100e): 3.31   actor gain: -0.49   critic loss: 0.43   steps: 83


training loop:   0% |                                 | ETA:  31 days, 20:47:31

Episode: 84   score: 4.60   Avg score (100e): 3.33   actor gain: -0.49   critic loss: 0.43   steps: 84


training loop:   0% |                                 | ETA:  31 days, 19:53:15

Episode: 85   score: 4.63   Avg score (100e): 3.34   actor gain: -0.49   critic loss: 0.43   steps: 85


training loop:   0% |                                 | ETA:  31 days, 19:18:40

Episode: 86   score: 4.65   Avg score (100e): 3.36   actor gain: -0.49   critic loss: 0.43   steps: 86


training loop:   0% |                                 | ETA:  31 days, 16:37:32

Episode: 87   score: 4.65   Avg score (100e): 3.37   actor gain: -0.49   critic loss: 0.43   steps: 87


training loop:   0% |                                 | ETA:  31 days, 13:40:10

Episode: 88   score: 4.66   Avg score (100e): 3.39   actor gain: -0.49   critic loss: 0.43   steps: 88


training loop:   0% |                                 | ETA:  31 days, 11:40:56

Episode: 89   score: 4.69   Avg score (100e): 3.40   actor gain: -0.49   critic loss: 0.43   steps: 89


training loop:   0% |                                 | ETA:  31 days, 11:01:01

Episode: 90   score: 4.71   Avg score (100e): 3.42   actor gain: -0.49   critic loss: 0.43   steps: 90


training loop:   0% |                                  | ETA:  31 days, 9:43:21

Episode: 91   score: 4.73   Avg score (100e): 3.43   actor gain: -0.49   critic loss: 0.43   steps: 91


training loop:   0% |                                  | ETA:  31 days, 6:48:56

Episode: 92   score: 4.72   Avg score (100e): 3.44   actor gain: -0.49   critic loss: 0.42   steps: 92


training loop:   0% |                                  | ETA:  31 days, 4:18:49

Episode: 93   score: 4.75   Avg score (100e): 3.46   actor gain: -0.49   critic loss: 0.42   steps: 93


training loop:   0% |                                  | ETA:  31 days, 1:53:24

Episode: 94   score: 4.78   Avg score (100e): 3.47   actor gain: -0.49   critic loss: 0.42   steps: 94


training loop:   0% |                                 | ETA:  30 days, 23:14:29

Episode: 95   score: 4.83   Avg score (100e): 3.49   actor gain: -0.48   critic loss: 0.42   steps: 95


training loop:   0% |                                 | ETA:  30 days, 20:55:46

Episode: 96   score: 4.86   Avg score (100e): 3.50   actor gain: -0.48   critic loss: 0.42   steps: 96


training loop:   0% |                                 | ETA:  30 days, 18:52:05

Episode: 97   score: 4.89   Avg score (100e): 3.52   actor gain: -0.48   critic loss: 0.42   steps: 97


training loop:   0% |                                 | ETA:  30 days, 16:39:30

Episode: 98   score: 4.90   Avg score (100e): 3.53   actor gain: -0.48   critic loss: 0.42   steps: 98


training loop:   0% |                                 | ETA:  30 days, 14:39:20

Episode: 99   score: 4.93   Avg score (100e): 3.54   actor gain: -0.48   critic loss: 0.42   steps: 99


training loop:   0% |                                 | ETA:  30 days, 12:17:34

Episode: 100   score: 4.95   Avg score (100e): 3.56   actor gain: -0.48   critic loss: 0.42   steps: 100


training loop:   0% |                                 | ETA:  30 days, 11:56:50

Episode: 101   score: 4.96   Avg score (100e): 3.59   actor gain: -0.48   critic loss: 0.42   steps: 101


training loop:   0% |                                  | ETA:  30 days, 9:58:52

Episode: 102   score: 4.98   Avg score (100e): 3.62   actor gain: -0.48   critic loss: 0.42   steps: 102


training loop:   0% |                                  | ETA:  30 days, 7:38:46

Episode: 103   score: 4.98   Avg score (100e): 3.64   actor gain: -0.48   critic loss: 0.42   steps: 103


training loop:   0% |                                  | ETA:  30 days, 5:47:42

Episode: 104   score: 5.00   Avg score (100e): 3.67   actor gain: -0.49   critic loss: 0.42   steps: 104


training loop:   0% |                                  | ETA:  30 days, 4:08:18

Episode: 105   score: 5.04   Avg score (100e): 3.70   actor gain: -0.49   critic loss: 0.42   steps: 105


training loop:   0% |                                  | ETA:  30 days, 2:33:12

Episode: 106   score: 5.04   Avg score (100e): 3.73   actor gain: -0.49   critic loss: 0.42   steps: 106


training loop:   0% |                                  | ETA:  30 days, 2:36:20

Episode: 107   score: 5.06   Avg score (100e): 3.76   actor gain: -0.49   critic loss: 0.42   steps: 107


training loop:   0% |                                  | ETA:  30 days, 1:22:15

Episode: 108   score: 5.11   Avg score (100e): 3.79   actor gain: -0.49   critic loss: 0.42   steps: 108


training loop:   0% |                                 | ETA:  29 days, 23:42:10

Episode: 109   score: 5.13   Avg score (100e): 3.83   actor gain: -0.49   critic loss: 0.42   steps: 109


training loop:   0% |                                 | ETA:  29 days, 21:47:52

Episode: 110   score: 5.15   Avg score (100e): 3.86   actor gain: -0.49   critic loss: 0.42   steps: 110


training loop:   0% |                                 | ETA:  29 days, 20:06:43

Episode: 111   score: 5.18   Avg score (100e): 3.88   actor gain: -0.49   critic loss: 0.42   steps: 111


training loop:   0% |                                 | ETA:  29 days, 18:54:19

Episode: 112   score: 5.20   Avg score (100e): 3.91   actor gain: -0.49   critic loss: 0.42   steps: 112


training loop:   0% |                                 | ETA:  29 days, 17:19:20

Episode: 113   score: 5.22   Avg score (100e): 3.94   actor gain: -0.49   critic loss: 0.42   steps: 113


training loop:   0% |                                 | ETA:  29 days, 15:37:37

Episode: 114   score: 5.24   Avg score (100e): 3.97   actor gain: -0.49   critic loss: 0.42   steps: 114


training loop:   0% |                                 | ETA:  29 days, 13:43:01

Episode: 115   score: 5.23   Avg score (100e): 4.00   actor gain: -0.49   critic loss: 0.42   steps: 115


training loop:   0% |                                 | ETA:  29 days, 11:40:43

Episode: 116   score: 5.25   Avg score (100e): 4.03   actor gain: -0.49   critic loss: 0.42   steps: 116


training loop:   0% |                                  | ETA:  29 days, 9:58:15

Episode: 117   score: 5.26   Avg score (100e): 4.06   actor gain: -0.49   critic loss: 0.42   steps: 117


training loop:   0% |                                  | ETA:  29 days, 9:40:25

Episode: 118   score: 5.29   Avg score (100e): 4.09   actor gain: -0.49   critic loss: 0.42   steps: 118


training loop:   0% |                                 | ETA:  29 days, 10:06:03

Episode: 119   score: 5.31   Avg score (100e): 4.11   actor gain: -0.50   critic loss: 0.42   steps: 119


training loop:   0% |                                  | ETA:  29 days, 9:07:14

Episode: 120   score: 5.33   Avg score (100e): 4.14   actor gain: -0.50   critic loss: 0.42   steps: 120


training loop:   0% |                                  | ETA:  29 days, 8:40:45

Episode: 121   score: 5.34   Avg score (100e): 4.17   actor gain: -0.50   critic loss: 0.42   steps: 121


training loop:   0% |                                  | ETA:  29 days, 9:02:35

Episode: 122   score: 5.36   Avg score (100e): 4.20   actor gain: -0.50   critic loss: 0.42   steps: 122


training loop:   0% |                                  | ETA:  29 days, 7:31:58

Episode: 123   score: 5.36   Avg score (100e): 4.22   actor gain: -0.50   critic loss: 0.42   steps: 123


training loop:   0% |                                  | ETA:  29 days, 6:38:25

Episode: 124   score: 5.36   Avg score (100e): 4.25   actor gain: -0.50   critic loss: 0.42   steps: 124


training loop:   0% |                                  | ETA:  29 days, 5:39:25

Episode: 125   score: 5.38   Avg score (100e): 4.28   actor gain: -0.50   critic loss: 0.42   steps: 125


training loop:   0% |                                  | ETA:  29 days, 4:51:10

Episode: 126   score: 5.41   Avg score (100e): 4.31   actor gain: -0.50   critic loss: 0.42   steps: 126


training loop:   0% |                                  | ETA:  29 days, 4:06:03

Episode: 127   score: 5.44   Avg score (100e): 4.33   actor gain: -0.50   critic loss: 0.42   steps: 127


training loop:   0% |                                  | ETA:  29 days, 4:21:57

Episode: 128   score: 5.45   Avg score (100e): 4.36   actor gain: -0.50   critic loss: 0.42   steps: 128


training loop:   0% |                                  | ETA:  29 days, 3:33:16

Episode: 129   score: 5.50   Avg score (100e): 4.38   actor gain: -0.48   critic loss: 0.42   steps: 129


training loop:   0% |                                  | ETA:  29 days, 2:44:24

Episode: 130   score: 5.51   Avg score (100e): 4.41   actor gain: -0.48   critic loss: 0.42   steps: 130


training loop:   0% |                                  | ETA:  29 days, 1:58:03

Episode: 131   score: 5.50   Avg score (100e): 4.44   actor gain: -0.48   critic loss: 0.42   steps: 131


training loop:   0% |                                  | ETA:  29 days, 0:59:35

Episode: 132   score: 5.53   Avg score (100e): 4.46   actor gain: -0.48   critic loss: 0.42   steps: 132


training loop:   0% |                                  | ETA:  29 days, 0:01:37

Episode: 133   score: 5.57   Avg score (100e): 4.49   actor gain: -0.48   critic loss: 0.42   steps: 133


training loop:   0% |                                 | ETA:  28 days, 23:50:47

Episode: 134   score: 5.59   Avg score (100e): 4.51   actor gain: -0.48   critic loss: 0.42   steps: 134


training loop:   0% |                                  | ETA:  29 days, 0:30:30

Episode: 135   score: 5.62   Avg score (100e): 4.54   actor gain: -0.48   critic loss: 0.42   steps: 135


training loop:   0% |                                  | ETA:  29 days, 0:32:42

Episode: 136   score: 5.63   Avg score (100e): 4.56   actor gain: -0.48   critic loss: 0.42   steps: 136


training loop:   0% |                                 | ETA:  28 days, 23:56:04

Episode: 137   score: 5.69   Avg score (100e): 4.59   actor gain: -0.48   critic loss: 0.42   steps: 137


training loop:   0% |                                 | ETA:  28 days, 23:35:29

Episode: 138   score: 5.72   Avg score (100e): 4.62   actor gain: -0.48   critic loss: 0.42   steps: 138


training loop:   0% |                                  | ETA:  30 days, 1:33:51

Episode: 139   score: 5.75   Avg score (100e): 4.64   actor gain: -0.48   critic loss: 0.42   steps: 139


training loop:   0% |                                  | ETA:  30 days, 1:06:52

Episode: 140   score: 5.77   Avg score (100e): 4.67   actor gain: -0.48   critic loss: 0.42   steps: 140


training loop:   0% |                                  | ETA:  30 days, 0:24:25

Episode: 141   score: 5.81   Avg score (100e): 4.69   actor gain: -0.48   critic loss: 0.42   steps: 141


training loop:   0% |                                  | ETA:  30 days, 1:36:00

Episode: 142   score: 5.83   Avg score (100e): 4.72   actor gain: -0.48   critic loss: 0.42   steps: 142


training loop:   0% |                                  | ETA:  30 days, 4:01:14

Episode: 143   score: 5.83   Avg score (100e): 4.74   actor gain: -0.48   critic loss: 0.42   steps: 143


training loop:   0% |                                  | ETA:  30 days, 6:29:59

Episode: 144   score: 5.83   Avg score (100e): 4.77   actor gain: -0.47   critic loss: 0.42   steps: 144


training loop:   0% |                                  | ETA:  30 days, 7:28:50

Episode: 145   score: 5.83   Avg score (100e): 4.79   actor gain: -0.47   critic loss: 0.42   steps: 145


training loop:   0% |                                  | ETA:  30 days, 8:17:39

Episode: 146   score: 5.82   Avg score (100e): 4.81   actor gain: -0.47   critic loss: 0.42   steps: 146


training loop:   0% |                                  | ETA:  30 days, 9:11:20

Episode: 147   score: 5.84   Avg score (100e): 4.84   actor gain: -0.47   critic loss: 0.42   steps: 147


training loop:   0% |                                  | ETA:  30 days, 9:48:06

Episode: 148   score: 5.83   Avg score (100e): 4.86   actor gain: -0.47   critic loss: 0.42   steps: 148


training loop:   0% |                                 | ETA:  30 days, 11:04:57

Episode: 149   score: 5.84   Avg score (100e): 4.88   actor gain: -0.47   critic loss: 0.42   steps: 149


training loop:   0% |                                 | ETA:  30 days, 11:47:50

Episode: 150   score: 5.85   Avg score (100e): 4.90   actor gain: -0.47   critic loss: 0.42   steps: 150


training loop:   0% |                                 | ETA:  30 days, 12:25:54

Episode: 151   score: 5.88   Avg score (100e): 4.93   actor gain: -0.47   critic loss: 0.42   steps: 151


training loop:   0% |                                 | ETA:  30 days, 13:01:01

Episode: 152   score: 5.92   Avg score (100e): 4.95   actor gain: -0.47   critic loss: 0.42   steps: 152


training loop:   0% |                                 | ETA:  30 days, 13:38:10

Episode: 153   score: 5.95   Avg score (100e): 4.97   actor gain: -0.47   critic loss: 0.42   steps: 153


training loop:   0% |                                 | ETA:  30 days, 14:41:15

Episode: 154   score: 5.98   Avg score (100e): 4.99   actor gain: -0.47   critic loss: 0.42   steps: 154


training loop:   0% |                                 | ETA:  30 days, 16:24:53

Episode: 155   score: 6.00   Avg score (100e): 5.01   actor gain: -0.47   critic loss: 0.42   steps: 155


training loop:   0% |                                 | ETA:  30 days, 17:39:42

Episode: 156   score: 6.00   Avg score (100e): 5.04   actor gain: -0.47   critic loss: 0.42   steps: 156


training loop:   0% |                                 | ETA:  30 days, 18:24:23

Episode: 157   score: 6.03   Avg score (100e): 5.06   actor gain: -0.47   critic loss: 0.42   steps: 157


training loop:   0% |                                 | ETA:  33 days, 15:32:53

Episode: 158   score: 6.05   Avg score (100e): 5.08   actor gain: -0.47   critic loss: 0.42   steps: 158


training loop:   0% |                                 | ETA:  33 days, 17:07:06

Episode: 159   score: 6.10   Avg score (100e): 5.10   actor gain: -0.47   critic loss: 0.42   steps: 159


training loop:   0% |                                 | ETA:  33 days, 17:55:12

Episode: 160   score: 6.10   Avg score (100e): 5.12   actor gain: -0.47   critic loss: 0.42   steps: 160


training loop:   0% |                                 | ETA:  33 days, 18:35:46

Episode: 161   score: 6.12   Avg score (100e): 5.14   actor gain: -0.47   critic loss: 0.42   steps: 161


training loop:   0% |                                 | ETA:  33 days, 19:05:07

Episode: 162   score: 6.14   Avg score (100e): 5.16   actor gain: -0.47   critic loss: 0.42   steps: 162


training loop:   0% |                                 | ETA:  33 days, 18:01:57

Episode: 163   score: 6.18   Avg score (100e): 5.18   actor gain: -0.47   critic loss: 0.42   steps: 163


training loop:   0% |                                 | ETA:  33 days, 17:58:17

Episode: 164   score: 6.20   Avg score (100e): 5.21   actor gain: -0.47   critic loss: 0.42   steps: 164


training loop:   0% |                                 | ETA:  33 days, 17:45:36

Episode: 165   score: 6.20   Avg score (100e): 5.23   actor gain: -0.47   critic loss: 0.42   steps: 165


training loop:   0% |                                 | ETA:  33 days, 17:37:51

Episode: 166   score: 6.22   Avg score (100e): 5.25   actor gain: -0.47   critic loss: 0.42   steps: 166


training loop:   0% |                                 | ETA:  33 days, 17:13:20

Episode: 167   score: 6.22   Avg score (100e): 5.27   actor gain: -0.47   critic loss: 0.42   steps: 167


training loop:   0% |                                 | ETA:  33 days, 16:49:49

Episode: 168   score: 6.23   Avg score (100e): 5.29   actor gain: -0.47   critic loss: 0.42   steps: 168


training loop:   0% |                                 | ETA:  33 days, 16:20:58

Episode: 169   score: 6.24   Avg score (100e): 5.31   actor gain: -0.47   critic loss: 0.42   steps: 169


training loop:   0% |                                 | ETA:  33 days, 16:52:57

Episode: 170   score: 6.25   Avg score (100e): 5.33   actor gain: -0.47   critic loss: 0.42   steps: 170


training loop:   0% |                                 | ETA:  33 days, 17:55:11

Episode: 171   score: 6.25   Avg score (100e): 5.35   actor gain: -0.47   critic loss: 0.42   steps: 171


training loop:   0% |                                 | ETA:  33 days, 18:09:10

Episode: 172   score: 6.26   Avg score (100e): 5.36   actor gain: -0.47   critic loss: 0.42   steps: 172


training loop:   0% |                                 | ETA:  33 days, 18:31:43

Episode: 173   score: 6.28   Avg score (100e): 5.38   actor gain: -0.47   critic loss: 0.42   steps: 173


training loop:   0% |                                 | ETA:  33 days, 18:33:37

Episode: 174   score: 6.26   Avg score (100e): 5.40   actor gain: -0.47   critic loss: 0.42   steps: 174


training loop:   0% |                                 | ETA:  33 days, 18:33:14

Episode: 175   score: 6.25   Avg score (100e): 5.42   actor gain: -0.47   critic loss: 0.42   steps: 175


training loop:   0% |                                 | ETA:  33 days, 18:55:52

Episode: 176   score: 6.25   Avg score (100e): 5.44   actor gain: -0.47   critic loss: 0.42   steps: 176


training loop:   0% |                                 | ETA:  33 days, 19:05:03

Episode: 177   score: 6.27   Avg score (100e): 5.46   actor gain: -0.46   critic loss: 0.42   steps: 177


training loop:   0% |                                 | ETA:  33 days, 19:24:38

Episode: 178   score: 6.26   Avg score (100e): 5.47   actor gain: -0.46   critic loss: 0.42   steps: 178


training loop:   0% |                                 | ETA:  33 days, 19:21:03

Episode: 179   score: 6.28   Avg score (100e): 5.49   actor gain: -0.46   critic loss: 0.42   steps: 179


training loop:   0% |                                 | ETA:  33 days, 19:19:38

Episode: 180   score: 6.28   Avg score (100e): 5.51   actor gain: -0.46   critic loss: 0.42   steps: 180


training loop:   0% |                                 | ETA:  33 days, 19:26:36

Episode: 181   score: 6.30   Avg score (100e): 5.53   actor gain: -0.46   critic loss: 0.42   steps: 181


training loop:   0% |                                 | ETA:  38 days, 22:53:07

Episode: 182   score: 6.29   Avg score (100e): 5.54   actor gain: -0.47   critic loss: 0.42   steps: 182


training loop:   0% |                                 | ETA:  38 days, 22:31:47

Episode: 183   score: 6.31   Avg score (100e): 5.56   actor gain: -0.47   critic loss: 0.42   steps: 183


training loop:   0% |                                 | ETA:  38 days, 22:20:29

Episode: 184   score: 6.33   Avg score (100e): 5.58   actor gain: -0.47   critic loss: 0.42   steps: 184


training loop:   0% |                                 | ETA:  38 days, 22:01:33

Episode: 185   score: 6.34   Avg score (100e): 5.60   actor gain: -0.47   critic loss: 0.42   steps: 185


training loop:   0% |                                 | ETA:  38 days, 21:33:33

Episode: 186   score: 6.37   Avg score (100e): 5.61   actor gain: -0.46   critic loss: 0.42   steps: 186


training loop:   0% |                                 | ETA:  38 days, 20:54:28

Episode: 187   score: 6.38   Avg score (100e): 5.63   actor gain: -0.46   critic loss: 0.42   steps: 187


training loop:   0% |                                 | ETA:  38 days, 20:20:27

Episode: 188   score: 6.39   Avg score (100e): 5.65   actor gain: -0.46   critic loss: 0.42   steps: 188


training loop:   0% |                                 | ETA:  38 days, 20:02:34

Episode: 189   score: 6.43   Avg score (100e): 5.67   actor gain: -0.46   critic loss: 0.42   steps: 189


training loop:   0% |                                 | ETA:  38 days, 19:30:17

Episode: 190   score: 6.46   Avg score (100e): 5.68   actor gain: -0.46   critic loss: 0.42   steps: 190


training loop:   0% |                                 | ETA:  38 days, 18:47:38

Episode: 191   score: 6.46   Avg score (100e): 5.70   actor gain: -0.46   critic loss: 0.42   steps: 191


training loop:   0% |                                 | ETA:  38 days, 18:11:07

Episode: 192   score: 6.47   Avg score (100e): 5.72   actor gain: -0.46   critic loss: 0.42   steps: 192


training loop:   0% |                                 | ETA:  38 days, 17:28:47

Episode: 193   score: 6.49   Avg score (100e): 5.74   actor gain: -0.46   critic loss: 0.42   steps: 193


training loop:   0% |                                 | ETA:  38 days, 17:07:51

Episode: 194   score: 6.50   Avg score (100e): 5.75   actor gain: -0.46   critic loss: 0.42   steps: 194


training loop:   0% |                                 | ETA:  38 days, 16:32:02

Episode: 195   score: 6.53   Avg score (100e): 5.77   actor gain: -0.47   critic loss: 0.42   steps: 195


training loop:   0% |                                 | ETA:  38 days, 16:09:04

Episode: 196   score: 6.55   Avg score (100e): 5.79   actor gain: -0.47   critic loss: 0.42   steps: 196


training loop:   0% |                                 | ETA:  38 days, 15:30:23

Episode: 197   score: 6.56   Avg score (100e): 5.80   actor gain: -0.49   critic loss: 0.42   steps: 197


training loop:   0% |                                 | ETA:  38 days, 14:51:37

Episode: 198   score: 6.55   Avg score (100e): 5.82   actor gain: -0.49   critic loss: 0.42   steps: 198


training loop:   0% |                                 | ETA:  38 days, 14:22:32

Episode: 199   score: 6.59   Avg score (100e): 5.84   actor gain: -0.49   critic loss: 0.42   steps: 199


training loop:   0% |                                 | ETA:  38 days, 13:44:59

Episode: 200   score: 6.63   Avg score (100e): 5.85   actor gain: -0.49   critic loss: 0.42   steps: 200


training loop:   0% |                                 | ETA:  38 days, 13:17:06

Episode: 201   score: 6.65   Avg score (100e): 5.87   actor gain: -0.49   critic loss: 0.42   steps: 201


training loop:   0% |                                 | ETA:  38 days, 12:51:40

Episode: 202   score: 6.68   Avg score (100e): 5.89   actor gain: -0.49   critic loss: 0.42   steps: 202


training loop:   0% |                                 | ETA:  38 days, 12:15:04

Episode: 203   score: 6.72   Avg score (100e): 5.90   actor gain: -0.49   critic loss: 0.42   steps: 203


training loop:   0% |                                 | ETA:  38 days, 12:23:15

Episode: 204   score: 6.75   Avg score (100e): 5.92   actor gain: -0.49   critic loss: 0.42   steps: 204


training loop:   0% |                                 | ETA:  38 days, 12:06:59

Episode: 205   score: 6.79   Avg score (100e): 5.94   actor gain: -0.49   critic loss: 0.42   steps: 205


training loop:   0% |                                 | ETA:  38 days, 11:55:21

Episode: 206   score: 6.82   Avg score (100e): 5.96   actor gain: -0.49   critic loss: 0.42   steps: 206


training loop:   0% |                                 | ETA:  38 days, 11:26:00

Episode: 207   score: 6.86   Avg score (100e): 5.98   actor gain: -0.49   critic loss: 0.42   steps: 207


training loop:   0% |                                 | ETA:  38 days, 11:36:12

Episode: 208   score: 6.88   Avg score (100e): 5.99   actor gain: -0.49   critic loss: 0.42   steps: 208


training loop:   0% |                                 | ETA:  38 days, 11:17:27

Episode: 209   score: 6.92   Avg score (100e): 6.01   actor gain: -0.49   critic loss: 0.42   steps: 209


training loop:   0% |                                 | ETA:  38 days, 10:43:04

Episode: 210   score: 6.96   Avg score (100e): 6.03   actor gain: -0.49   critic loss: 0.42   steps: 210


training loop:   0% |                                 | ETA:  38 days, 10:13:11

Episode: 211   score: 6.97   Avg score (100e): 6.05   actor gain: -0.49   critic loss: 0.42   steps: 211


training loop:   0% |                                  | ETA:  38 days, 9:45:55

Episode: 212   score: 6.99   Avg score (100e): 6.07   actor gain: -0.49   critic loss: 0.42   steps: 212


training loop:   0% |                                  | ETA:  38 days, 9:21:52

Episode: 213   score: 6.98   Avg score (100e): 6.08   actor gain: -0.49   critic loss: 0.42   steps: 213


training loop:   0% |                                  | ETA:  38 days, 9:05:00

Episode: 214   score: 7.01   Avg score (100e): 6.10   actor gain: -0.49   critic loss: 0.42   steps: 214


training loop:   0% |                                  | ETA:  38 days, 8:29:00

Episode: 215   score: 7.04   Avg score (100e): 6.12   actor gain: -0.49   critic loss: 0.42   steps: 215


training loop:   0% |                                  | ETA:  38 days, 7:56:41

Episode: 216   score: 7.07   Avg score (100e): 6.14   actor gain: -0.49   critic loss: 0.42   steps: 216


training loop:   0% |                                  | ETA:  38 days, 7:29:06

Episode: 217   score: 7.10   Avg score (100e): 6.16   actor gain: -0.49   critic loss: 0.42   steps: 217


training loop:   0% |                                  | ETA:  38 days, 7:03:20

Episode: 218   score: 7.15   Avg score (100e): 6.17   actor gain: -0.49   critic loss: 0.42   steps: 218


training loop:   0% |                                  | ETA:  38 days, 6:30:33

Episode: 219   score: 7.18   Avg score (100e): 6.19   actor gain: -0.49   critic loss: 0.42   steps: 219


training loop:   0% |                                  | ETA:  38 days, 6:00:27

Episode: 220   score: 7.21   Avg score (100e): 6.21   actor gain: -0.49   critic loss: 0.42   steps: 220


training loop:   0% |                                  | ETA:  38 days, 5:33:06

Episode: 221   score: 7.24   Avg score (100e): 6.23   actor gain: -0.49   critic loss: 0.42   steps: 221


training loop:   0% |                                  | ETA:  38 days, 4:57:56

Episode: 222   score: 7.28   Avg score (100e): 6.25   actor gain: -0.46   critic loss: 0.42   steps: 222


training loop:   0% |                                  | ETA:  38 days, 4:58:40

Episode: 223   score: 7.27   Avg score (100e): 6.27   actor gain: -0.46   critic loss: 0.42   steps: 223


training loop:   0% |                                  | ETA:  38 days, 4:42:58

Episode: 224   score: 7.30   Avg score (100e): 6.29   actor gain: -0.46   critic loss: 0.42   steps: 224


training loop:   0% |                                  | ETA:  38 days, 4:34:46

Episode: 225   score: 7.36   Avg score (100e): 6.31   actor gain: -0.47   critic loss: 0.42   steps: 225


training loop:   0% |                                  | ETA:  38 days, 3:59:10

Episode: 226   score: 7.38   Avg score (100e): 6.33   actor gain: -0.47   critic loss: 0.42   steps: 226


training loop:   0% |                                  | ETA:  38 days, 3:04:47

Episode: 227   score: 7.41   Avg score (100e): 6.35   actor gain: -0.47   critic loss: 0.42   steps: 227


training loop:   0% |                                  | ETA:  38 days, 3:05:11

Episode: 228   score: 7.42   Avg score (100e): 6.37   actor gain: -0.47   critic loss: 0.42   steps: 228


training loop:   0% |                                  | ETA:  38 days, 2:56:20

Episode: 229   score: 7.46   Avg score (100e): 6.39   actor gain: -0.47   critic loss: 0.42   steps: 229


training loop:   0% |                                  | ETA:  38 days, 2:36:57

Episode: 230   score: 7.46   Avg score (100e): 6.41   actor gain: -0.47   critic loss: 0.42   steps: 230


training loop:   0% |                                  | ETA:  38 days, 2:21:01

Episode: 231   score: 7.47   Avg score (100e): 6.43   actor gain: -0.47   critic loss: 0.42   steps: 231


training loop:   0% |                                  | ETA:  38 days, 2:40:36

Episode: 232   score: 7.49   Avg score (100e): 6.45   actor gain: -0.47   critic loss: 0.42   steps: 232


training loop:   0% |                                  | ETA:  38 days, 2:53:42

Episode: 233   score: 7.53   Avg score (100e): 6.46   actor gain: -0.47   critic loss: 0.42   steps: 233


training loop:   0% |                                  | ETA:  38 days, 2:57:51

Episode: 234   score: 7.58   Avg score (100e): 6.48   actor gain: -0.47   critic loss: 0.42   steps: 234


training loop:   0% |                                  | ETA:  38 days, 2:33:17

Episode: 235   score: 7.60   Avg score (100e): 6.50   actor gain: -0.47   critic loss: 0.42   steps: 235


training loop:   0% |                                  | ETA:  38 days, 3:54:30

Episode: 236   score: 7.63   Avg score (100e): 6.52   actor gain: -0.47   critic loss: 0.42   steps: 236


training loop:   0% |                                  | ETA:  38 days, 4:25:36

Episode: 237   score: 7.67   Avg score (100e): 6.54   actor gain: -0.47   critic loss: 0.42   steps: 237


training loop:   0% |                                  | ETA:  38 days, 5:17:31

Episode: 238   score: 7.69   Avg score (100e): 6.56   actor gain: -0.47   critic loss: 0.42   steps: 238


training loop:   0% |                                  | ETA:  38 days, 5:08:02

Episode: 239   score: 7.72   Avg score (100e): 6.58   actor gain: -0.47   critic loss: 0.42   steps: 239


training loop:   0% |                                  | ETA:  38 days, 4:45:56

Episode: 240   score: 7.77   Avg score (100e): 6.60   actor gain: -0.47   critic loss: 0.42   steps: 240


training loop:   0% |                                  | ETA:  38 days, 4:42:07

Episode: 241   score: 7.79   Avg score (100e): 6.62   actor gain: -0.47   critic loss: 0.42   steps: 241


training loop:   0% |                                  | ETA:  38 days, 4:18:13

Episode: 242   score: 7.80   Avg score (100e): 6.64   actor gain: -0.47   critic loss: 0.42   steps: 242


training loop:   0% |                                  | ETA:  38 days, 3:46:52

Episode: 243   score: 7.83   Avg score (100e): 6.66   actor gain: -0.47   critic loss: 0.42   steps: 243


training loop:   0% |                                  | ETA:  38 days, 3:21:33

Episode: 244   score: 7.84   Avg score (100e): 6.68   actor gain: -0.47   critic loss: 0.42   steps: 244


training loop:   0% |                                  | ETA:  38 days, 3:04:13

Episode: 245   score: 7.88   Avg score (100e): 6.70   actor gain: -0.47   critic loss: 0.42   steps: 245


training loop:   0% |                                  | ETA:  38 days, 2:38:28

Episode: 246   score: 7.91   Avg score (100e): 6.72   actor gain: -0.47   critic loss: 0.42   steps: 246


training loop:   0% |                                  | ETA:  38 days, 2:15:09

Episode: 247   score: 7.92   Avg score (100e): 6.75   actor gain: -0.47   critic loss: 0.42   steps: 247


training loop:   0% |                                  | ETA:  38 days, 1:46:47

Episode: 248   score: 7.96   Avg score (100e): 6.77   actor gain: -0.47   critic loss: 0.42   steps: 248


training loop:   0% |                                  | ETA:  38 days, 1:16:06

Episode: 249   score: 7.98   Avg score (100e): 6.79   actor gain: -0.46   critic loss: 0.42   steps: 249


training loop:   0% |                                  | ETA:  38 days, 0:46:15

Episode: 250   score: 7.99   Avg score (100e): 6.81   actor gain: -0.47   critic loss: 0.41   steps: 250


training loop:   0% |                                  | ETA:  38 days, 0:14:02

Episode: 251   score: 8.01   Avg score (100e): 6.83   actor gain: -0.47   critic loss: 0.41   steps: 251


training loop:   0% |                                 | ETA:  37 days, 23:50:33

Episode: 252   score: 8.03   Avg score (100e): 6.85   actor gain: -0.47   critic loss: 0.41   steps: 252


training loop:   0% |                                 | ETA:  37 days, 23:34:47

Episode: 253   score: 8.06   Avg score (100e): 6.87   actor gain: -0.47   critic loss: 0.41   steps: 253


training loop:   0% |                                 | ETA:  37 days, 23:08:16

Episode: 254   score: 8.09   Avg score (100e): 6.89   actor gain: -0.46   critic loss: 0.41   steps: 254


training loop:   0% |                                 | ETA:  37 days, 22:48:52

Episode: 255   score: 8.12   Avg score (100e): 6.92   actor gain: -0.46   critic loss: 0.41   steps: 255


training loop:   0% |                                 | ETA:  37 days, 22:24:38

Episode: 256   score: 8.14   Avg score (100e): 6.94   actor gain: -0.46   critic loss: 0.41   steps: 256


training loop:   0% |                                 | ETA:  37 days, 22:02:03

Episode: 257   score: 8.16   Avg score (100e): 6.96   actor gain: -0.46   critic loss: 0.42   steps: 257


training loop:   0% |                                 | ETA:  37 days, 21:35:46

Episode: 258   score: 8.15   Avg score (100e): 6.98   actor gain: -0.46   critic loss: 0.42   steps: 258


training loop:   0% |                                 | ETA:  37 days, 21:14:01

Episode: 259   score: 8.18   Avg score (100e): 7.00   actor gain: -0.54   critic loss: 0.42   steps: 259


training loop:   0% |                                 | ETA:  37 days, 20:51:51

Episode: 260   score: 8.18   Avg score (100e): 7.02   actor gain: -0.54   critic loss: 0.42   steps: 260


training loop:   0% |                                 | ETA:  37 days, 20:33:56

Episode: 261   score: 8.21   Avg score (100e): 7.04   actor gain: -0.54   critic loss: 0.42   steps: 261


training loop:   0% |                                 | ETA:  37 days, 20:42:35

Episode: 262   score: 8.25   Avg score (100e): 7.06   actor gain: -0.54   critic loss: 0.42   steps: 262


training loop:   0% |                                 | ETA:  37 days, 20:27:19

Episode: 263   score: 8.28   Avg score (100e): 7.08   actor gain: -0.54   critic loss: 0.42   steps: 263


training loop:   0% |                                 | ETA:  37 days, 20:36:05

Episode: 264   score: 8.31   Avg score (100e): 7.10   actor gain: -0.54   critic loss: 0.42   steps: 264


training loop:   0% |                                 | ETA:  37 days, 20:22:03

Episode: 265   score: 8.33   Avg score (100e): 7.13   actor gain: -0.54   critic loss: 0.42   steps: 265


training loop:   0% |                                 | ETA:  37 days, 20:05:34

Episode: 266   score: 8.36   Avg score (100e): 7.15   actor gain: -0.54   critic loss: 0.42   steps: 266


training loop:   0% |                                 | ETA:  37 days, 19:43:55

Episode: 267   score: 8.38   Avg score (100e): 7.17   actor gain: -0.54   critic loss: 0.42   steps: 267


training loop:   0% |                                 | ETA:  37 days, 19:56:51

Episode: 268   score: 8.39   Avg score (100e): 7.19   actor gain: -0.54   critic loss: 0.42   steps: 268


training loop:   0% |                                 | ETA:  37 days, 19:51:56

Episode: 269   score: 8.42   Avg score (100e): 7.21   actor gain: -0.54   critic loss: 0.42   steps: 269


training loop:   0% |                                 | ETA:  37 days, 19:36:37

Episode: 270   score: 8.43   Avg score (100e): 7.23   actor gain: -0.54   critic loss: 0.42   steps: 270


training loop:   0% |                                 | ETA:  37 days, 19:24:58

Episode: 271   score: 8.46   Avg score (100e): 7.26   actor gain: -0.54   critic loss: 0.42   steps: 271


training loop:   0% |                                 | ETA:  37 days, 19:09:41

Episode: 272   score: 8.48   Avg score (100e): 7.28   actor gain: -0.54   critic loss: 0.42   steps: 272


training loop:   0% |                                 | ETA:  37 days, 18:51:15

Episode: 273   score: 8.51   Avg score (100e): 7.30   actor gain: -0.54   critic loss: 0.42   steps: 273


training loop:   0% |                                 | ETA:  37 days, 18:31:02

Episode: 274   score: 8.53   Avg score (100e): 7.32   actor gain: -0.54   critic loss: 0.42   steps: 274


training loop:   0% |                                 | ETA:  37 days, 18:14:43

Episode: 275   score: 8.55   Avg score (100e): 7.35   actor gain: -0.54   critic loss: 0.42   steps: 275


training loop:   0% |                                 | ETA:  37 days, 17:54:23

Episode: 276   score: 8.58   Avg score (100e): 7.37   actor gain: -0.55   critic loss: 0.42   steps: 276


training loop:   0% |                                 | ETA:  37 days, 17:29:14

Episode: 277   score: 8.62   Avg score (100e): 7.39   actor gain: -0.55   critic loss: 0.42   steps: 277


training loop:   0% |                                 | ETA:  37 days, 17:10:17

Episode: 278   score: 8.64   Avg score (100e): 7.42   actor gain: -0.55   critic loss: 0.42   steps: 278


training loop:   0% |                                 | ETA:  37 days, 16:49:00

Episode: 279   score: 8.66   Avg score (100e): 7.44   actor gain: -0.55   critic loss: 0.42   steps: 279


training loop:   0% |                                 | ETA:  37 days, 16:26:35

Episode: 280   score: 8.68   Avg score (100e): 7.46   actor gain: -0.55   critic loss: 0.42   steps: 280


training loop:   0% |                                 | ETA:  37 days, 16:05:31

Episode: 281   score: 8.69   Avg score (100e): 7.49   actor gain: -0.55   critic loss: 0.42   steps: 281


training loop:   0% |                                 | ETA:  37 days, 15:40:21

Episode: 282   score: 8.72   Avg score (100e): 7.51   actor gain: -0.55   critic loss: 0.42   steps: 282


training loop:   0% |                                 | ETA:  37 days, 15:21:03

Episode: 283   score: 8.74   Avg score (100e): 7.54   actor gain: -0.55   critic loss: 0.42   steps: 283


training loop:   0% |                                 | ETA:  37 days, 14:58:37

Episode: 284   score: 8.75   Avg score (100e): 7.56   actor gain: -0.46   critic loss: 0.42   steps: 284


training loop:   0% |                                 | ETA:  37 days, 14:41:55

Episode: 285   score: 8.77   Avg score (100e): 7.59   actor gain: -0.46   critic loss: 0.42   steps: 285


training loop:   0% |                                 | ETA:  37 days, 14:37:37

Episode: 286   score: 8.79   Avg score (100e): 7.61   actor gain: -0.47   critic loss: 0.42   steps: 286


training loop:   0% |                                 | ETA:  37 days, 14:31:10

Episode: 287   score: 8.81   Avg score (100e): 7.63   actor gain: -0.47   critic loss: 0.42   steps: 287


training loop:   0% |                                 | ETA:  37 days, 14:15:21

Episode: 288   score: 8.82   Avg score (100e): 7.66   actor gain: -0.46   critic loss: 0.42   steps: 288


training loop:   0% |                                 | ETA:  37 days, 13:57:33

Episode: 289   score: 8.87   Avg score (100e): 7.68   actor gain: -0.46   critic loss: 0.42   steps: 289


training loop:   0% |                                 | ETA:  37 days, 13:37:15

Episode: 290   score: 8.89   Avg score (100e): 7.71   actor gain: -0.47   critic loss: 0.42   steps: 290


training loop:   0% |                                 | ETA:  37 days, 13:17:57

Episode: 291   score: 8.92   Avg score (100e): 7.73   actor gain: -0.47   critic loss: 0.42   steps: 291


training loop:   0% |                                 | ETA:  37 days, 13:06:53

Episode: 292   score: 8.94   Avg score (100e): 7.76   actor gain: -0.46   critic loss: 0.41   steps: 292


training loop:   0% |                                 | ETA:  37 days, 12:50:47

Episode: 293   score: 8.97   Avg score (100e): 7.78   actor gain: -0.46   critic loss: 0.41   steps: 293


training loop:   0% |                                 | ETA:  37 days, 12:34:42

Episode: 294   score: 9.00   Avg score (100e): 7.81   actor gain: -0.48   critic loss: 0.41   steps: 294


training loop:   0% |                                 | ETA:  37 days, 12:18:44

Episode: 295   score: 9.01   Avg score (100e): 7.83   actor gain: -0.48   critic loss: 0.41   steps: 295


training loop:   0% |                                 | ETA:  37 days, 12:01:13

Episode: 296   score: 9.04   Avg score (100e): 7.86   actor gain: -0.48   critic loss: 0.41   steps: 296


training loop:   0% |                                 | ETA:  37 days, 11:42:57

Episode: 297   score: 9.05   Avg score (100e): 7.88   actor gain: -0.48   critic loss: 0.41   steps: 297


training loop:   0% |                                 | ETA:  37 days, 11:19:02

Episode: 298   score: 9.07   Avg score (100e): 7.91   actor gain: -0.48   critic loss: 0.41   steps: 298


training loop:   0% |                                 | ETA:  37 days, 11:01:48

Episode: 299   score: 9.10   Avg score (100e): 7.93   actor gain: -0.48   critic loss: 0.41   steps: 299


training loop:   0% |                                 | ETA:  37 days, 10:37:46

Episode: 300   score: 9.12   Avg score (100e): 7.96   actor gain: -0.48   critic loss: 0.41   steps: 300


training loop:   0% |                                 | ETA:  37 days, 10:46:28

Episode: 301   score: 9.13   Avg score (100e): 7.98   actor gain: -0.46   critic loss: 0.41   steps: 301


training loop:   0% |                                 | ETA:  37 days, 10:38:12

Episode: 302   score: 9.15   Avg score (100e): 8.01   actor gain: -0.46   critic loss: 0.41   steps: 302


training loop:   0% |                                 | ETA:  37 days, 10:29:21

Episode: 303   score: 9.15   Avg score (100e): 8.03   actor gain: -0.46   critic loss: 0.41   steps: 303


training loop:   0% |                                 | ETA:  37 days, 10:14:06

Episode: 304   score: 9.16   Avg score (100e): 8.05   actor gain: -0.46   critic loss: 0.41   steps: 304


training loop:   0% |                                  | ETA:  37 days, 9:56:54

Episode: 305   score: 9.18   Avg score (100e): 8.08   actor gain: -0.46   critic loss: 0.40   steps: 305


training loop:   0% |                                  | ETA:  37 days, 9:47:40

Episode: 306   score: 9.22   Avg score (100e): 8.10   actor gain: -0.46   critic loss: 0.40   steps: 306


training loop:   0% |                                  | ETA:  37 days, 9:28:33

Episode: 307   score: 9.22   Avg score (100e): 8.13   actor gain: -0.47   critic loss: 0.40   steps: 307


training loop:   0% |                                  | ETA:  37 days, 9:09:34

Episode: 308   score: 9.26   Avg score (100e): 8.15   actor gain: -0.47   critic loss: 0.40   steps: 308


training loop:   0% |                                  | ETA:  37 days, 8:55:53

Episode: 309   score: 9.26   Avg score (100e): 8.17   actor gain: -0.47   critic loss: 0.40   steps: 309


training loop:   0% |                                  | ETA:  37 days, 8:36:06

Episode: 310   score: 9.28   Avg score (100e): 8.20   actor gain: -0.47   critic loss: 0.40   steps: 310


training loop:   0% |                                  | ETA:  37 days, 8:20:30

Episode: 311   score: 9.32   Avg score (100e): 8.22   actor gain: -0.47   critic loss: 0.40   steps: 311


training loop:   0% |                                  | ETA:  37 days, 8:00:01

Episode: 312   score: 9.34   Avg score (100e): 8.24   actor gain: -0.47   critic loss: 0.40   steps: 312


training loop:   0% |                                  | ETA:  37 days, 7:42:00

Episode: 313   score: 9.38   Avg score (100e): 8.27   actor gain: -0.47   critic loss: 0.40   steps: 313


training loop:   0% |                                  | ETA:  37 days, 7:22:33

Episode: 314   score: 9.41   Avg score (100e): 8.29   actor gain: -0.47   critic loss: 0.40   steps: 314


training loop:   0% |                                  | ETA:  37 days, 7:11:21

Episode: 315   score: 9.44   Avg score (100e): 8.32   actor gain: -0.47   critic loss: 0.40   steps: 315


training loop:   0% |                                  | ETA:  37 days, 7:17:21

Episode: 316   score: 9.45   Avg score (100e): 8.34   actor gain: -0.47   critic loss: 0.40   steps: 316


training loop:   0% |                                  | ETA:  37 days, 7:10:27

Episode: 317   score: 9.45   Avg score (100e): 8.36   actor gain: -0.47   critic loss: 0.40   steps: 317


training loop:   0% |                                  | ETA:  37 days, 6:59:17

Episode: 318   score: 9.48   Avg score (100e): 8.39   actor gain: -0.47   critic loss: 0.40   steps: 318


training loop:   0% |                                  | ETA:  37 days, 7:18:26

Episode: 319   score: 9.48   Avg score (100e): 8.41   actor gain: -0.45   critic loss: 0.40   steps: 319


training loop:   0% |                                  | ETA:  37 days, 7:31:02

Episode: 320   score: 9.51   Avg score (100e): 8.43   actor gain: -0.45   critic loss: 0.40   steps: 320


training loop:   0% |                                  | ETA:  37 days, 7:51:13

Episode: 321   score: 9.53   Avg score (100e): 8.45   actor gain: -0.45   critic loss: 0.40   steps: 321


training loop:   0% |                                  | ETA:  37 days, 7:53:42

Episode: 322   score: 9.55   Avg score (100e): 8.48   actor gain: -0.45   critic loss: 0.40   steps: 322


training loop:   0% |                                  | ETA:  37 days, 7:52:37

Episode: 323   score: 9.56   Avg score (100e): 8.50   actor gain: -0.45   critic loss: 0.40   steps: 323


training loop:   0% |                                  | ETA:  37 days, 8:26:38

Episode: 324   score: 9.59   Avg score (100e): 8.52   actor gain: -0.45   critic loss: 0.40   steps: 324


training loop:   0% |                                  | ETA:  37 days, 8:27:00

Episode: 325   score: 9.61   Avg score (100e): 8.55   actor gain: -0.45   critic loss: 0.40   steps: 325


training loop:   0% |                                  | ETA:  37 days, 8:35:29

Episode: 326   score: 9.64   Avg score (100e): 8.57   actor gain: -0.45   critic loss: 0.40   steps: 326


training loop:   0% |                                  | ETA:  37 days, 8:29:11

Episode: 327   score: 9.64   Avg score (100e): 8.59   actor gain: -0.45   critic loss: 0.40   steps: 327


training loop:   0% |                                  | ETA:  37 days, 8:13:00

Episode: 328   score: 9.66   Avg score (100e): 8.61   actor gain: -0.45   critic loss: 0.40   steps: 328


training loop:   0% |                                  | ETA:  37 days, 8:00:20

Episode: 329   score: 9.70   Avg score (100e): 8.64   actor gain: -0.45   critic loss: 0.40   steps: 329


training loop:   0% |                                  | ETA:  37 days, 7:45:43

Episode: 330   score: 9.70   Avg score (100e): 8.66   actor gain: -0.45   critic loss: 0.40   steps: 330


training loop:   0% |                                  | ETA:  37 days, 7:25:26

Episode: 331   score: 9.72   Avg score (100e): 8.68   actor gain: -0.45   critic loss: 0.40   steps: 331


training loop:   0% |                                  | ETA:  37 days, 7:22:35

Episode: 332   score: 9.72   Avg score (100e): 8.70   actor gain: -0.45   critic loss: 0.41   steps: 332


training loop:   0% |                                  | ETA:  37 days, 7:40:22

Episode: 333   score: 9.75   Avg score (100e): 8.72   actor gain: -0.45   critic loss: 0.41   steps: 333


training loop:   0% |                                  | ETA:  37 days, 7:41:53

Episode: 334   score: 9.77   Avg score (100e): 8.75   actor gain: -0.45   critic loss: 0.41   steps: 334


training loop:   0% |                                  | ETA:  37 days, 7:46:12

Episode: 335   score: 9.80   Avg score (100e): 8.77   actor gain: -0.45   critic loss: 0.41   steps: 335


training loop:   0% |                                  | ETA:  37 days, 7:44:21

Episode: 336   score: 9.82   Avg score (100e): 8.79   actor gain: -0.45   critic loss: 0.41   steps: 336


training loop:   0% |                                  | ETA:  37 days, 7:36:58

Episode: 337   score: 9.86   Avg score (100e): 8.81   actor gain: -0.45   critic loss: 0.41   steps: 337


training loop:   0% |                                  | ETA:  37 days, 7:38:24

Episode: 338   score: 9.87   Avg score (100e): 8.83   actor gain: -0.44   critic loss: 0.41   steps: 338


training loop:   0% |                                  | ETA:  37 days, 7:32:24

Episode: 339   score: 9.89   Avg score (100e): 8.86   actor gain: -0.44   critic loss: 0.41   steps: 339


training loop:   0% |                                  | ETA:  37 days, 7:42:30

Episode: 340   score: 9.92   Avg score (100e): 8.88   actor gain: -0.44   critic loss: 0.41   steps: 340


training loop:   0% |                                  | ETA:  37 days, 7:40:51

Episode: 341   score: 9.95   Avg score (100e): 8.90   actor gain: -0.45   critic loss: 0.41   steps: 341


training loop:   0% |                                  | ETA:  37 days, 7:45:34

Episode: 342   score: 9.98   Avg score (100e): 8.92   actor gain: -0.45   critic loss: 0.41   steps: 342


training loop:   0% |                                  | ETA:  37 days, 7:33:25

Episode: 343   score: 10.00   Avg score (100e): 8.94   actor gain: -0.45   critic loss: 0.41   steps: 343


training loop:   0% |                                  | ETA:  37 days, 7:24:04

Episode: 344   score: 10.03   Avg score (100e): 8.96   actor gain: -0.45   critic loss: 0.41   steps: 344


training loop:   0% |                                  | ETA:  37 days, 7:07:47

Episode: 345   score: 10.04   Avg score (100e): 8.99   actor gain: -0.45   critic loss: 0.41   steps: 345


training loop:   0% |                                  | ETA:  37 days, 6:55:19

Episode: 346   score: 10.06   Avg score (100e): 9.01   actor gain: -0.45   critic loss: 0.41   steps: 346


training loop:   0% |                                  | ETA:  37 days, 6:36:10

Episode: 347   score: 10.06   Avg score (100e): 9.03   actor gain: -0.45   critic loss: 0.41   steps: 347


training loop:   0% |                                  | ETA:  37 days, 6:39:13

Episode: 348   score: 10.07   Avg score (100e): 9.05   actor gain: -0.45   critic loss: 0.41   steps: 348


training loop:   0% |                                  | ETA:  37 days, 6:45:13

Episode: 349   score: 10.09   Avg score (100e): 9.07   actor gain: -0.45   critic loss: 0.41   steps: 349


training loop:   0% |                                  | ETA:  37 days, 6:55:42

Episode: 350   score: 10.11   Avg score (100e): 9.09   actor gain: -0.45   critic loss: 0.40   steps: 350


training loop:   0% |                                  | ETA:  37 days, 6:46:55

Episode: 351   score: 10.13   Avg score (100e): 9.11   actor gain: -0.45   critic loss: 0.40   steps: 351


training loop:   0% |                                  | ETA:  37 days, 6:33:24

Episode: 352   score: 10.14   Avg score (100e): 9.13   actor gain: -0.45   critic loss: 0.40   steps: 352


training loop:   0% |                                  | ETA:  37 days, 6:22:34

Episode: 353   score: 10.16   Avg score (100e): 9.16   actor gain: -0.45   critic loss: 0.40   steps: 353


training loop:   0% |                                  | ETA:  37 days, 6:06:23

Episode: 354   score: 10.20   Avg score (100e): 9.18   actor gain: -0.45   critic loss: 0.40   steps: 354


training loop:   0% |                                  | ETA:  37 days, 5:48:27

Episode: 355   score: 10.22   Avg score (100e): 9.20   actor gain: -0.45   critic loss: 0.40   steps: 355


training loop:   0% |                                  | ETA:  37 days, 5:42:07

Episode: 356   score: 10.26   Avg score (100e): 9.22   actor gain: -0.45   critic loss: 0.40   steps: 356


training loop:   0% |                                  | ETA:  37 days, 5:51:20

Episode: 357   score: 10.27   Avg score (100e): 9.24   actor gain: -0.45   critic loss: 0.40   steps: 357


training loop:   0% |                                  | ETA:  37 days, 5:49:30

Episode: 358   score: 10.29   Avg score (100e): 9.26   actor gain: -0.45   critic loss: 0.40   steps: 358


training loop:   0% |                                  | ETA:  37 days, 5:41:10

Episode: 359   score: 10.32   Avg score (100e): 9.28   actor gain: -0.45   critic loss: 0.41   steps: 359


training loop:   0% |                                  | ETA:  37 days, 5:40:04

Episode: 360   score: 10.32   Avg score (100e): 9.30   actor gain: -0.45   critic loss: 0.41   steps: 360


training loop:   0% |                                  | ETA:  37 days, 5:24:26

Episode: 361   score: 10.34   Avg score (100e): 9.33   actor gain: -0.46   critic loss: 0.41   steps: 361


training loop:   0% |                                  | ETA:  37 days, 5:14:55

Episode: 362   score: 10.37   Avg score (100e): 9.35   actor gain: -0.46   critic loss: 0.41   steps: 362


training loop:   0% |                                  | ETA:  37 days, 5:10:28

Episode: 363   score: 10.39   Avg score (100e): 9.37   actor gain: -0.46   critic loss: 0.41   steps: 363


training loop:   0% |                                  | ETA:  37 days, 5:26:40

Episode: 364   score: 10.42   Avg score (100e): 9.39   actor gain: -0.46   critic loss: 0.41   steps: 364


training loop:   0% |                                  | ETA:  37 days, 5:47:39

Episode: 365   score: 10.43   Avg score (100e): 9.41   actor gain: -0.47   critic loss: 0.41   steps: 365


training loop:   0% |                                  | ETA:  37 days, 5:38:18

Episode: 366   score: 10.46   Avg score (100e): 9.43   actor gain: -0.46   critic loss: 0.41   steps: 366


training loop:   0% |                                  | ETA:  37 days, 5:29:22

Episode: 367   score: 10.49   Avg score (100e): 9.45   actor gain: -0.46   critic loss: 0.41   steps: 367


training loop:   0% |                                  | ETA:  37 days, 5:17:37

Episode: 368   score: 10.51   Avg score (100e): 9.47   actor gain: -0.47   critic loss: 0.41   steps: 368


training loop:   0% |                                  | ETA:  37 days, 4:56:58

Episode: 369   score: 10.51   Avg score (100e): 9.49   actor gain: -0.47   critic loss: 0.41   steps: 369


training loop:   0% |                                  | ETA:  37 days, 4:59:48

Episode: 370   score: 10.52   Avg score (100e): 9.51   actor gain: -0.47   critic loss: 0.41   steps: 370


training loop:   0% |                                  | ETA:  37 days, 4:55:41

Episode: 371   score: 10.52   Avg score (100e): 9.54   actor gain: -0.47   critic loss: 0.41   steps: 371


training loop:   0% |                                  | ETA:  37 days, 4:50:55

Episode: 372   score: 10.53   Avg score (100e): 9.56   actor gain: -0.47   critic loss: 0.41   steps: 372


training loop:   0% |                                  | ETA:  37 days, 4:46:26

Episode: 373   score: 10.54   Avg score (100e): 9.58   actor gain: -0.47   critic loss: 0.41   steps: 373


training loop:   0% |                                  | ETA:  37 days, 4:29:10

Episode: 374   score: 10.54   Avg score (100e): 9.60   actor gain: -0.47   critic loss: 0.41   steps: 374


training loop:   0% |                                  | ETA:  37 days, 4:17:20

Episode: 375   score: 10.55   Avg score (100e): 9.62   actor gain: -0.47   critic loss: 0.41   steps: 375


training loop:   0% |                                  | ETA:  37 days, 4:04:29

Episode: 376   score: 10.57   Avg score (100e): 9.64   actor gain: -0.47   critic loss: 0.41   steps: 376


training loop:   0% |                                  | ETA:  37 days, 3:52:15

Episode: 377   score: 10.57   Avg score (100e): 9.66   actor gain: -0.47   critic loss: 0.41   steps: 377


training loop:   0% |                                  | ETA:  37 days, 3:47:14

Episode: 378   score: 10.59   Avg score (100e): 9.68   actor gain: -0.47   critic loss: 0.41   steps: 378


training loop:   0% |                                  | ETA:  37 days, 3:42:49

Episode: 379   score: 10.58   Avg score (100e): 9.69   actor gain: -0.47   critic loss: 0.41   steps: 379


training loop:   0% |                                  | ETA:  37 days, 3:28:57

Episode: 380   score: 10.59   Avg score (100e): 9.71   actor gain: -0.47   critic loss: 0.41   steps: 380


training loop:   0% |                                  | ETA:  37 days, 3:17:13

Episode: 381   score: 10.59   Avg score (100e): 9.73   actor gain: -0.47   critic loss: 0.40   steps: 381


training loop:   0% |                                  | ETA:  37 days, 3:06:38

Episode: 382   score: 10.61   Avg score (100e): 9.75   actor gain: -0.48   critic loss: 0.40   steps: 382


training loop:   0% |                                  | ETA:  37 days, 2:50:55

Episode: 383   score: 10.59   Avg score (100e): 9.77   actor gain: -0.48   critic loss: 0.40   steps: 383


training loop:   0% |                                  | ETA:  37 days, 2:40:26

Episode: 384   score: 10.59   Avg score (100e): 9.79   actor gain: -0.48   critic loss: 0.40   steps: 384


training loop:   0% |                                  | ETA:  37 days, 2:31:02

Episode: 385   score: 10.61   Avg score (100e): 9.81   actor gain: -0.48   critic loss: 0.40   steps: 385


training loop:   0% |                                  | ETA:  37 days, 2:19:25

Episode: 386   score: 10.62   Avg score (100e): 9.83   actor gain: -0.47   critic loss: 0.40   steps: 386


training loop:   0% |                                  | ETA:  37 days, 2:06:41

Episode: 387   score: 10.62   Avg score (100e): 9.84   actor gain: -0.46   critic loss: 0.40   steps: 387


training loop:   0% |                                  | ETA:  37 days, 1:52:41

Episode: 388   score: 10.64   Avg score (100e): 9.86   actor gain: -0.46   critic loss: 0.40   steps: 388


training loop:   0% |                                  | ETA:  37 days, 1:56:11

Episode: 389   score: 10.67   Avg score (100e): 9.88   actor gain: -0.46   critic loss: 0.40   steps: 389


training loop:   0% |                                  | ETA:  37 days, 1:44:23

Episode: 390   score: 10.69   Avg score (100e): 9.90   actor gain: -0.45   critic loss: 0.40   steps: 390


training loop:   0% |                                  | ETA:  37 days, 1:28:17

Episode: 391   score: 10.70   Avg score (100e): 9.92   actor gain: -0.45   critic loss: 0.40   steps: 391


training loop:   0% |                                  | ETA:  37 days, 1:18:25

Episode: 392   score: 10.71   Avg score (100e): 9.93   actor gain: -0.45   critic loss: 0.40   steps: 392


training loop:   0% |                                  | ETA:  37 days, 1:08:42

Episode: 393   score: 10.73   Avg score (100e): 9.95   actor gain: -0.44   critic loss: 0.39   steps: 393


training loop:   0% |                                  | ETA:  37 days, 0:52:20

Episode: 394   score: 10.74   Avg score (100e): 9.97   actor gain: -0.45   critic loss: 0.39   steps: 394


training loop:   0% |                                  | ETA:  37 days, 0:41:47

Episode: 395   score: 10.75   Avg score (100e): 9.99   actor gain: -0.44   critic loss: 0.39   steps: 395


training loop:   0% |                                  | ETA:  37 days, 0:25:59

Episode: 396   score: 10.75   Avg score (100e): 10.00   actor gain: -0.44   critic loss: 0.39   steps: 396


training loop:   0% |                                  | ETA:  37 days, 0:14:42

Episode: 397   score: 10.75   Avg score (100e): 10.02   actor gain: -0.44   critic loss: 0.39   steps: 397


training loop:   0% |                                  | ETA:  37 days, 0:26:31

Episode: 398   score: 10.77   Avg score (100e): 10.04   actor gain: -0.44   critic loss: 0.39   steps: 398


training loop:   0% |                                  | ETA:  37 days, 0:16:06

Episode: 399   score: 10.77   Avg score (100e): 10.05   actor gain: -0.44   critic loss: 0.39   steps: 399


training loop:   0% |                                  | ETA:  37 days, 0:13:21

Episode: 400   score: 10.80   Avg score (100e): 10.07   actor gain: -0.45   critic loss: 0.39   steps: 400


training loop:   0% |                                  | ETA:  37 days, 0:08:40

Episode: 401   score: 10.80   Avg score (100e): 10.09   actor gain: -0.45   critic loss: 0.39   steps: 401


training loop:   0% |                                 | ETA:  36 days, 23:56:55

Episode: 402   score: 10.81   Avg score (100e): 10.10   actor gain: -0.45   critic loss: 0.39   steps: 402


training loop:   0% |                                 | ETA:  36 days, 23:42:34

Episode: 403   score: 10.82   Avg score (100e): 10.12   actor gain: -0.44   critic loss: 0.39   steps: 403


training loop:   0% |                                 | ETA:  36 days, 23:29:54

Episode: 404   score: 10.83   Avg score (100e): 10.14   actor gain: -0.44   critic loss: 0.39   steps: 404


training loop:   0% |                                 | ETA:  36 days, 23:18:51

Episode: 405   score: 10.83   Avg score (100e): 10.15   actor gain: -0.44   critic loss: 0.39   steps: 405


training loop:   0% |                                 | ETA:  36 days, 23:04:19

Episode: 406   score: 10.84   Avg score (100e): 10.17   actor gain: -0.44   critic loss: 0.39   steps: 406


training loop:   0% |                                 | ETA:  36 days, 22:52:22

Episode: 407   score: 10.84   Avg score (100e): 10.19   actor gain: -0.44   critic loss: 0.39   steps: 407


training loop:   0% |                                 | ETA:  36 days, 22:38:46

Episode: 408   score: 10.84   Avg score (100e): 10.20   actor gain: -0.44   critic loss: 0.39   steps: 408


training loop:   0% |                                 | ETA:  36 days, 22:25:15

Episode: 409   score: 10.87   Avg score (100e): 10.22   actor gain: -0.44   critic loss: 0.39   steps: 409


training loop:   0% |                                 | ETA:  36 days, 22:08:30

Episode: 410   score: 10.87   Avg score (100e): 10.23   actor gain: -0.44   critic loss: 0.39   steps: 410


training loop:   0% |                                 | ETA:  36 days, 21:54:03

Episode: 411   score: 10.89   Avg score (100e): 10.25   actor gain: -0.44   critic loss: 0.39   steps: 411


training loop:   0% |                                 | ETA:  36 days, 21:42:46

Episode: 412   score: 10.89   Avg score (100e): 10.26   actor gain: -0.44   critic loss: 0.39   steps: 412


training loop:   0% |                                 | ETA:  36 days, 21:29:39

Episode: 413   score: 10.90   Avg score (100e): 10.28   actor gain: -0.44   critic loss: 0.39   steps: 413


training loop:   0% |                                 | ETA:  36 days, 21:15:16

Episode: 414   score: 10.89   Avg score (100e): 10.29   actor gain: -0.44   critic loss: 0.40   steps: 414


training loop:   0% |                                 | ETA:  36 days, 21:00:34

Episode: 415   score: 10.90   Avg score (100e): 10.31   actor gain: -0.44   critic loss: 0.40   steps: 415


training loop:   0% |                                 | ETA:  36 days, 20:51:29

Episode: 416   score: 10.91   Avg score (100e): 10.32   actor gain: -0.44   critic loss: 0.40   steps: 416


training loop:   0% |                                 | ETA:  36 days, 20:44:22

Episode: 417   score: 10.90   Avg score (100e): 10.34   actor gain: -0.44   critic loss: 0.40   steps: 417


training loop:   0% |                                 | ETA:  36 days, 20:26:07

Episode: 418   score: 10.92   Avg score (100e): 10.35   actor gain: -0.44   critic loss: 0.40   steps: 418


training loop:   0% |                                 | ETA:  36 days, 20:11:38

Episode: 419   score: 10.90   Avg score (100e): 10.37   actor gain: -0.44   critic loss: 0.39   steps: 419


training loop:   0% |                                 | ETA:  36 days, 19:58:47

Episode: 420   score: 10.91   Avg score (100e): 10.38   actor gain: -0.44   critic loss: 0.39   steps: 420


training loop:   0% |                                 | ETA:  36 days, 19:47:28

Episode: 421   score: 10.92   Avg score (100e): 10.39   actor gain: -0.44   critic loss: 0.40   steps: 421


training loop:   0% |                                 | ETA:  36 days, 19:30:54

Episode: 422   score: 10.92   Avg score (100e): 10.41   actor gain: -0.44   critic loss: 0.39   steps: 422


training loop:   0% |                                 | ETA:  36 days, 19:14:21

Episode: 423   score: 10.92   Avg score (100e): 10.42   actor gain: -0.44   critic loss: 0.39   steps: 423


training loop:   0% |                                 | ETA:  36 days, 19:00:43

Episode: 424   score: 10.94   Avg score (100e): 10.44   actor gain: -0.44   critic loss: 0.39   steps: 424


training loop:   0% |                                 | ETA:  36 days, 18:49:12

Episode: 425   score: 10.95   Avg score (100e): 10.45   actor gain: -0.43   critic loss: 0.39   steps: 425


training loop:   0% |                                 | ETA:  36 days, 18:35:44

Episode: 426   score: 10.95   Avg score (100e): 10.46   actor gain: -0.43   critic loss: 0.39   steps: 426


training loop:   0% |                                 | ETA:  36 days, 18:24:32

Episode: 427   score: 10.95   Avg score (100e): 10.47   actor gain: -0.43   critic loss: 0.39   steps: 427


training loop:   0% |                                 | ETA:  36 days, 18:11:03

Episode: 428   score: 10.96   Avg score (100e): 10.49   actor gain: -0.46   critic loss: 0.39   steps: 428


training loop:   0% |                                 | ETA:  36 days, 18:05:50

Episode: 429   score: 10.97   Avg score (100e): 10.50   actor gain: -0.47   critic loss: 0.39   steps: 429


training loop:   0% |                                 | ETA:  36 days, 18:05:18

Episode: 430   score: 10.96   Avg score (100e): 10.51   actor gain: -0.46   critic loss: 0.39   steps: 430


training loop:   0% |                                 | ETA:  36 days, 17:55:33

Episode: 431   score: 10.96   Avg score (100e): 10.53   actor gain: -0.46   critic loss: 0.39   steps: 431


training loop:   0% |                                 | ETA:  36 days, 17:47:48

Episode: 432   score: 10.99   Avg score (100e): 10.54   actor gain: -0.46   critic loss: 0.39   steps: 432


training loop:   0% |                                 | ETA:  36 days, 17:37:56

Episode: 433   score: 10.98   Avg score (100e): 10.55   actor gain: -0.46   critic loss: 0.39   steps: 433


training loop:   0% |                                 | ETA:  36 days, 17:30:37

Episode: 434   score: 10.98   Avg score (100e): 10.56   actor gain: -0.46   critic loss: 0.39   steps: 434


training loop:   0% |                                 | ETA:  36 days, 17:19:06

Episode: 435   score: 10.98   Avg score (100e): 10.57   actor gain: -0.51   critic loss: 0.39   steps: 435


training loop:   0% |                                 | ETA:  36 days, 17:08:34

Episode: 436   score: 10.99   Avg score (100e): 10.59   actor gain: -0.51   critic loss: 0.39   steps: 436


training loop:   0% |                                 | ETA:  36 days, 16:56:04

Episode: 437   score: 10.98   Avg score (100e): 10.60   actor gain: -0.51   critic loss: 0.39   steps: 437


training loop:   0% |                                 | ETA:  36 days, 16:59:09

Episode: 438   score: 10.99   Avg score (100e): 10.61   actor gain: -0.51   critic loss: 0.39   steps: 438


training loop:   0% |                                 | ETA:  36 days, 16:47:40

Episode: 439   score: 10.98   Avg score (100e): 10.62   actor gain: -0.51   critic loss: 0.39   steps: 439


training loop:   0% |                                 | ETA:  36 days, 16:34:37

Episode: 440   score: 10.98   Avg score (100e): 10.63   actor gain: -0.51   critic loss: 0.39   steps: 440


training loop:   0% |                                 | ETA:  36 days, 16:25:23

Episode: 441   score: 10.98   Avg score (100e): 10.64   actor gain: -0.51   critic loss: 0.39   steps: 441


training loop:   0% |                                 | ETA:  36 days, 16:11:33

Episode: 442   score: 10.98   Avg score (100e): 10.65   actor gain: -0.51   critic loss: 0.39   steps: 442


training loop:   0% |                                 | ETA:  36 days, 15:57:41

Episode: 443   score: 10.98   Avg score (100e): 10.66   actor gain: -0.51   critic loss: 0.39   steps: 443


training loop:   0% |                                 | ETA:  36 days, 15:42:54

Episode: 444   score: 10.98   Avg score (100e): 10.67   actor gain: -0.51   critic loss: 0.39   steps: 444


training loop:   0% |                                 | ETA:  36 days, 15:27:52

Episode: 445   score: 10.98   Avg score (100e): 10.68   actor gain: -0.51   critic loss: 0.39   steps: 445


training loop:   0% |                                 | ETA:  36 days, 15:14:34

Episode: 446   score: 10.99   Avg score (100e): 10.69   actor gain: -0.51   critic loss: 0.39   steps: 446


training loop:   0% |                                 | ETA:  36 days, 14:59:43

Episode: 447   score: 10.99   Avg score (100e): 10.70   actor gain: -0.51   critic loss: 0.39   steps: 447


training loop:   0% |                                 | ETA:  36 days, 14:45:46

Episode: 448   score: 11.00   Avg score (100e): 10.71   actor gain: -0.51   critic loss: 0.39   steps: 448


training loop:   0% |                                 | ETA:  36 days, 14:36:03

Episode: 449   score: 11.00   Avg score (100e): 10.72   actor gain: -0.51   critic loss: 0.39   steps: 449


training loop:   0% |                                 | ETA:  36 days, 14:21:05

Episode: 450   score: 11.00   Avg score (100e): 10.72   actor gain: -0.51   critic loss: 0.39   steps: 450


training loop:   0% |                                 | ETA:  36 days, 14:04:36

Episode: 451   score: 11.00   Avg score (100e): 10.73   actor gain: -0.51   critic loss: 0.39   steps: 451


training loop:   0% |                                 | ETA:  36 days, 13:47:39

Episode: 452   score: 11.00   Avg score (100e): 10.74   actor gain: -0.51   critic loss: 0.39   steps: 452


training loop:   0% |                                 | ETA:  36 days, 13:33:34

Episode: 453   score: 10.99   Avg score (100e): 10.75   actor gain: -0.48   critic loss: 0.39   steps: 453


training loop:   0% |                                 | ETA:  36 days, 13:19:26

Episode: 454   score: 11.00   Avg score (100e): 10.76   actor gain: -0.48   critic loss: 0.39   steps: 454


training loop:   0% |                                 | ETA:  36 days, 13:01:26

Episode: 455   score: 11.01   Avg score (100e): 10.77   actor gain: -0.48   critic loss: 0.39   steps: 455


training loop:   0% |                                 | ETA:  36 days, 12:47:47

Episode: 456   score: 11.00   Avg score (100e): 10.77   actor gain: -0.48   critic loss: 0.39   steps: 456


training loop:   0% |                                 | ETA:  36 days, 12:34:54

Episode: 457   score: 11.00   Avg score (100e): 10.78   actor gain: -0.48   critic loss: 0.39   steps: 457


training loop:   0% |                                 | ETA:  36 days, 12:20:04

Episode: 458   score: 11.00   Avg score (100e): 10.79   actor gain: -0.48   critic loss: 0.39   steps: 458


training loop:   0% |                                 | ETA:  36 days, 12:07:28

Episode: 459   score: 11.00   Avg score (100e): 10.80   actor gain: -0.48   critic loss: 0.39   steps: 459


training loop:   0% |                                 | ETA:  36 days, 11:59:54

Episode: 460   score: 11.00   Avg score (100e): 10.80   actor gain: -0.43   critic loss: 0.39   steps: 460


training loop:   0% |                                 | ETA:  36 days, 11:44:43

Episode: 461   score: 11.00   Avg score (100e): 10.81   actor gain: -0.43   critic loss: 0.39   steps: 461


training loop:   0% |                                 | ETA:  36 days, 11:51:25

Episode: 462   score: 11.00   Avg score (100e): 10.81   actor gain: -0.43   critic loss: 0.39   steps: 462


training loop:   0% |                                 | ETA:  36 days, 11:40:15

Episode: 463   score: 11.00   Avg score (100e): 10.82   actor gain: -0.43   critic loss: 0.39   steps: 463


training loop:   0% |                                 | ETA:  36 days, 11:32:32

Episode: 464   score: 11.00   Avg score (100e): 10.83   actor gain: -0.43   critic loss: 0.39   steps: 464


training loop:   0% |                                 | ETA:  36 days, 11:25:30

Episode: 465   score: 11.01   Avg score (100e): 10.83   actor gain: -0.43   critic loss: 0.39   steps: 465


training loop:   0% |                                 | ETA:  36 days, 11:17:26

Episode: 466   score: 11.01   Avg score (100e): 10.84   actor gain: -0.43   critic loss: 0.39   steps: 466


training loop:   0% |                                 | ETA:  36 days, 11:05:46

Episode: 467   score: 11.01   Avg score (100e): 10.84   actor gain: -0.43   critic loss: 0.39   steps: 467


training loop:   0% |                                 | ETA:  36 days, 10:54:57

Episode: 468   score: 11.02   Avg score (100e): 10.85   actor gain: -0.43   critic loss: 0.39   steps: 468


training loop:   0% |                                 | ETA:  36 days, 10:46:14

Episode: 469   score: 11.02   Avg score (100e): 10.85   actor gain: -0.43   critic loss: 0.39   steps: 469


training loop:   0% |                                 | ETA:  36 days, 10:37:12

Episode: 470   score: 11.02   Avg score (100e): 10.86   actor gain: -0.43   critic loss: 0.39   steps: 470


training loop:   0% |                                 | ETA:  36 days, 10:29:21

Episode: 471   score: 11.02   Avg score (100e): 10.86   actor gain: -0.43   critic loss: 0.39   steps: 471


training loop:   0% |                                 | ETA:  36 days, 10:18:09

Episode: 472   score: 11.01   Avg score (100e): 10.87   actor gain: -0.43   critic loss: 0.39   steps: 472


training loop:   0% |                                 | ETA:  36 days, 10:06:17

Episode: 473   score: 11.02   Avg score (100e): 10.87   actor gain: -0.43   critic loss: 0.39   steps: 473


training loop:   0% |                                  | ETA:  36 days, 9:56:13

Episode: 474   score: 11.02   Avg score (100e): 10.88   actor gain: -0.43   critic loss: 0.39   steps: 474


training loop:   0% |                                  | ETA:  36 days, 9:44:47

Episode: 475   score: 11.03   Avg score (100e): 10.88   actor gain: -0.43   critic loss: 0.39   steps: 475


training loop:   0% |                                  | ETA:  36 days, 9:32:07

Episode: 476   score: 11.04   Avg score (100e): 10.89   actor gain: -0.43   critic loss: 0.39   steps: 476


training loop:   0% |                                  | ETA:  36 days, 9:24:42

Episode: 477   score: 11.04   Avg score (100e): 10.89   actor gain: -0.43   critic loss: 0.39   steps: 477


training loop:   0% |                                  | ETA:  36 days, 9:12:43

Episode: 478   score: 11.05   Avg score (100e): 10.90   actor gain: -0.43   critic loss: 0.39   steps: 478


training loop:   0% |                                  | ETA:  36 days, 9:02:47

Episode: 479   score: 11.06   Avg score (100e): 10.90   actor gain: -0.43   critic loss: 0.39   steps: 479


training loop:   0% |                                  | ETA:  36 days, 8:51:27

Episode: 480   score: 11.05   Avg score (100e): 10.91   actor gain: -0.43   critic loss: 0.39   steps: 480


training loop:   0% |                                  | ETA:  36 days, 8:41:23

Episode: 481   score: 11.04   Avg score (100e): 10.91   actor gain: -0.43   critic loss: 0.39   steps: 481


training loop:   0% |                                  | ETA:  36 days, 8:31:59

Episode: 482   score: 11.06   Avg score (100e): 10.92   actor gain: -0.43   critic loss: 0.39   steps: 482


training loop:   0% |                                  | ETA:  36 days, 8:21:28

Episode: 483   score: 11.06   Avg score (100e): 10.92   actor gain: -0.43   critic loss: 0.39   steps: 483


training loop:   0% |                                  | ETA:  36 days, 8:11:24

Episode: 484   score: 11.06   Avg score (100e): 10.92   actor gain: -0.43   critic loss: 0.39   steps: 484


training loop:   0% |                                  | ETA:  36 days, 8:03:51

Episode: 485   score: 11.06   Avg score (100e): 10.93   actor gain: -0.43   critic loss: 0.39   steps: 485


training loop:   0% |                                  | ETA:  36 days, 7:51:43

Episode: 486   score: 11.07   Avg score (100e): 10.93   actor gain: -0.43   critic loss: 0.39   steps: 486


training loop:   0% |                                  | ETA:  36 days, 7:39:08

Episode: 487   score: 11.07   Avg score (100e): 10.94   actor gain: -0.43   critic loss: 0.39   steps: 487


training loop:   0% |                                  | ETA:  36 days, 7:30:05

Episode: 488   score: 11.07   Avg score (100e): 10.94   actor gain: -0.44   critic loss: 0.39   steps: 488


training loop:   0% |                                  | ETA:  36 days, 7:22:53

Episode: 489   score: 11.07   Avg score (100e): 10.95   actor gain: -0.44   critic loss: 0.39   steps: 489


training loop:   0% |                                  | ETA:  36 days, 7:13:25

Episode: 490   score: 11.07   Avg score (100e): 10.95   actor gain: -0.44   critic loss: 0.39   steps: 490


training loop:   0% |                                  | ETA:  36 days, 7:02:01

Episode: 491   score: 11.07   Avg score (100e): 10.95   actor gain: -0.44   critic loss: 0.39   steps: 491


training loop:   0% |                                  | ETA:  36 days, 6:52:00

Episode: 492   score: 11.07   Avg score (100e): 10.96   actor gain: -0.44   critic loss: 0.39   steps: 492


training loop:   0% |                                  | ETA:  36 days, 6:40:22

Episode: 493   score: 11.07   Avg score (100e): 10.96   actor gain: -0.44   critic loss: 0.39   steps: 493


training loop:   0% |                                  | ETA:  36 days, 6:32:34

Episode: 494   score: 11.08   Avg score (100e): 10.96   actor gain: -0.44   critic loss: 0.39   steps: 494


training loop:   0% |                                  | ETA:  36 days, 6:39:50

Episode: 495   score: 11.07   Avg score (100e): 10.97   actor gain: -0.44   critic loss: 0.39   steps: 495


training loop:   0% |                                  | ETA:  36 days, 6:37:00

Episode: 496   score: 11.08   Avg score (100e): 10.97   actor gain: -0.44   critic loss: 0.39   steps: 496


training loop:   0% |                                  | ETA:  36 days, 6:33:43

Episode: 497   score: 11.08   Avg score (100e): 10.97   actor gain: -0.44   critic loss: 0.39   steps: 497


training loop:   0% |                                  | ETA:  36 days, 6:25:41

Episode: 498   score: 11.09   Avg score (100e): 10.98   actor gain: -0.43   critic loss: 0.39   steps: 498


training loop:   0% |                                  | ETA:  36 days, 6:18:30

Episode: 499   score: 11.09   Avg score (100e): 10.98   actor gain: -0.43   critic loss: 0.39   steps: 499


training loop:   0% |                                  | ETA:  36 days, 6:11:21

Episode: 500   score: 11.10   Avg score (100e): 10.98   actor gain: -0.43   critic loss: 0.39   steps: 500


training loop:   1% |                                  | ETA:  36 days, 6:02:30

Episode: 501   score: 11.10   Avg score (100e): 10.99   actor gain: -0.44   critic loss: 0.39   steps: 501


training loop:   1% |                                  | ETA:  36 days, 5:52:31

Episode: 502   score: 11.10   Avg score (100e): 10.99   actor gain: -0.44   critic loss: 0.39   steps: 502


training loop:   1% |                                  | ETA:  36 days, 5:50:51

Episode: 503   score: 11.11   Avg score (100e): 10.99   actor gain: -0.44   critic loss: 0.39   steps: 503


training loop:   1% |                                  | ETA:  36 days, 5:41:25

Episode: 504   score: 11.10   Avg score (100e): 10.99   actor gain: -0.44   critic loss: 0.39   steps: 504


training loop:   1% |                                  | ETA:  36 days, 5:33:36

Episode: 505   score: 11.10   Avg score (100e): 11.00   actor gain: -0.44   critic loss: 0.39   steps: 505


training loop:   1% |                                  | ETA:  36 days, 5:20:40

Episode: 506   score: 11.11   Avg score (100e): 11.00   actor gain: -0.43   critic loss: 0.39   steps: 506


training loop:   1% |                                  | ETA:  36 days, 5:11:15

Episode: 507   score: 11.12   Avg score (100e): 11.00   actor gain: -0.43   critic loss: 0.39   steps: 507


training loop:   1% |                                  | ETA:  36 days, 5:03:31

Episode: 508   score: 11.12   Avg score (100e): 11.01   actor gain: -0.43   critic loss: 0.39   steps: 508


training loop:   1% |                                  | ETA:  36 days, 4:52:54

Episode: 509   score: 11.12   Avg score (100e): 11.01   actor gain: -0.43   critic loss: 0.39   steps: 509


training loop:   1% |                                  | ETA:  36 days, 4:41:25

Episode: 510   score: 11.12   Avg score (100e): 11.01   actor gain: -0.43   critic loss: 0.39   steps: 510


training loop:   1% |                                  | ETA:  36 days, 4:31:03

Episode: 511   score: 11.12   Avg score (100e): 11.01   actor gain: -0.43   critic loss: 0.39   steps: 511


training loop:   1% |                                  | ETA:  36 days, 4:21:33

Episode: 512   score: 11.12   Avg score (100e): 11.02   actor gain: -0.52   critic loss: 0.39   steps: 512


training loop:   1% |                                  | ETA:  36 days, 4:09:04

Episode: 513   score: 11.12   Avg score (100e): 11.02   actor gain: -0.51   critic loss: 0.39   steps: 513


training loop:   1% |                                  | ETA:  36 days, 3:58:27

Episode: 514   score: 11.13   Avg score (100e): 11.02   actor gain: -0.51   critic loss: 0.39   steps: 514


training loop:   1% |                                  | ETA:  36 days, 3:48:57

Episode: 515   score: 11.13   Avg score (100e): 11.02   actor gain: -0.51   critic loss: 0.39   steps: 515


training loop:   1% |                                  | ETA:  36 days, 3:36:35

Episode: 516   score: 11.16   Avg score (100e): 11.02   actor gain: -0.51   critic loss: 0.39   steps: 516


training loop:   1% |                                  | ETA:  36 days, 3:27:30

Episode: 517   score: 11.18   Avg score (100e): 11.03   actor gain: -0.51   critic loss: 0.39   steps: 517


training loop:   1% |                                  | ETA:  36 days, 3:15:30

Episode: 518   score: 11.17   Avg score (100e): 11.03   actor gain: -0.51   critic loss: 0.39   steps: 518


training loop:   1% |                                  | ETA:  36 days, 3:03:16

Episode: 519   score: 11.17   Avg score (100e): 11.03   actor gain: -0.51   critic loss: 0.40   steps: 519


training loop:   1% |                                  | ETA:  36 days, 2:57:32

Episode: 520   score: 11.19   Avg score (100e): 11.04   actor gain: -0.51   critic loss: 0.40   steps: 520


training loop:   1% |                                  | ETA:  36 days, 2:57:53

Episode: 521   score: 11.21   Avg score (100e): 11.04   actor gain: -0.51   critic loss: 0.40   steps: 521


training loop:   1% |                                  | ETA:  36 days, 2:54:31

Episode: 522   score: 11.21   Avg score (100e): 11.04   actor gain: -0.51   critic loss: 0.40   steps: 522


training loop:   1% |                                  | ETA:  36 days, 2:48:55

Episode: 523   score: 11.22   Avg score (100e): 11.04   actor gain: -0.51   critic loss: 0.40   steps: 523


training loop:   1% |                                  | ETA:  36 days, 2:37:50

Episode: 524   score: 11.22   Avg score (100e): 11.05   actor gain: -0.51   critic loss: 0.40   steps: 524


training loop:   1% |                                  | ETA:  36 days, 2:31:12

Episode: 525   score: 11.23   Avg score (100e): 11.05   actor gain: -0.51   critic loss: 0.40   steps: 525


training loop:   1% |                                  | ETA:  36 days, 2:22:18

Episode: 526   score: 11.24   Avg score (100e): 11.05   actor gain: -0.51   critic loss: 0.40   steps: 526


training loop:   1% |                                  | ETA:  36 days, 2:30:32

Episode: 527   score: 11.25   Avg score (100e): 11.06   actor gain: -0.51   critic loss: 0.40   steps: 527


training loop:   1% |                                  | ETA:  36 days, 2:25:40

Episode: 528   score: 11.26   Avg score (100e): 11.06   actor gain: -0.51   critic loss: 0.40   steps: 528


training loop:   1% |                                  | ETA:  36 days, 2:19:13

Episode: 529   score: 11.28   Avg score (100e): 11.06   actor gain: -0.51   critic loss: 0.40   steps: 529


training loop:   1% |                                  | ETA:  36 days, 2:13:22

Episode: 530   score: 11.29   Avg score (100e): 11.07   actor gain: -0.51   critic loss: 0.40   steps: 530


training loop:   1% |                                  | ETA:  36 days, 2:07:02

Episode: 531   score: 11.30   Avg score (100e): 11.07   actor gain: -0.52   critic loss: 0.40   steps: 531


training loop:   1% |                                  | ETA:  36 days, 1:59:49

Episode: 532   score: 11.29   Avg score (100e): 11.07   actor gain: -0.52   critic loss: 0.40   steps: 532


training loop:   1% |                                  | ETA:  36 days, 1:51:40

Episode: 533   score: 11.30   Avg score (100e): 11.07   actor gain: -0.52   critic loss: 0.40   steps: 533


training loop:   1% |                                  | ETA:  36 days, 1:45:49

Episode: 534   score: 11.30   Avg score (100e): 11.08   actor gain: -0.52   critic loss: 0.40   steps: 534


training loop:   1% |                                  | ETA:  36 days, 1:37:41

Episode: 535   score: 11.30   Avg score (100e): 11.08   actor gain: -0.52   critic loss: 0.40   steps: 535


training loop:   1% |                                  | ETA:  36 days, 1:29:14

Episode: 536   score: 11.32   Avg score (100e): 11.08   actor gain: -0.52   critic loss: 0.40   steps: 536


training loop:   1% |                                  | ETA:  36 days, 1:27:01

Episode: 537   score: 11.31   Avg score (100e): 11.09   actor gain: -0.43   critic loss: 0.40   steps: 537


training loop:   1% |                                  | ETA:  36 days, 1:20:55

Episode: 538   score: 11.30   Avg score (100e): 11.09   actor gain: -0.43   critic loss: 0.40   steps: 538


training loop:   1% |                                  | ETA:  36 days, 1:12:38

Episode: 539   score: 11.30   Avg score (100e): 11.09   actor gain: -0.43   critic loss: 0.40   steps: 539


training loop:   1% |                                  | ETA:  36 days, 1:07:50

Episode: 540   score: 11.31   Avg score (100e): 11.10   actor gain: -0.43   critic loss: 0.40   steps: 540


training loop:   1% |                                  | ETA:  36 days, 0:59:13

Episode: 541   score: 11.32   Avg score (100e): 11.10   actor gain: -0.43   critic loss: 0.40   steps: 541


training loop:   1% |                                  | ETA:  36 days, 0:51:50

Episode: 542   score: 11.34   Avg score (100e): 11.10   actor gain: -0.43   critic loss: 0.40   steps: 542


training loop:   1% |                                  | ETA:  36 days, 0:43:49

Episode: 543   score: 11.35   Avg score (100e): 11.11   actor gain: -0.43   critic loss: 0.40   steps: 543


training loop:   1% |                                  | ETA:  36 days, 0:35:43

Episode: 544   score: 11.36   Avg score (100e): 11.11   actor gain: -0.43   critic loss: 0.40   steps: 544


training loop:   1% |                                  | ETA:  36 days, 0:31:11

Episode: 545   score: 11.36   Avg score (100e): 11.12   actor gain: -0.43   critic loss: 0.40   steps: 545


training loop:   1% |                                  | ETA:  36 days, 0:25:50

Episode: 546   score: 11.36   Avg score (100e): 11.12   actor gain: -0.43   critic loss: 0.40   steps: 546


training loop:   1% |                                  | ETA:  36 days, 0:15:38

Episode: 547   score: 11.37   Avg score (100e): 11.12   actor gain: -0.43   critic loss: 0.40   steps: 547


training loop:   1% |                                  | ETA:  36 days, 0:04:41

Episode: 548   score: 11.37   Avg score (100e): 11.13   actor gain: -0.43   critic loss: 0.40   steps: 548


training loop:   1% |                                 | ETA:  35 days, 23:55:38

Episode: 549   score: 11.38   Avg score (100e): 11.13   actor gain: -0.43   critic loss: 0.40   steps: 549


training loop:   1% |                                 | ETA:  35 days, 23:55:13

Episode: 550   score: 11.39   Avg score (100e): 11.13   actor gain: -0.43   critic loss: 0.40   steps: 550


training loop:   1% |                                 | ETA:  35 days, 23:46:49

Episode: 551   score: 11.39   Avg score (100e): 11.14   actor gain: -0.43   critic loss: 0.40   steps: 551


training loop:   1% |                                 | ETA:  35 days, 23:39:17

Episode: 552   score: 11.40   Avg score (100e): 11.14   actor gain: -0.43   critic loss: 0.40   steps: 552


training loop:   1% |                                 | ETA:  35 days, 23:33:01

Episode: 553   score: 11.40   Avg score (100e): 11.15   actor gain: -0.43   critic loss: 0.40   steps: 553


training loop:   1% |                                 | ETA:  35 days, 23:25:36

Episode: 554   score: 11.42   Avg score (100e): 11.15   actor gain: -0.43   critic loss: 0.40   steps: 554


training loop:   1% |                                 | ETA:  35 days, 23:20:56

Episode: 555   score: 11.41   Avg score (100e): 11.16   actor gain: -0.43   critic loss: 0.40   steps: 555


training loop:   1% |                                 | ETA:  35 days, 23:12:43

Episode: 556   score: 11.41   Avg score (100e): 11.16   actor gain: -0.43   critic loss: 0.40   steps: 556


training loop:   1% |                                 | ETA:  35 days, 23:09:23

Episode: 557   score: 11.41   Avg score (100e): 11.16   actor gain: -0.42   critic loss: 0.40   steps: 557


training loop:   1% |                                 | ETA:  35 days, 23:00:51

Episode: 558   score: 11.42   Avg score (100e): 11.17   actor gain: -0.42   critic loss: 0.39   steps: 558


training loop:   1% |                                 | ETA:  35 days, 23:04:00

Episode: 559   score: 11.42   Avg score (100e): 11.17   actor gain: -0.42   critic loss: 0.39   steps: 559


training loop:   1% |                                 | ETA:  35 days, 22:59:47

Episode: 560   score: 11.42   Avg score (100e): 11.18   actor gain: -0.42   critic loss: 0.39   steps: 560


training loop:   1% |                                 | ETA:  35 days, 22:53:14

Episode: 561   score: 11.42   Avg score (100e): 11.18   actor gain: -0.42   critic loss: 0.39   steps: 561


training loop:   1% |                                 | ETA:  35 days, 22:45:59

Episode: 562   score: 11.42   Avg score (100e): 11.18   actor gain: -0.43   critic loss: 0.39   steps: 562


training loop:   1% |                                 | ETA:  35 days, 22:42:44

Episode: 563   score: 11.43   Avg score (100e): 11.19   actor gain: -0.42   critic loss: 0.39   steps: 563


training loop:   1% |                                 | ETA:  35 days, 22:34:31

Episode: 564   score: 11.46   Avg score (100e): 11.19   actor gain: -0.43   critic loss: 0.39   steps: 564


training loop:   1% |                                 | ETA:  35 days, 22:28:40

Episode: 565   score: 11.46   Avg score (100e): 11.20   actor gain: -0.43   critic loss: 0.39   steps: 565


training loop:   1% |                                 | ETA:  35 days, 22:20:50

Episode: 566   score: 11.45   Avg score (100e): 11.20   actor gain: -0.43   critic loss: 0.39   steps: 566


training loop:   1% |                                 | ETA:  35 days, 22:11:19

Episode: 567   score: 11.46   Avg score (100e): 11.21   actor gain: -0.44   critic loss: 0.39   steps: 567


training loop:   1% |                                 | ETA:  35 days, 22:02:40

Episode: 568   score: 11.47   Avg score (100e): 11.21   actor gain: -0.43   critic loss: 0.39   steps: 568


training loop:   1% |                                 | ETA:  35 days, 21:54:48

Episode: 569   score: 11.47   Avg score (100e): 11.22   actor gain: -0.44   critic loss: 0.39   steps: 569


training loop:   1% |                                 | ETA:  35 days, 21:48:44

Episode: 570   score: 11.47   Avg score (100e): 11.22   actor gain: -0.44   critic loss: 0.40   steps: 570


training loop:   1% |                                 | ETA:  35 days, 21:46:51

Episode: 571   score: 11.48   Avg score (100e): 11.22   actor gain: -0.44   critic loss: 0.40   steps: 571


training loop:   1% |                                 | ETA:  35 days, 21:40:02

Episode: 572   score: 11.48   Avg score (100e): 11.23   actor gain: -0.44   critic loss: 0.40   steps: 572


training loop:   1% |                                 | ETA:  35 days, 21:31:22

Episode: 573   score: 11.49   Avg score (100e): 11.23   actor gain: -0.44   critic loss: 0.40   steps: 573


training loop:   1% |                                 | ETA:  35 days, 21:24:00

Episode: 574   score: 11.48   Avg score (100e): 11.24   actor gain: -0.43   critic loss: 0.40   steps: 574


training loop:   1% |                                 | ETA:  35 days, 21:15:31

Episode: 575   score: 11.50   Avg score (100e): 11.24   actor gain: -0.43   critic loss: 0.40   steps: 575


training loop:   1% |                                 | ETA:  35 days, 21:10:00

Episode: 576   score: 11.50   Avg score (100e): 11.25   actor gain: -0.43   critic loss: 0.40   steps: 576


training loop:   1% |                                 | ETA:  35 days, 21:00:43

Episode: 577   score: 11.51   Avg score (100e): 11.25   actor gain: -0.43   critic loss: 0.40   steps: 577


training loop:   1% |                                 | ETA:  35 days, 20:54:35

Episode: 578   score: 11.52   Avg score (100e): 11.26   actor gain: -0.43   critic loss: 0.40   steps: 578


training loop:   1% |                                 | ETA:  35 days, 20:47:42

Episode: 579   score: 11.52   Avg score (100e): 11.26   actor gain: -0.43   critic loss: 0.40   steps: 579


training loop:   1% |                                 | ETA:  35 days, 20:40:55

Episode: 580   score: 11.54   Avg score (100e): 11.27   actor gain: -0.44   critic loss: 0.40   steps: 580


training loop:   1% |                                 | ETA:  35 days, 20:34:26

Episode: 581   score: 11.54   Avg score (100e): 11.27   actor gain: -0.44   critic loss: 0.40   steps: 581


training loop:   1% |                                 | ETA:  35 days, 20:33:54

Episode: 582   score: 11.53   Avg score (100e): 11.28   actor gain: -0.44   critic loss: 0.40   steps: 582


training loop:   1% |                                 | ETA:  35 days, 20:27:37

Episode: 583   score: 11.54   Avg score (100e): 11.28   actor gain: -0.44   critic loss: 0.40   steps: 583


training loop:   1% |                                 | ETA:  35 days, 20:19:15

Episode: 584   score: 11.54   Avg score (100e): 11.29   actor gain: -0.44   critic loss: 0.40   steps: 584


training loop:   1% |                                 | ETA:  35 days, 20:12:06

Episode: 585   score: 11.54   Avg score (100e): 11.29   actor gain: -0.44   critic loss: 0.40   steps: 585


training loop:   1% |                                 | ETA:  35 days, 20:06:57

Episode: 586   score: 11.55   Avg score (100e): 11.30   actor gain: -0.44   critic loss: 0.40   steps: 586


training loop:   1% |                                 | ETA:  35 days, 20:00:07

Episode: 587   score: 11.55   Avg score (100e): 11.30   actor gain: -0.44   critic loss: 0.40   steps: 587


training loop:   1% |                                 | ETA:  35 days, 19:54:39

Episode: 588   score: 11.57   Avg score (100e): 11.31   actor gain: -0.45   critic loss: 0.40   steps: 588


training loop:   1% |                                 | ETA:  35 days, 19:47:51

Episode: 589   score: 11.58   Avg score (100e): 11.31   actor gain: -0.44   critic loss: 0.40   steps: 589


training loop:   1% |                                 | ETA:  35 days, 19:38:49

Episode: 590   score: 11.57   Avg score (100e): 11.32   actor gain: -0.45   critic loss: 0.40   steps: 590


training loop:   1% |                                 | ETA:  35 days, 19:33:29

Episode: 591   score: 11.59   Avg score (100e): 11.32   actor gain: -0.45   critic loss: 0.40   steps: 591


training loop:   1% |                                 | ETA:  35 days, 19:40:30

Episode: 592   score: 11.60   Avg score (100e): 11.33   actor gain: -0.44   critic loss: 0.40   steps: 592


training loop:   1% |                                 | ETA:  35 days, 19:35:33

Episode: 593   score: 11.61   Avg score (100e): 11.33   actor gain: -0.45   critic loss: 0.40   steps: 593


training loop:   1% |                                 | ETA:  35 days, 19:33:17

Episode: 594   score: 11.61   Avg score (100e): 11.34   actor gain: -0.44   critic loss: 0.40   steps: 594


training loop:   1% |                                 | ETA:  35 days, 19:30:16

Episode: 595   score: 11.63   Avg score (100e): 11.34   actor gain: -0.45   critic loss: 0.40   steps: 595


training loop:   1% |                                 | ETA:  35 days, 19:23:05

Episode: 596   score: 11.63   Avg score (100e): 11.35   actor gain: -0.45   critic loss: 0.40   steps: 596


training loop:   1% |                                 | ETA:  35 days, 19:19:10

Episode: 597   score: 11.63   Avg score (100e): 11.35   actor gain: -0.45   critic loss: 0.40   steps: 597


training loop:   1% |                                 | ETA:  35 days, 19:10:29

Episode: 598   score: 11.64   Avg score (100e): 11.36   actor gain: -0.45   critic loss: 0.40   steps: 598


training loop:   1% |                                 | ETA:  35 days, 19:06:41

Episode: 599   score: 11.65   Avg score (100e): 11.36   actor gain: -0.46   critic loss: 0.40   steps: 599


training loop:   1% |                                 | ETA:  35 days, 19:02:34

Episode: 600   score: 11.65   Avg score (100e): 11.37   actor gain: -0.46   critic loss: 0.40   steps: 600


training loop:   1% |                                 | ETA:  35 days, 18:54:35

Episode: 601   score: 11.66   Avg score (100e): 11.38   actor gain: -0.46   critic loss: 0.40   steps: 601


training loop:   1% |                                 | ETA:  35 days, 18:45:29

Episode: 602   score: 11.66   Avg score (100e): 11.38   actor gain: -0.47   critic loss: 0.40   steps: 602


training loop:   1% |                                 | ETA:  35 days, 18:42:29

Episode: 603   score: 11.67   Avg score (100e): 11.39   actor gain: -0.47   critic loss: 0.40   steps: 603


training loop:   1% |                                 | ETA:  35 days, 18:36:06

Episode: 604   score: 11.69   Avg score (100e): 11.39   actor gain: -0.47   critic loss: 0.40   steps: 604


training loop:   1% |                                 | ETA:  35 days, 18:28:34

Episode: 605   score: 11.70   Avg score (100e): 11.40   actor gain: -0.46   critic loss: 0.40   steps: 605


training loop:   1% |                                 | ETA:  35 days, 18:23:49

Episode: 606   score: 11.72   Avg score (100e): 11.41   actor gain: -0.46   critic loss: 0.40   steps: 606


training loop:   1% |                                 | ETA:  35 days, 18:17:16

Episode: 607   score: 11.72   Avg score (100e): 11.41   actor gain: -0.46   critic loss: 0.40   steps: 607


training loop:   1% |                                 | ETA:  35 days, 18:08:06

Episode: 608   score: 11.73   Avg score (100e): 11.42   actor gain: -0.46   critic loss: 0.40   steps: 608


training loop:   1% |                                 | ETA:  35 days, 17:59:55

Episode: 609   score: 11.75   Avg score (100e): 11.42   actor gain: -0.47   critic loss: 0.40   steps: 609


training loop:   1% |                                 | ETA:  35 days, 17:52:06

Episode: 610   score: 11.75   Avg score (100e): 11.43   actor gain: -0.46   critic loss: 0.40   steps: 610


training loop:   1% |                                 | ETA:  35 days, 17:45:45

Episode: 611   score: 11.77   Avg score (100e): 11.44   actor gain: -0.46   critic loss: 0.40   steps: 611


training loop:   1% |                                 | ETA:  35 days, 17:43:07

Episode: 612   score: 11.78   Avg score (100e): 11.44   actor gain: -0.46   critic loss: 0.40   steps: 612


training loop:   1% |                                 | ETA:  35 days, 17:35:35

Episode: 613   score: 11.77   Avg score (100e): 11.45   actor gain: -0.48   critic loss: 0.40   steps: 613


training loop:   1% |                                 | ETA:  35 days, 17:28:47

Episode: 614   score: 11.77   Avg score (100e): 11.46   actor gain: -0.47   critic loss: 0.40   steps: 614


training loop:   1% |                                 | ETA:  35 days, 17:24:37

Episode: 615   score: 11.76   Avg score (100e): 11.46   actor gain: -0.47   critic loss: 0.40   steps: 615


training loop:   1% |                                 | ETA:  35 days, 17:20:23

Episode: 616   score: 11.77   Avg score (100e): 11.47   actor gain: -0.47   critic loss: 0.40   steps: 616


training loop:   1% |                                 | ETA:  35 days, 17:13:15

Episode: 617   score: 11.78   Avg score (100e): 11.47   actor gain: -0.47   critic loss: 0.40   steps: 617


training loop:   1% |                                 | ETA:  35 days, 17:09:57

Episode: 618   score: 11.78   Avg score (100e): 11.48   actor gain: -0.46   critic loss: 0.40   steps: 618


training loop:   1% |                                 | ETA:  35 days, 17:05:24

Episode: 619   score: 11.78   Avg score (100e): 11.49   actor gain: -0.46   critic loss: 0.40   steps: 619


training loop:   1% |                                 | ETA:  35 days, 17:01:50

Episode: 620   score: 11.79   Avg score (100e): 11.49   actor gain: -0.46   critic loss: 0.40   steps: 620


training loop:   1% |                                 | ETA:  35 days, 16:57:05

Episode: 621   score: 11.80   Avg score (100e): 11.50   actor gain: -0.46   critic loss: 0.40   steps: 621


training loop:   1% |                                 | ETA:  35 days, 16:49:36

Episode: 622   score: 11.82   Avg score (100e): 11.50   actor gain: -0.46   critic loss: 0.40   steps: 622


training loop:   1% |                                 | ETA:  35 days, 16:44:47

Episode: 623   score: 11.82   Avg score (100e): 11.51   actor gain: -0.46   critic loss: 0.40   steps: 623


training loop:   1% |                                 | ETA:  35 days, 16:48:49

Episode: 624   score: 11.83   Avg score (100e): 11.52   actor gain: -0.45   critic loss: 0.40   steps: 624


training loop:   1% |                                 | ETA:  35 days, 16:45:55

Episode: 625   score: 11.84   Avg score (100e): 11.52   actor gain: -0.45   critic loss: 0.40   steps: 625


training loop:   1% |                                 | ETA:  35 days, 16:45:26

Episode: 626   score: 11.85   Avg score (100e): 11.53   actor gain: -0.44   critic loss: 0.40   steps: 626


training loop:   1% |                                 | ETA:  35 days, 16:38:10

Episode: 627   score: 11.86   Avg score (100e): 11.54   actor gain: -0.44   critic loss: 0.40   steps: 627


training loop:   1% |                                 | ETA:  35 days, 16:34:15

Episode: 628   score: 11.87   Avg score (100e): 11.54   actor gain: -0.44   critic loss: 0.40   steps: 628


training loop:   1% |                                 | ETA:  35 days, 16:29:37

Episode: 629   score: 11.87   Avg score (100e): 11.55   actor gain: -0.44   critic loss: 0.40   steps: 629


training loop:   1% |                                 | ETA:  35 days, 16:23:31

Episode: 630   score: 11.88   Avg score (100e): 11.55   actor gain: -0.45   critic loss: 0.40   steps: 630


training loop:   1% |                                 | ETA:  35 days, 16:20:29

Episode: 631   score: 11.90   Avg score (100e): 11.56   actor gain: -0.45   critic loss: 0.40   steps: 631


training loop:   1% |                                 | ETA:  35 days, 16:16:58

Episode: 632   score: 11.89   Avg score (100e): 11.57   actor gain: -0.44   critic loss: 0.40   steps: 632


training loop:   1% |                                 | ETA:  35 days, 16:10:56

Episode: 633   score: 11.90   Avg score (100e): 11.57   actor gain: -0.44   critic loss: 0.40   steps: 633


training loop:   1% |                                 | ETA:  35 days, 16:04:10

Episode: 634   score: 11.91   Avg score (100e): 11.58   actor gain: -0.44   critic loss: 0.40   steps: 634


training loop:   1% |                                 | ETA:  35 days, 15:57:38

Episode: 635   score: 11.92   Avg score (100e): 11.58   actor gain: -0.44   critic loss: 0.40   steps: 635


training loop:   1% |                                 | ETA:  35 days, 15:53:14

Episode: 636   score: 11.93   Avg score (100e): 11.59   actor gain: -0.44   critic loss: 0.40   steps: 636


training loop:   1% |                                 | ETA:  35 days, 15:49:00

Episode: 637   score: 11.94   Avg score (100e): 11.60   actor gain: -0.44   critic loss: 0.40   steps: 637


training loop:   1% |                                 | ETA:  35 days, 15:42:26

Episode: 638   score: 11.96   Avg score (100e): 11.60   actor gain: -0.42   critic loss: 0.40   steps: 638


training loop:   1% |                                 | ETA:  35 days, 15:35:56

Episode: 639   score: 11.97   Avg score (100e): 11.61   actor gain: -0.42   critic loss: 0.40   steps: 639


training loop:   1% |                                 | ETA:  35 days, 15:32:45

Episode: 640   score: 11.97   Avg score (100e): 11.62   actor gain: -0.42   critic loss: 0.40   steps: 640


training loop:   1% |                                 | ETA:  35 days, 15:27:59

Episode: 641   score: 11.97   Avg score (100e): 11.62   actor gain: -0.42   critic loss: 0.40   steps: 641


training loop:   1% |                                 | ETA:  35 days, 15:26:44

Episode: 642   score: 11.96   Avg score (100e): 11.63   actor gain: -0.42   critic loss: 0.40   steps: 642


training loop:   1% |                                 | ETA:  35 days, 15:27:00

Episode: 643   score: 11.98   Avg score (100e): 11.63   actor gain: -0.42   critic loss: 0.40   steps: 643


training loop:   1% |                                 | ETA:  35 days, 15:20:17

Episode: 644   score: 11.99   Avg score (100e): 11.64   actor gain: -0.42   critic loss: 0.41   steps: 644


training loop:   1% |                                 | ETA:  35 days, 15:13:06

Episode: 645   score: 12.00   Avg score (100e): 11.65   actor gain: -0.42   critic loss: 0.41   steps: 645


training loop:   1% |                                 | ETA:  35 days, 15:08:55

Episode: 646   score: 11.99   Avg score (100e): 11.65   actor gain: -0.42   critic loss: 0.41   steps: 646


training loop:   1% |                                 | ETA:  35 days, 15:00:24

Episode: 647   score: 12.00   Avg score (100e): 11.66   actor gain: -0.42   critic loss: 0.41   steps: 647


training loop:   1% |                                 | ETA:  35 days, 14:53:55

Episode: 648   score: 12.00   Avg score (100e): 11.67   actor gain: -0.42   critic loss: 0.41   steps: 648


training loop:   1% |                                 | ETA:  35 days, 14:50:42

Episode: 649   score: 12.01   Avg score (100e): 11.67   actor gain: -0.42   critic loss: 0.41   steps: 649


training loop:   1% |                                 | ETA:  35 days, 14:43:35

Episode: 650   score: 12.01   Avg score (100e): 11.68   actor gain: -0.42   critic loss: 0.41   steps: 650


training loop:   1% |                                 | ETA:  35 days, 14:39:21

Episode: 651   score: 12.02   Avg score (100e): 11.68   actor gain: -0.42   critic loss: 0.41   steps: 651


training loop:   1% |                                 | ETA:  35 days, 14:37:18

Episode: 652   score: 12.03   Avg score (100e): 11.69   actor gain: -0.42   critic loss: 0.41   steps: 652


training loop:   1% |                                 | ETA:  35 days, 14:31:49

Episode: 653   score: 12.04   Avg score (100e): 11.70   actor gain: -0.42   critic loss: 0.41   steps: 653


training loop:   1% |                                 | ETA:  35 days, 14:29:36

Episode: 654   score: 12.05   Avg score (100e): 11.70   actor gain: -0.42   critic loss: 0.41   steps: 654


training loop:   1% |                                 | ETA:  35 days, 14:25:13

Episode: 655   score: 12.05   Avg score (100e): 11.71   actor gain: -0.42   critic loss: 0.40   steps: 655


training loop:   1% |                                 | ETA:  35 days, 14:31:36

Episode: 656   score: 12.06   Avg score (100e): 11.72   actor gain: -0.42   critic loss: 0.40   steps: 656


training loop:   1% |                                 | ETA:  35 days, 14:29:34

Episode: 657   score: 12.08   Avg score (100e): 11.72   actor gain: -0.62   critic loss: 0.40   steps: 657


training loop:   1% |                                 | ETA:  35 days, 14:24:17

Episode: 658   score: 12.10   Avg score (100e): 11.73   actor gain: -0.63   critic loss: 0.40   steps: 658


training loop:   1% |                                 | ETA:  35 days, 14:24:55

Episode: 659   score: 12.11   Avg score (100e): 11.74   actor gain: -0.63   critic loss: 0.40   steps: 659


training loop:   1% |                                 | ETA:  35 days, 14:22:14

Episode: 660   score: 12.10   Avg score (100e): 11.74   actor gain: -0.62   critic loss: 0.40   steps: 660


training loop:   1% |                                 | ETA:  35 days, 14:20:03

Episode: 661   score: 12.10   Avg score (100e): 11.75   actor gain: -0.63   critic loss: 0.40   steps: 661


training loop:   1% |                                 | ETA:  35 days, 14:16:02

Episode: 662   score: 12.09   Avg score (100e): 11.76   actor gain: -0.63   critic loss: 0.40   steps: 662


training loop:   1% |                                 | ETA:  35 days, 14:13:49

Episode: 663   score: 12.10   Avg score (100e): 11.76   actor gain: -0.62   critic loss: 0.40   steps: 663


training loop:   1% |                                 | ETA:  35 days, 14:07:47

Episode: 664   score: 12.10   Avg score (100e): 11.77   actor gain: -0.62   critic loss: 0.40   steps: 664


training loop:   1% |                                 | ETA:  35 days, 14:01:15

Episode: 665   score: 12.08   Avg score (100e): 11.78   actor gain: -0.62   critic loss: 0.40   steps: 665


training loop:   1% |                                 | ETA:  35 days, 13:55:13

Episode: 666   score: 12.09   Avg score (100e): 11.78   actor gain: -0.62   critic loss: 0.40   steps: 666


training loop:   1% |                                 | ETA:  35 days, 13:49:19

Episode: 667   score: 12.08   Avg score (100e): 11.79   actor gain: -0.62   critic loss: 0.40   steps: 667


training loop:   1% |                                 | ETA:  35 days, 13:42:19

Episode: 668   score: 12.08   Avg score (100e): 11.80   actor gain: -0.62   critic loss: 0.40   steps: 668


training loop:   1% |                                 | ETA:  35 days, 13:38:27

Episode: 669   score: 12.08   Avg score (100e): 11.80   actor gain: -0.62   critic loss: 0.40   steps: 669


training loop:   1% |                                 | ETA:  35 days, 13:31:48

Episode: 670   score: 12.09   Avg score (100e): 11.81   actor gain: -0.62   critic loss: 0.40   steps: 670


training loop:   1% |                                 | ETA:  35 days, 13:28:41

Episode: 671   score: 12.06   Avg score (100e): 11.81   actor gain: -0.62   critic loss: 0.40   steps: 671


training loop:   1% |                                 | ETA:  35 days, 13:21:23

Episode: 672   score: 12.06   Avg score (100e): 11.82   actor gain: -0.62   critic loss: 0.40   steps: 672


training loop:   1% |                                 | ETA:  35 days, 13:15:28

Episode: 673   score: 12.05   Avg score (100e): 11.82   actor gain: -0.62   critic loss: 0.40   steps: 673


training loop:   1% |                                 | ETA:  35 days, 13:11:07

Episode: 674   score: 12.05   Avg score (100e): 11.83   actor gain: -0.62   critic loss: 0.39   steps: 674


training loop:   1% |                                 | ETA:  35 days, 13:07:23

Episode: 675   score: 12.05   Avg score (100e): 11.84   actor gain: -0.62   critic loss: 0.39   steps: 675


training loop:   1% |                                 | ETA:  35 days, 13:03:20

Episode: 676   score: 12.04   Avg score (100e): 11.84   actor gain: -0.63   critic loss: 0.39   steps: 676


training loop:   1% |                                 | ETA:  35 days, 13:02:55

Episode: 677   score: 12.04   Avg score (100e): 11.85   actor gain: -0.62   critic loss: 0.39   steps: 677


training loop:   1% |                                 | ETA:  35 days, 12:58:45

Episode: 678   score: 12.02   Avg score (100e): 11.85   actor gain: -0.62   critic loss: 0.39   steps: 678


training loop:   1% |                                 | ETA:  35 days, 12:53:18

Episode: 679   score: 12.02   Avg score (100e): 11.86   actor gain: -0.62   critic loss: 0.39   steps: 679


training loop:   1% |                                 | ETA:  35 days, 12:48:27

Episode: 680   score: 12.00   Avg score (100e): 11.86   actor gain: -0.62   critic loss: 0.39   steps: 680


training loop:   1% |                                 | ETA:  35 days, 12:44:02

Episode: 681   score: 12.01   Avg score (100e): 11.87   actor gain: -0.62   critic loss: 0.39   steps: 681


training loop:   1% |                                 | ETA:  35 days, 12:38:18

Episode: 682   score: 12.00   Avg score (100e): 11.87   actor gain: -0.41   critic loss: 0.39   steps: 682


training loop:   1% |                                 | ETA:  35 days, 12:34:12

Episode: 683   score: 11.99   Avg score (100e): 11.88   actor gain: -0.41   critic loss: 0.38   steps: 683


training loop:   1% |                                 | ETA:  35 days, 12:29:00

Episode: 684   score: 11.99   Avg score (100e): 11.88   actor gain: -0.41   critic loss: 0.38   steps: 684


training loop:   1% |                                 | ETA:  35 days, 12:25:03

Episode: 685   score: 11.99   Avg score (100e): 11.88   actor gain: -0.41   critic loss: 0.38   steps: 685


training loop:   1% |                                 | ETA:  35 days, 12:21:14

Episode: 686   score: 11.98   Avg score (100e): 11.89   actor gain: -0.43   critic loss: 0.38   steps: 686


training loop:   1% |                                 | ETA:  35 days, 12:14:44

Episode: 687   score: 11.98   Avg score (100e): 11.89   actor gain: -0.43   critic loss: 0.38   steps: 687


training loop:   1% |                                 | ETA:  35 days, 12:07:53

Episode: 688   score: 11.99   Avg score (100e): 11.90   actor gain: -0.43   critic loss: 0.38   steps: 688
np.all(done) is true! miracle!


training loop:   1% |                                 | ETA:  35 days, 12:12:53

Episode: 689   score: 11.98   Avg score (100e): 11.90   actor gain: -0.43   critic loss: 0.38   steps: 689


training loop:   1% |                                 | ETA:  35 days, 12:09:11

Episode: 690   score: 11.99   Avg score (100e): 11.91   actor gain: -0.43   critic loss: 0.38   steps: 690


training loop:   1% |                                 | ETA:  35 days, 12:11:36

Episode: 691   score: 11.99   Avg score (100e): 11.91   actor gain: -0.43   critic loss: 0.38   steps: 691


training loop:   1% |                                 | ETA:  35 days, 12:07:11

Episode: 692   score: 11.98   Avg score (100e): 11.91   actor gain: -0.44   critic loss: 0.38   steps: 692


training loop:   1% |                                 | ETA:  35 days, 12:06:41

Episode: 693   score: 11.97   Avg score (100e): 11.92   actor gain: -0.44   critic loss: 0.38   steps: 693


training loop:   1% |                                 | ETA:  35 days, 12:02:29

Episode: 694   score: 11.97   Avg score (100e): 11.92   actor gain: -0.44   critic loss: 0.38   steps: 694


training loop:   1% |                                 | ETA:  35 days, 11:57:39

Episode: 695   score: 11.96   Avg score (100e): 11.92   actor gain: -0.44   critic loss: 0.38   steps: 695


training loop:   1% |                                 | ETA:  35 days, 11:52:09

Episode: 696   score: 11.96   Avg score (100e): 11.93   actor gain: -0.44   critic loss: 0.38   steps: 696


training loop:   1% |                                 | ETA:  35 days, 11:46:45

Episode: 697   score: 11.97   Avg score (100e): 11.93   actor gain: -0.44   critic loss: 0.38   steps: 697


training loop:   1% |                                 | ETA:  35 days, 11:41:10

Episode: 698   score: 11.97   Avg score (100e): 11.93   actor gain: -0.44   critic loss: 0.38   steps: 698


training loop:   1% |                                 | ETA:  35 days, 11:35:26

Episode: 699   score: 11.98   Avg score (100e): 11.94   actor gain: -0.44   critic loss: 0.38   steps: 699


training loop:   1% |                                 | ETA:  35 days, 11:29:14

Episode: 700   score: 11.97   Avg score (100e): 11.94   actor gain: -0.44   critic loss: 0.38   steps: 700


training loop:   1% |                                 | ETA:  35 days, 11:25:36

Episode: 701   score: 11.97   Avg score (100e): 11.94   actor gain: -0.45   critic loss: 0.38   steps: 701


training loop:   1% |                                 | ETA:  35 days, 11:24:27

Episode: 702   score: 11.97   Avg score (100e): 11.95   actor gain: -0.45   critic loss: 0.38   steps: 702


training loop:   1% |                                 | ETA:  35 days, 11:23:37

Episode: 703   score: 11.98   Avg score (100e): 11.95   actor gain: -0.44   critic loss: 0.38   steps: 703


training loop:   1% |                                 | ETA:  35 days, 11:18:46

Episode: 704   score: 11.97   Avg score (100e): 11.95   actor gain: -0.44   critic loss: 0.38   steps: 704


training loop:   1% |                                 | ETA:  35 days, 11:16:59

Episode: 705   score: 11.97   Avg score (100e): 11.95   actor gain: -0.44   critic loss: 0.38   steps: 705


training loop:   1% |                                 | ETA:  35 days, 11:14:18

Episode: 706   score: 11.98   Avg score (100e): 11.96   actor gain: -0.44   critic loss: 0.38   steps: 706


training loop:   1% |                                 | ETA:  35 days, 11:09:12

Episode: 707   score: 11.98   Avg score (100e): 11.96   actor gain: -0.46   critic loss: 0.38   steps: 707


training loop:   1% |                                 | ETA:  35 days, 11:05:56

Episode: 708   score: 11.98   Avg score (100e): 11.96   actor gain: -0.46   critic loss: 0.38   steps: 708


training loop:   1% |                                 | ETA:  35 days, 11:01:10

Episode: 709   score: 11.98   Avg score (100e): 11.96   actor gain: -0.45   critic loss: 0.38   steps: 709


training loop:   1% |                                 | ETA:  35 days, 10:57:20

Episode: 710   score: 11.99   Avg score (100e): 11.97   actor gain: -0.47   critic loss: 0.38   steps: 710


training loop:   1% |                                 | ETA:  35 days, 10:54:49

Episode: 711   score: 11.98   Avg score (100e): 11.97   actor gain: -0.45   critic loss: 0.38   steps: 711


training loop:   1% |                                 | ETA:  35 days, 10:50:04

Episode: 712   score: 11.98   Avg score (100e): 11.97   actor gain: -0.45   critic loss: 0.38   steps: 712


training loop:   1% |                                 | ETA:  35 days, 10:44:40

Episode: 713   score: 11.99   Avg score (100e): 11.97   actor gain: -0.45   critic loss: 0.38   steps: 713


training loop:   1% |                                 | ETA:  35 days, 10:40:19

Episode: 714   score: 11.99   Avg score (100e): 11.98   actor gain: -0.45   critic loss: 0.38   steps: 714


training loop:   1% |                                 | ETA:  35 days, 10:34:41

Episode: 715   score: 12.00   Avg score (100e): 11.98   actor gain: -0.45   critic loss: 0.38   steps: 715


training loop:   1% |                                 | ETA:  35 days, 10:29:10

Episode: 716   score: 11.99   Avg score (100e): 11.98   actor gain: -0.45   critic loss: 0.38   steps: 716


training loop:   1% |                                 | ETA:  35 days, 10:27:09

Episode: 717   score: 11.99   Avg score (100e): 11.98   actor gain: -0.45   critic loss: 0.38   steps: 717


training loop:   1% |                                 | ETA:  35 days, 10:22:21

Episode: 718   score: 11.98   Avg score (100e): 11.98   actor gain: -0.45   critic loss: 0.38   steps: 718


training loop:   1% |                                 | ETA:  35 days, 10:16:12

Episode: 719   score: 11.98   Avg score (100e): 11.99   actor gain: -0.45   critic loss: 0.38   steps: 719


training loop:   1% |                                 | ETA:  35 days, 10:11:53

Episode: 720   score: 11.98   Avg score (100e): 11.99   actor gain: -0.45   critic loss: 0.38   steps: 720


training loop:   1% |                                 | ETA:  35 days, 10:17:18

Episode: 721   score: 11.99   Avg score (100e): 11.99   actor gain: -0.45   critic loss: 0.38   steps: 721


training loop:   1% |                                 | ETA:  35 days, 10:16:15

Episode: 722   score: 11.98   Avg score (100e): 11.99   actor gain: -0.46   critic loss: 0.38   steps: 722


training loop:   1% |                                 | ETA:  35 days, 10:15:03

Episode: 723   score: 11.98   Avg score (100e): 11.99   actor gain: -0.45   critic loss: 0.38   steps: 723


training loop:   1% |                                 | ETA:  35 days, 10:10:56

Episode: 724   score: 11.98   Avg score (100e): 11.99   actor gain: -0.46   critic loss: 0.38   steps: 724


training loop:   1% |                                 | ETA:  35 days, 10:08:45

Episode: 725   score: 11.98   Avg score (100e): 12.00   actor gain: -0.46   critic loss: 0.38   steps: 725


training loop:   1% |                                 | ETA:  35 days, 10:07:25

Episode: 726   score: 11.98   Avg score (100e): 12.00   actor gain: -0.51   critic loss: 0.38   steps: 726


training loop:   1% |                                 | ETA:  35 days, 10:03:01

Episode: 727   score: 11.97   Avg score (100e): 12.00   actor gain: -0.51   critic loss: 0.38   steps: 727
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 9:57:40

Episode: 728   score: 11.99   Avg score (100e): 12.00   actor gain: -0.51   critic loss: 0.38   steps: 728


training loop:   1% |                                  | ETA:  35 days, 9:53:08

Episode: 729   score: 11.97   Avg score (100e): 12.00   actor gain: -0.51   critic loss: 0.38   steps: 729


training loop:   1% |                                  | ETA:  35 days, 9:47:37

Episode: 730   score: 11.98   Avg score (100e): 12.00   actor gain: -0.51   critic loss: 0.38   steps: 730


training loop:   1% |                                  | ETA:  35 days, 9:46:40

Episode: 731   score: 11.99   Avg score (100e): 12.00   actor gain: -0.51   critic loss: 0.38   steps: 731


training loop:   1% |                                  | ETA:  35 days, 9:42:04

Episode: 732   score: 11.99   Avg score (100e): 12.00   actor gain: -0.52   critic loss: 0.38   steps: 732


training loop:   1% |                                  | ETA:  35 days, 9:37:54

Episode: 733   score: 11.99   Avg score (100e): 12.00   actor gain: -0.52   critic loss: 0.39   steps: 733


training loop:   1% |                                  | ETA:  35 days, 9:35:37

Episode: 734   score: 11.99   Avg score (100e): 12.01   actor gain: -0.52   critic loss: 0.39   steps: 734


training loop:   1% |                                  | ETA:  35 days, 9:32:28

Episode: 735   score: 11.98   Avg score (100e): 12.01   actor gain: -0.50   critic loss: 0.39   steps: 735


training loop:   1% |                                  | ETA:  35 days, 9:28:58

Episode: 736   score: 11.98   Avg score (100e): 12.01   actor gain: -0.52   critic loss: 0.39   steps: 736


training loop:   1% |                                  | ETA:  35 days, 9:26:20

Episode: 737   score: 11.99   Avg score (100e): 12.01   actor gain: -0.52   critic loss: 0.39   steps: 737


training loop:   1% |                                  | ETA:  35 days, 9:24:20

Episode: 738   score: 11.99   Avg score (100e): 12.01   actor gain: -0.52   critic loss: 0.39   steps: 738


training loop:   1% |                                  | ETA:  35 days, 9:21:57

Episode: 739   score: 12.00   Avg score (100e): 12.01   actor gain: -0.51   critic loss: 0.39   steps: 739


training loop:   1% |                                  | ETA:  35 days, 9:15:44

Episode: 740   score: 12.01   Avg score (100e): 12.01   actor gain: -0.52   critic loss: 0.39   steps: 740


training loop:   1% |                                  | ETA:  35 days, 9:12:22

Episode: 741   score: 12.01   Avg score (100e): 12.01   actor gain: -0.51   critic loss: 0.39   steps: 741


training loop:   1% |                                  | ETA:  35 days, 9:09:54

Episode: 742   score: 12.01   Avg score (100e): 12.01   actor gain: -0.52   critic loss: 0.39   steps: 742


training loop:   1% |                                  | ETA:  35 days, 9:09:23

Episode: 743   score: 12.00   Avg score (100e): 12.01   actor gain: -0.52   critic loss: 0.39   steps: 743


training loop:   1% |                                  | ETA:  35 days, 9:02:58

Episode: 744   score: 12.01   Avg score (100e): 12.01   actor gain: -0.51   critic loss: 0.39   steps: 744


training loop:   1% |                                  | ETA:  35 days, 9:00:40

Episode: 745   score: 12.01   Avg score (100e): 12.01   actor gain: -0.51   critic loss: 0.39   steps: 745


training loop:   1% |                                  | ETA:  35 days, 8:57:13

Episode: 746   score: 12.02   Avg score (100e): 12.01   actor gain: -0.51   critic loss: 0.39   steps: 746


training loop:   1% |                                  | ETA:  35 days, 8:53:19

Episode: 747   score: 12.03   Avg score (100e): 12.01   actor gain: -0.50   critic loss: 0.39   steps: 747


training loop:   1% |                                  | ETA:  35 days, 8:49:58

Episode: 748   score: 12.02   Avg score (100e): 12.01   actor gain: -0.50   critic loss: 0.39   steps: 748


training loop:   1% |                                  | ETA:  35 days, 8:46:42

Episode: 749   score: 12.03   Avg score (100e): 12.01   actor gain: -0.50   critic loss: 0.39   steps: 749


training loop:   1% |                                  | ETA:  35 days, 8:43:26

Episode: 750   score: 12.02   Avg score (100e): 12.01   actor gain: -0.50   critic loss: 0.39   steps: 750


training loop:   1% |                                  | ETA:  35 days, 8:42:03

Episode: 751   score: 12.02   Avg score (100e): 12.01   actor gain: -0.44   critic loss: 0.39   steps: 751


training loop:   1% |                                  | ETA:  35 days, 8:36:56

Episode: 752   score: 12.01   Avg score (100e): 12.01   actor gain: -0.44   critic loss: 0.39   steps: 752


training loop:   1% |                                  | ETA:  35 days, 8:42:28

Episode: 753   score: 12.01   Avg score (100e): 12.01   actor gain: -0.44   critic loss: 0.39   steps: 753


training loop:   1% |                                  | ETA:  35 days, 8:42:37

Episode: 754   score: 12.01   Avg score (100e): 12.01   actor gain: -0.44   critic loss: 0.39   steps: 754


training loop:   1% |                                  | ETA:  35 days, 8:39:05

Episode: 755   score: 12.01   Avg score (100e): 12.01   actor gain: -0.44   critic loss: 0.39   steps: 755


training loop:   1% |                                  | ETA:  35 days, 8:38:10

Episode: 756   score: 12.01   Avg score (100e): 12.01   actor gain: -0.44   critic loss: 0.39   steps: 756


training loop:   1% |                                  | ETA:  35 days, 8:34:40

Episode: 757   score: 12.00   Avg score (100e): 12.01   actor gain: -0.42   critic loss: 0.39   steps: 757


training loop:   1% |                                  | ETA:  35 days, 8:32:59

Episode: 758   score: 11.99   Avg score (100e): 12.01   actor gain: -0.42   critic loss: 0.39   steps: 758


training loop:   1% |                                  | ETA:  35 days, 8:30:29

Episode: 759   score: 12.00   Avg score (100e): 12.01   actor gain: -0.42   critic loss: 0.39   steps: 759


training loop:   1% |                                  | ETA:  35 days, 8:26:00

Episode: 760   score: 12.00   Avg score (100e): 12.00   actor gain: -0.42   critic loss: 0.39   steps: 760


training loop:   1% |                                  | ETA:  35 days, 8:22:32

Episode: 761   score: 11.99   Avg score (100e): 12.00   actor gain: -0.43   critic loss: 0.39   steps: 761


training loop:   1% |                                  | ETA:  35 days, 8:23:10

Episode: 762   score: 12.00   Avg score (100e): 12.00   actor gain: -0.43   critic loss: 0.38   steps: 762


training loop:   1% |                                  | ETA:  35 days, 8:21:20

Episode: 763   score: 12.01   Avg score (100e): 12.00   actor gain: -0.43   critic loss: 0.38   steps: 763


training loop:   1% |                                  | ETA:  35 days, 8:16:05

Episode: 764   score: 12.02   Avg score (100e): 12.00   actor gain: -0.43   critic loss: 0.38   steps: 764


training loop:   1% |                                  | ETA:  35 days, 8:13:44

Episode: 765   score: 12.02   Avg score (100e): 12.00   actor gain: -0.42   critic loss: 0.38   steps: 765


training loop:   1% |                                  | ETA:  35 days, 8:11:00

Episode: 766   score: 12.02   Avg score (100e): 12.00   actor gain: -0.43   critic loss: 0.38   steps: 766


training loop:   1% |                                  | ETA:  35 days, 8:06:10

Episode: 767   score: 12.02   Avg score (100e): 12.00   actor gain: -0.43   critic loss: 0.38   steps: 767


training loop:   1% |                                  | ETA:  35 days, 8:04:10

Episode: 768   score: 12.02   Avg score (100e): 12.00   actor gain: -0.46   critic loss: 0.38   steps: 768


training loop:   1% |                                  | ETA:  35 days, 8:00:37

Episode: 769   score: 12.02   Avg score (100e): 12.00   actor gain: -0.46   critic loss: 0.38   steps: 769


training loop:   1% |                                  | ETA:  35 days, 7:56:58

Episode: 770   score: 12.01   Avg score (100e): 12.00   actor gain: -0.47   critic loss: 0.38   steps: 770


training loop:   1% |                                  | ETA:  35 days, 7:51:56

Episode: 771   score: 12.02   Avg score (100e): 12.00   actor gain: -0.47   critic loss: 0.38   steps: 771


training loop:   1% |                                  | ETA:  35 days, 7:47:36

Episode: 772   score: 12.02   Avg score (100e): 12.00   actor gain: -0.47   critic loss: 0.38   steps: 772


training loop:   1% |                                  | ETA:  35 days, 7:43:11

Episode: 773   score: 12.02   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 773


training loop:   1% |                                  | ETA:  35 days, 7:36:37

Episode: 774   score: 12.01   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 774


training loop:   1% |                                  | ETA:  35 days, 7:31:20

Episode: 775   score: 12.02   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 775


training loop:   1% |                                  | ETA:  35 days, 7:29:42

Episode: 776   score: 12.02   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 776


training loop:   1% |                                  | ETA:  35 days, 7:25:20

Episode: 777   score: 12.04   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 777


training loop:   1% |                                  | ETA:  35 days, 7:20:40

Episode: 778   score: 12.05   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 778


training loop:   1% |                                  | ETA:  35 days, 7:17:16

Episode: 779   score: 12.05   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 779


training loop:   1% |                                  | ETA:  35 days, 7:13:25

Episode: 780   score: 12.06   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 780


training loop:   1% |                                  | ETA:  35 days, 7:09:48

Episode: 781   score: 12.06   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 781


training loop:   1% |                                  | ETA:  35 days, 7:08:35

Episode: 782   score: 12.07   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 782


training loop:   1% |                                  | ETA:  35 days, 7:05:14

Episode: 783   score: 12.07   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 783


training loop:   1% |                                  | ETA:  35 days, 6:59:13

Episode: 784   score: 12.07   Avg score (100e): 12.00   actor gain: -0.48   critic loss: 0.38   steps: 784


training loop:   1% |                                  | ETA:  35 days, 6:54:52

Episode: 785   score: 12.07   Avg score (100e): 12.00   actor gain: -0.49   critic loss: 0.38   steps: 785


training loop:   1% |                                  | ETA:  35 days, 6:58:53

Episode: 786   score: 12.09   Avg score (100e): 12.00   actor gain: -0.47   critic loss: 0.38   steps: 786


training loop:   1% |                                  | ETA:  35 days, 6:59:56

Episode: 787   score: 12.09   Avg score (100e): 12.00   actor gain: -0.47   critic loss: 0.38   steps: 787


training loop:   1% |                                  | ETA:  35 days, 6:59:02

Episode: 788   score: 12.09   Avg score (100e): 12.00   actor gain: -0.47   critic loss: 0.39   steps: 788


training loop:   1% |                                  | ETA:  35 days, 6:57:23

Episode: 789   score: 12.09   Avg score (100e): 12.00   actor gain: -0.47   critic loss: 0.39   steps: 789


training loop:   1% |                                  | ETA:  35 days, 6:53:53

Episode: 790   score: 12.09   Avg score (100e): 12.01   actor gain: -0.47   critic loss: 0.39   steps: 790


training loop:   1% |                                  | ETA:  35 days, 6:53:47

Episode: 791   score: 12.10   Avg score (100e): 12.01   actor gain: -0.47   critic loss: 0.39   steps: 791


training loop:   1% |                                  | ETA:  35 days, 6:50:52

Episode: 792   score: 12.11   Avg score (100e): 12.01   actor gain: -6.07   critic loss: 0.38   steps: 792


training loop:   1% |                                  | ETA:  35 days, 6:49:49

Episode: 793   score: 12.12   Avg score (100e): 12.01   actor gain: -6.04   critic loss: 0.38   steps: 793


training loop:   1% |                                  | ETA:  35 days, 6:45:30

Episode: 794   score: 12.12   Avg score (100e): 12.01   actor gain: -6.03   critic loss: 0.38   steps: 794


training loop:   1% |                                  | ETA:  35 days, 6:39:26

Episode: 795   score: 12.12   Avg score (100e): 12.01   actor gain: -6.02   critic loss: 0.38   steps: 795


training loop:   1% |                                  | ETA:  35 days, 6:37:46

Episode: 796   score: 12.12   Avg score (100e): 12.01   actor gain: -6.02   critic loss: 0.38   steps: 796
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 6:32:43

Episode: 797   score: 12.13   Avg score (100e): 12.02   actor gain: -6.02   critic loss: 0.39   steps: 797


training loop:   1% |                                  | ETA:  35 days, 6:31:39

Episode: 798   score: 12.13   Avg score (100e): 12.02   actor gain: -6.02   critic loss: 0.38   steps: 798


training loop:   1% |                                  | ETA:  35 days, 6:29:59

Episode: 799   score: 12.14   Avg score (100e): 12.02   actor gain: -6.02   critic loss: 0.39   steps: 799


training loop:   1% |                                  | ETA:  35 days, 6:25:00

Episode: 800   score: 12.15   Avg score (100e): 12.02   actor gain: -6.02   critic loss: 0.39   steps: 800


training loop:   1% |                                  | ETA:  35 days, 6:21:47

Episode: 801   score: 12.16   Avg score (100e): 12.02   actor gain: -6.02   critic loss: 0.39   steps: 801


training loop:   1% |                                  | ETA:  35 days, 6:18:19

Episode: 802   score: 12.17   Avg score (100e): 12.02   actor gain: -6.02   critic loss: 0.39   steps: 802


training loop:   1% |                                  | ETA:  35 days, 6:13:50

Episode: 803   score: 12.19   Avg score (100e): 12.03   actor gain: -6.02   critic loss: 0.39   steps: 803


training loop:   1% |                                  | ETA:  35 days, 6:09:31

Episode: 804   score: 12.20   Avg score (100e): 12.03   actor gain: -6.02   critic loss: 0.39   steps: 804


training loop:   1% |                                  | ETA:  35 days, 6:08:35

Episode: 805   score: 12.20   Avg score (100e): 12.03   actor gain: -6.02   critic loss: 0.39   steps: 805


training loop:   1% |                                  | ETA:  35 days, 6:06:28

Episode: 806   score: 12.22   Avg score (100e): 12.03   actor gain: -6.03   critic loss: 0.39   steps: 806


training loop:   1% |                                  | ETA:  35 days, 6:05:46

Episode: 807   score: 12.22   Avg score (100e): 12.04   actor gain: -6.03   critic loss: 0.39   steps: 807


training loop:   1% |                                  | ETA:  35 days, 6:01:21

Episode: 808   score: 12.23   Avg score (100e): 12.04   actor gain: -6.03   critic loss: 0.40   steps: 808


training loop:   1% |                                  | ETA:  35 days, 5:58:01

Episode: 809   score: 12.23   Avg score (100e): 12.04   actor gain: -6.03   critic loss: 0.40   steps: 809


training loop:   1% |                                  | ETA:  35 days, 5:56:14

Episode: 810   score: 12.25   Avg score (100e): 12.04   actor gain: -6.02   critic loss: 0.40   steps: 810


training loop:   1% |                                  | ETA:  35 days, 5:52:51

Episode: 811   score: 12.24   Avg score (100e): 12.05   actor gain: -6.02   critic loss: 0.40   steps: 811
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 5:48:17

Episode: 812   score: 12.26   Avg score (100e): 12.05   actor gain: -6.02   critic loss: 0.40   steps: 812


training loop:   1% |                                  | ETA:  35 days, 5:44:23

Episode: 813   score: 12.27   Avg score (100e): 12.05   actor gain: -6.02   critic loss: 0.40   steps: 813


training loop:   1% |                                  | ETA:  35 days, 5:40:25

Episode: 814   score: 12.28   Avg score (100e): 12.05   actor gain: -6.02   critic loss: 0.40   steps: 814


training loop:   1% |                                  | ETA:  35 days, 5:36:46

Episode: 815   score: 12.28   Avg score (100e): 12.06   actor gain: -6.02   critic loss: 0.40   steps: 815


training loop:   1% |                                  | ETA:  35 days, 5:34:20

Episode: 816   score: 12.28   Avg score (100e): 12.06   actor gain: -6.02   critic loss: 0.41   steps: 816


training loop:   1% |                                  | ETA:  35 days, 5:30:44

Episode: 817   score: 12.29   Avg score (100e): 12.06   actor gain: -0.42   critic loss: 0.41   steps: 817


training loop:   1% |                                  | ETA:  35 days, 5:37:36

Episode: 818   score: 12.31   Avg score (100e): 12.07   actor gain: -0.42   critic loss: 0.41   steps: 818


training loop:   1% |                                  | ETA:  35 days, 5:36:11

Episode: 819   score: 12.30   Avg score (100e): 12.07   actor gain: -0.42   critic loss: 0.41   steps: 819


training loop:   1% |                                  | ETA:  35 days, 5:33:39

Episode: 820   score: 12.31   Avg score (100e): 12.07   actor gain: -0.42   critic loss: 0.41   steps: 820


training loop:   1% |                                  | ETA:  35 days, 5:37:09

Episode: 821   score: 12.32   Avg score (100e): 12.08   actor gain: -0.42   critic loss: 0.41   steps: 821


training loop:   1% |                                  | ETA:  35 days, 5:37:53

Episode: 822   score: 12.33   Avg score (100e): 12.08   actor gain: -0.42   critic loss: 0.41   steps: 822


training loop:   1% |                                  | ETA:  35 days, 5:40:07

Episode: 823   score: 12.33   Avg score (100e): 12.08   actor gain: -0.41   critic loss: 0.41   steps: 823
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 5:40:15

Episode: 824   score: 12.34   Avg score (100e): 12.09   actor gain: -0.41   critic loss: 0.41   steps: 824


training loop:   1% |                                  | ETA:  35 days, 5:39:16

Episode: 825   score: 12.34   Avg score (100e): 12.09   actor gain: -0.41   critic loss: 0.41   steps: 825


training loop:   1% |                                  | ETA:  35 days, 5:37:03

Episode: 826   score: 12.35   Avg score (100e): 12.09   actor gain: -0.41   critic loss: 0.41   steps: 826
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 5:36:06

Episode: 827   score: 12.37   Avg score (100e): 12.10   actor gain: -0.41   critic loss: 0.41   steps: 827


training loop:   1% |                                  | ETA:  35 days, 5:35:49

Episode: 828   score: 12.39   Avg score (100e): 12.10   actor gain: -0.41   critic loss: 0.41   steps: 828


training loop:   1% |                                  | ETA:  35 days, 5:34:44

Episode: 829   score: 12.39   Avg score (100e): 12.11   actor gain: -0.41   critic loss: 0.41   steps: 829


training loop:   1% |                                  | ETA:  35 days, 5:32:21

Episode: 830   score: 12.40   Avg score (100e): 12.11   actor gain: -0.41   critic loss: 0.41   steps: 830


training loop:   1% |                                  | ETA:  35 days, 5:30:37

Episode: 831   score: 12.41   Avg score (100e): 12.11   actor gain: -0.41   critic loss: 0.41   steps: 831
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 5:28:11

Episode: 832   score: 12.40   Avg score (100e): 12.12   actor gain: -0.40   critic loss: 0.41   steps: 832
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 5:23:35

Episode: 833   score: 12.40   Avg score (100e): 12.12   actor gain: -0.40   critic loss: 0.41   steps: 833


training loop:   1% |                                  | ETA:  35 days, 5:17:46

Episode: 834   score: 12.41   Avg score (100e): 12.13   actor gain: -0.40   critic loss: 0.41   steps: 834


training loop:   1% |                                  | ETA:  35 days, 5:14:59

Episode: 835   score: 12.41   Avg score (100e): 12.13   actor gain: -0.40   critic loss: 0.41   steps: 835


training loop:   1% |                                  | ETA:  35 days, 5:13:14

Episode: 836   score: 12.43   Avg score (100e): 12.14   actor gain: -0.40   critic loss: 0.41   steps: 836


training loop:   1% |                                  | ETA:  35 days, 5:10:28

Episode: 837   score: 12.43   Avg score (100e): 12.14   actor gain: -0.40   critic loss: 0.41   steps: 837


training loop:   1% |                                  | ETA:  35 days, 5:08:12

Episode: 838   score: 12.44   Avg score (100e): 12.14   actor gain: -0.40   critic loss: 0.41   steps: 838


training loop:   1% |                                  | ETA:  35 days, 5:04:15

Episode: 839   score: 12.45   Avg score (100e): 12.15   actor gain: -0.62   critic loss: 0.41   steps: 839


training loop:   1% |                                  | ETA:  35 days, 5:05:14

Episode: 840   score: 12.45   Avg score (100e): 12.15   actor gain: -0.62   critic loss: 0.41   steps: 840


training loop:   1% |                                  | ETA:  35 days, 5:02:16

Episode: 841   score: 12.46   Avg score (100e): 12.16   actor gain: -0.62   critic loss: 0.41   steps: 841


training loop:   1% |                                  | ETA:  35 days, 4:59:22

Episode: 842   score: 12.46   Avg score (100e): 12.16   actor gain: -0.62   critic loss: 0.41   steps: 842


training loop:   1% |                                  | ETA:  35 days, 4:57:21

Episode: 843   score: 12.47   Avg score (100e): 12.17   actor gain: -0.62   critic loss: 0.41   steps: 843


training loop:   1% |                                  | ETA:  35 days, 4:53:59

Episode: 844   score: 12.48   Avg score (100e): 12.17   actor gain: -0.62   critic loss: 0.41   steps: 844


training loop:   1% |                                  | ETA:  35 days, 4:51:51

Episode: 845   score: 12.50   Avg score (100e): 12.18   actor gain: -0.62   critic loss: 0.41   steps: 845


training loop:   1% |                                  | ETA:  35 days, 4:49:50

Episode: 846   score: 12.50   Avg score (100e): 12.18   actor gain: -0.62   critic loss: 0.41   steps: 846


training loop:   1% |                                  | ETA:  35 days, 4:47:06

Episode: 847   score: 12.52   Avg score (100e): 12.19   actor gain: -0.62   critic loss: 0.41   steps: 847


training loop:   1% |                                  | ETA:  35 days, 4:46:05

Episode: 848   score: 12.53   Avg score (100e): 12.19   actor gain: -0.62   critic loss: 0.41   steps: 848


training loop:   1% |                                  | ETA:  35 days, 4:44:27

Episode: 849   score: 12.54   Avg score (100e): 12.20   actor gain: -0.62   critic loss: 0.41   steps: 849


training loop:   1% |                                  | ETA:  35 days, 4:50:54

Episode: 850   score: 12.55   Avg score (100e): 12.20   actor gain: -0.62   critic loss: 0.41   steps: 850


training loop:   1% |                                  | ETA:  35 days, 4:49:07

Episode: 851   score: 12.54   Avg score (100e): 12.21   actor gain: -0.62   critic loss: 0.41   steps: 851


training loop:   1% |                                  | ETA:  35 days, 4:48:42

Episode: 852   score: 12.55   Avg score (100e): 12.21   actor gain: -0.62   critic loss: 0.41   steps: 852


training loop:   1% |                                  | ETA:  35 days, 4:48:52

Episode: 853   score: 12.57   Avg score (100e): 12.22   actor gain: -0.62   critic loss: 0.41   steps: 853


training loop:   1% |                                  | ETA:  35 days, 4:49:07

Episode: 854   score: 12.58   Avg score (100e): 12.22   actor gain: -0.62   critic loss: 0.41   steps: 854


training loop:   1% |                                  | ETA:  35 days, 4:47:30

Episode: 855   score: 12.58   Avg score (100e): 12.23   actor gain: -0.62   critic loss: 0.41   steps: 855


training loop:   1% |                                  | ETA:  35 days, 4:45:56

Episode: 856   score: 12.59   Avg score (100e): 12.24   actor gain: -0.62   critic loss: 0.41   steps: 856


training loop:   1% |                                  | ETA:  35 days, 4:45:10

Episode: 857   score: 12.60   Avg score (100e): 12.24   actor gain: -0.62   critic loss: 0.41   steps: 857


training loop:   1% |                                  | ETA:  35 days, 4:45:25

Episode: 858   score: 12.60   Avg score (100e): 12.25   actor gain: -0.62   critic loss: 0.41   steps: 858


training loop:   1% |                                  | ETA:  35 days, 4:40:07

Episode: 859   score: 12.60   Avg score (100e): 12.25   actor gain: -0.65   critic loss: 0.41   steps: 859


training loop:   1% |                                  | ETA:  35 days, 4:39:16

Episode: 860   score: 12.61   Avg score (100e): 12.26   actor gain: -0.65   critic loss: 0.41   steps: 860


training loop:   1% |                                  | ETA:  35 days, 4:39:26

Episode: 861   score: 12.63   Avg score (100e): 12.27   actor gain: -0.65   critic loss: 0.41   steps: 861


training loop:   1% |                                  | ETA:  35 days, 4:39:17

Episode: 862   score: 12.63   Avg score (100e): 12.27   actor gain: -0.65   critic loss: 0.41   steps: 862


training loop:   1% |                                  | ETA:  35 days, 4:48:32

Episode: 863   score: 12.62   Avg score (100e): 12.28   actor gain: -0.65   critic loss: 0.41   steps: 863


training loop:   1% |                                  | ETA:  35 days, 4:50:26

Episode: 864   score: 12.63   Avg score (100e): 12.28   actor gain: -0.43   critic loss: 0.41   steps: 864


training loop:   1% |                                  | ETA:  35 days, 4:50:14

Episode: 865   score: 12.63   Avg score (100e): 12.29   actor gain: -0.43   critic loss: 0.41   steps: 865


training loop:   1% |                                  | ETA:  35 days, 4:46:06

Episode: 866   score: 12.64   Avg score (100e): 12.30   actor gain: -0.43   critic loss: 0.41   steps: 866


training loop:   1% |                                  | ETA:  35 days, 4:43:43

Episode: 867   score: 12.64   Avg score (100e): 12.30   actor gain: -0.51   critic loss: 0.41   steps: 867


training loop:   1% |                                  | ETA:  35 days, 4:40:59

Episode: 868   score: 12.64   Avg score (100e): 12.31   actor gain: -0.54   critic loss: 0.41   steps: 868


training loop:   1% |                                  | ETA:  35 days, 4:38:26

Episode: 869   score: 12.66   Avg score (100e): 12.32   actor gain: -0.54   critic loss: 0.41   steps: 869
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 4:34:13

Episode: 870   score: 12.65   Avg score (100e): 12.32   actor gain: -0.54   critic loss: 0.41   steps: 870


training loop:   1% |                                  | ETA:  35 days, 4:31:13

Episode: 871   score: 12.66   Avg score (100e): 12.33   actor gain: -0.54   critic loss: 0.41   steps: 871


training loop:   1% |                                  | ETA:  35 days, 4:28:09

Episode: 872   score: 12.67   Avg score (100e): 12.34   actor gain: -0.55   critic loss: 0.41   steps: 872


training loop:   1% |                                  | ETA:  35 days, 4:24:23

Episode: 873   score: 12.67   Avg score (100e): 12.34   actor gain: -0.55   critic loss: 0.41   steps: 873


training loop:   1% |                                  | ETA:  35 days, 4:21:59

Episode: 874   score: 12.68   Avg score (100e): 12.35   actor gain: -0.55   critic loss: 0.41   steps: 874


training loop:   1% |                                  | ETA:  35 days, 4:17:31

Episode: 875   score: 12.68   Avg score (100e): 12.36   actor gain: -0.55   critic loss: 0.41   steps: 875


training loop:   1% |                                  | ETA:  35 days, 4:11:29

Episode: 876   score: 12.68   Avg score (100e): 12.36   actor gain: -0.54   critic loss: 0.41   steps: 876


training loop:   1% |                                  | ETA:  35 days, 4:07:49

Episode: 877   score: 12.69   Avg score (100e): 12.37   actor gain: -0.54   critic loss: 0.41   steps: 877


training loop:   1% |                                  | ETA:  35 days, 4:02:58

Episode: 878   score: 12.69   Avg score (100e): 12.37   actor gain: -0.55   critic loss: 0.41   steps: 878


training loop:   1% |                                  | ETA:  35 days, 3:59:03

Episode: 879   score: 12.70   Avg score (100e): 12.38   actor gain: -0.55   critic loss: 0.41   steps: 879


training loop:   1% |                                  | ETA:  35 days, 4:02:10

Episode: 880   score: 12.71   Avg score (100e): 12.39   actor gain: -0.55   critic loss: 0.41   steps: 880


training loop:   1% |                                  | ETA:  35 days, 4:01:38

Episode: 881   score: 12.72   Avg score (100e): 12.39   actor gain: -0.55   critic loss: 0.41   steps: 881


training loop:   1% |                                  | ETA:  35 days, 4:01:35

Episode: 882   score: 12.73   Avg score (100e): 12.40   actor gain: -0.55   critic loss: 0.41   steps: 882


training loop:   1% |                                  | ETA:  35 days, 4:07:07

Episode: 883   score: 12.73   Avg score (100e): 12.41   actor gain: -0.55   critic loss: 0.41   steps: 883


training loop:   1% |                                  | ETA:  35 days, 4:11:20

Episode: 884   score: 12.74   Avg score (100e): 12.41   actor gain: -0.52   critic loss: 0.42   steps: 884


training loop:   1% |                                  | ETA:  35 days, 4:10:52

Episode: 885   score: 12.73   Avg score (100e): 12.42   actor gain: -0.52   critic loss: 0.42   steps: 885


training loop:   1% |                                  | ETA:  35 days, 4:08:55

Episode: 886   score: 12.75   Avg score (100e): 12.43   actor gain: -0.52   critic loss: 0.42   steps: 886


training loop:   1% |                                  | ETA:  35 days, 4:10:50

Episode: 887   score: 12.76   Avg score (100e): 12.43   actor gain: -0.52   critic loss: 0.42   steps: 887


training loop:   1% |                                  | ETA:  35 days, 4:08:03

Episode: 888   score: 12.77   Avg score (100e): 12.44   actor gain: -0.52   critic loss: 0.42   steps: 888


training loop:   1% |                                  | ETA:  35 days, 4:06:58

Episode: 889   score: 12.77   Avg score (100e): 12.45   actor gain: -0.52   critic loss: 0.42   steps: 889


training loop:   1% |                                  | ETA:  35 days, 4:05:56

Episode: 890   score: 12.78   Avg score (100e): 12.45   actor gain: -0.52   critic loss: 0.42   steps: 890


training loop:   1% |                                  | ETA:  35 days, 4:07:37

Episode: 891   score: 12.80   Avg score (100e): 12.46   actor gain: -0.52   critic loss: 0.42   steps: 891


training loop:   1% |                                  | ETA:  35 days, 4:03:31

Episode: 892   score: 12.79   Avg score (100e): 12.47   actor gain: -0.44   critic loss: 0.42   steps: 892


training loop:   1% |                                  | ETA:  35 days, 4:00:32

Episode: 893   score: 12.80   Avg score (100e): 12.47   actor gain: -0.40   critic loss: 0.42   steps: 893


training loop:   1% |                                  | ETA:  35 days, 3:57:52

Episode: 894   score: 12.80   Avg score (100e): 12.48   actor gain: -0.40   critic loss: 0.42   steps: 894


training loop:   1% |                                  | ETA:  35 days, 3:53:45

Episode: 895   score: 12.81   Avg score (100e): 12.49   actor gain: -0.43   critic loss: 0.42   steps: 895


training loop:   1% |                                  | ETA:  35 days, 3:51:12

Episode: 896   score: 12.83   Avg score (100e): 12.50   actor gain: -0.43   critic loss: 0.42   steps: 896


training loop:   1% |                                  | ETA:  35 days, 3:54:59

Episode: 897   score: 12.83   Avg score (100e): 12.50   actor gain: -0.43   critic loss: 0.42   steps: 897


training loop:   1% |                                  | ETA:  35 days, 4:33:26

Episode: 898   score: 12.83   Avg score (100e): 12.51   actor gain: -0.43   critic loss: 0.42   steps: 898


training loop:   1% |                                  | ETA:  35 days, 4:39:43

Episode: 899   score: 12.83   Avg score (100e): 12.52   actor gain: -0.43   critic loss: 0.42   steps: 899


training loop:   1% |                                  | ETA:  35 days, 4:41:20

Episode: 900   score: 12.83   Avg score (100e): 12.52   actor gain: -0.43   critic loss: 0.42   steps: 900


training loop:   1% |                                  | ETA:  35 days, 4:41:27

Episode: 901   score: 12.84   Avg score (100e): 12.53   actor gain: -0.43   critic loss: 0.42   steps: 901


training loop:   1% |                                  | ETA:  35 days, 4:43:10

Episode: 902   score: 12.84   Avg score (100e): 12.54   actor gain: -0.43   critic loss: 0.42   steps: 902


training loop:   1% |                                  | ETA:  35 days, 6:43:39

Episode: 903   score: 12.84   Avg score (100e): 12.54   actor gain: -0.42   critic loss: 0.42   steps: 903


training loop:   1% |                                  | ETA:  35 days, 6:41:18

Episode: 904   score: 12.85   Avg score (100e): 12.55   actor gain: -0.42   critic loss: 0.42   steps: 904


training loop:   1% |                                  | ETA:  35 days, 6:41:06

Episode: 905   score: 12.85   Avg score (100e): 12.56   actor gain: -0.42   critic loss: 0.42   steps: 905
np.all(done) is true! miracle!


training loop:   1% |                                  | ETA:  35 days, 6:35:51

Episode: 906   score: 12.85   Avg score (100e): 12.56   actor gain: -0.42   critic loss: 0.42   steps: 906


training loop:   1% |                                  | ETA:  35 days, 6:36:49

Episode: 907   score: 12.85   Avg score (100e): 12.57   actor gain: -0.42   critic loss: 0.42   steps: 907


training loop:   1% |                                  | ETA:  36 days, 9:35:33

Episode: 908   score: 12.85   Avg score (100e): 12.58   actor gain: -0.42   critic loss: 0.42   steps: 908


training loop:   1% |                                  | ETA:  36 days, 9:31:10

Episode: 909   score: 12.86   Avg score (100e): 12.58   actor gain: -0.42   critic loss: 0.42   steps: 909


training loop:   1% |                                  | ETA:  36 days, 9:31:29

Episode: 910   score: 12.86   Avg score (100e): 12.59   actor gain: -0.42   critic loss: 0.42   steps: 910


training loop:   1% |                                  | ETA:  36 days, 9:26:03

Episode: 911   score: 12.86   Avg score (100e): 12.59   actor gain: -0.42   critic loss: 0.42   steps: 911


training loop:   1% |                                  | ETA:  36 days, 9:20:55

Episode: 912   score: 12.87   Avg score (100e): 12.60   actor gain: -0.42   critic loss: 0.42   steps: 912


training loop:   1% |                                  | ETA:  36 days, 9:17:04

Episode: 913   score: 12.87   Avg score (100e): 12.61   actor gain: -0.42   critic loss: 0.41   steps: 913


training loop:   1% |                                 | ETA:  36 days, 11:00:36

Episode: 914   score: 12.87   Avg score (100e): 12.61   actor gain: -0.43   critic loss: 0.41   steps: 914


training loop:   1% |                                 | ETA:  36 days, 11:13:41

Episode: 915   score: 12.89   Avg score (100e): 12.62   actor gain: -0.43   critic loss: 0.41   steps: 915


training loop:   1% |                                 | ETA:  36 days, 11:14:17

Episode: 916   score: 12.88   Avg score (100e): 12.62   actor gain: -0.43   critic loss: 0.41   steps: 916


training loop:   1% |                                 | ETA:  36 days, 11:12:21

Episode: 917   score: 12.88   Avg score (100e): 12.63   actor gain: -0.43   critic loss: 0.41   steps: 917


training loop:   1% |                                 | ETA:  36 days, 11:13:31

Episode: 918   score: 12.88   Avg score (100e): 12.64   actor gain: -0.44   critic loss: 0.41   steps: 918


training loop:   1% |                                 | ETA:  36 days, 11:09:04

Episode: 919   score: 12.89   Avg score (100e): 12.64   actor gain: -0.44   critic loss: 0.41   steps: 919


training loop:   1% |                                 | ETA:  36 days, 11:07:49

Episode: 920   score: 12.91   Avg score (100e): 12.65   actor gain: -0.41   critic loss: 0.41   steps: 920


training loop:   1% |                                 | ETA:  36 days, 11:12:25

Episode: 921   score: 12.90   Avg score (100e): 12.65   actor gain: -0.41   critic loss: 0.41   steps: 921


training loop:   1% |                                 | ETA:  36 days, 11:08:51

Episode: 922   score: 12.90   Avg score (100e): 12.66   actor gain: -0.41   critic loss: 0.41   steps: 922


training loop:   1% |                                 | ETA:  36 days, 11:03:36

Episode: 923   score: 12.89   Avg score (100e): 12.66   actor gain: -0.41   critic loss: 0.41   steps: 923


training loop:   1% |                                 | ETA:  36 days, 10:58:46

Episode: 924   score: 12.89   Avg score (100e): 12.67   actor gain: -0.41   critic loss: 0.41   steps: 924


training loop:   1% |                                 | ETA:  36 days, 10:52:50

Episode: 925   score: 12.91   Avg score (100e): 12.68   actor gain: -0.41   critic loss: 0.41   steps: 925


training loop:   1% |                                 | ETA:  36 days, 10:48:37

Episode: 926   score: 12.92   Avg score (100e): 12.68   actor gain: -0.41   critic loss: 0.41   steps: 926


training loop:   1% |                                 | ETA:  36 days, 16:06:58

Episode: 927   score: 12.93   Avg score (100e): 12.69   actor gain: -0.42   critic loss: 0.41   steps: 927


training loop:   1% |                                 | ETA:  36 days, 16:17:07

Episode: 928   score: 12.92   Avg score (100e): 12.69   actor gain: -0.42   critic loss: 0.41   steps: 928


training loop:   1% |                                 | ETA:  36 days, 16:29:20

Episode: 929   score: 12.93   Avg score (100e): 12.70   actor gain: -0.42   critic loss: 0.41   steps: 929


training loop:   1% |                                 | ETA:  36 days, 16:36:25

Episode: 930   score: 12.93   Avg score (100e): 12.70   actor gain: -0.41   critic loss: 0.41   steps: 930
np.all(done) is true! miracle!


training loop:   1% |                                 | ETA:  36 days, 16:31:05

Episode: 931   score: 12.93   Avg score (100e): 12.71   actor gain: -0.41   critic loss: 0.41   steps: 931


training loop:   1% |                                 | ETA:  36 days, 16:19:31

Episode: 932   score: 12.93   Avg score (100e): 12.71   actor gain: -0.41   critic loss: 0.41   steps: 932


training loop:   1% |                                 | ETA:  36 days, 16:23:04

Episode: 933   score: 12.93   Avg score (100e): 12.72   actor gain: -0.41   critic loss: 0.42   steps: 933


training loop:   1% |                                 | ETA:  36 days, 16:14:46

Episode: 934   score: 12.95   Avg score (100e): 12.72   actor gain: -0.41   critic loss: 0.42   steps: 934
np.all(done) is true! miracle!


training loop:   1% |                                 | ETA:  36 days, 16:11:22

Episode: 935   score: 12.94   Avg score (100e): 12.73   actor gain: -0.41   critic loss: 0.42   steps: 935


training loop:   1% |                                 | ETA:  36 days, 16:10:17

Episode: 936   score: 12.94   Avg score (100e): 12.74   actor gain: -0.41   critic loss: 0.42   steps: 936


training loop:   1% |                                 | ETA:  36 days, 16:03:35

Episode: 937   score: 12.95   Avg score (100e): 12.74   actor gain: -0.41   critic loss: 0.42   steps: 937


training loop:   1% |                                 | ETA:  36 days, 15:59:34

Episode: 938   score: 12.95   Avg score (100e): 12.75   actor gain: -0.41   critic loss: 0.42   steps: 938


training loop:   1% |                                 | ETA:  36 days, 16:02:45

Episode: 939   score: 12.95   Avg score (100e): 12.75   actor gain: -0.41   critic loss: 0.42   steps: 939


training loop:   1% |                                 | ETA:  36 days, 16:08:52

Episode: 940   score: 12.95   Avg score (100e): 12.76   actor gain: -0.41   critic loss: 0.42   steps: 940


training loop:   1% |                                 | ETA:  36 days, 16:44:13

Episode: 941   score: 12.95   Avg score (100e): 12.76   actor gain: -0.41   critic loss: 0.42   steps: 941


training loop:   1% |                                 | ETA:  36 days, 16:47:22

Episode: 942   score: 12.95   Avg score (100e): 12.77   actor gain: -0.41   critic loss: 0.42   steps: 942


training loop:   1% |                                 | ETA:  36 days, 16:46:43

Episode: 943   score: 12.96   Avg score (100e): 12.77   actor gain: -0.40   critic loss: 0.42   steps: 943


training loop:   1% |                                 | ETA:  36 days, 16:54:00

Episode: 944   score: 12.97   Avg score (100e): 12.77   actor gain: -0.41   critic loss: 0.42   steps: 944


training loop:   1% |                                 | ETA:  36 days, 16:56:37

Episode: 945   score: 12.98   Avg score (100e): 12.78   actor gain: -0.41   critic loss: 0.42   steps: 945


training loop:   1% |                                 | ETA:  36 days, 16:59:56

Episode: 946   score: 12.99   Avg score (100e): 12.78   actor gain: -0.41   critic loss: 0.42   steps: 946


training loop:   1% |                                 | ETA:  36 days, 17:11:59

Episode: 947   score: 13.00   Avg score (100e): 12.79   actor gain: -0.41   critic loss: 0.42   steps: 947


training loop:   1% |                                 | ETA:  36 days, 17:11:56

Episode: 948   score: 13.00   Avg score (100e): 12.79   actor gain: -0.41   critic loss: 0.42   steps: 948


training loop:   1% |                                 | ETA:  36 days, 17:14:32

Episode: 949   score: 13.00   Avg score (100e): 12.80   actor gain: -0.41   critic loss: 0.42   steps: 949


training loop:   1% |                                 | ETA:  36 days, 17:13:31

Episode: 950   score: 13.01   Avg score (100e): 12.80   actor gain: -0.41   critic loss: 0.42   steps: 950


training loop:   1% |                                 | ETA:  36 days, 17:18:23

Episode: 951   score: 13.01   Avg score (100e): 12.81   actor gain: -0.41   critic loss: 0.42   steps: 951


training loop:   1% |                                 | ETA:  36 days, 17:16:46

Episode: 952   score: 13.02   Avg score (100e): 12.81   actor gain: -0.40   critic loss: 0.42   steps: 952


training loop:   1% |                                 | ETA:  36 days, 17:17:01

Episode: 953   score: 13.02   Avg score (100e): 12.82   actor gain: -0.40   critic loss: 0.42   steps: 953


training loop:   1% |                                 | ETA:  36 days, 17:40:40

Episode: 954   score: 13.01   Avg score (100e): 12.82   actor gain: -0.40   critic loss: 0.42   steps: 954


training loop:   1% |                                 | ETA:  36 days, 17:53:19

Episode: 955   score: 13.02   Avg score (100e): 12.83   actor gain: -0.40   critic loss: 0.42   steps: 955


training loop:   1% |                                 | ETA:  36 days, 18:00:41

Episode: 956   score: 13.03   Avg score (100e): 12.83   actor gain: -0.40   critic loss: 0.42   steps: 956


training loop:   1% |                                 | ETA:  36 days, 18:06:12

Episode: 957   score: 13.03   Avg score (100e): 12.83   actor gain: -0.40   critic loss: 0.42   steps: 957


training loop:   1% |                                 | ETA:  36 days, 18:09:58

Episode: 958   score: 13.03   Avg score (100e): 12.84   actor gain: -0.40   critic loss: 0.42   steps: 958


training loop:   1% |                                 | ETA:  36 days, 18:09:19

Episode: 959   score: 13.04   Avg score (100e): 12.84   actor gain: -0.40   critic loss: 0.42   steps: 959


training loop:   1% |                                 | ETA:  36 days, 18:12:18

Episode: 960   score: 13.04   Avg score (100e): 12.85   actor gain: -0.40   critic loss: 0.42   steps: 960


training loop:   1% |                                 | ETA:  36 days, 18:21:16

Episode: 961   score: 13.05   Avg score (100e): 12.85   actor gain: -0.40   critic loss: 0.42   steps: 961


training loop:   1% |                                 | ETA:  36 days, 18:31:47

Episode: 962   score: 13.05   Avg score (100e): 12.86   actor gain: -0.40   critic loss: 0.41   steps: 962


training loop:   1% |                                 | ETA:  36 days, 18:33:38

Episode: 963   score: 13.05   Avg score (100e): 12.86   actor gain: -0.40   critic loss: 0.41   steps: 963
np.all(done) is true! miracle!


training loop:   1% |                                 | ETA:  36 days, 18:33:00

Episode: 964   score: 13.05   Avg score (100e): 12.86   actor gain: -0.40   critic loss: 0.41   steps: 964


training loop:   1% |                                 | ETA:  36 days, 18:51:39

Episode: 965   score: 13.05   Avg score (100e): 12.87   actor gain: -0.40   critic loss: 0.41   steps: 965


training loop:   1% |                                 | ETA:  36 days, 18:57:40

Episode: 966   score: 13.06   Avg score (100e): 12.87   actor gain: -0.40   critic loss: 0.41   steps: 966


training loop:   1% |                                 | ETA:  36 days, 19:04:22

Episode: 967   score: 13.06   Avg score (100e): 12.88   actor gain: -0.40   critic loss: 0.41   steps: 967
np.all(done) is true! miracle!


training loop:   1% |                                 | ETA:  36 days, 19:17:44

Episode: 968   score: 13.07   Avg score (100e): 12.88   actor gain: -0.40   critic loss: 0.41   steps: 968


training loop:   1% |                                 | ETA:  36 days, 19:51:01

Episode: 969   score: 13.07   Avg score (100e): 12.88   actor gain: -0.39   critic loss: 0.41   steps: 969


training loop:   1% |                                 | ETA:  36 days, 20:08:25

Episode: 970   score: 13.07   Avg score (100e): 12.89   actor gain: -0.39   critic loss: 0.41   steps: 970


training loop:   1% |                                 | ETA:  36 days, 20:09:45

Episode: 971   score: 13.07   Avg score (100e): 12.89   actor gain: -0.40   critic loss: 0.41   steps: 971


training loop:   1% |                                 | ETA:  36 days, 20:08:25

Episode: 972   score: 13.08   Avg score (100e): 12.90   actor gain: -0.39   critic loss: 0.41   steps: 972


training loop:   1% |                                 | ETA:  36 days, 20:10:47

Episode: 973   score: 13.07   Avg score (100e): 12.90   actor gain: -0.39   critic loss: 0.41   steps: 973


training loop:   1% |                                 | ETA:  36 days, 20:10:13

Episode: 974   score: 13.09   Avg score (100e): 12.91   actor gain: -0.39   critic loss: 0.41   steps: 974


training loop:   1% |                                 | ETA:  36 days, 20:24:13

Episode: 975   score: 13.09   Avg score (100e): 12.91   actor gain: -0.39   critic loss: 0.41   steps: 975


training loop:   1% |                                 | ETA:  36 days, 20:35:40

Episode: 976   score: 13.10   Avg score (100e): 12.91   actor gain: -0.39   critic loss: 0.41   steps: 976


training loop:   1% |                                 | ETA:  36 days, 20:34:11

Episode: 977   score: 13.10   Avg score (100e): 12.92   actor gain: -0.40   critic loss: 0.41   steps: 977


training loop:   1% |                                 | ETA:  36 days, 20:47:47

Episode: 978   score: 13.11   Avg score (100e): 12.92   actor gain: -0.39   critic loss: 0.41   steps: 978


training loop:   1% |                                 | ETA:  36 days, 20:55:24

Episode: 979   score: 13.12   Avg score (100e): 12.93   actor gain: -0.39   critic loss: 0.41   steps: 979


training loop:   1% |                                 | ETA:  36 days, 21:08:57

Episode: 980   score: 13.11   Avg score (100e): 12.93   actor gain: -0.39   critic loss: 0.41   steps: 980


training loop:   1% |                                 | ETA:  36 days, 21:13:14

Episode: 981   score: 13.12   Avg score (100e): 12.93   actor gain: -0.39   critic loss: 0.41   steps: 981


training loop:   1% |                                 | ETA:  36 days, 21:35:11

Episode: 982   score: 13.13   Avg score (100e): 12.94   actor gain: -0.39   critic loss: 0.41   steps: 982


training loop:   1% |                                 | ETA:  36 days, 21:38:09

Episode: 983   score: 13.14   Avg score (100e): 12.94   actor gain: -0.39   critic loss: 0.41   steps: 983


training loop:   1% |                                 | ETA:  36 days, 21:42:17

Episode: 984   score: 13.16   Avg score (100e): 12.95   actor gain: -0.39   critic loss: 0.41   steps: 984


training loop:   1% |                                 | ETA:  36 days, 21:38:39

Episode: 985   score: 13.16   Avg score (100e): 12.95   actor gain: -0.39   critic loss: 0.41   steps: 985


training loop:   1% |                                 | ETA:  36 days, 21:32:54

Episode: 986   score: 13.17   Avg score (100e): 12.96   actor gain: -0.39   critic loss: 0.41   steps: 986


training loop:   1% |                                 | ETA:  36 days, 21:31:16

Episode: 987   score: 13.17   Avg score (100e): 12.96   actor gain: -0.39   critic loss: 0.41   steps: 987


training loop:   1% |                                 | ETA:  36 days, 21:20:31

Episode: 988   score: 13.18   Avg score (100e): 12.96   actor gain: -0.39   critic loss: 0.42   steps: 988


training loop:   1% |                                 | ETA:  36 days, 21:09:40

Episode: 989   score: 13.19   Avg score (100e): 12.97   actor gain: -0.39   critic loss: 0.42   steps: 989


training loop:   1% |                                 | ETA:  36 days, 20:59:18

Episode: 990   score: 13.19   Avg score (100e): 12.97   actor gain: -0.53   critic loss: 0.42   steps: 990


training loop:   1% |                                 | ETA:  36 days, 21:06:19

Episode: 991   score: 13.19   Avg score (100e): 12.98   actor gain: -0.52   critic loss: 0.42   steps: 991


training loop:   1% |                                 | ETA:  36 days, 21:05:45

Episode: 992   score: 13.20   Avg score (100e): 12.98   actor gain: -0.52   critic loss: 0.42   steps: 992


training loop:   1% |                                 | ETA:  36 days, 21:06:47

Episode: 993   score: 13.21   Avg score (100e): 12.98   actor gain: -0.53   critic loss: 0.42   steps: 993


training loop:   1% |                                 | ETA:  36 days, 21:06:48

Episode: 994   score: 13.22   Avg score (100e): 12.99   actor gain: -0.53   critic loss: 0.42   steps: 994


training loop:   1% |                                 | ETA:  36 days, 21:03:59

Episode: 995   score: 13.23   Avg score (100e): 12.99   actor gain: -0.53   critic loss: 0.42   steps: 995
np.all(done) is true! miracle!


training loop:   1% |                                 | ETA:  36 days, 21:02:04

Episode: 996   score: 13.24   Avg score (100e): 13.00   actor gain: -0.52   critic loss: 0.42   steps: 996


training loop:   1% |                                 | ETA:  36 days, 20:59:29

Episode: 997   score: 13.25   Avg score (100e): 13.00   actor gain: -0.53   critic loss: 0.42   steps: 997


training loop:   1% |                                 | ETA:  36 days, 20:58:14

Episode: 998   score: 13.25   Avg score (100e): 13.00   actor gain: -0.53   critic loss: 0.42   steps: 998


training loop:   1% |                                 | ETA:  36 days, 20:57:44

Episode: 999   score: 13.24   Avg score (100e): 13.01   actor gain: -0.53   critic loss: 0.42   steps: 999


training loop:   1% |                                 | ETA:  36 days, 20:44:53

Episode: 1000   score: 13.26   Avg score (100e): 13.01   actor gain: -0.52   critic loss: 0.42   steps: 1000


training loop:   2% |                                 | ETA:  36 days, 20:27:15

Episode: 1001   score: 13.26   Avg score (100e): 13.02   actor gain: -0.52   critic loss: 0.42   steps: 1001


training loop:   2% |                                 | ETA:  36 days, 20:10:41

Episode: 1002   score: 13.26   Avg score (100e): 13.02   actor gain: -0.53   critic loss: 0.42   steps: 1002


training loop:   2% |                                 | ETA:  36 days, 19:56:48

Episode: 1003   score: 13.28   Avg score (100e): 13.03   actor gain: -0.53   critic loss: 0.42   steps: 1003


training loop:   2% |                                 | ETA:  36 days, 19:42:12

Episode: 1004   score: 13.29   Avg score (100e): 13.03   actor gain: -0.53   critic loss: 0.41   steps: 1004


training loop:   2% |                                 | ETA:  36 days, 19:35:55

Episode: 1005   score: 13.29   Avg score (100e): 13.03   actor gain: -0.53   critic loss: 0.41   steps: 1005


training loop:   2% |                                 | ETA:  36 days, 19:24:45

Episode: 1006   score: 13.30   Avg score (100e): 13.04   actor gain: -0.53   critic loss: 0.41   steps: 1006


training loop:   2% |                                 | ETA:  36 days, 19:11:04

Episode: 1007   score: 13.30   Avg score (100e): 13.04   actor gain: -0.53   critic loss: 0.41   steps: 1007
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  36 days, 18:52:16

Episode: 1008   score: 13.30   Avg score (100e): 13.05   actor gain: -0.53   critic loss: 0.41   steps: 1008


training loop:   2% |                                 | ETA:  36 days, 18:32:56

Episode: 1009   score: 13.30   Avg score (100e): 13.05   actor gain: -0.53   critic loss: 0.41   steps: 1009


training loop:   2% |                                 | ETA:  36 days, 18:16:20

Episode: 1010   score: 13.30   Avg score (100e): 13.06   actor gain: -0.53   critic loss: 0.41   steps: 1010


training loop:   2% |                                 | ETA:  36 days, 18:14:17

Episode: 1011   score: 13.32   Avg score (100e): 13.06   actor gain: -0.53   critic loss: 0.41   steps: 1011


training loop:   2% |                                 | ETA:  36 days, 18:18:06

Episode: 1012   score: 13.32   Avg score (100e): 13.07   actor gain: -0.53   critic loss: 0.41   steps: 1012


training loop:   2% |                                 | ETA:  36 days, 18:24:54

Episode: 1013   score: 13.33   Avg score (100e): 13.07   actor gain: -0.53   critic loss: 0.41   steps: 1013


training loop:   2% |                                 | ETA:  36 days, 18:45:32

Episode: 1014   score: 13.34   Avg score (100e): 13.08   actor gain: -0.53   critic loss: 0.41   steps: 1014


training loop:   2% |                                 | ETA:  36 days, 18:52:34

Episode: 1015   score: 13.34   Avg score (100e): 13.08   actor gain: -0.40   critic loss: 0.41   steps: 1015


training loop:   2% |                                 | ETA:  36 days, 19:00:39

Episode: 1016   score: 13.34   Avg score (100e): 13.08   actor gain: -0.40   critic loss: 0.41   steps: 1016


training loop:   2% |                                 | ETA:  36 days, 19:03:19

Episode: 1017   score: 13.35   Avg score (100e): 13.09   actor gain: -0.40   critic loss: 0.41   steps: 1017


training loop:   2% |                                 | ETA:  36 days, 19:04:48

Episode: 1018   score: 13.35   Avg score (100e): 13.09   actor gain: -0.40   critic loss: 0.41   steps: 1018


training loop:   2% |                                 | ETA:  36 days, 19:03:41

Episode: 1019   score: 13.36   Avg score (100e): 13.10   actor gain: -0.40   critic loss: 0.41   steps: 1019


training loop:   2% |                                 | ETA:  36 days, 18:59:28

Episode: 1020   score: 13.36   Avg score (100e): 13.10   actor gain: -0.40   critic loss: 0.41   steps: 1020


training loop:   2% |                                 | ETA:  36 days, 19:04:11

Episode: 1021   score: 13.37   Avg score (100e): 13.11   actor gain: -0.39   critic loss: 0.41   steps: 1021
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  36 days, 18:56:27

Episode: 1022   score: 13.38   Avg score (100e): 13.11   actor gain: -0.39   critic loss: 0.41   steps: 1022


training loop:   2% |                                 | ETA:  36 days, 18:48:57

Episode: 1023   score: 13.38   Avg score (100e): 13.12   actor gain: -0.40   critic loss: 0.41   steps: 1023


training loop:   2% |                                 | ETA:  36 days, 18:56:40

Episode: 1024   score: 13.39   Avg score (100e): 13.12   actor gain: -0.40   critic loss: 0.41   steps: 1024


training loop:   2% |                                 | ETA:  36 days, 19:00:12

Episode: 1025   score: 13.40   Avg score (100e): 13.13   actor gain: -0.40   critic loss: 0.41   steps: 1025


training loop:   2% |                                 | ETA:  36 days, 19:06:36

Episode: 1026   score: 13.41   Avg score (100e): 13.13   actor gain: -0.40   critic loss: 0.41   steps: 1026


training loop:   2% |                                 | ETA:  36 days, 19:07:43

Episode: 1027   score: 13.42   Avg score (100e): 13.14   actor gain: -0.40   critic loss: 0.41   steps: 1027


training loop:   2% |                                 | ETA:  36 days, 19:07:21

Episode: 1028   score: 13.43   Avg score (100e): 13.14   actor gain: -0.39   critic loss: 0.41   steps: 1028


training loop:   2% |                                 | ETA:  36 days, 19:15:52

Episode: 1029   score: 13.43   Avg score (100e): 13.15   actor gain: -0.40   critic loss: 0.41   steps: 1029


training loop:   2% |                                 | ETA:  36 days, 19:12:59

Episode: 1030   score: 13.43   Avg score (100e): 13.15   actor gain: -0.40   critic loss: 0.41   steps: 1030


training loop:   2% |                                 | ETA:  36 days, 19:06:12

Episode: 1031   score: 13.44   Avg score (100e): 13.16   actor gain: -0.41   critic loss: 0.41   steps: 1031


training loop:   2% |                                 | ETA:  36 days, 18:53:54

Episode: 1032   score: 13.45   Avg score (100e): 13.16   actor gain: -0.41   critic loss: 0.41   steps: 1032


training loop:   2% |                                 | ETA:  36 days, 18:46:43

Episode: 1033   score: 13.44   Avg score (100e): 13.17   actor gain: -0.41   critic loss: 0.41   steps: 1033


training loop:   2% |                                 | ETA:  36 days, 18:43:39

Episode: 1034   score: 13.44   Avg score (100e): 13.17   actor gain: -0.41   critic loss: 0.41   steps: 1034


training loop:   2% |                                 | ETA:  36 days, 18:46:25

Episode: 1035   score: 13.45   Avg score (100e): 13.18   actor gain: -0.41   critic loss: 0.41   steps: 1035


training loop:   2% |                                 | ETA:  36 days, 18:42:15

Episode: 1036   score: 13.46   Avg score (100e): 13.18   actor gain: -0.41   critic loss: 0.41   steps: 1036


training loop:   2% |                                 | ETA:  36 days, 18:43:49

Episode: 1037   score: 13.47   Avg score (100e): 13.19   actor gain: -0.41   critic loss: 0.41   steps: 1037


training loop:   2% |                                 | ETA:  36 days, 18:45:45

Episode: 1038   score: 13.47   Avg score (100e): 13.19   actor gain: -0.41   critic loss: 0.41   steps: 1038


training loop:   2% |                                 | ETA:  36 days, 18:47:09

Episode: 1039   score: 13.48   Avg score (100e): 13.20   actor gain: -0.41   critic loss: 0.41   steps: 1039


training loop:   2% |                                 | ETA:  36 days, 18:49:49

Episode: 1040   score: 13.48   Avg score (100e): 13.20   actor gain: -0.41   critic loss: 0.41   steps: 1040


training loop:   2% |                                 | ETA:  36 days, 18:50:52

Episode: 1041   score: 13.48   Avg score (100e): 13.21   actor gain: -0.41   critic loss: 0.41   steps: 1041


training loop:   2% |                                 | ETA:  36 days, 18:54:24

Episode: 1042   score: 13.48   Avg score (100e): 13.21   actor gain: -0.41   critic loss: 0.41   steps: 1042


training loop:   2% |                                 | ETA:  36 days, 19:21:14

Episode: 1043   score: 13.49   Avg score (100e): 13.22   actor gain: -0.41   critic loss: 0.41   steps: 1043


training loop:   2% |                                 | ETA:  36 days, 19:34:22

Episode: 1044   score: 13.50   Avg score (100e): 13.22   actor gain: -0.42   critic loss: 0.41   steps: 1044


training loop:   2% |                                 | ETA:  36 days, 19:31:48

Episode: 1045   score: 13.50   Avg score (100e): 13.23   actor gain: -0.42   critic loss: 0.41   steps: 1045


training loop:   2% |                                 | ETA:  36 days, 19:35:25

Episode: 1046   score: 13.52   Avg score (100e): 13.23   actor gain: -0.42   critic loss: 0.41   steps: 1046


training loop:   2% |                                 | ETA:  36 days, 19:44:29

Episode: 1047   score: 13.52   Avg score (100e): 13.24   actor gain: -0.42   critic loss: 0.41   steps: 1047
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  36 days, 20:03:35

Episode: 1048   score: 13.53   Avg score (100e): 13.25   actor gain: -0.42   critic loss: 0.41   steps: 1048


training loop:   2% |                                 | ETA:  36 days, 20:07:31

Episode: 1049   score: 13.53   Avg score (100e): 13.25   actor gain: -0.42   critic loss: 0.41   steps: 1049


training loop:   2% |                                 | ETA:  36 days, 20:07:24

Episode: 1050   score: 13.53   Avg score (100e): 13.26   actor gain: -0.42   critic loss: 0.41   steps: 1050


training loop:   2% |                                 | ETA:  36 days, 20:07:40

Episode: 1051   score: 13.53   Avg score (100e): 13.26   actor gain: -0.42   critic loss: 0.41   steps: 1051
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  36 days, 20:07:09

Episode: 1052   score: 13.54   Avg score (100e): 13.27   actor gain: -0.42   critic loss: 0.41   steps: 1052


training loop:   2% |                                 | ETA:  36 days, 20:05:43

Episode: 1053   score: 13.54   Avg score (100e): 13.27   actor gain: -0.41   critic loss: 0.41   steps: 1053


training loop:   2% |                                 | ETA:  36 days, 20:01:13

Episode: 1054   score: 13.55   Avg score (100e): 13.28   actor gain: -0.41   critic loss: 0.41   steps: 1054


training loop:   2% |                                 | ETA:  36 days, 19:58:50

Episode: 1055   score: 13.56   Avg score (100e): 13.28   actor gain: -0.42   critic loss: 0.41   steps: 1055


training loop:   2% |                                 | ETA:  36 days, 19:54:48

Episode: 1056   score: 13.56   Avg score (100e): 13.29   actor gain: -0.40   critic loss: 0.41   steps: 1056


training loop:   2% |                                 | ETA:  36 days, 19:53:45

Episode: 1057   score: 13.57   Avg score (100e): 13.29   actor gain: -0.40   critic loss: 0.41   steps: 1057


training loop:   2% |                                 | ETA:  36 days, 19:52:31

Episode: 1058   score: 13.57   Avg score (100e): 13.30   actor gain: -0.40   critic loss: 0.41   steps: 1058


training loop:   2% |                                 | ETA:  36 days, 20:09:14

Episode: 1059   score: 13.57   Avg score (100e): 13.30   actor gain: -0.40   critic loss: 0.41   steps: 1059


training loop:   2% |                                 | ETA:  36 days, 20:19:31

Episode: 1060   score: 13.58   Avg score (100e): 13.31   actor gain: -0.40   critic loss: 0.41   steps: 1060
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  36 days, 20:22:11

Episode: 1061   score: 13.57   Avg score (100e): 13.31   actor gain: -0.40   critic loss: 0.41   steps: 1061


training loop:   2% |                                 | ETA:  36 days, 20:37:04

Episode: 1062   score: 13.57   Avg score (100e): 13.32   actor gain: -0.40   critic loss: 0.41   steps: 1062


training loop:   2% |                                 | ETA:  36 days, 20:42:23

Episode: 1063   score: 13.58   Avg score (100e): 13.32   actor gain: -0.40   critic loss: 0.41   steps: 1063


training loop:   2% |                                 | ETA:  36 days, 20:39:48

Episode: 1064   score: 13.59   Avg score (100e): 13.33   actor gain: -0.40   critic loss: 0.41   steps: 1064


training loop:   2% |                                 | ETA:  36 days, 20:41:11

Episode: 1065   score: 13.59   Avg score (100e): 13.34   actor gain: -0.40   critic loss: 0.41   steps: 1065


training loop:   2% |                                 | ETA:  36 days, 20:43:19

Episode: 1066   score: 13.59   Avg score (100e): 13.34   actor gain: -0.40   critic loss: 0.41   steps: 1066


training loop:   2% |                                 | ETA:  36 days, 20:45:46

Episode: 1067   score: 13.59   Avg score (100e): 13.35   actor gain: -0.40   critic loss: 0.41   steps: 1067


training loop:   2% |                                 | ETA:  36 days, 20:48:49

Episode: 1068   score: 13.60   Avg score (100e): 13.35   actor gain: -0.40   critic loss: 0.41   steps: 1068


training loop:   2% |                                 | ETA:  36 days, 20:53:04

Episode: 1069   score: 13.60   Avg score (100e): 13.36   actor gain: -0.39   critic loss: 0.41   steps: 1069


training loop:   2% |                                 | ETA:  36 days, 21:06:08

Episode: 1070   score: 13.61   Avg score (100e): 13.36   actor gain: -0.39   critic loss: 0.41   steps: 1070


training loop:   2% |                                 | ETA:  36 days, 21:09:18

Episode: 1071   score: 13.61   Avg score (100e): 13.37   actor gain: -0.39   critic loss: 0.41   steps: 1071


training loop:   2% |                                 | ETA:  36 days, 21:14:10

Episode: 1072   score: 13.62   Avg score (100e): 13.37   actor gain: -0.39   critic loss: 0.41   steps: 1072


training loop:   2% |                                 | ETA:  36 days, 21:36:59

Episode: 1073   score: 13.63   Avg score (100e): 13.38   actor gain: -0.39   critic loss: 0.41   steps: 1073


training loop:   2% |                                 | ETA:  36 days, 21:44:28

Episode: 1074   score: 13.63   Avg score (100e): 13.38   actor gain: -0.39   critic loss: 0.41   steps: 1074


training loop:   2% |                                 | ETA:  36 days, 21:40:45

Episode: 1075   score: 13.64   Avg score (100e): 13.39   actor gain: -0.39   critic loss: 0.41   steps: 1075


training loop:   2% |                                 | ETA:  36 days, 21:48:54

Episode: 1076   score: 13.65   Avg score (100e): 13.40   actor gain: -0.39   critic loss: 0.41   steps: 1076


training loop:   2% |                                 | ETA:  36 days, 21:58:06

Episode: 1077   score: 13.65   Avg score (100e): 13.40   actor gain: -0.39   critic loss: 0.41   steps: 1077


training loop:   2% |                                 | ETA:  36 days, 22:01:19

Episode: 1078   score: 13.66   Avg score (100e): 13.41   actor gain: -0.39   critic loss: 0.41   steps: 1078


training loop:   2% |                                 | ETA:  36 days, 22:01:35

Episode: 1079   score: 13.67   Avg score (100e): 13.41   actor gain: -0.39   critic loss: 0.41   steps: 1079


training loop:   2% |                                 | ETA:  36 days, 21:59:50

Episode: 1080   score: 13.68   Avg score (100e): 13.42   actor gain: -0.39   critic loss: 0.41   steps: 1080


training loop:   2% |                                 | ETA:  36 days, 22:01:27

Episode: 1081   score: 13.68   Avg score (100e): 13.42   actor gain: -0.41   critic loss: 0.41   steps: 1081


training loop:   2% |                                 | ETA:  36 days, 21:55:14

Episode: 1082   score: 13.68   Avg score (100e): 13.43   actor gain: -0.42   critic loss: 0.41   steps: 1082


training loop:   2% |                                 | ETA:  36 days, 21:54:05

Episode: 1083   score: 13.68   Avg score (100e): 13.43   actor gain: -0.43   critic loss: 0.41   steps: 1083


training loop:   2% |                                 | ETA:  36 days, 22:02:12

Episode: 1084   score: 13.68   Avg score (100e): 13.44   actor gain: -0.43   critic loss: 0.40   steps: 1084


training loop:   2% |                                 | ETA:  36 days, 22:05:44

Episode: 1085   score: 13.69   Avg score (100e): 13.44   actor gain: -0.43   critic loss: 0.40   steps: 1085


training loop:   2% |                                 | ETA:  36 days, 22:12:37

Episode: 1086   score: 13.69   Avg score (100e): 13.45   actor gain: -0.42   critic loss: 0.40   steps: 1086


training loop:   2% |                                 | ETA:  36 days, 22:16:19

Episode: 1087   score: 13.71   Avg score (100e): 13.46   actor gain: -0.42   critic loss: 0.40   steps: 1087


training loop:   2% |                                 | ETA:  36 days, 22:15:41

Episode: 1088   score: 13.71   Avg score (100e): 13.46   actor gain: -0.43   critic loss: 0.40   steps: 1088


training loop:   2% |                                 | ETA:  36 days, 22:10:40

Episode: 1089   score: 13.72   Avg score (100e): 13.47   actor gain: -1.03   critic loss: 0.40   steps: 1089


training loop:   2% |                                 | ETA:  36 days, 22:05:21

Episode: 1090   score: 13.71   Avg score (100e): 13.47   actor gain: -1.03   critic loss: 0.40   steps: 1090


training loop:   2% |                                 | ETA:  36 days, 21:59:06

Episode: 1091   score: 13.72   Avg score (100e): 13.48   actor gain: -1.03   critic loss: 0.41   steps: 1091


training loop:   2% |                                 | ETA:  36 days, 21:57:54

Episode: 1092   score: 13.72   Avg score (100e): 13.48   actor gain: -1.03   critic loss: 0.41   steps: 1092


training loop:   2% |                                 | ETA:  36 days, 21:52:51

Episode: 1093   score: 13.73   Avg score (100e): 13.49   actor gain: -1.03   critic loss: 0.40   steps: 1093


training loop:   2% |                                 | ETA:  36 days, 21:45:23

Episode: 1094   score: 13.74   Avg score (100e): 13.49   actor gain: -1.32   critic loss: 0.41   steps: 1094


training loop:   2% |                                 | ETA:  36 days, 21:52:57

Episode: 1095   score: 13.76   Avg score (100e): 13.50   actor gain: -1.31   critic loss: 0.40   steps: 1095


training loop:   2% |                                 | ETA:  36 days, 21:51:23

Episode: 1096   score: 13.76   Avg score (100e): 13.50   actor gain: -1.31   critic loss: 0.40   steps: 1096


training loop:   2% |                                 | ETA:  36 days, 22:11:26

Episode: 1097   score: 13.77   Avg score (100e): 13.51   actor gain: -1.31   critic loss: 0.40   steps: 1097


training loop:   2% |                                 | ETA:  36 days, 22:09:01

Episode: 1098   score: 13.78   Avg score (100e): 13.51   actor gain: -1.31   critic loss: 0.40   steps: 1098


training loop:   2% |                                 | ETA:  36 days, 22:06:21

Episode: 1099   score: 13.80   Avg score (100e): 13.52   actor gain: -1.31   critic loss: 0.40   steps: 1099


training loop:   2% |                                 | ETA:  36 days, 22:09:40

Episode: 1100   score: 13.81   Avg score (100e): 13.52   actor gain: -1.31   critic loss: 0.40   steps: 1100


training loop:   2% |                                 | ETA:  36 days, 22:11:03

Episode: 1101   score: 13.82   Avg score (100e): 13.53   actor gain: -1.32   critic loss: 0.40   steps: 1101


training loop:   2% |                                 | ETA:  36 days, 22:09:32

Episode: 1102   score: 13.83   Avg score (100e): 13.54   actor gain: -1.32   critic loss: 0.41   steps: 1102


training loop:   2% |                                 | ETA:  36 days, 22:15:30

Episode: 1103   score: 13.84   Avg score (100e): 13.54   actor gain: -1.32   critic loss: 0.41   steps: 1103


training loop:   2% |                                 | ETA:  36 days, 22:15:56

Episode: 1104   score: 13.85   Avg score (100e): 13.55   actor gain: -1.31   critic loss: 0.41   steps: 1104


training loop:   2% |                                 | ETA:  36 days, 22:25:59

Episode: 1105   score: 13.87   Avg score (100e): 13.55   actor gain: -1.31   critic loss: 0.41   steps: 1105


training loop:   2% |                                 | ETA:  36 days, 22:35:06

Episode: 1106   score: 13.87   Avg score (100e): 13.56   actor gain: -1.29   critic loss: 0.41   steps: 1106


training loop:   2% |                                 | ETA:  36 days, 22:33:40

Episode: 1107   score: 13.88   Avg score (100e): 13.56   actor gain: -1.28   critic loss: 0.41   steps: 1107


training loop:   2% |                                 | ETA:  36 days, 22:39:53

Episode: 1108   score: 13.90   Avg score (100e): 13.57   actor gain: -1.28   critic loss: 0.41   steps: 1108


training loop:   2% |                                 | ETA:  36 days, 22:46:54

Episode: 1109   score: 13.90   Avg score (100e): 13.58   actor gain: -1.27   critic loss: 0.41   steps: 1109


training loop:   2% |                                 | ETA:  36 days, 22:51:09

Episode: 1110   score: 13.91   Avg score (100e): 13.58   actor gain: -1.27   critic loss: 0.41   steps: 1110


training loop:   2% |                                 | ETA:  36 days, 22:52:50

Episode: 1111   score: 13.93   Avg score (100e): 13.59   actor gain: -1.28   critic loss: 0.41   steps: 1111


training loop:   2% |                                 | ETA:  36 days, 22:51:16

Episode: 1112   score: 13.94   Avg score (100e): 13.59   actor gain: -1.27   critic loss: 0.41   steps: 1112


training loop:   2% |                                 | ETA:  36 days, 22:51:33

Episode: 1113   score: 13.94   Avg score (100e): 13.60   actor gain: -1.27   critic loss: 0.41   steps: 1113


training loop:   2% |                                 | ETA:  36 days, 22:48:26

Episode: 1114   score: 13.95   Avg score (100e): 13.61   actor gain: -0.67   critic loss: 0.41   steps: 1114


training loop:   2% |                                 | ETA:  36 days, 22:48:08

Episode: 1115   score: 13.96   Avg score (100e): 13.61   actor gain: -0.67   critic loss: 0.41   steps: 1115


training loop:   2% |                                 | ETA:  36 days, 23:13:51

Episode: 1116   score: 13.98   Avg score (100e): 13.62   actor gain: -0.67   critic loss: 0.41   steps: 1116


training loop:   2% |                                 | ETA:  36 days, 23:22:43

Episode: 1117   score: 13.99   Avg score (100e): 13.63   actor gain: -0.66   critic loss: 0.41   steps: 1117


training loop:   2% |                                 | ETA:  36 days, 23:22:55

Episode: 1118   score: 14.01   Avg score (100e): 13.63   actor gain: -0.66   critic loss: 0.41   steps: 1118


training loop:   2% |                                 | ETA:  36 days, 23:34:15

Episode: 1119   score: 14.01   Avg score (100e): 13.64   actor gain: -0.38   critic loss: 0.41   steps: 1119


training loop:   2% |                                 | ETA:  36 days, 23:37:49

Episode: 1120   score: 14.03   Avg score (100e): 13.65   actor gain: -0.38   critic loss: 0.41   steps: 1120


training loop:   2% |                                 | ETA:  36 days, 23:38:22

Episode: 1121   score: 14.05   Avg score (100e): 13.65   actor gain: -0.38   critic loss: 0.41   steps: 1121


training loop:   2% |                                 | ETA:  36 days, 23:37:10

Episode: 1122   score: 14.07   Avg score (100e): 13.66   actor gain: -0.38   critic loss: 0.41   steps: 1122


training loop:   2% |                                 | ETA:  36 days, 23:34:48

Episode: 1123   score: 14.09   Avg score (100e): 13.67   actor gain: -0.38   critic loss: 0.41   steps: 1123


training loop:   2% |                                 | ETA:  36 days, 23:34:41

Episode: 1124   score: 14.11   Avg score (100e): 13.67   actor gain: -0.38   critic loss: 0.41   steps: 1124


training loop:   2% |                                 | ETA:  36 days, 23:38:56

Episode: 1125   score: 14.12   Avg score (100e): 13.68   actor gain: -0.38   critic loss: 0.41   steps: 1125


training loop:   2% |                                 | ETA:  36 days, 23:46:14

Episode: 1126   score: 14.13   Avg score (100e): 13.69   actor gain: -0.38   critic loss: 0.41   steps: 1126


training loop:   2% |                                 | ETA:  36 days, 23:52:24

Episode: 1127   score: 14.14   Avg score (100e): 13.70   actor gain: -0.38   critic loss: 0.41   steps: 1127


training loop:   2% |                                 | ETA:  36 days, 23:55:22

Episode: 1128   score: 14.16   Avg score (100e): 13.70   actor gain: -0.38   critic loss: 0.41   steps: 1128


training loop:   2% |                                 | ETA:  36 days, 23:57:48

Episode: 1129   score: 14.17   Avg score (100e): 13.71   actor gain: -0.38   critic loss: 0.41   steps: 1129


training loop:   2% |                                 | ETA:  36 days, 23:58:25

Episode: 1130   score: 14.18   Avg score (100e): 13.72   actor gain: -0.38   critic loss: 0.41   steps: 1130


training loop:   2% |                                 | ETA:  36 days, 23:55:01

Episode: 1131   score: 14.19   Avg score (100e): 13.72   actor gain: -0.38   critic loss: 0.41   steps: 1131


training loop:   2% |                                 | ETA:  36 days, 23:57:36

Episode: 1132   score: 14.20   Avg score (100e): 13.73   actor gain: -0.38   critic loss: 0.41   steps: 1132


training loop:   2% |                                 | ETA:  36 days, 23:55:38

Episode: 1133   score: 14.21   Avg score (100e): 13.74   actor gain: -0.38   critic loss: 0.41   steps: 1133


training loop:   2% |                                 | ETA:  36 days, 23:56:41

Episode: 1134   score: 14.22   Avg score (100e): 13.75   actor gain: -0.38   critic loss: 0.41   steps: 1134


training loop:   2% |                                  | ETA:  37 days, 0:05:10

Episode: 1135   score: 14.25   Avg score (100e): 13.76   actor gain: -0.38   critic loss: 0.41   steps: 1135


training loop:   2% |                                  | ETA:  37 days, 0:00:26

Episode: 1136   score: 14.25   Avg score (100e): 13.76   actor gain: -0.38   critic loss: 0.41   steps: 1136


training loop:   2% |                                  | ETA:  37 days, 0:08:23

Episode: 1137   score: 14.26   Avg score (100e): 13.77   actor gain: -0.38   critic loss: 0.41   steps: 1137


training loop:   2% |                                  | ETA:  37 days, 0:16:28

Episode: 1138   score: 14.27   Avg score (100e): 13.78   actor gain: -0.38   critic loss: 0.41   steps: 1138


training loop:   2% |                                  | ETA:  37 days, 0:25:49

Episode: 1139   score: 14.28   Avg score (100e): 13.79   actor gain: -0.38   critic loss: 0.41   steps: 1139


training loop:   2% |                                  | ETA:  37 days, 0:43:59

Episode: 1140   score: 14.30   Avg score (100e): 13.80   actor gain: -0.39   critic loss: 0.41   steps: 1140


training loop:   2% |                                  | ETA:  37 days, 0:59:12

Episode: 1141   score: 14.31   Avg score (100e): 13.80   actor gain: -0.39   critic loss: 0.41   steps: 1141


training loop:   2% |                                  | ETA:  37 days, 1:07:04

Episode: 1142   score: 14.33   Avg score (100e): 13.81   actor gain: -0.39   critic loss: 0.41   steps: 1142


training loop:   2% |                                  | ETA:  37 days, 1:08:48

Episode: 1143   score: 14.34   Avg score (100e): 13.82   actor gain: -0.40   critic loss: 0.41   steps: 1143


training loop:   2% |                                  | ETA:  37 days, 1:11:37

Episode: 1144   score: 14.36   Avg score (100e): 13.83   actor gain: -0.40   critic loss: 0.41   steps: 1144


training loop:   2% |                                  | ETA:  37 days, 1:13:05

Episode: 1145   score: 14.37   Avg score (100e): 13.84   actor gain: -0.40   critic loss: 0.41   steps: 1145


training loop:   2% |                                  | ETA:  37 days, 1:14:40

Episode: 1146   score: 14.39   Avg score (100e): 13.85   actor gain: -0.40   critic loss: 0.41   steps: 1146


training loop:   2% |                                  | ETA:  37 days, 1:15:48

Episode: 1147   score: 14.39   Avg score (100e): 13.86   actor gain: -0.40   critic loss: 0.41   steps: 1147


training loop:   2% |                                  | ETA:  37 days, 1:18:11

Episode: 1148   score: 14.41   Avg score (100e): 13.86   actor gain: -0.40   critic loss: 0.41   steps: 1148


training loop:   2% |                                  | ETA:  37 days, 1:28:57

Episode: 1149   score: 14.42   Avg score (100e): 13.87   actor gain: -0.40   critic loss: 0.41   steps: 1149


training loop:   2% |                                  | ETA:  37 days, 1:36:24

Episode: 1150   score: 14.43   Avg score (100e): 13.88   actor gain: -0.41   critic loss: 0.41   steps: 1150


training loop:   2% |                                  | ETA:  37 days, 1:36:07

Episode: 1151   score: 14.45   Avg score (100e): 13.89   actor gain: -0.42   critic loss: 0.41   steps: 1151


training loop:   2% |                                  | ETA:  37 days, 1:35:38

Episode: 1152   score: 14.46   Avg score (100e): 13.90   actor gain: -0.42   critic loss: 0.41   steps: 1152


training loop:   2% |                                  | ETA:  37 days, 1:44:39

Episode: 1153   score: 14.47   Avg score (100e): 13.91   actor gain: -0.42   critic loss: 0.41   steps: 1153


training loop:   2% |                                  | ETA:  37 days, 1:53:15

Episode: 1154   score: 14.49   Avg score (100e): 13.92   actor gain: -0.42   critic loss: 0.41   steps: 1154


training loop:   2% |                                  | ETA:  37 days, 9:38:24

Episode: 1155   score: 14.49   Avg score (100e): 13.93   actor gain: -0.42   critic loss: 0.41   steps: 1155


training loop:   2% |                                  | ETA:  37 days, 9:43:21

Episode: 1156   score: 14.51   Avg score (100e): 13.94   actor gain: -0.42   critic loss: 0.41   steps: 1156


training loop:   2% |                                  | ETA:  37 days, 9:58:51

Episode: 1157   score: 14.52   Avg score (100e): 13.95   actor gain: -0.59   critic loss: 0.41   steps: 1157


training loop:   2% |                                 | ETA:  37 days, 10:08:48

Episode: 1158   score: 14.54   Avg score (100e): 13.96   actor gain: -0.59   critic loss: 0.41   steps: 1158


training loop:   2% |                                 | ETA:  37 days, 10:08:33

Episode: 1159   score: 14.55   Avg score (100e): 13.97   actor gain: -0.60   critic loss: 0.41   steps: 1159


training loop:   2% |                                 | ETA:  37 days, 10:08:45

Episode: 1160   score: 14.56   Avg score (100e): 13.98   actor gain: -0.60   critic loss: 0.41   steps: 1160


training loop:   2% |                                 | ETA:  37 days, 10:07:53

Episode: 1161   score: 14.58   Avg score (100e): 13.99   actor gain: -0.61   critic loss: 0.41   steps: 1161


training loop:   2% |                                 | ETA:  37 days, 10:11:07

Episode: 1162   score: 14.59   Avg score (100e): 14.00   actor gain: -0.61   critic loss: 0.41   steps: 1162


training loop:   2% |                                 | ETA:  37 days, 10:10:30

Episode: 1163   score: 14.61   Avg score (100e): 14.01   actor gain: -0.60   critic loss: 0.41   steps: 1163


training loop:   2% |                                 | ETA:  37 days, 10:12:33

Episode: 1164   score: 14.62   Avg score (100e): 14.02   actor gain: -0.60   critic loss: 0.41   steps: 1164


training loop:   2% |                                 | ETA:  37 days, 10:12:24

Episode: 1165   score: 14.63   Avg score (100e): 14.03   actor gain: -0.60   critic loss: 0.41   steps: 1165


training loop:   2% |                                 | ETA:  37 days, 10:11:12

Episode: 1166   score: 14.66   Avg score (100e): 14.04   actor gain: -0.60   critic loss: 0.41   steps: 1166


training loop:   2% |                                 | ETA:  37 days, 10:09:10

Episode: 1167   score: 14.67   Avg score (100e): 14.05   actor gain: -0.60   critic loss: 0.41   steps: 1167


training loop:   2% |                                 | ETA:  37 days, 10:10:12

Episode: 1168   score: 14.67   Avg score (100e): 14.06   actor gain: -0.59   critic loss: 0.41   steps: 1168


training loop:   2% |                                 | ETA:  37 days, 10:13:19

Episode: 1169   score: 14.68   Avg score (100e): 14.07   actor gain: -0.59   critic loss: 0.41   steps: 1169


training loop:   2% |                                 | ETA:  37 days, 10:17:49

Episode: 1170   score: 14.70   Avg score (100e): 14.08   actor gain: -0.59   critic loss: 0.41   steps: 1170


training loop:   2% |                                 | ETA:  37 days, 10:17:49

Episode: 1171   score: 14.71   Avg score (100e): 14.09   actor gain: -0.59   critic loss: 0.41   steps: 1171


training loop:   2% |                                 | ETA:  37 days, 10:20:34

Episode: 1172   score: 14.72   Avg score (100e): 14.10   actor gain: -0.59   critic loss: 0.41   steps: 1172


training loop:   2% |                                 | ETA:  37 days, 10:19:21

Episode: 1173   score: 14.73   Avg score (100e): 14.12   actor gain: -0.59   critic loss: 0.41   steps: 1173


training loop:   2% |                                 | ETA:  37 days, 10:38:53

Episode: 1174   score: 14.75   Avg score (100e): 14.13   actor gain: -0.60   critic loss: 0.41   steps: 1174


training loop:   2% |                                 | ETA:  37 days, 10:46:23

Episode: 1175   score: 14.76   Avg score (100e): 14.14   actor gain: -0.59   critic loss: 0.41   steps: 1175


training loop:   2% |                                 | ETA:  37 days, 10:59:32

Episode: 1176   score: 14.78   Avg score (100e): 14.15   actor gain: -0.58   critic loss: 0.41   steps: 1176


training loop:   2% |                                 | ETA:  37 days, 11:12:03

Episode: 1177   score: 14.78   Avg score (100e): 14.16   actor gain: -0.58   critic loss: 0.41   steps: 1177


training loop:   2% |                                 | ETA:  37 days, 11:30:10

Episode: 1178   score: 14.80   Avg score (100e): 14.17   actor gain: -0.58   critic loss: 0.41   steps: 1178


training loop:   2% |                                 | ETA:  37 days, 11:37:53

Episode: 1179   score: 14.81   Avg score (100e): 14.18   actor gain: -0.58   critic loss: 0.41   steps: 1179


training loop:   2% |                                 | ETA:  37 days, 11:40:57

Episode: 1180   score: 14.82   Avg score (100e): 14.19   actor gain: -0.61   critic loss: 0.41   steps: 1180


training loop:   2% |                                 | ETA:  37 days, 11:44:13

Episode: 1181   score: 14.84   Avg score (100e): 14.21   actor gain: -0.61   critic loss: 0.41   steps: 1181


training loop:   2% |                                 | ETA:  37 days, 11:46:56

Episode: 1182   score: 14.85   Avg score (100e): 14.22   actor gain: -0.44   critic loss: 0.41   steps: 1182


training loop:   2% |                                 | ETA:  37 days, 11:50:34

Episode: 1183   score: 14.86   Avg score (100e): 14.23   actor gain: -0.44   critic loss: 0.41   steps: 1183


training loop:   2% |                                 | ETA:  37 days, 11:49:19

Episode: 1184   score: 14.87   Avg score (100e): 14.24   actor gain: -0.43   critic loss: 0.41   steps: 1184


training loop:   2% |                                 | ETA:  37 days, 11:48:03

Episode: 1185   score: 14.88   Avg score (100e): 14.25   actor gain: -0.43   critic loss: 0.41   steps: 1185


training loop:   2% |                                 | ETA:  37 days, 11:46:57

Episode: 1186   score: 14.90   Avg score (100e): 14.27   actor gain: -0.42   critic loss: 0.41   steps: 1186


training loop:   2% |                                 | ETA:  37 days, 11:48:20

Episode: 1187   score: 14.91   Avg score (100e): 14.28   actor gain: -0.42   critic loss: 0.41   steps: 1187


training loop:   2% |                                 | ETA:  37 days, 11:50:38

Episode: 1188   score: 14.92   Avg score (100e): 14.29   actor gain: -0.42   critic loss: 0.41   steps: 1188


training loop:   2% |                                 | ETA:  37 days, 11:59:17

Episode: 1189   score: 14.93   Avg score (100e): 14.30   actor gain: -0.42   critic loss: 0.41   steps: 1189


training loop:   2% |                                 | ETA:  37 days, 12:05:14

Episode: 1190   score: 14.96   Avg score (100e): 14.31   actor gain: -0.41   critic loss: 0.41   steps: 1190


training loop:   2% |                                 | ETA:  37 days, 12:26:08

Episode: 1191   score: 14.97   Avg score (100e): 14.33   actor gain: -0.41   critic loss: 0.41   steps: 1191


training loop:   2% |                                 | ETA:  37 days, 12:30:20

Episode: 1192   score: 14.99   Avg score (100e): 14.34   actor gain: -0.41   critic loss: 0.41   steps: 1192


training loop:   2% |                                 | ETA:  37 days, 12:46:19

Episode: 1193   score: 15.01   Avg score (100e): 14.35   actor gain: -0.41   critic loss: 0.41   steps: 1193


training loop:   2% |                                 | ETA:  37 days, 12:59:39

Episode: 1194   score: 15.02   Avg score (100e): 14.37   actor gain: -0.41   critic loss: 0.41   steps: 1194


training loop:   2% |                                 | ETA:  37 days, 12:59:27

Episode: 1195   score: 15.03   Avg score (100e): 14.38   actor gain: -0.41   critic loss: 0.41   steps: 1195


training loop:   2% |                                 | ETA:  37 days, 13:04:21

Episode: 1196   score: 15.04   Avg score (100e): 14.39   actor gain: -0.41   critic loss: 0.41   steps: 1196


training loop:   2% |                                 | ETA:  37 days, 13:14:29

Episode: 1197   score: 15.06   Avg score (100e): 14.40   actor gain: -0.45   critic loss: 0.41   steps: 1197


training loop:   2% |                                 | ETA:  37 days, 13:29:52

Episode: 1198   score: 15.07   Avg score (100e): 14.42   actor gain: -0.45   critic loss: 0.41   steps: 1198


training loop:   2% |                                 | ETA:  37 days, 13:37:41

Episode: 1199   score: 15.08   Avg score (100e): 14.43   actor gain: -0.44   critic loss: 0.41   steps: 1199


training loop:   2% |                                 | ETA:  37 days, 13:40:00

Episode: 1200   score: 15.09   Avg score (100e): 14.44   actor gain: -0.44   critic loss: 0.41   steps: 1200


training loop:   2% |                                 | ETA:  37 days, 13:39:13

Episode: 1201   score: 15.12   Avg score (100e): 14.46   actor gain: -0.44   critic loss: 0.41   steps: 1201


training loop:   2% |                                 | ETA:  37 days, 13:38:53

Episode: 1202   score: 15.13   Avg score (100e): 14.47   actor gain: -0.46   critic loss: 0.41   steps: 1202


training loop:   2% |                                 | ETA:  37 days, 13:39:27

Episode: 1203   score: 15.15   Avg score (100e): 14.48   actor gain: -0.46   critic loss: 0.41   steps: 1203


training loop:   2% |                                 | ETA:  37 days, 13:38:35

Episode: 1204   score: 15.16   Avg score (100e): 14.49   actor gain: -0.46   critic loss: 0.41   steps: 1204


training loop:   2% |                                 | ETA:  37 days, 13:37:39

Episode: 1205   score: 15.18   Avg score (100e): 14.51   actor gain: -0.43   critic loss: 0.41   steps: 1205


training loop:   2% |                                 | ETA:  37 days, 13:44:22

Episode: 1206   score: 15.19   Avg score (100e): 14.52   actor gain: -0.43   critic loss: 0.41   steps: 1206


training loop:   2% |                                 | ETA:  37 days, 13:44:59

Episode: 1207   score: 15.20   Avg score (100e): 14.53   actor gain: -0.43   critic loss: 0.41   steps: 1207


training loop:   2% |                                 | ETA:  37 days, 13:43:09

Episode: 1208   score: 15.21   Avg score (100e): 14.55   actor gain: -0.43   critic loss: 0.41   steps: 1208


training loop:   2% |                                 | ETA:  37 days, 13:42:03

Episode: 1209   score: 15.23   Avg score (100e): 14.56   actor gain: -0.43   critic loss: 0.41   steps: 1209


training loop:   2% |                                 | ETA:  37 days, 13:41:32

Episode: 1210   score: 15.25   Avg score (100e): 14.57   actor gain: -0.59   critic loss: 0.41   steps: 1210


training loop:   2% |                                 | ETA:  37 days, 13:39:34

Episode: 1211   score: 15.26   Avg score (100e): 14.59   actor gain: -0.59   critic loss: 0.41   steps: 1211


training loop:   2% |                                 | ETA:  37 days, 13:36:26

Episode: 1212   score: 15.27   Avg score (100e): 14.60   actor gain: -0.63   critic loss: 0.41   steps: 1212


training loop:   2% |                                 | ETA:  37 days, 13:33:10

Episode: 1213   score: 15.29   Avg score (100e): 14.61   actor gain: -0.62   critic loss: 0.41   steps: 1213


training loop:   2% |                                 | ETA:  37 days, 13:31:29

Episode: 1214   score: 15.30   Avg score (100e): 14.63   actor gain: -0.62   critic loss: 0.41   steps: 1214


training loop:   2% |                                 | ETA:  37 days, 13:26:55

Episode: 1215   score: 15.32   Avg score (100e): 14.64   actor gain: -0.62   critic loss: 0.41   steps: 1215


training loop:   2% |                                 | ETA:  37 days, 13:22:51

Episode: 1216   score: 15.33   Avg score (100e): 14.65   actor gain: -0.63   critic loss: 0.42   steps: 1216


training loop:   2% |                                 | ETA:  37 days, 13:19:17

Episode: 1217   score: 15.35   Avg score (100e): 14.67   actor gain: -0.63   critic loss: 0.42   steps: 1217


training loop:   2% |                                 | ETA:  37 days, 13:16:36

Episode: 1218   score: 15.37   Avg score (100e): 14.68   actor gain: -0.63   critic loss: 0.42   steps: 1218


training loop:   2% |                                 | ETA:  37 days, 13:10:13

Episode: 1219   score: 15.38   Avg score (100e): 14.70   actor gain: -0.63   critic loss: 0.42   steps: 1219


training loop:   2% |                                 | ETA:  37 days, 13:07:32

Episode: 1220   score: 15.39   Avg score (100e): 14.71   actor gain: -0.63   critic loss: 0.42   steps: 1220


training loop:   2% |                                 | ETA:  37 days, 13:14:46

Episode: 1221   score: 15.41   Avg score (100e): 14.72   actor gain: -0.63   critic loss: 0.42   steps: 1221


training loop:   2% |                                 | ETA:  37 days, 13:11:21

Episode: 1222   score: 15.42   Avg score (100e): 14.74   actor gain: -0.59   critic loss: 0.42   steps: 1222


training loop:   2% |                                 | ETA:  37 days, 13:08:50

Episode: 1223   score: 15.43   Avg score (100e): 14.75   actor gain: -0.59   critic loss: 0.42   steps: 1223


training loop:   2% |                                 | ETA:  37 days, 13:06:54

Episode: 1224   score: 15.44   Avg score (100e): 14.76   actor gain: -0.59   critic loss: 0.42   steps: 1224


training loop:   2% |                                 | ETA:  37 days, 13:22:01

Episode: 1225   score: 15.46   Avg score (100e): 14.78   actor gain: -0.59   critic loss: 0.42   steps: 1225


training loop:   2% |                                 | ETA:  37 days, 13:19:30

Episode: 1226   score: 15.47   Avg score (100e): 14.79   actor gain: -0.59   critic loss: 0.42   steps: 1226


training loop:   2% |                                 | ETA:  37 days, 13:18:17

Episode: 1227   score: 15.49   Avg score (100e): 14.80   actor gain: -0.57   critic loss: 0.42   steps: 1227


training loop:   2% |                                 | ETA:  37 days, 13:21:24

Episode: 1228   score: 15.51   Avg score (100e): 14.82   actor gain: -0.57   critic loss: 0.42   steps: 1228


training loop:   2% |                                 | ETA:  37 days, 13:26:14

Episode: 1229   score: 15.52   Avg score (100e): 14.83   actor gain: -0.57   critic loss: 0.42   steps: 1229


training loop:   2% |                                 | ETA:  37 days, 13:26:55

Episode: 1230   score: 15.54   Avg score (100e): 14.84   actor gain: -0.58   critic loss: 0.42   steps: 1230


training loop:   2% |                                 | ETA:  37 days, 13:27:11

Episode: 1231   score: 15.55   Avg score (100e): 14.86   actor gain: -0.58   critic loss: 0.42   steps: 1231


training loop:   2% |                                 | ETA:  37 days, 13:22:33

Episode: 1232   score: 15.58   Avg score (100e): 14.87   actor gain: -0.58   critic loss: 0.42   steps: 1232


training loop:   2% |                                 | ETA:  37 days, 13:20:10

Episode: 1233   score: 15.59   Avg score (100e): 14.88   actor gain: -0.58   critic loss: 0.42   steps: 1233


training loop:   2% |                                 | ETA:  37 days, 13:15:09

Episode: 1234   score: 15.59   Avg score (100e): 14.90   actor gain: -0.58   critic loss: 0.42   steps: 1234


training loop:   2% |                                 | ETA:  37 days, 13:15:02

Episode: 1235   score: 15.61   Avg score (100e): 14.91   actor gain: -0.42   critic loss: 0.42   steps: 1235


training loop:   2% |                                 | ETA:  37 days, 13:11:44

Episode: 1236   score: 15.62   Avg score (100e): 14.93   actor gain: -0.42   critic loss: 0.42   steps: 1236


training loop:   2% |                                 | ETA:  37 days, 13:07:51

Episode: 1237   score: 15.63   Avg score (100e): 14.94   actor gain: -0.38   critic loss: 0.42   steps: 1237


training loop:   2% |                                 | ETA:  37 days, 13:17:50

Episode: 1238   score: 15.63   Avg score (100e): 14.95   actor gain: -0.38   critic loss: 0.42   steps: 1238


training loop:   2% |                                 | ETA:  37 days, 13:14:35

Episode: 1239   score: 15.64   Avg score (100e): 14.97   actor gain: -0.38   critic loss: 0.42   steps: 1239


training loop:   2% |                                 | ETA:  37 days, 13:13:39

Episode: 1240   score: 15.65   Avg score (100e): 14.98   actor gain: -0.38   critic loss: 0.42   steps: 1240
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 13:12:59

Episode: 1241   score: 15.66   Avg score (100e): 14.99   actor gain: -0.38   critic loss: 0.42   steps: 1241


training loop:   2% |                                 | ETA:  37 days, 13:12:53

Episode: 1242   score: 15.67   Avg score (100e): 15.01   actor gain: -0.38   critic loss: 0.42   steps: 1242


training loop:   2% |                                 | ETA:  37 days, 13:13:58

Episode: 1243   score: 15.68   Avg score (100e): 15.02   actor gain: -0.38   critic loss: 0.42   steps: 1243


training loop:   2% |                                 | ETA:  37 days, 13:14:58

Episode: 1244   score: 15.69   Avg score (100e): 15.03   actor gain: -0.38   critic loss: 0.42   steps: 1244


training loop:   2% |                                 | ETA:  37 days, 13:21:20

Episode: 1245   score: 15.71   Avg score (100e): 15.05   actor gain: -0.38   critic loss: 0.42   steps: 1245


training loop:   2% |                                 | ETA:  37 days, 13:23:42

Episode: 1246   score: 15.72   Avg score (100e): 15.06   actor gain: -0.38   critic loss: 0.41   steps: 1246


training loop:   2% |                                 | ETA:  37 days, 13:26:51

Episode: 1247   score: 15.73   Avg score (100e): 15.07   actor gain: -0.38   critic loss: 0.41   steps: 1247


training loop:   2% |                                 | ETA:  37 days, 13:24:06

Episode: 1248   score: 15.76   Avg score (100e): 15.09   actor gain: -0.38   critic loss: 0.41   steps: 1248


training loop:   2% |                                 | ETA:  37 days, 13:18:34

Episode: 1249   score: 15.77   Avg score (100e): 15.10   actor gain: -0.38   critic loss: 0.41   steps: 1249


training loop:   2% |                                 | ETA:  37 days, 13:15:42

Episode: 1250   score: 15.78   Avg score (100e): 15.11   actor gain: -0.38   critic loss: 0.41   steps: 1250


training loop:   2% |                                 | ETA:  37 days, 13:14:57

Episode: 1251   score: 15.80   Avg score (100e): 15.13   actor gain: -0.38   critic loss: 0.41   steps: 1251


training loop:   2% |                                 | ETA:  37 days, 13:14:50

Episode: 1252   score: 15.82   Avg score (100e): 15.14   actor gain: -0.38   critic loss: 0.41   steps: 1252


training loop:   2% |                                 | ETA:  37 days, 13:16:55

Episode: 1253   score: 15.84   Avg score (100e): 15.15   actor gain: -0.38   critic loss: 0.41   steps: 1253


training loop:   2% |                                 | ETA:  37 days, 13:22:25

Episode: 1254   score: 15.85   Avg score (100e): 15.17   actor gain: -0.39   critic loss: 0.41   steps: 1254


training loop:   2% |                                 | ETA:  37 days, 13:23:07

Episode: 1255   score: 15.86   Avg score (100e): 15.18   actor gain: -0.38   critic loss: 0.41   steps: 1255


training loop:   2% |                                 | ETA:  37 days, 13:19:54

Episode: 1256   score: 15.87   Avg score (100e): 15.20   actor gain: -0.38   critic loss: 0.41   steps: 1256


training loop:   2% |                                 | ETA:  37 days, 13:24:48

Episode: 1257   score: 15.89   Avg score (100e): 15.21   actor gain: -0.38   critic loss: 0.41   steps: 1257
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 13:13:47

Episode: 1258   score: 15.90   Avg score (100e): 15.22   actor gain: -0.38   critic loss: 0.41   steps: 1258


training loop:   2% |                                 | ETA:  37 days, 13:04:03

Episode: 1259   score: 15.91   Avg score (100e): 15.24   actor gain: -0.38   critic loss: 0.41   steps: 1259


training loop:   2% |                                 | ETA:  37 days, 12:53:32

Episode: 1260   score: 15.92   Avg score (100e): 15.25   actor gain: -0.38   critic loss: 0.41   steps: 1260


training loop:   2% |                                 | ETA:  37 days, 12:50:16

Episode: 1261   score: 15.92   Avg score (100e): 15.26   actor gain: -0.38   critic loss: 0.41   steps: 1261


training loop:   2% |                                 | ETA:  37 days, 12:45:11

Episode: 1262   score: 15.94   Avg score (100e): 15.28   actor gain: -0.38   critic loss: 0.41   steps: 1262


training loop:   2% |                                 | ETA:  37 days, 12:39:30

Episode: 1263   score: 15.95   Avg score (100e): 15.29   actor gain: -0.38   critic loss: 0.41   steps: 1263
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 12:37:26

Episode: 1264   score: 15.97   Avg score (100e): 15.30   actor gain: -0.38   critic loss: 0.41   steps: 1264


training loop:   2% |                                 | ETA:  37 days, 12:43:20

Episode: 1265   score: 15.98   Avg score (100e): 15.32   actor gain: -0.38   critic loss: 0.41   steps: 1265


training loop:   2% |                                 | ETA:  37 days, 12:43:39

Episode: 1266   score: 15.99   Avg score (100e): 15.33   actor gain: -0.38   critic loss: 0.41   steps: 1266


training loop:   2% |                                 | ETA:  37 days, 12:45:59

Episode: 1267   score: 16.01   Avg score (100e): 15.34   actor gain: -0.38   critic loss: 0.41   steps: 1267


training loop:   2% |                                 | ETA:  37 days, 12:55:07

Episode: 1268   score: 16.02   Avg score (100e): 15.36   actor gain: -0.38   critic loss: 0.41   steps: 1268


training loop:   2% |                                 | ETA:  37 days, 12:52:07

Episode: 1269   score: 16.03   Avg score (100e): 15.37   actor gain: -0.38   critic loss: 0.41   steps: 1269


training loop:   2% |                                 | ETA:  37 days, 12:49:27

Episode: 1270   score: 16.04   Avg score (100e): 15.38   actor gain: -0.38   critic loss: 0.41   steps: 1270


training loop:   2% |                                 | ETA:  37 days, 12:55:32

Episode: 1271   score: 16.04   Avg score (100e): 15.40   actor gain: -0.39   critic loss: 0.41   steps: 1271


training loop:   2% |                                 | ETA:  37 days, 12:52:31

Episode: 1272   score: 16.06   Avg score (100e): 15.41   actor gain: -0.39   critic loss: 0.41   steps: 1272


training loop:   2% |                                 | ETA:  37 days, 12:50:46

Episode: 1273   score: 16.07   Avg score (100e): 15.42   actor gain: -0.39   critic loss: 0.41   steps: 1273


training loop:   2% |                                 | ETA:  37 days, 12:47:11

Episode: 1274   score: 16.09   Avg score (100e): 15.44   actor gain: -0.39   critic loss: 0.41   steps: 1274


training loop:   2% |                                 | ETA:  37 days, 12:44:04

Episode: 1275   score: 16.10   Avg score (100e): 15.45   actor gain: -0.39   critic loss: 0.41   steps: 1275


training loop:   2% |                                 | ETA:  37 days, 12:42:05

Episode: 1276   score: 16.11   Avg score (100e): 15.46   actor gain: -0.39   critic loss: 0.41   steps: 1276
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 12:40:17

Episode: 1277   score: 16.14   Avg score (100e): 15.48   actor gain: -0.39   critic loss: 0.41   steps: 1277


training loop:   2% |                                 | ETA:  37 days, 12:37:53

Episode: 1278   score: 16.15   Avg score (100e): 15.49   actor gain: -0.39   critic loss: 0.41   steps: 1278


training loop:   2% |                                 | ETA:  37 days, 12:33:18

Episode: 1279   score: 16.17   Avg score (100e): 15.51   actor gain: -0.38   critic loss: 0.41   steps: 1279


training loop:   2% |                                 | ETA:  37 days, 12:29:33

Episode: 1280   score: 16.18   Avg score (100e): 15.52   actor gain: -0.38   critic loss: 0.41   steps: 1280


training loop:   2% |                                 | ETA:  37 days, 12:24:33

Episode: 1281   score: 16.19   Avg score (100e): 15.53   actor gain: -0.38   critic loss: 0.41   steps: 1281


training loop:   2% |                                 | ETA:  37 days, 12:22:21

Episode: 1282   score: 16.21   Avg score (100e): 15.55   actor gain: -0.39   critic loss: 0.41   steps: 1282


training loop:   2% |                                 | ETA:  37 days, 12:19:05

Episode: 1283   score: 16.22   Avg score (100e): 15.56   actor gain: -0.42   critic loss: 0.41   steps: 1283


training loop:   2% |                                 | ETA:  37 days, 12:24:39

Episode: 1284   score: 16.24   Avg score (100e): 15.57   actor gain: -0.44   critic loss: 0.41   steps: 1284
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 12:22:42

Episode: 1285   score: 16.24   Avg score (100e): 15.59   actor gain: -7.29   critic loss: 0.41   steps: 1285


training loop:   2% |                                 | ETA:  37 days, 12:24:37

Episode: 1286   score: 16.26   Avg score (100e): 15.60   actor gain: -7.28   critic loss: 0.41   steps: 1286


training loop:   2% |                                 | ETA:  37 days, 12:22:19

Episode: 1287   score: 16.27   Avg score (100e): 15.61   actor gain: -7.28   critic loss: 0.41   steps: 1287


training loop:   2% |                                 | ETA:  37 days, 12:22:16

Episode: 1288   score: 16.27   Avg score (100e): 15.63   actor gain: -7.28   critic loss: 0.41   steps: 1288


training loop:   2% |                                 | ETA:  37 days, 12:20:38

Episode: 1289   score: 16.28   Avg score (100e): 15.64   actor gain: -7.28   critic loss: 0.41   steps: 1289


training loop:   2% |                                 | ETA:  37 days, 12:18:38

Episode: 1290   score: 16.30   Avg score (100e): 15.65   actor gain: -7.28   critic loss: 0.41   steps: 1290
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 12:33:27

Episode: 1291   score: 16.32   Avg score (100e): 15.67   actor gain: -7.28   critic loss: 0.41   steps: 1291


training loop:   2% |                                 | ETA:  37 days, 12:38:25

Episode: 1292   score: 16.32   Avg score (100e): 15.68   actor gain: -7.28   critic loss: 0.41   steps: 1292


training loop:   2% |                                 | ETA:  37 days, 12:39:55

Episode: 1293   score: 16.33   Avg score (100e): 15.69   actor gain: -7.28   critic loss: 0.41   steps: 1293


training loop:   2% |                                 | ETA:  37 days, 12:38:50

Episode: 1294   score: 16.34   Avg score (100e): 15.71   actor gain: -7.28   critic loss: 0.41   steps: 1294


training loop:   2% |                                 | ETA:  37 days, 12:38:58

Episode: 1295   score: 16.34   Avg score (100e): 15.72   actor gain: -7.28   critic loss: 0.41   steps: 1295


training loop:   2% |                                 | ETA:  37 days, 12:40:26

Episode: 1296   score: 16.35   Avg score (100e): 15.73   actor gain: -7.28   critic loss: 0.41   steps: 1296


training loop:   2% |                                 | ETA:  37 days, 12:49:56

Episode: 1297   score: 16.35   Avg score (100e): 15.75   actor gain: -7.28   critic loss: 0.41   steps: 1297


training loop:   2% |                                 | ETA:  37 days, 12:55:59

Episode: 1298   score: 16.36   Avg score (100e): 15.76   actor gain: -7.28   critic loss: 0.41   steps: 1298


training loop:   2% |                                 | ETA:  37 days, 13:00:17

Episode: 1299   score: 16.37   Avg score (100e): 15.77   actor gain: -7.28   critic loss: 0.41   steps: 1299


training loop:   2% |                                 | ETA:  37 days, 12:58:45

Episode: 1300   score: 16.37   Avg score (100e): 15.79   actor gain: -7.28   critic loss: 0.41   steps: 1300
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 12:55:05

Episode: 1301   score: 16.38   Avg score (100e): 15.80   actor gain: -7.28   critic loss: 0.41   steps: 1301


training loop:   2% |                                 | ETA:  37 days, 12:55:06

Episode: 1302   score: 16.39   Avg score (100e): 15.81   actor gain: -7.28   critic loss: 0.41   steps: 1302


training loop:   2% |                                 | ETA:  37 days, 13:04:56

Episode: 1303   score: 16.39   Avg score (100e): 15.82   actor gain: -7.28   critic loss: 0.41   steps: 1303


training loop:   2% |                                 | ETA:  37 days, 13:02:25

Episode: 1304   score: 16.40   Avg score (100e): 15.84   actor gain: -7.28   critic loss: 0.41   steps: 1304


training loop:   2% |                                 | ETA:  37 days, 13:02:08

Episode: 1305   score: 16.42   Avg score (100e): 15.85   actor gain: -7.28   critic loss: 0.41   steps: 1305


training loop:   2% |                                 | ETA:  37 days, 13:00:42

Episode: 1306   score: 16.42   Avg score (100e): 15.86   actor gain: -7.29   critic loss: 0.41   steps: 1306


training loop:   2% |                                 | ETA:  37 days, 12:57:41

Episode: 1307   score: 16.44   Avg score (100e): 15.87   actor gain: -7.29   critic loss: 0.41   steps: 1307


training loop:   2% |                                 | ETA:  37 days, 12:53:25

Episode: 1308   score: 16.45   Avg score (100e): 15.89   actor gain: -7.26   critic loss: 0.41   steps: 1308


training loop:   2% |                                 | ETA:  37 days, 12:50:46

Episode: 1309   score: 16.46   Avg score (100e): 15.90   actor gain: -7.24   critic loss: 0.41   steps: 1309


training loop:   2% |                                 | ETA:  37 days, 12:47:01

Episode: 1310   score: 16.46   Avg score (100e): 15.91   actor gain: -0.40   critic loss: 0.41   steps: 1310


training loop:   2% |                                 | ETA:  37 days, 12:44:31

Episode: 1311   score: 16.48   Avg score (100e): 15.92   actor gain: -0.40   critic loss: 0.41   steps: 1311


training loop:   2% |                                 | ETA:  37 days, 12:42:01

Episode: 1312   score: 16.50   Avg score (100e): 15.93   actor gain: -0.41   critic loss: 0.41   steps: 1312


training loop:   2% |                                 | ETA:  37 days, 12:49:10

Episode: 1313   score: 16.50   Avg score (100e): 15.95   actor gain: -0.41   critic loss: 0.41   steps: 1313


training loop:   2% |                                 | ETA:  37 days, 12:48:53

Episode: 1314   score: 16.51   Avg score (100e): 15.96   actor gain: -0.41   critic loss: 0.41   steps: 1314


training loop:   2% |                                 | ETA:  37 days, 12:58:27

Episode: 1315   score: 16.52   Avg score (100e): 15.97   actor gain: -0.41   critic loss: 0.41   steps: 1315


training loop:   2% |                                 | ETA:  37 days, 12:59:49

Episode: 1316   score: 16.52   Avg score (100e): 15.98   actor gain: -0.42   critic loss: 0.41   steps: 1316


training loop:   2% |                                 | ETA:  37 days, 12:58:39

Episode: 1317   score: 16.53   Avg score (100e): 15.99   actor gain: -0.42   critic loss: 0.41   steps: 1317


training loop:   2% |                                 | ETA:  37 days, 12:55:25

Episode: 1318   score: 16.55   Avg score (100e): 16.01   actor gain: -0.42   critic loss: 0.41   steps: 1318


training loop:   2% |                                 | ETA:  37 days, 12:58:12

Episode: 1319   score: 16.55   Avg score (100e): 16.02   actor gain: -0.42   critic loss: 0.41   steps: 1319


training loop:   2% |                                 | ETA:  37 days, 12:57:25

Episode: 1320   score: 16.56   Avg score (100e): 16.03   actor gain: -0.45   critic loss: 0.41   steps: 1320


training loop:   2% |                                 | ETA:  37 days, 12:55:17

Episode: 1321   score: 16.57   Avg score (100e): 16.04   actor gain: -0.46   critic loss: 0.41   steps: 1321


training loop:   2% |                                 | ETA:  37 days, 12:51:33

Episode: 1322   score: 16.59   Avg score (100e): 16.05   actor gain: -0.51   critic loss: 0.41   steps: 1322


training loop:   2% |                                 | ETA:  37 days, 12:49:34

Episode: 1323   score: 16.60   Avg score (100e): 16.06   actor gain: -0.51   critic loss: 0.41   steps: 1323


training loop:   2% |                                 | ETA:  37 days, 12:51:13

Episode: 1324   score: 16.60   Avg score (100e): 16.08   actor gain: -0.51   critic loss: 0.41   steps: 1324
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 12:55:45

Episode: 1325   score: 16.61   Avg score (100e): 16.09   actor gain: -0.51   critic loss: 0.41   steps: 1325


training loop:   2% |                                 | ETA:  37 days, 12:57:56

Episode: 1326   score: 16.61   Avg score (100e): 16.10   actor gain: -0.51   critic loss: 0.41   steps: 1326


training loop:   2% |                                 | ETA:  37 days, 12:54:56

Episode: 1327   score: 16.62   Avg score (100e): 16.11   actor gain: -0.51   critic loss: 0.41   steps: 1327


training loop:   2% |                                 | ETA:  37 days, 13:03:42

Episode: 1328   score: 16.63   Avg score (100e): 16.12   actor gain: -0.57   critic loss: 0.41   steps: 1328


training loop:   2% |                                 | ETA:  37 days, 13:01:24

Episode: 1329   score: 16.63   Avg score (100e): 16.13   actor gain: -0.57   critic loss: 0.41   steps: 1329


training loop:   2% |                                 | ETA:  37 days, 13:18:57

Episode: 1330   score: 16.64   Avg score (100e): 16.14   actor gain: -0.57   critic loss: 0.41   steps: 1330


training loop:   2% |                                 | ETA:  37 days, 13:25:33

Episode: 1331   score: 16.65   Avg score (100e): 16.15   actor gain: -0.56   critic loss: 0.41   steps: 1331


training loop:   2% |                                 | ETA:  37 days, 13:30:30

Episode: 1332   score: 16.66   Avg score (100e): 16.16   actor gain: -0.56   critic loss: 0.41   steps: 1332
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 13:37:13

Episode: 1333   score: 16.67   Avg score (100e): 16.18   actor gain: -0.56   critic loss: 0.41   steps: 1333


training loop:   2% |                                 | ETA:  37 days, 13:37:52

Episode: 1334   score: 16.69   Avg score (100e): 16.19   actor gain: -0.56   critic loss: 0.41   steps: 1334


training loop:   2% |                                 | ETA:  37 days, 13:57:42

Episode: 1335   score: 16.69   Avg score (100e): 16.20   actor gain: -0.56   critic loss: 0.41   steps: 1335


training loop:   2% |                                 | ETA:  37 days, 14:02:34

Episode: 1336   score: 16.68   Avg score (100e): 16.21   actor gain: -0.56   critic loss: 0.41   steps: 1336


training loop:   2% |                                 | ETA:  37 days, 14:02:55

Episode: 1337   score: 16.69   Avg score (100e): 16.22   actor gain: -0.55   critic loss: 0.41   steps: 1337


training loop:   2% |                                 | ETA:  37 days, 14:01:56

Episode: 1338   score: 16.70   Avg score (100e): 16.23   actor gain: -0.55   critic loss: 0.41   steps: 1338


training loop:   2% |                                 | ETA:  37 days, 14:01:19

Episode: 1339   score: 16.70   Avg score (100e): 16.24   actor gain: -0.56   critic loss: 0.41   steps: 1339


training loop:   2% |                                 | ETA:  37 days, 13:58:37

Episode: 1340   score: 16.71   Avg score (100e): 16.25   actor gain: -0.56   critic loss: 0.41   steps: 1340


training loop:   2% |                                 | ETA:  37 days, 13:59:16

Episode: 1341   score: 16.74   Avg score (100e): 16.26   actor gain: -0.56   critic loss: 0.41   steps: 1341


training loop:   2% |                                 | ETA:  37 days, 14:02:44

Episode: 1342   score: 16.75   Avg score (100e): 16.27   actor gain: -0.59   critic loss: 0.41   steps: 1342


training loop:   2% |                                 | ETA:  37 days, 14:02:51

Episode: 1343   score: 16.76   Avg score (100e): 16.28   actor gain: -0.59   critic loss: 0.41   steps: 1343


training loop:   2% |                                 | ETA:  37 days, 14:01:45

Episode: 1344   score: 16.77   Avg score (100e): 16.29   actor gain: -0.59   critic loss: 0.41   steps: 1344


training loop:   2% |                                 | ETA:  37 days, 14:03:57

Episode: 1345   score: 16.78   Avg score (100e): 16.30   actor gain: -0.56   critic loss: 0.41   steps: 1345


training loop:   2% |                                 | ETA:  37 days, 14:01:51

Episode: 1346   score: 16.78   Avg score (100e): 16.32   actor gain: -0.54   critic loss: 0.41   steps: 1346


training loop:   2% |                                 | ETA:  37 days, 14:01:11

Episode: 1347   score: 16.78   Avg score (100e): 16.33   actor gain: -0.50   critic loss: 0.41   steps: 1347


training loop:   2% |                                 | ETA:  37 days, 13:56:47

Episode: 1348   score: 16.79   Avg score (100e): 16.34   actor gain: -0.50   critic loss: 0.41   steps: 1348


training loop:   2% |                                 | ETA:  37 days, 13:57:57

Episode: 1349   score: 16.80   Avg score (100e): 16.35   actor gain: -0.50   critic loss: 0.41   steps: 1349


training loop:   2% |                                 | ETA:  37 days, 14:02:20

Episode: 1350   score: 16.81   Avg score (100e): 16.36   actor gain: -0.50   critic loss: 0.41   steps: 1350


training loop:   2% |                                 | ETA:  37 days, 14:04:38

Episode: 1351   score: 16.82   Avg score (100e): 16.37   actor gain: -0.50   critic loss: 0.41   steps: 1351


training loop:   2% |                                 | ETA:  37 days, 14:03:51

Episode: 1352   score: 16.83   Avg score (100e): 16.38   actor gain: -0.50   critic loss: 0.41   steps: 1352


training loop:   2% |                                 | ETA:  37 days, 14:01:25

Episode: 1353   score: 16.84   Avg score (100e): 16.39   actor gain: -0.44   critic loss: 0.41   steps: 1353


training loop:   2% |                                 | ETA:  37 days, 13:57:17

Episode: 1354   score: 16.85   Avg score (100e): 16.40   actor gain: -0.44   critic loss: 0.41   steps: 1354


training loop:   2% |                                 | ETA:  37 days, 13:55:17

Episode: 1355   score: 16.86   Avg score (100e): 16.41   actor gain: -0.44   critic loss: 0.41   steps: 1355


training loop:   2% |                                 | ETA:  37 days, 13:54:00

Episode: 1356   score: 16.87   Avg score (100e): 16.42   actor gain: -0.46   critic loss: 0.41   steps: 1356


training loop:   2% |                                 | ETA:  37 days, 13:51:43

Episode: 1357   score: 16.87   Avg score (100e): 16.43   actor gain: -0.47   critic loss: 0.41   steps: 1357


training loop:   2% |                                 | ETA:  37 days, 13:49:31

Episode: 1358   score: 16.88   Avg score (100e): 16.44   actor gain: -0.47   critic loss: 0.41   steps: 1358


training loop:   2% |                                 | ETA:  37 days, 13:47:17

Episode: 1359   score: 16.89   Avg score (100e): 16.45   actor gain: -0.47   critic loss: 0.41   steps: 1359


training loop:   2% |                                 | ETA:  37 days, 13:43:53

Episode: 1360   score: 16.90   Avg score (100e): 16.46   actor gain: -0.46   critic loss: 0.41   steps: 1360


training loop:   2% |                                 | ETA:  37 days, 13:41:05

Episode: 1361   score: 16.91   Avg score (100e): 16.47   actor gain: -0.46   critic loss: 0.41   steps: 1361


training loop:   2% |                                 | ETA:  37 days, 13:36:57

Episode: 1362   score: 16.92   Avg score (100e): 16.48   actor gain: -0.46   critic loss: 0.41   steps: 1362


training loop:   2% |                                 | ETA:  37 days, 13:33:49

Episode: 1363   score: 16.91   Avg score (100e): 16.49   actor gain: -0.46   critic loss: 0.41   steps: 1363


training loop:   2% |                                 | ETA:  37 days, 13:29:37

Episode: 1364   score: 16.92   Avg score (100e): 16.49   actor gain: -0.45   critic loss: 0.41   steps: 1364


training loop:   2% |                                 | ETA:  37 days, 13:27:10

Episode: 1365   score: 16.92   Avg score (100e): 16.50   actor gain: -0.44   critic loss: 0.41   steps: 1365


training loop:   2% |                                 | ETA:  37 days, 13:39:19

Episode: 1366   score: 16.93   Avg score (100e): 16.51   actor gain: -0.44   critic loss: 0.41   steps: 1366


training loop:   2% |                                 | ETA:  37 days, 13:47:27

Episode: 1367   score: 16.93   Avg score (100e): 16.52   actor gain: -0.41   critic loss: 0.41   steps: 1367


training loop:   2% |                                 | ETA:  37 days, 13:59:14

Episode: 1368   score: 16.95   Avg score (100e): 16.53   actor gain: -0.41   critic loss: 0.41   steps: 1368


training loop:   2% |                                 | ETA:  37 days, 14:00:36

Episode: 1369   score: 16.95   Avg score (100e): 16.54   actor gain: -0.41   critic loss: 0.41   steps: 1369


training loop:   2% |                                 | ETA:  37 days, 13:58:15

Episode: 1370   score: 16.96   Avg score (100e): 16.55   actor gain: -0.41   critic loss: 0.41   steps: 1370


training loop:   2% |                                 | ETA:  37 days, 13:58:23

Episode: 1371   score: 16.97   Avg score (100e): 16.56   actor gain: -0.41   critic loss: 0.41   steps: 1371


training loop:   2% |                                 | ETA:  37 days, 13:55:49

Episode: 1372   score: 16.99   Avg score (100e): 16.57   actor gain: -0.41   critic loss: 0.41   steps: 1372


training loop:   2% |                                 | ETA:  37 days, 13:52:46

Episode: 1373   score: 17.00   Avg score (100e): 16.58   actor gain: -0.41   critic loss: 0.41   steps: 1373


training loop:   2% |                                 | ETA:  37 days, 13:59:01

Episode: 1374   score: 17.00   Avg score (100e): 16.59   actor gain: -0.41   critic loss: 0.41   steps: 1374


training loop:   2% |                                 | ETA:  37 days, 13:58:51

Episode: 1375   score: 17.01   Avg score (100e): 16.60   actor gain: -0.41   critic loss: 0.41   steps: 1375


training loop:   2% |                                 | ETA:  37 days, 13:58:29

Episode: 1376   score: 17.02   Avg score (100e): 16.61   actor gain: -0.41   critic loss: 0.41   steps: 1376


training loop:   2% |                                 | ETA:  37 days, 13:55:10

Episode: 1377   score: 17.02   Avg score (100e): 16.61   actor gain: -0.41   critic loss: 0.41   steps: 1377


training loop:   2% |                                 | ETA:  37 days, 13:49:16

Episode: 1378   score: 17.02   Avg score (100e): 16.62   actor gain: -0.41   critic loss: 0.41   steps: 1378


training loop:   2% |                                 | ETA:  37 days, 13:45:08

Episode: 1379   score: 17.03   Avg score (100e): 16.63   actor gain: -0.41   critic loss: 0.41   steps: 1379
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 13:39:48

Episode: 1380   score: 17.04   Avg score (100e): 16.64   actor gain: -0.41   critic loss: 0.41   steps: 1380


training loop:   2% |                                 | ETA:  37 days, 13:35:54

Episode: 1381   score: 17.05   Avg score (100e): 16.65   actor gain: -0.39   critic loss: 0.41   steps: 1381


training loop:   2% |                                 | ETA:  37 days, 13:33:18

Episode: 1382   score: 17.06   Avg score (100e): 16.66   actor gain: -0.38   critic loss: 0.41   steps: 1382


training loop:   2% |                                 | ETA:  37 days, 13:30:51

Episode: 1383   score: 17.07   Avg score (100e): 16.67   actor gain: -0.38   critic loss: 0.41   steps: 1383


training loop:   2% |                                 | ETA:  37 days, 13:26:08

Episode: 1384   score: 17.08   Avg score (100e): 16.67   actor gain: -0.38   critic loss: 0.41   steps: 1384


training loop:   2% |                                 | ETA:  37 days, 13:20:31

Episode: 1385   score: 17.09   Avg score (100e): 16.68   actor gain: -0.38   critic loss: 0.41   steps: 1385
np.all(done) is true! miracle!


training loop:   2% |                                 | ETA:  37 days, 13:15:58

Episode: 1386   score: 17.10   Avg score (100e): 16.69   actor gain: -0.38   critic loss: 0.41   steps: 1386


training loop:   2% |                                 | ETA:  37 days, 13:15:41

Episode: 1387   score: 17.11   Avg score (100e): 16.70   actor gain: -0.38   critic loss: 0.41   steps: 1387


training loop:   2% |                                 | ETA:  37 days, 13:17:58

Episode: 1388   score: 17.11   Avg score (100e): 16.71   actor gain: -0.38   critic loss: 0.41   steps: 1388


In [None]:
saveTrainedModel(agent, model_dir + model_name)

In [None]:
# plot the scores
import matplotlib.pyplot as plt
%matplotlib inline

fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot(np.arange(len(mean_rewards)), mean_rewards)
plt.ylabel('Score')
plt.xlabel('Episode #')
plt.show()

In [None]:
scores = np.zeros(num_agents)                # initialize the score (for each agent)
for _ in range(10):
    agent.step(train_mode=False)             # lower eps and train_mode=False
    episode_reward = agent.running_rewards
    scores += episode_reward                 # update the score (for each agent)
print('Total score (averaged over agents) this episode: {}'.format(np.mean(scores)))

In [None]:
env.close()