# Continuous Control

---

In this notebook, you will learn how to use the Unity ML-Agents environment for the second project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program.

### 1. Start the Environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

In [1]:
from unityagents import UnityEnvironment
import numpy as np
from collections import namedtuple, deque

In [None]:
%load_ext autoreload
%autoreload 2

In [2]:
!wandb login xyz

Successfully logged in to Weights & Biases!


wandb: Appending key for api.wandb.ai to your netrc file: C:\Users\Adam/.netrc


Next, we will start the environment!  **_Before running the code cell below_**, change the `file_name` parameter to match the location of the Unity environment that you downloaded.

- **Mac**: `"path/to/Reacher.app"`
- **Windows** (x86): `"path/to/Reacher_Windows_x86/Reacher.exe"`
- **Windows** (x86_64): `"path/to/Reacher_Windows_x86_64/Reacher.exe"`
- **Linux** (x86): `"path/to/Reacher_Linux/Reacher.x86"`
- **Linux** (x86_64): `"path/to/Reacher_Linux/Reacher.x86_64"`
- **Linux** (x86, headless): `"path/to/Reacher_Linux_NoVis/Reacher.x86"`
- **Linux** (x86_64, headless): `"path/to/Reacher_Linux_NoVis/Reacher.x86_64"`

For instance, if you are using a Mac, then you downloaded `Reacher.app`.  If this file is in the same folder as the notebook, then the line below should appear as follows:
```
env = UnityEnvironment(file_name="Reacher.app")
```

In [3]:
env = UnityEnvironment(file_name='./reacher/Reacher.exe')

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		goal_speed -> 1.0
		goal_size -> 5.0
Unity brain name: ReacherBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 33
        Number of stacked Vector Observation: 1
        Vector Action space type: continuous
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 


Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

In [4]:
# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

### 2. Examine the State and Action Spaces

In this environment, a double-jointed arm can move to target locations. A reward of `+0.1` is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.

The observation space consists of `33` variables corresponding to position, rotation, velocity, and angular velocities of the arm.  Each action is a vector with four numbers, corresponding to torque applicable to two joints.  Every entry in the action vector must be a number between `-1` and `1`.

Run the code cell below to print some information about the environment.

In [5]:
# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# number of agents
num_agents = len(env_info.agents)
print('Number of agents:', num_agents)

# size of each action
action_size = brain.vector_action_space_size
print('Size of each action:', action_size)

# examine the state space 
states = env_info.vector_observations
state_size = states.shape[1]
print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))
print('The state for the first agent looks like:', states[0])

Number of agents: 20
Size of each action: 4
There are 20 agents. Each observes a state with length: 33
The state for the first agent looks like: [ 0.00000000e+00 -4.00000000e+00  0.00000000e+00  1.00000000e+00
 -0.00000000e+00 -0.00000000e+00 -4.37113883e-08  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00 -1.00000000e+01  0.00000000e+00
  1.00000000e+00 -0.00000000e+00 -0.00000000e+00 -4.37113883e-08
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  5.75471878e+00 -1.00000000e+00
  5.55726624e+00  0.00000000e+00  1.00000000e+00  0.00000000e+00
 -1.68164849e-01]


## Init AI

New run ...

Hyper parameters...

In [53]:
import wandb

BUFFER_SIZE = int(1e5)  # replay buffer size
BATCH_SIZE = 512        # minibatch size
GAMMA = 0.99            # discount factor
TAU = 1e-3              # for soft update of target parameters
LR_ACTOR = 1e-3         # learning rate of the actor 
LR_CRITIC = 1e-3        # learning rate of the critic
WEIGHT_DECAY = 0.0   # L2 weight decay
OU_SIGMA = 0.20
OU_THETA = 0.15
LEARN_INTERVAL = 6
LEARN_NUM = 2

config = {
    "BUFFER_SIZE": BUFFER_SIZE,
    "BATCH_SIZE": BATCH_SIZE,
    "GAMMA": GAMMA,
    "TAU": TAU,
    "LR_ACTOR": LR_ACTOR,
    "LR_CRITIC": LR_CRITIC,
    "WEIGHT_DECAY": WEIGHT_DECAY,
    "OU_SIGMA": OU_SIGMA,
    "OU_THETA" : OU_THETA,
    "LEARN_INTERVAL": LEARN_INTERVAL,
    "LEARN_NUM": LEARN_NUM
}
wandb.init(config=config, project="reacher_ddpg_continuous_control")

INFO:wandb.run_manager:system metrics and metadata threads started
INFO:wandb.run_manager:checking resume status, waiting at most 10 seconds
INFO:wandb.run_manager:resuming run from id: UnVuOnYxOnJ6MGY1aDN6OnJlYWNoZXJfZGRwZ19jb250aW51b3VzX2NvbnRyb2w6YWRhbV9ibHZjaw==
INFO:wandb.run_manager:upserting run before process can begin, waiting at most 10 seconds
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\config.yaml
INFO:wandb.run_manager:saving patches
INFO:wandb.run_manager:saving pip packages
INFO:wandb.run_manager:initializing streaming files api
INFO:wandb.run_manager:unblocking file change observer, beginning sync with W&B servers


W&B Run: https://app.wandb.ai/adam_blvck/reacher_ddpg_continuous_control/runs/rz0f5h3z

INFO:wandb.run_manager:shutting down system stats and metadata service
INFO:wandb.run_manager:file/dir created: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir created: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir created: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir created: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir created: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuou

Device...

In [54]:
import torch
from ddpg_agent import Agent

In [55]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

device(type='cuda', index=0)

Create AI agent...

In [57]:
agent = Agent( device, state_size, num_agents, action_size, 1, \
                BUFFER_SIZE, BATCH_SIZE, GAMMA, TAU, LR_ACTOR, LR_CRITIC, WEIGHT_DECAY, \
                 LEARN_INTERVAL, LEARN_NUM, OU_SIGMA, OU_THETA)

wandb.watch(agent.actor_local)
wandb.watch(agent.critic_local)

[<wandb.wandb_torch.TorchGraph at 0x163a71b6320>]

## Train AI

In [58]:
def save_agent(agent, checkpoint_name):
    torch.save(agent.actor_local.state_dict(), checkpoint_name+'_actor.pth')
    torch.save(agent.critic_local.state_dict(), checkpoint_name+'_critic.pth')

def load_agent(agent, checkpoint_name):
    agent.actor_local.load_state_dict(torch.load(checkpoint_name+'_actor.pth'))
    agent.critic_local.load_state_dict(torch.load(checkpoint_name+'_critic.pth'))
    
def train_ddpg(n_episodes=2000, max_t=1000, checkpoint_name='checkpoint'):
    scores = []
    scores_window = deque(maxlen=100)
    max_score = np.Inf
    
    for i_episode in range(1, n_episodes+1):
        
        # reset => env, agent, score
        env_info = env.reset(train_mode=True)[brain_name]
        states = env_info.vector_observations
        agent.reset()
        score = np.zeros(num_agents)
        
        # run for maxt_ time (or when `done` is reached)
        # - Agent decides action - agent.act
        # - Environement accepts action - env.step
        # - Agent is informed about taken action - agent.step
        for t in range(max_t):
            
            # Mr AI, what action should we take?
            actions = agent.act(states)
            
            # Take the action in the environment
            env_info = env.step(actions)[brain_name]
            next_states = env_info.vector_observations
            rewards = env_info.rewards
            dones = env_info.local_done
            
            # Mr AI, if you choose to accept (for every human instance)
            agent.step(t, states, actions, rewards, next_states, dones)
            
            # Loop around & keep track of score
            states = next_states
            score += rewards
            
            if np.any(dones): # if any agent is at it's completion
                break
                
        # log
        scores_window.append(np.mean(score))
        scores.append(np.mean(score))
        
        # log avg score for this episode across agents
        wandb.log({"mean_score": np.mean(score)})
        avg_score = np.mean(scores_window)
        print('\rEpisode {}\tAverage Score: {:.2f}\tScore: {:.2f}'.format(i_episode, \
                                                                          np.mean(scores_window), \
                                                                          np.mean(score) ,\
                                                                          end=""))
        
        # log avg score in last 100 episodes
        if i_episode % 20 == 0 or avg_score > 30.0:
            save_agent(agent, checkpoint_name)
            wandb.log({"window_score": np.mean(scores_window)})
            print('\rEpisode {}\tAverage Score: {:.2f}'.format(i_episode, np.mean(scores_window)))   
        
        if avg_score >= 31.0:
            print('\nEnvironment solved in {:d} episodes!\tAverage score: {:.2f}'.format(i_episode, average_score))
            break
        
    return scores

In [59]:
scores = train_ddpg()

fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot(np.arange(1, len(scores)+1), scores)
plt.ylabel('Score')
plt.xlabel('Episode #')
plt.show()

INFO:wandb.run_manager:system metrics and metadata threads started
INFO:wandb.run_manager:checking resume status, waiting at most 10 seconds
INFO:wandb.run_manager:resuming run from id: UnVuOnYxOnJ6MGY1aDN6OnJlYWNoZXJfZGRwZ19jb250aW51b3VzX2NvbnRyb2w6YWRhbV9ibHZjaw==
INFO:wandb.run_manager:upserting run before process can begin, waiting at most 10 seconds
INFO:wandb.run_manager:saving patches
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:saving pip packages
INFO:wandb.run_manager:initializing streaming files api
INFO:wandb.run_manager:unblocking file change observer, beginning sync with W&B servers
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-lear

Episode 1	Average Score: 0.83	Score: 0.83


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 2	Average Score: 0.78	Score: 0.72


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 3	Average Score: 0.73	Score: 0.65


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 4	Average Score: 0.76	Score: 0.83


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 5	Average Score: 0.88	Score: 1.37


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 6	Average Score: 0.90	Score: 0.97


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 7	Average Score: 0.95	Score: 1.24


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 8	Average Score: 0.99	Score: 1.26


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 9	Average Score: 1.00	Score: 1.16


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 10	Average Score: 1.02	Score: 1.17


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 11	Average Score: 1.10	Score: 1.95


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 12	Average Score: 1.15	Score: 1.66


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl


Episode 13	Average Score: 1.14	Score: 1.04


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 14	Average Score: 1.12	Score: 0.86


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 15	Average Score: 1.13	Score: 1.17


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 16	Average Score: 1.16	Score: 1.61


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 17	Average Score: 1.20	Score: 1.89


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 18	Average Score: 1.24	Score: 2.00


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 19	Average Score: 1.27	Score: 1.79


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 20	Average Score: 1.28	Score: 1.48
Episode 20	Average Score: 1.28


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 21	Average Score: 1.31	Score: 1.82


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 22	Average Score: 1.36	Score: 2.44


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl


Episode 23	Average Score: 1.36	Score: 1.45


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 24	Average Score: 1.38	Score: 1.84


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 25	Average Score: 1.48	Score: 3.73


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 26	Average Score: 1.55	Score: 3.38


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 27	Average Score: 1.63	Score: 3.69


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 28	Average Score: 1.77	Score: 5.55


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 29	Average Score: 1.99	Score: 8.27


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 30	Average Score: 2.32	Score: 11.63


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 31	Average Score: 2.74	Score: 15.49


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 32	Average Score: 3.15	Score: 16.01


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 33	Average Score: 3.58	Score: 17.14


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 34	Average Score: 4.10	Score: 21.46


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 35	Average Score: 4.71	Score: 25.40


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 36	Average Score: 5.28	Score: 25.02


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 37	Average Score: 5.89	Score: 28.00


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 38	Average Score: 6.52	Score: 29.66


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 39	Average Score: 7.17	Score: 31.89


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 40	Average Score: 7.78	Score: 31.82
Episode 40	Average Score: 7.78


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 41	Average Score: 8.45	Score: 35.23


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 42	Average Score: 9.11	Score: 35.99


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 43	Average Score: 9.72	Score: 35.58


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl


Episode 44	Average Score: 10.34	Score: 37.04


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 45	Average Score: 10.92	Score: 36.43


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl


Episode 46	Average Score: 11.48	Score: 36.69


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 47	Average Score: 12.02	Score: 36.84


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 48	Average Score: 12.56	Score: 37.64


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 49	Average Score: 13.06	Score: 37.02


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 50	Average Score: 13.55	Score: 37.49


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 51	Average Score: 14.02	Score: 37.58


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 52	Average Score: 14.47	Score: 37.68


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 53	Average Score: 14.90	Score: 37.18


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 54	Average Score: 15.30	Score: 36.42


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 55	Average Score: 15.69	Score: 36.80


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 56	Average Score: 16.07	Score: 37.18


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 57	Average Score: 16.44	Score: 36.90


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 58	Average Score: 16.79	Score: 36.91


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 59	Average Score: 17.13	Score: 36.73


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 60	Average Score: 17.47	Score: 37.42
Episode 60	Average Score: 17.47


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 61	Average Score: 17.79	Score: 37.39


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 62	Average Score: 18.12	Score: 37.71


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 63	Average Score: 18.43	Score: 37.97


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl


Episode 64	Average Score: 18.74	Score: 37.95


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 65	Average Score: 19.03	Score: 37.56


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 66	Average Score: 19.31	Score: 37.51


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl


Episode 67	Average Score: 19.58	Score: 37.68


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 68	Average Score: 19.84	Score: 37.18


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 69	Average Score: 20.10	Score: 38.04


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 70	Average Score: 20.36	Score: 37.77


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 71	Average Score: 20.60	Score: 37.67


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 72	Average Score: 20.83	Score: 37.14


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 73	Average Score: 21.06	Score: 37.50


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 74	Average Score: 21.28	Score: 37.86


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 75	Average Score: 21.50	Score: 37.48


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 76	Average Score: 21.72	Score: 37.83


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 77	Average Score: 21.93	Score: 37.89


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 78	Average Score: 22.12	Score: 37.34


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 79	Average Score: 22.30	Score: 35.93


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 80	Average Score: 22.48	Score: 36.91
Episode 80	Average Score: 22.48


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 81	Average Score: 22.66	Score: 37.01


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 82	Average Score: 22.83	Score: 36.88


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 83	Average Score: 23.00	Score: 36.98


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 84	Average Score: 23.18	Score: 37.42


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 85	Average Score: 23.34	Score: 37.15


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl


Episode 86	Average Score: 23.50	Score: 36.99


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 87	Average Score: 23.65	Score: 36.82


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 88	Average Score: 23.80	Score: 36.57


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 89	Average Score: 23.94	Score: 36.48


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 90	Average Score: 24.08	Score: 36.48


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 91	Average Score: 24.23	Score: 37.49


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 92	Average Score: 24.37	Score: 37.65


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 93	Average Score: 24.51	Score: 36.81


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 94	Average Score: 24.64	Score: 37.36


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 95	Average Score: 24.77	Score: 36.96


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 96	Average Score: 24.90	Score: 37.05


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 97	Average Score: 25.03	Score: 37.17


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 98	Average Score: 25.15	Score: 37.29


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 99	Average Score: 25.27	Score: 37.20


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 100	Average Score: 25.39	Score: 36.66
Episode 100	Average Score: 25.39


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 101	Average Score: 25.74	Score: 36.08


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 102	Average Score: 26.10	Score: 36.59


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 103	Average Score: 26.47	Score: 37.31


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 104	Average Score: 26.83	Score: 36.83


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 105	Average Score: 27.18	Score: 36.95


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 106	Average Score: 27.53	Score: 35.93


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 107	Average Score: 27.89	Score: 37.39


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 108	Average Score: 28.25	Score: 37.26


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json


Episode 109	Average Score: 28.62	Score: 37.73


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


Episode 110	Average Score: 28.98	Score: 37.43


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 111	Average Score: 29.34	Score: 37.38


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 112	Average Score: 29.69	Score: 37.48


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 113	Average Score: 30.05	Score: 36.86
Episode 113	Average Score: 30.05


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 114	Average Score: 30.41	Score: 36.63
Episode 114	Average Score: 30.41


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 115	Average Score: 30.77	Score: 36.88
Episode 115	Average Score: 30.77


INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json


Episode 116	Average Score: 31.12	Score: 37.21
Episode 116	Average Score: 31.12


NameError: name 'average_score' is not defined

INFO:wandb.run_manager:shutting down system stats and metadata service
INFO:wandb.run_manager:stopping streaming files and file change observer
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-metadata.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-history.jsonl
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-summary.json
INFO:wandb.run_manager:file/dir modified: D:\Code\DRL_udacity\deep-reinforcement-learning\p2_continuous-control\wandb\run-20191120_144308-rz0f5h3z\wandb-events.jsonl


## Observe AI

In [None]:
load_agent(agent, 'checkpoint')

In [61]:
env_info = env.reset(train_mode=False)[brain_name]     # reset the environment    
states = env_info.vector_observations                  # get the current state (for each agent)
score = np.zeros(num_agents)                          # initialize the score (for each agent)

while True:

    # Mr AI, what action should we take?
    actions = agent.act(states)

    # Take the action in the environment
    env_info = env.step(actions)[brain_name]
    next_states = env_info.vector_observations
    rewards = env_info.rewards
    dones = env_info.local_done

    # Mr AI, here are the results of your action, and the next state
    # agent.step(states, actions, rewards, next_states, dones)

    # Loop around & keep track of score
    states = next_states
    score += rewards

    if np.any(dones): # if any agent is at it's completion
        break
print('Total score (averaged over agents) this episode: {}'.format(np.mean(score)))

Total score (averaged over agents) this episode: 37.15949916942045


In [None]:
env.close()