# Continuous Control

---

Congratulations for completing the second project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program!  In this notebook, you will learn how to control an agent in a more challenging environment, where the goal is to train a creature with four arms to walk forward.  **Note that this exercise is optional!**

### 1. Start the Environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

In [1]:
from unityagents import UnityEnvironment
import numpy as np

Next, we will start the environment!  **_Before running the code cell below_**, change the `file_name` parameter to match the location of the Unity environment that you downloaded.

- **Mac**: `"path/to/Crawler.app"`
- **Windows** (x86): `"path/to/Crawler_Windows_x86/Crawler.exe"`
- **Windows** (x86_64): `"path/to/Crawler_Windows_x86_64/Crawler.exe"`
- **Linux** (x86): `"path/to/Crawler_Linux/Crawler.x86"`
- **Linux** (x86_64): `"path/to/Crawler_Linux/Crawler.x86_64"`
- **Linux** (x86, headless): `"path/to/Crawler_Linux_NoVis/Crawler.x86"`
- **Linux** (x86_64, headless): `"path/to/Crawler_Linux_NoVis/Crawler.x86_64"`

For instance, if you are using a Mac, then you downloaded `Crawler.app`.  If this file is in the same folder as the notebook, then the line below should appear as follows:
```
env = UnityEnvironment(file_name="Crawler.app")
```

In [2]:
env = UnityEnvironment(file_name='/home/lunarpulse/Documents/DRLND/deep-reinforcement-learning/p2_continuous-control/Crawler_Linux/Crawler.x86_64')

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: CrawlerBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 129
        Number of stacked Vector Observation: 1
        Vector Action space type: continuous
        Vector Action space size (per agent): 20
        Vector Action descriptions: , , , , , , , , , , , , , , , , , , , 


Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

In [3]:
# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

### 2. Examine the State and Action Spaces

Run the code cell below to print some information about the environment.

In [4]:
# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# number of agents
num_agents = len(env_info.agents)
print('Number of agents:', num_agents)

# size of each action
action_size = brain.vector_action_space_size
print('Size of each action:', action_size)

# examine the state space 
states = env_info.vector_observations
state_size = states.shape[1]
print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))
print('The state for the first agent looks like:', states[0])

Number of agents: 12
Size of each action: 20
There are 12 agents. Each observes a state with length: 129
The state for the first agent looks like: [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  2.25000000e+00
  1.00000000e+00  0.00000000e+00  1.78813934e-07  0.00000000e+00
  1.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  6.06093168e-01 -1.42857209e-01 -6.06078804e-01  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  1.33339906e+00 -1.42857209e-01
 -1.33341408e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
 -6.0609

### 3. Take Random Actions in the Environment

In the next code cell, you will learn how to use the Python API to control the agent and receive feedback from the environment.

Once this cell is executed, you will watch the agent's performance, if it selects an action at random with each time step.  A window should pop up that allows you to observe the agent, as it moves through the environment.  

Of course, as part of the project, you'll have to change the code so that the agent is able to use its experience to gradually choose better actions when interacting with the environment!

In [5]:
env_info = env.reset(train_mode=False)[brain_name]     # reset the environment    
states = env_info.vector_observations                  # get the current state (for each agent)
scores = np.zeros(num_agents)                          # initialize the score (for each agent)
while True:
    actions = np.random.randn(num_agents, action_size) # select an action (for each agent)
    actions = np.clip(actions, -1, 1)                  # all actions between -1 and 1
    env_info = env.step(actions)[brain_name]           # send all actions to tne environment
    next_states = env_info.vector_observations         # get next state (for each agent)
    rewards = env_info.rewards                         # get reward (for each agent)
    dones = env_info.local_done                        # see if episode finished
    scores += env_info.rewards                         # update the score (for each agent)
    states = next_states                               # roll over states to next time step
    if np.any(dones):                                  # exit loop if episode finished
        break
print('Total score (averaged over agents) this episode: {}'.format(np.mean(scores)))

Total score (averaged over agents) this episode: 0.33929963087818277


When finished, you can close the environment.

In [6]:
# env.close()

### 4. It's Your Turn!

Now it's your turn to train your own agent to solve the environment!  When training the environment, set `train_mode=True`, so that the line for resetting the environment looks like the following:
```python
env_info = env.reset(train_mode=True)[brain_name]
```

In [7]:
from agent import DDPG_agent
import numpy as np
import torch

from collections import namedtuple, deque

In [8]:
agents = DDPG_agent(state_size=state_size, action_size=action_size, num_agents=num_agents, random_seed=0)
n_episodes = 1000
print_every = 10

In [9]:
def ddpg(n_episodes=2000, max_t=1000):
    scores_deque = deque(maxlen=100)
    scores = []
    for i_episode in range(1, n_episodes+1):
        env_info = env.reset(train_mode=True)[brain_name]
        state = env_info.vector_observations
        agents.reset()
        score = np.zeros(num_agents)
        for t in range(max_t):
            action = agents.act(state)
            env_info = env.step(action)[brain_name]
            next_state = env_info.vector_observations
            rewards = env_info.rewards
            dones = env_info.local_done
            agents.step(state, action, rewards, next_state, dones)
            state = next_state
            score += rewards
            if np.any(dones):
                print('\tSteps: ', t)
                break 
        scores_deque.append(np.mean(score))
        scores.append(np.mean(score))
        print('\rEpisode {}\tAverage Score: {:.2f}\tScore: {:.3f}'.format(i_episode, 
                                                                          np.mean(scores_deque), 
                                                                          np.mean(score)))
        average_score = np.mean(scores_deque)
        if i_episode % print_every == 20 or average_score > 30:
            print('\rEpisode {}\tAverage Score: {:.2f}'.format(i_episode, average_score))
            torch.save(agents.actor_local.state_dict(), 'crawler_checkpoint_actor.pth')
            torch.save(agents.critic_local.state_dict(), 'crawler_checkpoint_critic.pth') 
            if average_score > 30:
                break
    return scores

In [10]:
score = ddpg()

	Steps:  7
Episode 1	Average Score: 0.35	Score: 0.351
	Steps:  7
Episode 2	Average Score: 0.28	Score: 0.219
	Steps:  7
Episode 3	Average Score: 0.30	Score: 0.322
	Steps:  7
Episode 4	Average Score: 0.28	Score: 0.214
	Steps:  8
Episode 5	Average Score: 0.31	Score: 0.429
	Steps:  7
Episode 6	Average Score: 0.30	Score: 0.281
	Steps:  19
Episode 7	Average Score: 0.28	Score: 0.125
	Steps:  8
Episode 8	Average Score: 0.27	Score: 0.247
	Steps:  7
Episode 9	Average Score: 0.27	Score: 0.285
	Steps:  8
Episode 10	Average Score: 0.27	Score: 0.215
	Steps:  7
Episode 11	Average Score: 0.29	Score: 0.520
	Steps:  7
Episode 12	Average Score: 0.30	Score: 0.413
	Steps:  17
Episode 13	Average Score: 0.26	Score: -0.210
	Steps:  11
Episode 14	Average Score: 0.27	Score: 0.408
	Steps:  10
Episode 15	Average Score: 0.26	Score: 0.136
	Steps:  11
Episode 16	Average Score: 0.25	Score: 0.123
	Steps:  11
Episode 17	Average Score: 0.25	Score: 0.105
	Steps:  7
Episode 18	Average Score: 0.25	Score: 0.306
	Steps:  7
E

	Steps:  7
Episode 150	Average Score: 0.40	Score: 0.468
	Steps:  9
Episode 151	Average Score: 0.40	Score: 0.359
	Steps:  8
Episode 152	Average Score: 0.40	Score: 0.410
	Steps:  8
Episode 153	Average Score: 0.39	Score: 0.363
	Steps:  8
Episode 154	Average Score: 0.39	Score: 0.451
	Steps:  8
Episode 155	Average Score: 0.40	Score: 0.332
	Steps:  9
Episode 156	Average Score: 0.39	Score: 0.262
	Steps:  7
Episode 157	Average Score: 0.39	Score: 0.508
	Steps:  8
Episode 158	Average Score: 0.39	Score: 0.420
	Steps:  9
Episode 159	Average Score: 0.39	Score: 0.386
	Steps:  7
Episode 160	Average Score: 0.40	Score: 0.459
	Steps:  8
Episode 161	Average Score: 0.40	Score: 0.457
	Steps:  7
Episode 162	Average Score: 0.40	Score: 0.453
	Steps:  7
Episode 163	Average Score: 0.40	Score: 0.423
	Steps:  8
Episode 164	Average Score: 0.39	Score: 0.290
	Steps:  8
Episode 165	Average Score: 0.39	Score: 0.337
	Steps:  8
Episode 166	Average Score: 0.39	Score: 0.273
	Steps:  7
Episode 167	Average Score: 0.39	Score

	Steps:  21
Episode 297	Average Score: 0.57	Score: 0.780
	Steps:  12
Episode 298	Average Score: 0.58	Score: 0.490
	Steps:  7
Episode 299	Average Score: 0.58	Score: 0.586
	Steps:  8
Episode 300	Average Score: 0.58	Score: 0.557
	Steps:  8
Episode 301	Average Score: 0.58	Score: 0.608
	Steps:  9
Episode 302	Average Score: 0.58	Score: 0.627
	Steps:  7
Episode 303	Average Score: 0.58	Score: 0.604
	Steps:  7
Episode 304	Average Score: 0.59	Score: 0.546
	Steps:  14
Episode 305	Average Score: 0.59	Score: 0.649
	Steps:  9
Episode 306	Average Score: 0.59	Score: 0.633
	Steps:  10
Episode 307	Average Score: 0.59	Score: 0.585
	Steps:  13
Episode 308	Average Score: 0.59	Score: 0.563
	Steps:  8
Episode 309	Average Score: 0.59	Score: 0.459
	Steps:  9
Episode 310	Average Score: 0.59	Score: 0.668
	Steps:  8
Episode 311	Average Score: 0.59	Score: 0.528
	Steps:  8
Episode 312	Average Score: 0.59	Score: 0.525
	Steps:  7
Episode 313	Average Score: 0.59	Score: 0.507
	Steps:  7
Episode 314	Average Score: 0.59	

	Steps:  7
Episode 444	Average Score: 0.56	Score: 0.627
	Steps:  9
Episode 445	Average Score: 0.56	Score: 0.442
	Steps:  8
Episode 446	Average Score: 0.56	Score: 0.649
	Steps:  9
Episode 447	Average Score: 0.56	Score: 0.554
	Steps:  29
Episode 448	Average Score: 0.57	Score: 1.265
	Steps:  28
Episode 449	Average Score: 0.57	Score: -0.015
	Steps:  15
Episode 450	Average Score: 0.56	Score: -0.384
	Steps:  16
Episode 451	Average Score: 0.55	Score: 0.195
	Steps:  17
Episode 452	Average Score: 0.54	Score: -0.244
	Steps:  13
Episode 453	Average Score: 0.54	Score: 0.048
	Steps:  14
Episode 454	Average Score: 0.54	Score: 0.431
	Steps:  13
Episode 455	Average Score: 0.53	Score: -0.281
	Steps:  18
Episode 456	Average Score: 0.52	Score: -0.036
	Steps:  11
Episode 457	Average Score: 0.52	Score: 0.299
	Steps:  11
Episode 458	Average Score: 0.52	Score: 0.378
	Steps:  12
Episode 459	Average Score: 0.52	Score: 0.249
	Steps:  8
Episode 460	Average Score: 0.52	Score: 0.395
	Steps:  16
Episode 461	Average

	Steps:  14
Episode 590	Average Score: 0.52	Score: 1.224
	Steps:  8
Episode 591	Average Score: 0.52	Score: 0.598
	Steps:  10
Episode 592	Average Score: 0.52	Score: 0.797
	Steps:  9
Episode 593	Average Score: 0.52	Score: 0.769
	Steps:  11
Episode 594	Average Score: 0.52	Score: 0.749
	Steps:  9
Episode 595	Average Score: 0.53	Score: 0.778
	Steps:  10
Episode 596	Average Score: 0.53	Score: 0.457
	Steps:  8
Episode 597	Average Score: 0.53	Score: 0.593
	Steps:  7
Episode 598	Average Score: 0.53	Score: 0.605
	Steps:  7
Episode 599	Average Score: 0.53	Score: 0.514
	Steps:  8
Episode 600	Average Score: 0.53	Score: 0.629
	Steps:  10
Episode 601	Average Score: 0.53	Score: 0.656
	Steps:  7
Episode 602	Average Score: 0.53	Score: 0.412
	Steps:  13
Episode 603	Average Score: 0.53	Score: 0.980
	Steps:  11
Episode 604	Average Score: 0.53	Score: 0.473
	Steps:  9
Episode 605	Average Score: 0.53	Score: 0.667
	Steps:  10
Episode 606	Average Score: 0.53	Score: 0.520
	Steps:  10
Episode 607	Average Score: 0

	Steps:  21
Episode 735	Average Score: 0.65	Score: 0.342
	Steps:  12
Episode 736	Average Score: 0.65	Score: 0.609
	Steps:  25
Episode 737	Average Score: 0.65	Score: 0.238
	Steps:  12
Episode 738	Average Score: 0.64	Score: 0.427
	Steps:  13
Episode 739	Average Score: 0.64	Score: 0.406
	Steps:  10
Episode 740	Average Score: 0.65	Score: 0.611
	Steps:  17
Episode 741	Average Score: 0.65	Score: 0.608
	Steps:  11
Episode 742	Average Score: 0.64	Score: 0.364
	Steps:  12
Episode 743	Average Score: 0.64	Score: 0.404
	Steps:  20
Episode 744	Average Score: 0.64	Score: 0.499
	Steps:  14
Episode 745	Average Score: 0.64	Score: 0.665
	Steps:  19
Episode 746	Average Score: 0.64	Score: 0.770
	Steps:  18
Episode 747	Average Score: 0.64	Score: 0.586
	Steps:  16
Episode 748	Average Score: 0.64	Score: 0.675
	Steps:  32
Episode 749	Average Score: 0.63	Score: -0.264
	Steps:  10
Episode 750	Average Score: 0.64	Score: 0.634
	Steps:  15
Episode 751	Average Score: 0.63	Score: 0.741
	Steps:  29
Episode 752	Averag

	Steps:  9
Episode 882	Average Score: 0.63	Score: 0.590
	Steps:  7
Episode 883	Average Score: 0.62	Score: 0.600
	Steps:  10
Episode 884	Average Score: 0.62	Score: 0.619
	Steps:  6
Episode 885	Average Score: 0.62	Score: 0.425
	Steps:  10
Episode 886	Average Score: 0.62	Score: 1.001
	Steps:  9
Episode 887	Average Score: 0.63	Score: 0.692
	Steps:  7
Episode 888	Average Score: 0.63	Score: 0.622
	Steps:  7
Episode 889	Average Score: 0.62	Score: 0.560
	Steps:  7
Episode 890	Average Score: 0.62	Score: 0.608
	Steps:  8
Episode 891	Average Score: 0.62	Score: 0.743
	Steps:  6
Episode 892	Average Score: 0.62	Score: 0.459
	Steps:  7
Episode 893	Average Score: 0.61	Score: 0.622
	Steps:  8
Episode 894	Average Score: 0.61	Score: 0.717
	Steps:  9
Episode 895	Average Score: 0.62	Score: 0.659
	Steps:  14
Episode 896	Average Score: 0.62	Score: 1.166
	Steps:  7
Episode 897	Average Score: 0.62	Score: 0.710
	Steps:  8
Episode 898	Average Score: 0.62	Score: 0.729
	Steps:  7
Episode 899	Average Score: 0.62	Sc

	Steps:  22
Episode 1026	Average Score: 1.18	Score: 0.687
	Steps:  27
Episode 1027	Average Score: 1.19	Score: 1.750
	Steps:  7
Episode 1028	Average Score: 1.19	Score: 0.513
	Steps:  15
Episode 1029	Average Score: 1.19	Score: 0.922
	Steps:  22
Episode 1030	Average Score: 1.19	Score: 1.215
	Steps:  7
Episode 1031	Average Score: 1.19	Score: 0.503
	Steps:  8
Episode 1032	Average Score: 1.18	Score: 0.605
	Steps:  22
Episode 1033	Average Score: 1.19	Score: 1.222
	Steps:  17
Episode 1034	Average Score: 1.18	Score: 0.890
	Steps:  12
Episode 1035	Average Score: 1.18	Score: 1.066
	Steps:  12
Episode 1036	Average Score: 1.19	Score: 1.028
	Steps:  11
Episode 1037	Average Score: 1.18	Score: 1.141
	Steps:  10
Episode 1038	Average Score: 1.17	Score: 1.108
	Steps:  18
Episode 1039	Average Score: 1.17	Score: 1.195
	Steps:  10
Episode 1040	Average Score: 1.16	Score: 0.838
	Steps:  13
Episode 1041	Average Score: 1.16	Score: 1.177
	Steps:  13
Episode 1042	Average Score: 1.16	Score: 0.990
	Steps:  12
Episo

	Steps:  9
Episode 1169	Average Score: 1.03	Score: 1.224
	Steps:  8
Episode 1170	Average Score: 1.03	Score: 1.207
	Steps:  9
Episode 1171	Average Score: 1.03	Score: 1.264
	Steps:  11
Episode 1172	Average Score: 1.03	Score: 1.746
	Steps:  9
Episode 1173	Average Score: 1.04	Score: 1.263
	Steps:  12
Episode 1174	Average Score: 1.04	Score: 1.339
	Steps:  8
Episode 1175	Average Score: 1.04	Score: 1.530
	Steps:  9
Episode 1176	Average Score: 1.04	Score: 1.103
	Steps:  8
Episode 1177	Average Score: 1.04	Score: 1.041
	Steps:  8
Episode 1178	Average Score: 1.04	Score: 1.059
	Steps:  16
Episode 1179	Average Score: 1.06	Score: 2.653
	Steps:  11
Episode 1180	Average Score: 1.07	Score: 1.717
	Steps:  7
Episode 1181	Average Score: 1.07	Score: 0.982
	Steps:  12
Episode 1182	Average Score: 1.07	Score: 1.168
	Steps:  13
Episode 1183	Average Score: 1.08	Score: 1.412
	Steps:  9
Episode 1184	Average Score: 1.09	Score: 1.235
	Steps:  15
Episode 1185	Average Score: 1.11	Score: 2.306
	Steps:  9
Episode 1186	

	Steps:  13
Episode 1311	Average Score: 1.81	Score: 2.890
	Steps:  16
Episode 1312	Average Score: 1.82	Score: 3.083
	Steps:  16
Episode 1313	Average Score: 1.84	Score: 3.330
	Steps:  10
Episode 1314	Average Score: 1.85	Score: 2.609
	Steps:  17
Episode 1315	Average Score: 1.87	Score: 3.793
	Steps:  13
Episode 1316	Average Score: 1.89	Score: 3.902
	Steps:  16
Episode 1317	Average Score: 1.91	Score: 3.140
	Steps:  14
Episode 1318	Average Score: 1.91	Score: 2.468
	Steps:  11
Episode 1319	Average Score: 1.92	Score: 2.305
	Steps:  11
Episode 1320	Average Score: 1.94	Score: 3.047
	Steps:  13
Episode 1321	Average Score: 1.95	Score: 3.458
	Steps:  14
Episode 1322	Average Score: 1.96	Score: 2.887
	Steps:  12
Episode 1323	Average Score: 1.98	Score: 3.449
	Steps:  10
Episode 1324	Average Score: 2.00	Score: 2.495
	Steps:  15
Episode 1325	Average Score: 2.02	Score: 3.469
	Steps:  13
Episode 1326	Average Score: 2.04	Score: 2.905
	Steps:  12
Episode 1327	Average Score: 2.05	Score: 3.188
	Steps:  13
Ep

	Steps:  10
Episode 1453	Average Score: 3.20	Score: 2.559
	Steps:  9
Episode 1454	Average Score: 3.19	Score: 2.354
	Steps:  9
Episode 1455	Average Score: 3.17	Score: 2.297
	Steps:  12
Episode 1456	Average Score: 3.13	Score: 2.766
	Steps:  9
Episode 1457	Average Score: 3.11	Score: 1.818
	Steps:  9
Episode 1458	Average Score: 3.10	Score: 2.448
	Steps:  8
Episode 1459	Average Score: 3.08	Score: 1.775
	Steps:  9
Episode 1460	Average Score: 3.06	Score: 1.968
	Steps:  13
Episode 1461	Average Score: 3.06	Score: 3.356
	Steps:  11
Episode 1462	Average Score: 3.05	Score: 3.114
	Steps:  13
Episode 1463	Average Score: 3.05	Score: 3.965
	Steps:  12
Episode 1464	Average Score: 3.04	Score: 3.452
	Steps:  12
Episode 1465	Average Score: 3.03	Score: 3.239
	Steps:  10
Episode 1466	Average Score: 3.01	Score: 2.379
	Steps:  12
Episode 1467	Average Score: 3.00	Score: 3.228
	Steps:  8
Episode 1468	Average Score: 2.98	Score: 1.686
	Steps:  12
Episode 1469	Average Score: 3.00	Score: 3.347
	Steps:  10
Episode 1

	Steps:  14
Episode 1596	Average Score: 3.33	Score: 3.672
	Steps:  9
Episode 1597	Average Score: 3.33	Score: 2.240
	Steps:  14
Episode 1598	Average Score: 3.34	Score: 4.084
	Steps:  15
Episode 1599	Average Score: 3.36	Score: 4.584
	Steps:  13
Episode 1600	Average Score: 3.38	Score: 3.924
	Steps:  11
Episode 1601	Average Score: 3.39	Score: 3.299
	Steps:  15
Episode 1602	Average Score: 3.42	Score: 5.269
	Steps:  13
Episode 1603	Average Score: 3.43	Score: 3.283
	Steps:  11
Episode 1604	Average Score: 3.43	Score: 3.320
	Steps:  15
Episode 1605	Average Score: 3.45	Score: 4.599
	Steps:  11
Episode 1606	Average Score: 3.45	Score: 3.285
	Steps:  12
Episode 1607	Average Score: 3.46	Score: 3.724
	Steps:  13
Episode 1608	Average Score: 3.48	Score: 4.017
	Steps:  13
Episode 1609	Average Score: 3.50	Score: 3.977
	Steps:  12
Episode 1610	Average Score: 3.51	Score: 3.940
	Steps:  11
Episode 1611	Average Score: 3.52	Score: 2.760
	Steps:  13
Episode 1612	Average Score: 3.52	Score: 3.406
	Steps:  11
Epi

	Steps:  13
Episode 1738	Average Score: 4.68	Score: 4.633
	Steps:  13
Episode 1739	Average Score: 4.68	Score: 4.197
	Steps:  12
Episode 1740	Average Score: 4.68	Score: 4.230
	Steps:  10
Episode 1741	Average Score: 4.68	Score: 3.177
	Steps:  14
Episode 1742	Average Score: 4.70	Score: 5.162
	Steps:  13
Episode 1743	Average Score: 4.69	Score: 4.338
	Steps:  9
Episode 1744	Average Score: 4.67	Score: 2.512
	Steps:  13
Episode 1745	Average Score: 4.67	Score: 4.040
	Steps:  16
Episode 1746	Average Score: 4.69	Score: 6.125
	Steps:  13
Episode 1747	Average Score: 4.70	Score: 5.022
	Steps:  12
Episode 1748	Average Score: 4.70	Score: 4.047
	Steps:  13
Episode 1749	Average Score: 4.70	Score: 4.406
	Steps:  16
Episode 1750	Average Score: 4.73	Score: 6.578
	Steps:  14
Episode 1751	Average Score: 4.73	Score: 4.845
	Steps:  13
Episode 1752	Average Score: 4.74	Score: 4.781
	Steps:  15
Episode 1753	Average Score: 4.77	Score: 5.257
	Steps:  16
Episode 1754	Average Score: 4.78	Score: 6.554
	Steps:  14
Epi

	Steps:  17
Episode 1880	Average Score: 4.98	Score: 6.889
	Steps:  12
Episode 1881	Average Score: 4.97	Score: 4.411
	Steps:  15
Episode 1882	Average Score: 4.99	Score: 6.012
	Steps:  15
Episode 1883	Average Score: 5.01	Score: 5.862
	Steps:  12
Episode 1884	Average Score: 5.00	Score: 4.025
	Steps:  11
Episode 1885	Average Score: 5.00	Score: 3.803
	Steps:  9
Episode 1886	Average Score: 4.98	Score: 2.933
	Steps:  16
Episode 1887	Average Score: 5.00	Score: 6.497
	Steps:  15
Episode 1888	Average Score: 5.03	Score: 6.172
	Steps:  14
Episode 1889	Average Score: 5.03	Score: 5.168
	Steps:  15
Episode 1890	Average Score: 5.05	Score: 5.828
	Steps:  14
Episode 1891	Average Score: 5.05	Score: 5.752
	Steps:  19
Episode 1892	Average Score: 5.08	Score: 8.221
	Steps:  11
Episode 1893	Average Score: 5.07	Score: 3.546
	Steps:  15
Episode 1894	Average Score: 5.09	Score: 5.901
	Steps:  14
Episode 1895	Average Score: 5.09	Score: 5.429
	Steps:  12
Episode 1896	Average Score: 5.08	Score: 4.063
	Steps:  16
Epi

In [11]:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot(np.arange(1, len(score)+1), score)
plt.ylabel('Score')
plt.xlabel('Episode #')
plt.show()

<Figure size 640x480 with 1 Axes>

In [12]:
agents.actor_local.load_state_dict(torch.load('crawler_checkpoint_actor.pth', map_location='cpu'))
agents.critic_local.load_state_dict(torch.load('crawler_checkpoint_critic.pth', map_location='cpu'))

env_info = env.reset(train_mode=False)[brain_name]        
states = env_info.vector_observations                  
scores = np.zeros(num_agents)                          

for i in range(200):
    actions = agents.act(states, add_noise=False)                    
    env_info = env.step(actions)[brain_name]        
    next_states = env_info.vector_observations        
    rewards = env_info.rewards                        
    dones = env_info.local_done                 
    scores += rewards                         
    states = next_states                              
    if np.any(dones):                              
        break

FileNotFoundError: [Errno 2] No such file or directory: 'crawler_checkpoint_actor.pth'

In [None]:
env.close()