# Navigation

---

In this notebook, you will learn how to use the Unity ML-Agents environment for the first project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893).

### 1. Start the Environment

We begin by importing some necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

In [1]:
# Only necessary when executing this notebook on the Udacity cloud
!pip -q install ../python

[31mtensorflow 1.7.1 has requirement numpy>=1.13.3, but you'll have numpy 1.12.1 which is incompatible.[0m
[31mipython 6.5.0 has requirement prompt-toolkit<2.0.0,>=1.0.15, but you'll have prompt-toolkit 3.0.5 which is incompatible.[0m


In [2]:
from unityagents import UnityEnvironment
import numpy as np
import os
import time

Next, we will start the environment!  **_Before running the code cell below_**, change the `file_name` parameter to match the location of the Unity environment that you downloaded.

- **Mac**: `"path/to/Banana.app"`
- **Windows** (x86): `"path/to/Banana_Windows_x86/Banana.exe"`
- **Windows** (x86_64): `"path/to/Banana_Windows_x86_64/Banana.exe"`
- **Linux** (x86): `"path/to/Banana_Linux/Banana.x86"`
- **Linux** (x86_64): `"path/to/Banana_Linux/Banana.x86_64"`
- **Linux** (x86, headless): `"path/to/Banana_Linux_NoVis/Banana.x86"`
- **Linux** (x86_64, headless): `"path/to/Banana_Linux_NoVis/Banana.x86_64"`

For instance, if you are using a Mac, then you downloaded `Banana.app`.  If this file is in the same folder as the notebook, then the line below should appear as follows:
```
env = UnityEnvironment(file_name="Banana.app")
```

In [3]:
# Standalone machine
#env = UnityEnvironment( os.path.join( os.environ['HOME'], 'Python/rl/udadrl/data/Banana_Linux/Banana.x86_64' ) )

# Udacity cloud
env = UnityEnvironment( file_name="/data/Banana_Linux_NoVis/Banana.x86_64" )

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 


Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

In [4]:
# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

### 2. Examine the State and Action Spaces

The simulation contains a single agent that navigates a large environment.  At each time step, it has four actions at its disposal:
- `0` - walk forward 
- `1` - walk backward
- `2` - turn left
- `3` - turn right

The state space has `37` dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction.  A reward of `+1` is provided for collecting a yellow banana, and a reward of `-1` is provided for collecting a blue banana. 

Run the code cell below to print some information about the environment.

In [5]:
# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# number of agents in the environment
print('Number of agents:', len(env_info.agents))

# number of actions
action_size = brain.vector_action_space_size
print('Number of actions:', action_size)

# examine the state space 
state = env_info.vector_observations[0]
print('States look like:', state)
state_size = len(state)
print('States have length:', state_size)

Number of agents: 1
Number of actions: 4
States look like: [ 1.          0.          0.          0.          0.84408134  0.          0.
  1.          0.          0.0748472   0.          1.          0.          0.
  0.25755     1.          0.          0.          0.          0.74177343
  0.          1.          0.          0.          0.25854847  0.          0.
  1.          0.          0.09355672  0.          1.          0.          0.
  0.31969345  0.          0.        ]
States have length: 37


### 3. Take Random Actions in the Environment

In the next code cell, you will learn how to use the Python API to control the agent and receive feedback from the environment.

Once this cell is executed, you will watch the agent's performance, if it selects an action (uniformly) at random with each time step.  A window should pop up that allows you to observe the agent, as it moves through the environment.  

Of course, as part of the project, you'll have to change the code so that the agent is able to use its experience to gradually choose better actions when interacting with the environment!

In [None]:
env_info = env.reset(train_mode=False)[brain_name] # reset the environment
state = env_info.vector_observations[0]            # get the current state
score = 0                                          # initialize the score
while True:
    action = np.random.randint(action_size)        # select an action
    env_info = env.step(action)[brain_name]        # send the action to the environment
    next_state = env_info.vector_observations[0]   # get the next state
    reward = env_info.rewards[0]                   # get the reward
    done = env_info.local_done[0]                  # see if episode has finished
    score += reward                                # update the score
    state = next_state                             # roll over the state to next time step
    if done:                                       # exit loop if episode finished
        break
    
print("Score: {}".format(score))

When finished, you can close the environment.

In [None]:
env.close()

### 4. It's Your Turn!

Now it's your turn to train your own agent to solve the environment!  When training the environment, set `train_mode=True`, so that the line for resetting the environment looks like the following:
```python
env_info = env.reset(train_mode=True)[brain_name]
```

In [6]:
from dql_agent import Agent

agent = Agent(state_size, action_size, 19711108)

In [7]:

for epc in range(4000):

    env_info = env.reset(train_mode=True)[brain_name]  # reset the environment
    state = env_info.vector_observations[0]            # get the current state
    score = 0     # initialize the score

    start_time = time.time()

    while True:
        action = agent.act(state)        # select an action
        env_info = env.step(action)[brain_name]        # send the action to the environment
        next_state = env_info.vector_observations[0]   # get the next state
        reward = env_info.rewards[0]                   # get the reward
        done = env_info.local_done[0]                  # see if episode has finished
        agent.step(state, action, reward, next_state, done)
        score += reward                                # update the score
        state = next_state                             # roll over the state to next time step
        if done:                                       # exit loop if episode finished
            break

    stop_time = time.time()

    print( "Epoch:{:>5}; Score: {:>5}; Execution time: {:.4f}".format( epc+1, score, stop_time - start_time ) )


Epoch:    1, Score:   0.0, Execution time: 2.1981
Epoch:    2, Score:   1.0, Execution time: 0.9358
Epoch:    3, Score:   0.0, Execution time: 0.9293
Epoch:    4, Score:   0.0, Execution time: 0.9102
Epoch:    5, Score:  -2.0, Execution time: 0.8764
Epoch:    6, Score:   0.0, Execution time: 0.8554
Epoch:    7, Score:   1.0, Execution time: 0.8436
Epoch:    8, Score:   0.0, Execution time: 0.8527
Epoch:    9, Score:   0.0, Execution time: 0.8613
Epoch:   10, Score:   0.0, Execution time: 0.8478
Epoch:   11, Score:   0.0, Execution time: 0.8758
Epoch:   12, Score:   0.0, Execution time: 0.8522
Epoch:   13, Score:   0.0, Execution time: 0.8545
Epoch:   14, Score:  -1.0, Execution time: 0.8835
Epoch:   15, Score:   0.0, Execution time: 0.8676
Epoch:   16, Score:   0.0, Execution time: 0.8568
Epoch:   17, Score:   0.0, Execution time: 0.8541
Epoch:   18, Score:  -2.0, Execution time: 0.8890
Epoch:   19, Score:   0.0, Execution time: 0.8912
Epoch:   20, Score:   1.0, Execution time: 0.8565


Epoch:  165, Score:   0.0, Execution time: 0.9030
Epoch:  166, Score:   0.0, Execution time: 0.9054
Epoch:  167, Score:   0.0, Execution time: 0.8849
Epoch:  168, Score:   0.0, Execution time: 0.8836
Epoch:  169, Score:   0.0, Execution time: 0.8977
Epoch:  170, Score:   1.0, Execution time: 0.8986
Epoch:  171, Score:   0.0, Execution time: 0.9129
Epoch:  172, Score:  -1.0, Execution time: 0.9279
Epoch:  173, Score:   0.0, Execution time: 0.9078
Epoch:  174, Score:   0.0, Execution time: 0.9106
Epoch:  175, Score:   0.0, Execution time: 0.9131
Epoch:  176, Score:   0.0, Execution time: 0.9397
Epoch:  177, Score:  -1.0, Execution time: 0.9048
Epoch:  178, Score:  -2.0, Execution time: 0.9173
Epoch:  179, Score:  -1.0, Execution time: 0.8983
Epoch:  180, Score:   0.0, Execution time: 0.9231
Epoch:  181, Score:   0.0, Execution time: 0.9353
Epoch:  182, Score:   1.0, Execution time: 0.9083
Epoch:  183, Score:   0.0, Execution time: 0.9026
Epoch:  184, Score:   0.0, Execution time: 0.8856


Epoch:  329, Score:   0.0, Execution time: 0.9195
Epoch:  330, Score:   1.0, Execution time: 0.9860
Epoch:  331, Score:   2.0, Execution time: 0.8984
Epoch:  332, Score:  -1.0, Execution time: 0.9126
Epoch:  333, Score:   3.0, Execution time: 0.9184
Epoch:  334, Score:   2.0, Execution time: 0.9275
Epoch:  335, Score:   1.0, Execution time: 0.9203
Epoch:  336, Score:   1.0, Execution time: 0.9197
Epoch:  337, Score:   1.0, Execution time: 0.9206
Epoch:  338, Score:   0.0, Execution time: 0.9419
Epoch:  339, Score:  -1.0, Execution time: 0.9564
Epoch:  340, Score:   0.0, Execution time: 0.9287
Epoch:  341, Score:   4.0, Execution time: 0.9160
Epoch:  342, Score:   0.0, Execution time: 0.9122
Epoch:  343, Score:   0.0, Execution time: 0.9209
Epoch:  344, Score:   0.0, Execution time: 0.9457
Epoch:  345, Score:  -1.0, Execution time: 0.9349
Epoch:  346, Score:   1.0, Execution time: 0.9264
Epoch:  347, Score:   3.0, Execution time: 0.9329
Epoch:  348, Score:   0.0, Execution time: 0.9018


Epoch:  493, Score:  18.0, Execution time: 0.9555
Epoch:  494, Score:  12.0, Execution time: 0.9648
Epoch:  495, Score:   5.0, Execution time: 0.9439
Epoch:  496, Score:   9.0, Execution time: 0.9550
Epoch:  497, Score:   6.0, Execution time: 1.0065
Epoch:  498, Score:  10.0, Execution time: 0.9459
Epoch:  499, Score:   7.0, Execution time: 0.9369
Epoch:  500, Score:   7.0, Execution time: 0.9636
Epoch:  501, Score:  10.0, Execution time: 0.9675
Epoch:  502, Score:   4.0, Execution time: 0.9713
Epoch:  503, Score:  10.0, Execution time: 1.0520
Epoch:  504, Score:  11.0, Execution time: 0.9462
Epoch:  505, Score:   7.0, Execution time: 0.9414
Epoch:  506, Score:  18.0, Execution time: 0.9478
Epoch:  507, Score:  13.0, Execution time: 0.9455
Epoch:  508, Score:   6.0, Execution time: 0.9533
Epoch:  509, Score:   5.0, Execution time: 0.9695
Epoch:  510, Score:   4.0, Execution time: 0.9509
Epoch:  511, Score:   4.0, Execution time: 0.9453
Epoch:  512, Score:  10.0, Execution time: 0.9547


Epoch:  657, Score:  18.0, Execution time: 0.9954
Epoch:  658, Score:  15.0, Execution time: 0.9831
Epoch:  659, Score:   7.0, Execution time: 1.0054
Epoch:  660, Score:   3.0, Execution time: 0.9671
Epoch:  661, Score:   9.0, Execution time: 0.9877
Epoch:  662, Score:  16.0, Execution time: 0.9332
Epoch:  663, Score:  12.0, Execution time: 0.9433
Epoch:  664, Score:   6.0, Execution time: 0.9398
Epoch:  665, Score:  15.0, Execution time: 0.9535
Epoch:  666, Score:   9.0, Execution time: 0.9476
Epoch:  667, Score:  14.0, Execution time: 0.9661
Epoch:  668, Score:   7.0, Execution time: 0.9501
Epoch:  669, Score:  14.0, Execution time: 0.9529
Epoch:  670, Score:  12.0, Execution time: 0.9446
Epoch:  671, Score:  16.0, Execution time: 0.9317
Epoch:  672, Score:   8.0, Execution time: 0.9426
Epoch:  673, Score:  16.0, Execution time: 0.9423
Epoch:  674, Score:  18.0, Execution time: 0.9318
Epoch:  675, Score:  12.0, Execution time: 0.9460
Epoch:  676, Score:  16.0, Execution time: 0.9620


Epoch:  821, Score:  16.0, Execution time: 0.9947
Epoch:  822, Score:  16.0, Execution time: 1.0371
Epoch:  823, Score:   9.0, Execution time: 1.0029
Epoch:  824, Score:  14.0, Execution time: 0.9874
Epoch:  825, Score:  16.0, Execution time: 0.9978
Epoch:  826, Score:  10.0, Execution time: 1.0124
Epoch:  827, Score:  11.0, Execution time: 1.0221
Epoch:  828, Score:  15.0, Execution time: 1.0110
Epoch:  829, Score:   9.0, Execution time: 0.9897
Epoch:  830, Score:   5.0, Execution time: 0.9762
Epoch:  831, Score:  16.0, Execution time: 1.0165
Epoch:  832, Score:  15.0, Execution time: 0.9768
Epoch:  833, Score:  15.0, Execution time: 0.9813
Epoch:  834, Score:  15.0, Execution time: 0.9950
Epoch:  835, Score:  11.0, Execution time: 1.0315
Epoch:  836, Score:  21.0, Execution time: 1.0343
Epoch:  837, Score:   8.0, Execution time: 1.0098
Epoch:  838, Score:  18.0, Execution time: 1.0250
Epoch:  839, Score:  19.0, Execution time: 1.0224
Epoch:  840, Score:  15.0, Execution time: 0.9881


Epoch:  985, Score:  16.0, Execution time: 1.0422
Epoch:  986, Score:  16.0, Execution time: 1.0379
Epoch:  987, Score:   5.0, Execution time: 1.0093
Epoch:  988, Score:  15.0, Execution time: 0.9889
Epoch:  989, Score:  11.0, Execution time: 1.0078
Epoch:  990, Score:  16.0, Execution time: 0.9957
Epoch:  991, Score:   8.0, Execution time: 1.0596
Epoch:  992, Score:  17.0, Execution time: 0.9978
Epoch:  993, Score:  11.0, Execution time: 1.0312
Epoch:  994, Score:   7.0, Execution time: 1.0213
Epoch:  995, Score:  10.0, Execution time: 1.0088
Epoch:  996, Score:  13.0, Execution time: 0.9983
Epoch:  997, Score:   7.0, Execution time: 1.0416
Epoch:  998, Score:   8.0, Execution time: 0.9745
Epoch:  999, Score:   8.0, Execution time: 1.0497
Epoch: 1000, Score:  15.0, Execution time: 0.9978
Epoch: 1001, Score:  15.0, Execution time: 0.9954
Epoch: 1002, Score:   9.0, Execution time: 0.9815
Epoch: 1003, Score:  11.0, Execution time: 0.9950
Epoch: 1004, Score:  13.0, Execution time: 1.0085


Epoch: 1149, Score:   2.0, Execution time: 0.9900
Epoch: 1150, Score:  11.0, Execution time: 0.9870
Epoch: 1151, Score:  17.0, Execution time: 0.9947
Epoch: 1152, Score:   8.0, Execution time: 1.0216
Epoch: 1153, Score:  14.0, Execution time: 1.0757
Epoch: 1154, Score:  17.0, Execution time: 1.0081
Epoch: 1155, Score:  10.0, Execution time: 0.9837
Epoch: 1156, Score:  13.0, Execution time: 0.9799
Epoch: 1157, Score:  16.0, Execution time: 1.0441
Epoch: 1158, Score:  14.0, Execution time: 0.9914
Epoch: 1159, Score:  12.0, Execution time: 0.9916
Epoch: 1160, Score:  17.0, Execution time: 0.9868
Epoch: 1161, Score:  18.0, Execution time: 1.0112
Epoch: 1162, Score:  18.0, Execution time: 1.0124
Epoch: 1163, Score:  11.0, Execution time: 1.0263
Epoch: 1164, Score:  14.0, Execution time: 1.0229
Epoch: 1165, Score:  19.0, Execution time: 0.9992
Epoch: 1166, Score:  18.0, Execution time: 0.9871
Epoch: 1167, Score:  13.0, Execution time: 0.9851
Epoch: 1168, Score:  14.0, Execution time: 0.9889


Epoch: 1313, Score:  12.0, Execution time: 0.9730
Epoch: 1314, Score:  16.0, Execution time: 0.9801
Epoch: 1315, Score:  12.0, Execution time: 1.0113
Epoch: 1316, Score:  11.0, Execution time: 0.9257
Epoch: 1317, Score:  20.0, Execution time: 0.9312
Epoch: 1318, Score:  10.0, Execution time: 0.9706
Epoch: 1319, Score:  13.0, Execution time: 1.0087
Epoch: 1320, Score:  19.0, Execution time: 1.0665
Epoch: 1321, Score:  13.0, Execution time: 0.9969
Epoch: 1322, Score:   9.0, Execution time: 0.9870
Epoch: 1323, Score:  20.0, Execution time: 0.9966
Epoch: 1324, Score:  14.0, Execution time: 0.9820
Epoch: 1325, Score:  14.0, Execution time: 0.9755
Epoch: 1326, Score:  13.0, Execution time: 0.9860
Epoch: 1327, Score:  18.0, Execution time: 1.0009
Epoch: 1328, Score:  13.0, Execution time: 0.9738
Epoch: 1329, Score:  10.0, Execution time: 0.9748
Epoch: 1330, Score:  14.0, Execution time: 0.9456
Epoch: 1331, Score:  16.0, Execution time: 0.9469
Epoch: 1332, Score:  16.0, Execution time: 1.0145


Epoch: 1477, Score:  15.0, Execution time: 0.9304
Epoch: 1478, Score:  20.0, Execution time: 0.9466
Epoch: 1479, Score:   7.0, Execution time: 0.9335
Epoch: 1480, Score:  19.0, Execution time: 0.9670
Epoch: 1481, Score:  17.0, Execution time: 0.9367
Epoch: 1482, Score:  15.0, Execution time: 0.9367
Epoch: 1483, Score:   8.0, Execution time: 0.9291
Epoch: 1484, Score:  18.0, Execution time: 0.9618
Epoch: 1485, Score:  15.0, Execution time: 0.9643
Epoch: 1486, Score:   6.0, Execution time: 0.9357
Epoch: 1487, Score:  16.0, Execution time: 0.9400
Epoch: 1488, Score:  13.0, Execution time: 0.9171
Epoch: 1489, Score:   8.0, Execution time: 0.9136
Epoch: 1490, Score:  10.0, Execution time: 0.9212
Epoch: 1491, Score:  14.0, Execution time: 0.9127
Epoch: 1492, Score:  13.0, Execution time: 0.9163
Epoch: 1493, Score:   7.0, Execution time: 0.9504
Epoch: 1494, Score:  16.0, Execution time: 0.9291
Epoch: 1495, Score:  19.0, Execution time: 0.9573
Epoch: 1496, Score:  16.0, Execution time: 0.9427


Epoch: 1641, Score:  14.0, Execution time: 0.9508
Epoch: 1642, Score:  13.0, Execution time: 0.9290
Epoch: 1643, Score:  18.0, Execution time: 0.9491
Epoch: 1644, Score:  12.0, Execution time: 0.9473
Epoch: 1645, Score:  15.0, Execution time: 0.9727
Epoch: 1646, Score:  16.0, Execution time: 0.9552
Epoch: 1647, Score:  14.0, Execution time: 0.9440
Epoch: 1648, Score:  21.0, Execution time: 0.9294
Epoch: 1649, Score:  14.0, Execution time: 0.9278
Epoch: 1650, Score:  14.0, Execution time: 0.9744
Epoch: 1651, Score:  17.0, Execution time: 0.9489
Epoch: 1652, Score:  13.0, Execution time: 0.9185
Epoch: 1653, Score:  19.0, Execution time: 0.9184
Epoch: 1654, Score:  16.0, Execution time: 0.9679
Epoch: 1655, Score:  17.0, Execution time: 0.9339
Epoch: 1656, Score:  15.0, Execution time: 0.9358
Epoch: 1657, Score:  10.0, Execution time: 0.9084
Epoch: 1658, Score:  17.0, Execution time: 0.9503
Epoch: 1659, Score:  14.0, Execution time: 0.9637
Epoch: 1660, Score:   8.0, Execution time: 0.9812


Epoch: 1805, Score:  15.0, Execution time: 0.9986
Epoch: 1806, Score:  13.0, Execution time: 1.0016
Epoch: 1807, Score:   8.0, Execution time: 0.9955
Epoch: 1808, Score:  13.0, Execution time: 0.9536
Epoch: 1809, Score:  11.0, Execution time: 0.9523
Epoch: 1810, Score:  11.0, Execution time: 0.9756
Epoch: 1811, Score:  19.0, Execution time: 1.0544
Epoch: 1812, Score:  18.0, Execution time: 1.0225
Epoch: 1813, Score:  12.0, Execution time: 0.9905
Epoch: 1814, Score:  15.0, Execution time: 0.9889
Epoch: 1815, Score:  13.0, Execution time: 0.9627
Epoch: 1816, Score:  19.0, Execution time: 1.0194
Epoch: 1817, Score:  11.0, Execution time: 1.0153
Epoch: 1818, Score:   8.0, Execution time: 1.0190
Epoch: 1819, Score:  16.0, Execution time: 1.0193
Epoch: 1820, Score:  13.0, Execution time: 1.0068
Epoch: 1821, Score:   8.0, Execution time: 1.0088
Epoch: 1822, Score:  16.0, Execution time: 0.9789
Epoch: 1823, Score:   6.0, Execution time: 0.9557
Epoch: 1824, Score:  12.0, Execution time: 0.9960


Epoch: 1969, Score:   8.0, Execution time: 0.9359
Epoch: 1970, Score:  18.0, Execution time: 0.9540
Epoch: 1971, Score:  15.0, Execution time: 0.9440
Epoch: 1972, Score:  18.0, Execution time: 0.9469
Epoch: 1973, Score:  18.0, Execution time: 1.0105
Epoch: 1974, Score:  12.0, Execution time: 0.9608
Epoch: 1975, Score:  12.0, Execution time: 0.9563
Epoch: 1976, Score:  16.0, Execution time: 0.9864
Epoch: 1977, Score:  14.0, Execution time: 0.9764
Epoch: 1978, Score:  18.0, Execution time: 1.0080
Epoch: 1979, Score:  19.0, Execution time: 0.9706
Epoch: 1980, Score:  14.0, Execution time: 0.9730
Epoch: 1981, Score:  12.0, Execution time: 0.9661
Epoch: 1982, Score:  12.0, Execution time: 0.9785
Epoch: 1983, Score:  18.0, Execution time: 0.9712
Epoch: 1984, Score:  17.0, Execution time: 0.9539
Epoch: 1985, Score:   9.0, Execution time: 0.9510
Epoch: 1986, Score:  23.0, Execution time: 0.9568
Epoch: 1987, Score:  10.0, Execution time: 0.9733
Epoch: 1988, Score:  11.0, Execution time: 1.0050


Epoch: 2133, Score:  12.0, Execution time: 0.9743
Epoch: 2134, Score:  10.0, Execution time: 0.9192
Epoch: 2135, Score:  13.0, Execution time: 0.9148
Epoch: 2136, Score:   2.0, Execution time: 0.9253
Epoch: 2137, Score:  14.0, Execution time: 0.9203
Epoch: 2138, Score:  12.0, Execution time: 0.9607
Epoch: 2139, Score:  14.0, Execution time: 0.9162
Epoch: 2140, Score:   7.0, Execution time: 0.9056
Epoch: 2141, Score:   9.0, Execution time: 0.9412
Epoch: 2142, Score:  12.0, Execution time: 0.9266
Epoch: 2143, Score:  13.0, Execution time: 0.9272
Epoch: 2144, Score:   8.0, Execution time: 0.9111
Epoch: 2145, Score:  16.0, Execution time: 0.9156
Epoch: 2146, Score:   6.0, Execution time: 0.9576
Epoch: 2147, Score:   8.0, Execution time: 0.9625
Epoch: 2148, Score:  13.0, Execution time: 0.9699
Epoch: 2149, Score:  12.0, Execution time: 0.9868
Epoch: 2150, Score:  12.0, Execution time: 1.0094
Epoch: 2151, Score:  16.0, Execution time: 1.0150
Epoch: 2152, Score:  13.0, Execution time: 1.0278


Epoch: 2297, Score:  13.0, Execution time: 0.9882
Epoch: 2298, Score:  17.0, Execution time: 0.9589
Epoch: 2299, Score:   8.0, Execution time: 0.9357
Epoch: 2300, Score:   6.0, Execution time: 0.9183
Epoch: 2301, Score:  21.0, Execution time: 0.9192
Epoch: 2302, Score:  13.0, Execution time: 0.9770
Epoch: 2303, Score:   6.0, Execution time: 0.9259
Epoch: 2304, Score:   8.0, Execution time: 0.9078
Epoch: 2305, Score:  13.0, Execution time: 0.9041
Epoch: 2306, Score:  16.0, Execution time: 0.9323
Epoch: 2307, Score:  18.0, Execution time: 0.9442
Epoch: 2308, Score:  13.0, Execution time: 0.9305
Epoch: 2309, Score:  13.0, Execution time: 0.9164
Epoch: 2310, Score:  12.0, Execution time: 0.9193
Epoch: 2311, Score:   4.0, Execution time: 0.9022
Epoch: 2312, Score:   7.0, Execution time: 0.9267
Epoch: 2313, Score:   9.0, Execution time: 0.9020
Epoch: 2314, Score:  18.0, Execution time: 0.9353
Epoch: 2315, Score:  13.0, Execution time: 0.9716
Epoch: 2316, Score:  13.0, Execution time: 0.9733


Epoch: 2461, Score:   7.0, Execution time: 0.9364
Epoch: 2462, Score:  13.0, Execution time: 0.9131
Epoch: 2463, Score:  14.0, Execution time: 0.9257
Epoch: 2464, Score:  13.0, Execution time: 0.9290
Epoch: 2465, Score:   6.0, Execution time: 0.9164
Epoch: 2466, Score:   2.0, Execution time: 0.9340
Epoch: 2467, Score:  18.0, Execution time: 0.9065
Epoch: 2468, Score:  12.0, Execution time: 0.9167
Epoch: 2469, Score:  20.0, Execution time: 0.9305
Epoch: 2470, Score:  10.0, Execution time: 0.9590
Epoch: 2471, Score:   8.0, Execution time: 0.9200
Epoch: 2472, Score:   6.0, Execution time: 0.9293
Epoch: 2473, Score:   7.0, Execution time: 0.9593
Epoch: 2474, Score:  10.0, Execution time: 0.9486
Epoch: 2475, Score:  14.0, Execution time: 0.9404
Epoch: 2476, Score:  13.0, Execution time: 0.9141
Epoch: 2477, Score:   1.0, Execution time: 0.9166
Epoch: 2478, Score:  12.0, Execution time: 0.9104
Epoch: 2479, Score:  11.0, Execution time: 0.9127
Epoch: 2480, Score:   1.0, Execution time: 0.9106


Epoch: 2625, Score:  12.0, Execution time: 0.9449
Epoch: 2626, Score:  16.0, Execution time: 0.9139
Epoch: 2627, Score:  10.0, Execution time: 0.9161
Epoch: 2628, Score:  15.0, Execution time: 0.9375
Epoch: 2629, Score:  10.0, Execution time: 0.9093
Epoch: 2630, Score:  16.0, Execution time: 0.9213
Epoch: 2631, Score:   5.0, Execution time: 0.9121
Epoch: 2632, Score:  19.0, Execution time: 0.9370
Epoch: 2633, Score:  14.0, Execution time: 0.9483
Epoch: 2634, Score:  11.0, Execution time: 0.9428
Epoch: 2635, Score:  11.0, Execution time: 0.9273
Epoch: 2636, Score:  10.0, Execution time: 0.9248
Epoch: 2637, Score:  13.0, Execution time: 0.9244
Epoch: 2638, Score:  11.0, Execution time: 0.9111
Epoch: 2639, Score:  14.0, Execution time: 0.9321
Epoch: 2640, Score:  18.0, Execution time: 0.9942
Epoch: 2641, Score:  13.0, Execution time: 0.9146
Epoch: 2642, Score:  14.0, Execution time: 0.9110
Epoch: 2643, Score:  10.0, Execution time: 0.9098
Epoch: 2644, Score:  11.0, Execution time: 0.9434


Epoch: 2789, Score:  13.0, Execution time: 0.9760
Epoch: 2790, Score:  16.0, Execution time: 0.9268
Epoch: 2791, Score:  13.0, Execution time: 0.9109
Epoch: 2792, Score:  16.0, Execution time: 0.9294
Epoch: 2793, Score:  12.0, Execution time: 0.9627
Epoch: 2794, Score:  14.0, Execution time: 0.9474
Epoch: 2795, Score:  12.0, Execution time: 0.9295
Epoch: 2796, Score:  12.0, Execution time: 0.9372
Epoch: 2797, Score:  16.0, Execution time: 0.9245
Epoch: 2798, Score:   6.0, Execution time: 0.9092
Epoch: 2799, Score:   8.0, Execution time: 0.9371
Epoch: 2800, Score:  12.0, Execution time: 0.9365
Epoch: 2801, Score:   4.0, Execution time: 0.9180
Epoch: 2802, Score:  14.0, Execution time: 0.9162
Epoch: 2803, Score:  19.0, Execution time: 0.9572
Epoch: 2804, Score:  13.0, Execution time: 0.9719
Epoch: 2805, Score:  12.0, Execution time: 0.9908
Epoch: 2806, Score:   7.0, Execution time: 0.9360
Epoch: 2807, Score:  14.0, Execution time: 0.9273
Epoch: 2808, Score:  18.0, Execution time: 0.9199


Epoch: 2953, Score:  13.0, Execution time: 0.9606
Epoch: 2954, Score:  14.0, Execution time: 0.9563
Epoch: 2955, Score:  14.0, Execution time: 0.9424
Epoch: 2956, Score:  10.0, Execution time: 0.9118
Epoch: 2957, Score:  14.0, Execution time: 0.9640
Epoch: 2958, Score:  15.0, Execution time: 0.9222
Epoch: 2959, Score:  18.0, Execution time: 0.9215
Epoch: 2960, Score:  14.0, Execution time: 0.9292
Epoch: 2961, Score:  13.0, Execution time: 0.9276
Epoch: 2962, Score:  15.0, Execution time: 0.9387
Epoch: 2963, Score:  13.0, Execution time: 0.9278
Epoch: 2964, Score:  11.0, Execution time: 0.9196
Epoch: 2965, Score:  12.0, Execution time: 0.9095
Epoch: 2966, Score:   6.0, Execution time: 0.9147
Epoch: 2967, Score:  12.0, Execution time: 0.9233
Epoch: 2968, Score:  11.0, Execution time: 0.9331
Epoch: 2969, Score:   8.0, Execution time: 0.9302
Epoch: 2970, Score:  14.0, Execution time: 0.9446
Epoch: 2971, Score:  14.0, Execution time: 0.9307
Epoch: 2972, Score:  12.0, Execution time: 0.9199


Epoch: 3117, Score:   9.0, Execution time: 0.9382
Epoch: 3118, Score:   9.0, Execution time: 0.9381
Epoch: 3119, Score:  17.0, Execution time: 0.9413
Epoch: 3120, Score:   8.0, Execution time: 0.9426
Epoch: 3121, Score:   7.0, Execution time: 0.9455
Epoch: 3122, Score:  14.0, Execution time: 0.9419
Epoch: 3123, Score:  13.0, Execution time: 0.9249
Epoch: 3124, Score:   8.0, Execution time: 0.9574
Epoch: 3125, Score:  18.0, Execution time: 0.9585
Epoch: 3126, Score:  14.0, Execution time: 1.0084
Epoch: 3127, Score:  16.0, Execution time: 0.9358
Epoch: 3128, Score:  11.0, Execution time: 0.9351
Epoch: 3129, Score:  17.0, Execution time: 0.9396
Epoch: 3130, Score:  15.0, Execution time: 0.9422
Epoch: 3131, Score:  11.0, Execution time: 0.9379
Epoch: 3132, Score:  11.0, Execution time: 0.9342
Epoch: 3133, Score:  13.0, Execution time: 0.9302
Epoch: 3134, Score:   7.0, Execution time: 0.9405
Epoch: 3135, Score:  16.0, Execution time: 0.9463
Epoch: 3136, Score:   5.0, Execution time: 0.9336


Epoch: 3281, Score:  11.0, Execution time: 0.9202
Epoch: 3282, Score:  10.0, Execution time: 0.9079
Epoch: 3283, Score:  14.0, Execution time: 0.9124
Epoch: 3284, Score:  15.0, Execution time: 0.9092
Epoch: 3285, Score:  13.0, Execution time: 0.9199
Epoch: 3286, Score:  11.0, Execution time: 0.9158
Epoch: 3287, Score:   9.0, Execution time: 0.9108
Epoch: 3288, Score:   1.0, Execution time: 0.9203
Epoch: 3289, Score:  12.0, Execution time: 0.9348
Epoch: 3290, Score:   7.0, Execution time: 0.9437
Epoch: 3291, Score:  17.0, Execution time: 0.9191
Epoch: 3292, Score:  10.0, Execution time: 0.9183
Epoch: 3293, Score:  15.0, Execution time: 0.9093
Epoch: 3294, Score:   7.0, Execution time: 0.9021
Epoch: 3295, Score:  11.0, Execution time: 0.9127
Epoch: 3296, Score:  11.0, Execution time: 0.9659
Epoch: 3297, Score:  13.0, Execution time: 0.9439
Epoch: 3298, Score:  15.0, Execution time: 0.9256
Epoch: 3299, Score:  12.0, Execution time: 0.9291
Epoch: 3300, Score:  13.0, Execution time: 0.9349


Epoch: 3445, Score:  11.0, Execution time: 0.9797
Epoch: 3446, Score:   9.0, Execution time: 0.9378
Epoch: 3447, Score:   8.0, Execution time: 0.9258
Epoch: 3448, Score:  14.0, Execution time: 0.9317
Epoch: 3449, Score:   7.0, Execution time: 0.9244
Epoch: 3450, Score:  14.0, Execution time: 0.9271
Epoch: 3451, Score:  16.0, Execution time: 0.9250
Epoch: 3452, Score:  19.0, Execution time: 0.9182
Epoch: 3453, Score:  18.0, Execution time: 0.9115
Epoch: 3454, Score:  18.0, Execution time: 0.9257
Epoch: 3455, Score:  13.0, Execution time: 0.9118
Epoch: 3456, Score:  14.0, Execution time: 0.9387
Epoch: 3457, Score:  14.0, Execution time: 0.9177
Epoch: 3458, Score:  17.0, Execution time: 0.9099
Epoch: 3459, Score:  15.0, Execution time: 0.9255
Epoch: 3460, Score:  10.0, Execution time: 0.9401
Epoch: 3461, Score:   9.0, Execution time: 0.9249
Epoch: 3462, Score:  11.0, Execution time: 0.9389
Epoch: 3463, Score:  11.0, Execution time: 0.9134
Epoch: 3464, Score:  13.0, Execution time: 0.9109


Epoch: 3609, Score:  13.0, Execution time: 0.9331
Epoch: 3610, Score:   7.0, Execution time: 0.9297
Epoch: 3611, Score:  14.0, Execution time: 0.9435
Epoch: 3612, Score:  12.0, Execution time: 0.9376
Epoch: 3613, Score:  13.0, Execution time: 0.9369
Epoch: 3614, Score:  11.0, Execution time: 0.9465
Epoch: 3615, Score:   8.0, Execution time: 0.9953
Epoch: 3616, Score:   3.0, Execution time: 0.9113
Epoch: 3617, Score:  17.0, Execution time: 0.9191
Epoch: 3618, Score:  14.0, Execution time: 0.9257
Epoch: 3619, Score:  16.0, Execution time: 0.9483
Epoch: 3620, Score:   6.0, Execution time: 0.9447
Epoch: 3621, Score:  15.0, Execution time: 0.9661
Epoch: 3622, Score:  11.0, Execution time: 0.9335
Epoch: 3623, Score:  15.0, Execution time: 0.9304
Epoch: 3624, Score:   5.0, Execution time: 0.9254
Epoch: 3625, Score:  11.0, Execution time: 0.9325
Epoch: 3626, Score:  18.0, Execution time: 0.9317
Epoch: 3627, Score:   7.0, Execution time: 0.9027
Epoch: 3628, Score:   6.0, Execution time: 0.9209


Epoch: 3773, Score:  11.0, Execution time: 0.9398
Epoch: 3774, Score:  13.0, Execution time: 0.9463
Epoch: 3775, Score:  11.0, Execution time: 0.9503
Epoch: 3776, Score:  13.0, Execution time: 0.9594
Epoch: 3777, Score:  12.0, Execution time: 0.9888
Epoch: 3778, Score:  15.0, Execution time: 0.9951
Epoch: 3779, Score:   8.0, Execution time: 0.9667
Epoch: 3780, Score:  11.0, Execution time: 0.9339
Epoch: 3781, Score:  18.0, Execution time: 0.9408
Epoch: 3782, Score:  15.0, Execution time: 1.0121
Epoch: 3783, Score:  10.0, Execution time: 0.9463
Epoch: 3784, Score:  24.0, Execution time: 0.9632
Epoch: 3785, Score:  15.0, Execution time: 0.9331
Epoch: 3786, Score:   9.0, Execution time: 0.9562
Epoch: 3787, Score:  18.0, Execution time: 0.9642
Epoch: 3788, Score:  10.0, Execution time: 0.9725
Epoch: 3789, Score:  12.0, Execution time: 0.9519
Epoch: 3790, Score:  12.0, Execution time: 0.9331
Epoch: 3791, Score:  13.0, Execution time: 0.9355
Epoch: 3792, Score:  13.0, Execution time: 0.9379


Epoch: 3937, Score:  13.0, Execution time: 0.9432
Epoch: 3938, Score:  10.0, Execution time: 0.9432
Epoch: 3939, Score:  13.0, Execution time: 0.9266
Epoch: 3940, Score:  14.0, Execution time: 0.9480
Epoch: 3941, Score:  14.0, Execution time: 0.9378
Epoch: 3942, Score:   8.0, Execution time: 0.9258
Epoch: 3943, Score:  11.0, Execution time: 0.9265
Epoch: 3944, Score:  18.0, Execution time: 0.9431
Epoch: 3945, Score:  11.0, Execution time: 0.9426
Epoch: 3946, Score:  10.0, Execution time: 0.9398
Epoch: 3947, Score:  14.0, Execution time: 0.9251
Epoch: 3948, Score:  16.0, Execution time: 0.9231
Epoch: 3949, Score:  16.0, Execution time: 0.9392
Epoch: 3950, Score:  18.0, Execution time: 0.9920
Epoch: 3951, Score:   5.0, Execution time: 0.9476
Epoch: 3952, Score:  16.0, Execution time: 0.9449
Epoch: 3953, Score:  12.0, Execution time: 0.9202
Epoch: 3954, Score:  12.0, Execution time: 0.9230
Epoch: 3955, Score:  16.0, Execution time: 0.9706
Epoch: 3956, Score:  18.0, Execution time: 0.9690


In [15]:
import torch

torch.save(agent.qnetwork_local.state_dict(), '20200704a_qnetwork_local_statedict.pth')

torch.save(agent.qnetwork_target.state_dict(), '20200704a_qnetwork_target_statedict.pth')

In [16]:
env.close()