# Mathias Babin - P1 Navigation Test

This is my implementation for solving the P1 Navigation project for [Udacity's Deep Reinforcement Learning course](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893). Details on the project are provided in the **README** for this repository. The purpose of this notebook is to watch a **finished** agent perform in this enviroment. If you wish to **train** an agent for yourself, please go to the **Navigation_Train** notebook included in this repository.


### 1. Setting up the Environment

The following cells will import various packages and sets up the environment, the first of which gaurentees that both [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/) have been installed correctly.

In [1]:
from unityagents import UnityEnvironment
import torch
from Agent import Agent

This project was built and tested on 64-bit Linux system. To make this application run on a different OS please change the file path in the next cell to one of the following:

- **Mac**: `"path/to/Banana.app"`
- **Windows** (x86): `"path/to/Banana_Windows_x86/Banana.exe"`
- **Windows** (x86_64): `"path/to/Banana_Windows_x86_64/Banana.exe"`
- **Linux** (x86): `"path/to/Banana_Linux/Banana.x86"`
- **Linux** (x86_64): `"path/to/Banana_Linux/Banana.x86_64"`

Note that all of these files **_should_** already be included in the repository as .zip files, simply extract the one that matches your current OS (Linux 32-bit/64-bit are already extracted).

The next cell simply sets up the Enviroment. **_IMPORTANT:_**  If the following cell opens a Unity Window that crashes, this is because the rest of the cells in the project are not being executed fast enough. To avoid this, please select **Restart & Run All** under **Kernal**. This will execute all the cells in the project.

In [2]:
env = UnityEnvironment(file_name="Banana_Linux/Banana.x86_64")

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 


### 2. Testing the Agent

Start by intializes values for the training of the agent, and loading the weights for the agent to use from the *checkpoint.pth* file created by the **Navigation_Train** notebook.

In [3]:
brain_name = env.brain_names[0] # get the name of the brains from the Unity environment
brain = env.brains[brain_name]
env_info = env.reset(train_mode=False)[brain_name] # reset the environment and obtain info on state/action space

# initialize agent with state size and action size.
agent = Agent(len(env_info.vector_observations[0]), brain.vector_action_space_size, seed=0)

# load the trained weights
agent.qnetwork_local.load_state_dict(torch.load('checkpoint.pth'))

Test the smart agent out, and display its final score.

In [4]:
state = env_info.vector_observations[0]  # get the first state
score = 0 # initialize the score
while True: # loop until the episode ends
    action = agent.act(state, 0) # select a greedy action
    env_info = env.step(action)[brain_name] # take that action
    score += env_info.rewards[0] # update the score with the reward for taking that action
    next_state = env_info.vector_observations[0] # the next state
    state = next_state # set current state to next state
    done = env_info.local_done[0] # get the value of the done bool, indicating the episode is over
    # end episode if done is true
    if done:
        break

print("Score: {}".format(score)) # print the score

Score: 14.0


Finally close the environment.

In [5]:
env.close() #close the environment

### 3. Implementation Details

If you have any questions about the implementation details of this project please refer to the **Report.pdf** file included with this repository for a full explanation of both the algorithms and design decisions chosen.