# Evaluate Agent on Unity Environment

---

## Start the Environment

Below assumes that you have followed the instruction on the README file such that the Unity environment is ready.

In [1]:
from unityagents import UnityEnvironment

env = UnityEnvironment(file_name="Banana.app")

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 


Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

In [2]:
# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

## Run the Agent

Specify the saved model to test.

In [3]:
model_path = "model-double-dueling-dqn.pt"

Run below to see the agent interact with the Unity environment.

In [None]:
import torch
from dqn_agent import Agent

env_info = env.reset(train_mode=False)[brain_name] # reset the environment
state_size = len(env_info.vector_observations[0])  # get state size
action_size = brain.vector_action_space_size       # get action size

# initialize agent
agent = Agent(state_size=state_size, action_size=action_size, seed=0, use_double_dqn=True, use_dueling_dqn=True)
agent.qnetwork_local.load_state_dict(torch.load(model_path, map_location=lambda storage, loc: storage))
agent.qnetwork_local.eval()

state = env_info.vector_observations[0]            # get the current state
score = 0                                          # initialize the score
while True:
    action = agent.act(state)                      # select an action according to the agent's policy
    env_info = env.step(action)[brain_name]        # send the action to the environment
    next_state = env_info.vector_observations[0]   # get the next state
    reward = env_info.rewards[0]                   # get the reward
    done = env_info.local_done[0]                  # see if episode has finished
    score += reward                                # update the score
    state = next_state                             # roll over the state to next time step
    if done:                                       # exit loop if episode finished
        break
    
print("Score: {}".format(score))