# Trained Agent Demo

---

In this notebook, we'll run trained agent to see how it performs.

### Start the Environment

We begin by importing some necessary packages and starting the environment.
If the code cell below returns an error, please revisit installation instructions given in the README.md.

In [11]:
from unityagents import UnityEnvironment
import numpy as np
import torch

env = UnityEnvironment(file_name="Banana_Windows_x86_64/Banana.exe")
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# number of agents in the environment
print('Number of agents:', len(env_info.agents))

# number of actions
action_size = brain.vector_action_space_size
print('Number of actions:', action_size)

# examine the state space 
state = env_info.vector_observations[0]
print('States look like:', state)
state_size = len(state)
print('States have length:', state_size)

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 


Number of agents: 1
Number of actions: 4
States look like: [1.         0.         0.         0.         0.84408134 0.
 0.         1.         0.         0.0748472  0.         1.
 0.         0.         0.25755    1.         0.         0.
 0.         0.74177343 0.         1.         0.         0.
 0.25854847 0.         0.         1.         0.         0.09355672
 0.         1.         0.         0.         0.31969345 0.
 0.        ]
States have length: 37


### Load Agent

We create new agent and load pre-trained weights.

In [13]:
# Import agent from 'src' folder.
import sys
sys.path.insert(0, 'src')
from dqn_agent import Agent

# Create agent.
agent = Agent(state_size=state_size, action_size=action_size, seed=0)

# Load weights.
state = torch.load('checkpoint.pth')
agent.qnetwork_local.load_state_dict(state)

### Action!

Run this cell, switch to environment windows and watch the agent collecting bananas.

In [None]:
# Run environment multiple times and output final score
episodes = 50
for e in range(1, episodes):
    env_info = env.reset(train_mode=False)[brain_name]
    state = env_info.vector_observations[0]
    score = 0
    for t in range(300):
        action = int(agent.act(state, 0))
        env_info = env.step(action)[brain_name]
        next_state, reward, done = (env_info.vector_observations[0],
            env_info.rewards[0],
            env_info.local_done[0])
        agent.step(state, action, reward, next_state, done)
        state = next_state
        score += reward
        if done:
            break
    print("Episode %d, Score: %f" % (e, score))


When finished, you can close the environment.

In [None]:
env.close()