# Navigation

---

In this notebook, you will learn how to use the Unity ML-Agents environment for the first project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893).


## 1. Imports

In [1]:
import torch
from torch import nn
from torch.nn.functional import mse_loss
from torch.optim import Adam
import numpy as np
import matplotlib.pyplot as plt

In [2]:
from unityagents import UnityEnvironment

In [3]:
from dqn_agent import Agent
from dqn_trainer import DQNTrainer
from model import DQN

## 2. Setup

In [4]:
env_path = "/Users/bothmena/Projects/ai/ReinforcementLearning/deep-reinforcement-learning/bin/Banana.app"
DQN_HIDDEN_LAYERS = [64, 64] # in dqn_agent.py
# DQN_HIDDEN_LAYERS = [64, 128, 128, 128]

In [5]:
trainer = DQNTrainer(env_filename=env_path, target_score=1.)

In [6]:
trainer.init_env()

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 


In [7]:
# using default values for batch size, buffer size, ...
# default values are defined in dqn_agent.py, you can override them by adding more arguments
# to the Agent class call.
trainer.instantiate_agent(DQN_HIDDEN_LAYERS)

In [8]:
trainer.train_dqn()

Episode 100	Average Score: 0.81
Episode 106	Average Score: 1.00
Environment solved in 6 episodes!	Average Score: 1.00


In [None]:
plt.figure(figsize=(20, 10))
plt.plot(trainer.scores)
plt.savefig('saved_plots/results_{}.png'.format(trainer.agent.model_id))

## 3. Using trained model to play


In [None]:
# trainer.set_trained_dqn('saved_weights/solved_64_64_score_13.03.pth')

In [None]:
trainer.play()

## 3. Train the agent with DQN

In [15]:
# using default values for batch size, buffer size, ...
# default values are defined in dqn_agent.py, you can override them by adding more arguments
# to the Agent class call.
agent = Agent(state_size=state_size, action_size=action_size, hidden_layers=DQN_HIDDEN_LAYERS)

When finished, you can close the environment.

In [None]:
env.close()

### 4. It's Your Turn!

Now it's your turn to train your own agent to solve the environment!  When training the environment, set `train_mode=True`, so that the line for resetting the environment looks like the following:
```python
env_info = env.reset(train_mode=True)[brain_name]
```