# Project 1: Navigation

---

This notebook uses the Unity ML-Agents environment for the first project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893).

In [None]:
import sys

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from unityagents import UnityEnvironment

## Enviroment configuration

**_Before running the code cells below_**, change the `ENVIRONMENT_PATH` to match the location of the Unity environment (you may use the `../environments/` folder for that).

In [None]:
ENVIRONMENT_PATH = "../environments/Banana.app"

In [None]:
SRC_PATH = "../src"
MODEL_CHECKPOINT_PATH = "../models/drlnd_p1_model.pth"

In [None]:
sys.path.append(SRC_PATH)

In [None]:
from agents import Agent
from environments import UnityEnvWrapper

## Create the Unity environment

We wrap the Unity environment to be compatible with Gym. That way we don't need to change the agent's implementation.

In [None]:
env = UnityEnvWrapper(UnityEnvironment(file_name=ENVIRONMENT_PATH))

## Training an agent

Instantiate a `Agent` and learn on the given environment:

In [None]:
agent = Agent(state_size=env.state_size, action_size=env.action_size, seed=0)

In [None]:
scores = agent.learn(environment=env, model_checkpoint_path=MODEL_CHECKPOINT_PATH)

### Plot scores

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111)
x = np.arange(len(scores))
plt.plot(x, scores)
plt.plot(x, pd.Series(scores).rolling(100).mean())
plt.ylabel("Score")
plt.xlabel("Episode #")
plt.savefig("scores")
plt.show();

## Load model and test a trained agent

In [None]:
agent = Agent.load(MODEL_CHECKPOINT_PATH)
state = env.reset(train_mode=False)

In [None]:
score = 0
while True:
    action = agent.act(state)
    next_state, reward, done = env.step(action)
    score += reward
    state = next_state
    if done:
        break
    
print("Score: {}".format(score))

In [None]:
env.close()