# Collaboration and Competition

---

In this notebook, you will learn how to use the Unity ML-Agents environment for the third project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program.

In [None]:
import os
import sys

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from unityagents import UnityEnvironment

## Create the Unity environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

__Before running the code cell below__, change the ENVIRONMENT_PATH parameter to match the location of the Unity environment that you downloaded.

In [None]:
ENVIRONMENT_PATH = os.path.join("..", "environments", "Tennis.app")
#ENVIRONMENT_PATH = os.path.join("..", "environments", "Tennis_Linux", "Tennis.x86_64")

In [None]:
SEED = 0
SRC_PATH = os.path.join("..", "src")
AGENT_CHECKPOINT_DIR = os.path.join("..", "models")

In [None]:
sys.path.append(SRC_PATH)

In [None]:
from environments import UnityEnvWrapper
from agents.policy_based import MADDPG

In [None]:
n_agents = 2

## Learn an agent

In [None]:
with UnityEnvWrapper(UnityEnvironment(file_name=ENVIRONMENT_PATH)) as env:
    agent = MADDPG(
        state_size=env.state_size, 
        action_size=env.action_size, 
        n_agents=n_agents,
        seed=SEED,
    )
    scores = agent.fit(
        environment=env,
        average_target_score=0.5,
        agent_checkpoint_dir=AGENT_CHECKPOINT_DIR,
    )

In [None]:
plt.rcParams['axes.spines.left'] = False
plt.rcParams['axes.spines.right'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams["figure.figsize"] = [9, 6]

x = np.arange(len(scores))
mu = pd.Series(scores).rolling(10).mean()
std = pd.Series(scores).rolling(10).std()
plt.plot(x, scores, linewidth=1)
plt.plot(x, mu)
plt.fill_between(x, mu+std, mu-std, facecolor="grey", alpha=0.4)
plt.ylabel("Score")
plt.xlabel("Episode #")

plt.savefig("scores")
plt.show();

## Load a pre-trained agent and play