# Collaboration and Competition

---

In this notebook, you will learn how to use the Unity ML-Agents environment for the third project of the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program.

### 1. Start the Environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

In [None]:
import copy

from torch.utils.tensorboard import SummaryWriter
from unityagents import UnityEnvironment

from madddpg_utils import DDPGConfig, MADDDPGConfig, train_agent, EnvironmentWrapper, evaluate_current_weights, \
    ExploreStrategy
from maddpg_trainer import MADDPGManager
from maddpg_tennis import default_cfg

NUM_SUB_POLICIES = 1

In [None]:
env = UnityEnvironment(file_name='environments/Tennis_Linux/Tennis.x86_64')
env = EnvironmentWrapper(env)

### 2. Init Agent

Initialize a MADDPG Agent

In [None]:
multi_cfg = []
config = default_cfg()
config.multi_agent_actions = env.action_sizes
config.multi_agent_states = env.state_sizes
for i in range(env.num_agents):
    agent_cfg = copy.deepcopy(config)
    agent_cfg.state_size = config.multi_agent_states[i]
    agent_cfg.action_size = config.multi_agent_actions[i]
    maddpg_cfg = MADDDPGConfig()
    maddpg_cfg.subpolicy_configs = [agent_cfg] * NUM_SUB_POLICIES
    multi_cfg.append(maddpg_cfg)

agent = MADDPGManager(maddpg_agents_configs=multi_cfg, update_every=config.update_every)

### 3. Train your agent

This following section will train a new agent, the weights are only saved when the agent reaches an average of 0.5+ points. If you want to test the pre-trained weights skip to Section 5.

In [None]:
weight_dir = 'new_weights/'
train_agent(env, agent, main_weight_folder=weight_dir,
            n_episodes=1200, evaluation_freq=50)

### 4. Test your agent

Here you can test your weights. Hint: The weights are only saved when the agent reaches an average of 0.5+ points. If you want to test the pre-trained weights skip to Section 5.

In [None]:
agent.load_weights(main_folder=weight_dir)
evaluate_current_weights(env, agent, env.num_agents, train_mode=False)

### 5. Test pre-trained agent

In [None]:
multi_cfg = []
config = default_cfg()
config.multi_agent_actions = env.action_sizes
config.multi_agent_states = env.state_sizes
for i in range(env.num_agents):
    agent_cfg = copy.deepcopy(config)
    agent_cfg.state_size = config.multi_agent_states[i]
    agent_cfg.action_size = config.multi_agent_actions[i]
    maddpg_cfg = MADDDPGConfig()
    maddpg_cfg.subpolicy_configs = [agent_cfg] * NUM_SUB_POLICIES
    multi_cfg.append(maddpg_cfg)

agent = MADDPGManager(maddpg_agents_configs=multi_cfg, update_every=config.update_every)

In [None]:
agent.load_weights(main_folder="best_weights/")
evaluate_current_weights(env, agent, env.num_agents, train_mode=False)

In [None]:
env.close()