# Collaboration and Competition

---


### 1. Start the Environment

We begin by importing the necessary packages.  If the code cell below returns an error, please revisit the project instructions to double-check that you have installed [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md) and [NumPy](http://www.numpy.org/).

In [None]:
from unityagents import UnityEnvironment
import numpy as np
import torch
import sys
sys.path.append("../")

Next, we will start the environment!  **_Before running the code cell below_**, change the `file_name` parameter to match the location of the Unity environment that you downloaded.

- **Mac**: `"path/to/Soccer.app"`
- **Windows** (x86): `"path/to/Soccer_Windows_x86/Soccer.exe"`
- **Windows** (x86_64): `"path/to/Soccer_Windows_x86_64/Soccer.exe"`
- **Linux** (x86): `"path/to/Soccer_Linux/Soccer.x86"`
- **Linux** (x86_64): `"path/to/Soccer_Linux/Soccer.x86_64"`
- **Linux** (x86, headless): `"path/to/Soccer_Linux_NoVis/Soccer.x86"`
- **Linux** (x86_64, headless): `"path/to/Soccer_Linux_NoVis/Soccer.x86_64"`

For instance, if you are using a Mac, then you downloaded `Soccer.app`.  If this file is in the same folder as the notebook, then the line below should appear as follows:
```
env = UnityEnvironment(file_name="Soccer.app")
```

In [None]:
from lib.env.DualParallelAgentsUnityEnvironment import DualParallelAgentsUnityEnvironment
env = DualParallelAgentsUnityEnvironment(
    name="Soccer",
    target_reward=2.5,
    env_binary_path='../environments/Soccer_Windows_x86_64/Soccer.exe')

### 3. Watch trained MADDPG agents

In [None]:
from lib.models.policy.DeterministicDiscretePolicy import DeterministicDiscretePolicy
from lib.models.function.StateActionValueFunction import StateActionValueFunction
from lib.agent.ddpg.DiscreteMADDPGRLAgent import DiscreteMADDPGRLAgent

In [None]:
policy = lambda s, a: DeterministicDiscretePolicy(
    state_size=s, action_size=a,
    seed=1, output_transform=lambda x: torch.tanh(x))
value_function = lambda s, a: DeterministicDiscretePolicy(
    state_size=env.state_size * env.num_agents,
    action_size=env.action_size * env.num_agents,
    seed=1)

agent = DiscreteMADDPGRLAgent(
    get_actor=policy, get_critic=value_function,
    state_size=env.state_size, action_size=env.action_size, n_agents=env.num_agents, seed=1)

In [None]:
agent.load("Soccer-MADDPG")

In [None]:
for i in range(2):                                         # play game for 2 episodes
    states = self.env.reset()
    trajectory_scores = np.zeros(self.env.num_agents)
    while True:
        # select actions and send to environment
        pred = agent.act(states)
        next_states, rewards, dones = env.act(pred["actions"])
        env_info = env.step(actions)                       
        
        states = next_states
        trajectory_scores = trajectory_scores + rewards
        
        # exit loop if episode finished
        if done:                                           
            break
    print(f'Scores from episode {i} {trajectory_scores}')

When finished, you can close the environment.

In [None]:
env.close()