# Collaboration and Competition

---

You are welcome to use this coding environment to train your agent for the project.  Follow the instructions below to get started!

### 1. Start the Environment

Run the next code cell to install a few packages.  This line will take a few minutes to run!

In [1]:
!pip -q install ./python

[31mtensorflow 1.7.1 has requirement numpy>=1.13.3, but you'll have numpy 1.12.1 which is incompatible.[0m
[31mipython 6.5.0 has requirement prompt-toolkit<2.0.0,>=1.0.15, but you'll have prompt-toolkit 2.0.9 which is incompatible.[0m


The environment is already saved in the Workspace and can be accessed at the file path provided below. 

In [2]:
#import envs
from buffer import ReplayBuffer
from maddpg import MADDPG
import torch
import numpy as np
import matplotlib
#from tensorboardX import SummaryWriter
import os
from collections import namedtuple, deque

# keep training awake
from workspace_utils import active_session

# for saving gif
import imageio

In [3]:
from unityagents import UnityEnvironment
import numpy as np

env = UnityEnvironment(file_name="/data/Tennis_Linux_NoVis/Tennis")

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: TennisBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 8
        Number of stacked Vector Observation: 3
        Vector Action space type: continuous
        Vector Action space size (per agent): 2
        Vector Action descriptions: , 


Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python.

In [4]:
# get the default brain
brain_name = env.brain_names[0]
brain = env.brains[brain_name]

### 2. Examine the State and Action Spaces

Run the code cell below to print some information about the environment.

In [5]:
# reset the environment
env_info = env.reset(train_mode=True)[brain_name]

# number of agents 
num_agents = len(env_info.agents)
print('Number of agents:', num_agents)

# size of each action
action_size = brain.vector_action_space_size
print('Size of each action:', action_size)

# examine the state space 
states = env_info.vector_observations
state_size = states.shape[1]
print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))
print('The state for the first agent looks like:', states[0])

Number of agents: 2
Size of each action: 2
There are 2 agents. Each observes a state with length: 24
The state for the first agent looks like: [ 0.          0.          0.          0.          0.          0.          0.
  0.          0.          0.          0.          0.          0.          0.
  0.          0.         -6.65278625 -1.5        -0.          0.
  6.83172083  6.         -0.          0.        ]


### 3. Take Random Actions in the Environment

In the next code cell, you will learn how to use the Python API to control the agent and receive feedback from the environment.

Note that **in this coding environment, you will not be able to watch the agents while they are training**, and you should set `train_mode=True` to restart the environment.

In [6]:
for i in range(5):                                         # play game for 5 episodes
    env_info = env.reset(train_mode=False)[brain_name]     # reset the environment    
    states = env_info.vector_observations                  # get the current state (for each agent)
    scores = np.zeros(num_agents)                          # initialize the score (for each agent)
    while True:
        actions = np.random.randn(num_agents, action_size) # select an action (for each agent)
        actions = np.clip(actions, -1, 1)                  # all actions between -1 and 1
        env_info = env.step(actions)[brain_name]           # send all actions to tne environment
        next_states = env_info.vector_observations         # get next state (for each agent)
        rewards = env_info.rewards                         # get reward (for each agent)
        dones = env_info.local_done                        # see if episode finished
        scores += env_info.rewards                         # update the score (for each agent)
        states = next_states                               # roll over states to next time step
        if np.any(dones):                                  # exit loop if episode finished
            break
    print('Total score (averaged over agents) this episode: {}'.format(np.mean(scores)))

Total score (averaged over agents) this episode: -0.004999999888241291
Total score (averaged over agents) this episode: -0.004999999888241291
Total score (averaged over agents) this episode: -0.004999999888241291
Total score (averaged over agents) this episode: -0.004999999888241291
Total score (averaged over agents) this episode: -0.004999999888241291


### 4. Train the agent

In [7]:
# instantiate maddpg agent
agent = MADDPG(action_size=action_size, n_agents=num_agents, seed=0)

In [8]:
def maddpg_train(n_episodes=5000, train=True):
    
    scores_deque = deque(maxlen=100)
    scores_all = []
    
    for i_episode in range(1, n_episodes+1):
        total = []
        env_info = env.reset(train_mode=train)[brain_name]       
        state = env_info.vector_observations                  

        while True:
            action = agent.act(state, add_noise=train)
            env_info = env.step(action)[brain_name]
            next_state = env_info.vector_observations
            rewards = env_info.rewards
            done = env_info.local_done
            agent.step(state, action, rewards, next_state, done)
            state = next_state
            total.append(rewards)
            if any(done):
                break
                
        scores_deque.append(np.max(np.sum(np.array(total), axis=0)))
        avg_score = np.mean(scores_deque)
        scores_all.append(avg_score)
        
        print('\rEpisode number {}\tAverage Score for the episode: {:.2f}'.format(i_episode, avg_score))
        
        # log average score every 100 episodes
        if i_episode % 100 == 0:
            print('\rEpisode number {}\tAverage Score for last 100 episodes: {:.2f}'.format(i_episode, avg_score))
    
        # break and report success if environment is solved
        if avg_score >= 0.5 and train:
            print('\nSolved in {:d} episode\tAverage Score: {:.2f}'.format(i_episode, np.mean(scores_deque)))
            agent.save_agents()
            break
        
    return scores_all

with active_session():
    scores_all = maddpg_train()

fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot(np.arange(1, len(scores_all)+1), scores_all)
plt.ylabel('Score')
plt.xlabel('Episode #')
plt.show()

Episode number 1	Average Score for the episode: 0.00
Episode number 2	Average Score for the episode: 0.00
Episode number 3	Average Score for the episode: 0.00
Episode number 4	Average Score for the episode: 0.03
Episode number 5	Average Score for the episode: 0.02
Episode number 6	Average Score for the episode: 0.02
Episode number 7	Average Score for the episode: 0.01
Episode number 8	Average Score for the episode: 0.01
Episode number 9	Average Score for the episode: 0.01
Episode number 10	Average Score for the episode: 0.01
Episode number 11	Average Score for the episode: 0.01
Episode number 12	Average Score for the episode: 0.01
Episode number 13	Average Score for the episode: 0.01
Episode number 14	Average Score for the episode: 0.01
Episode number 15	Average Score for the episode: 0.01
Episode number 16	Average Score for the episode: 0.01
Episode number 17	Average Score for the episode: 0.01
Episode number 18	Average Score for the episode: 0.01
Episode number 19	Average Score for t

Episode number 151	Average Score for the episode: 0.00
Episode number 152	Average Score for the episode: 0.00
Episode number 153	Average Score for the episode: 0.00
Episode number 154	Average Score for the episode: 0.00
Episode number 155	Average Score for the episode: 0.00
Episode number 156	Average Score for the episode: 0.00
Episode number 157	Average Score for the episode: 0.00
Episode number 158	Average Score for the episode: 0.00
Episode number 159	Average Score for the episode: 0.00
Episode number 160	Average Score for the episode: 0.00
Episode number 161	Average Score for the episode: 0.00
Episode number 162	Average Score for the episode: 0.00
Episode number 163	Average Score for the episode: 0.00
Episode number 164	Average Score for the episode: 0.00
Episode number 165	Average Score for the episode: 0.00
Episode number 166	Average Score for the episode: 0.00
Episode number 167	Average Score for the episode: 0.00
Episode number 168	Average Score for the episode: 0.00
Episode nu

Episode number 299	Average Score for the episode: 0.01
Episode number 300	Average Score for the episode: 0.01
Episode number 300	Average Score for last 100 episodes: 0.01
Episode number 301	Average Score for the episode: 0.01
Episode number 302	Average Score for the episode: 0.01
Episode number 303	Average Score for the episode: 0.01
Episode number 304	Average Score for the episode: 0.01
Episode number 305	Average Score for the episode: 0.01
Episode number 306	Average Score for the episode: 0.01
Episode number 307	Average Score for the episode: 0.01
Episode number 308	Average Score for the episode: 0.01
Episode number 309	Average Score for the episode: 0.01
Episode number 310	Average Score for the episode: 0.01
Episode number 311	Average Score for the episode: 0.01
Episode number 312	Average Score for the episode: 0.01
Episode number 313	Average Score for the episode: 0.01
Episode number 314	Average Score for the episode: 0.01
Episode number 315	Average Score for the episode: 0.01
Epis

Episode number 446	Average Score for the episode: 0.01
Episode number 447	Average Score for the episode: 0.01
Episode number 448	Average Score for the episode: 0.00
Episode number 449	Average Score for the episode: 0.00
Episode number 450	Average Score for the episode: 0.00
Episode number 451	Average Score for the episode: 0.00
Episode number 452	Average Score for the episode: 0.00
Episode number 453	Average Score for the episode: 0.00
Episode number 454	Average Score for the episode: 0.00
Episode number 455	Average Score for the episode: 0.00
Episode number 456	Average Score for the episode: 0.00
Episode number 457	Average Score for the episode: 0.00
Episode number 458	Average Score for the episode: 0.00
Episode number 459	Average Score for the episode: 0.00
Episode number 460	Average Score for the episode: 0.00
Episode number 461	Average Score for the episode: 0.00
Episode number 462	Average Score for the episode: 0.00
Episode number 463	Average Score for the episode: 0.00
Episode nu

Episode number 595	Average Score for the episode: 0.00
Episode number 596	Average Score for the episode: 0.00
Episode number 597	Average Score for the episode: 0.00
Episode number 598	Average Score for the episode: 0.00
Episode number 599	Average Score for the episode: 0.00
Episode number 600	Average Score for the episode: 0.00
Episode number 600	Average Score for last 100 episodes: 0.00
Episode number 601	Average Score for the episode: 0.00
Episode number 602	Average Score for the episode: 0.00
Episode number 603	Average Score for the episode: 0.00
Episode number 604	Average Score for the episode: 0.00
Episode number 605	Average Score for the episode: 0.00
Episode number 606	Average Score for the episode: 0.00
Episode number 607	Average Score for the episode: 0.00
Episode number 608	Average Score for the episode: 0.00
Episode number 609	Average Score for the episode: 0.00
Episode number 610	Average Score for the episode: 0.00
Episode number 611	Average Score for the episode: 0.00
Epis

Episode number 742	Average Score for the episode: 0.00
Episode number 743	Average Score for the episode: 0.00
Episode number 744	Average Score for the episode: 0.00
Episode number 745	Average Score for the episode: 0.00
Episode number 746	Average Score for the episode: 0.00
Episode number 747	Average Score for the episode: 0.00
Episode number 748	Average Score for the episode: 0.00
Episode number 749	Average Score for the episode: 0.00
Episode number 750	Average Score for the episode: 0.00
Episode number 751	Average Score for the episode: 0.00
Episode number 752	Average Score for the episode: 0.00
Episode number 753	Average Score for the episode: 0.00
Episode number 754	Average Score for the episode: 0.00
Episode number 755	Average Score for the episode: 0.00
Episode number 756	Average Score for the episode: 0.00
Episode number 757	Average Score for the episode: 0.00
Episode number 758	Average Score for the episode: 0.00
Episode number 759	Average Score for the episode: 0.00
Episode nu

Episode number 891	Average Score for the episode: 0.00
Episode number 892	Average Score for the episode: 0.00
Episode number 893	Average Score for the episode: 0.00
Episode number 894	Average Score for the episode: 0.00
Episode number 895	Average Score for the episode: 0.00
Episode number 896	Average Score for the episode: 0.00
Episode number 897	Average Score for the episode: 0.00
Episode number 898	Average Score for the episode: 0.00
Episode number 899	Average Score for the episode: 0.00
Episode number 900	Average Score for the episode: 0.00
Episode number 900	Average Score for last 100 episodes: 0.00
Episode number 901	Average Score for the episode: 0.00
Episode number 902	Average Score for the episode: 0.00
Episode number 903	Average Score for the episode: 0.00
Episode number 904	Average Score for the episode: 0.00
Episode number 905	Average Score for the episode: 0.00
Episode number 906	Average Score for the episode: 0.00
Episode number 907	Average Score for the episode: 0.00
Epis

Episode number 1039	Average Score for the episode: 0.00
Episode number 1040	Average Score for the episode: 0.00
Episode number 1041	Average Score for the episode: 0.00
Episode number 1042	Average Score for the episode: 0.00
Episode number 1043	Average Score for the episode: 0.00
Episode number 1044	Average Score for the episode: 0.00
Episode number 1045	Average Score for the episode: 0.00
Episode number 1046	Average Score for the episode: 0.00
Episode number 1047	Average Score for the episode: 0.00
Episode number 1048	Average Score for the episode: 0.00
Episode number 1049	Average Score for the episode: 0.00
Episode number 1050	Average Score for the episode: 0.00
Episode number 1051	Average Score for the episode: 0.00
Episode number 1052	Average Score for the episode: 0.00
Episode number 1053	Average Score for the episode: 0.00
Episode number 1054	Average Score for the episode: 0.00
Episode number 1055	Average Score for the episode: 0.00
Episode number 1056	Average Score for the episod

Episode number 1185	Average Score for the episode: 0.00
Episode number 1186	Average Score for the episode: 0.00
Episode number 1187	Average Score for the episode: 0.00
Episode number 1188	Average Score for the episode: 0.00
Episode number 1189	Average Score for the episode: 0.00
Episode number 1190	Average Score for the episode: 0.00
Episode number 1191	Average Score for the episode: 0.00
Episode number 1192	Average Score for the episode: 0.00
Episode number 1193	Average Score for the episode: 0.00
Episode number 1194	Average Score for the episode: 0.00
Episode number 1195	Average Score for the episode: 0.00
Episode number 1196	Average Score for the episode: 0.00
Episode number 1197	Average Score for the episode: 0.00
Episode number 1198	Average Score for the episode: 0.00
Episode number 1199	Average Score for the episode: 0.00
Episode number 1200	Average Score for the episode: 0.00
Episode number 1200	Average Score for last 100 episodes: 0.00
Episode number 1201	Average Score for the 

Episode number 1330	Average Score for the episode: 0.01
Episode number 1331	Average Score for the episode: 0.01
Episode number 1332	Average Score for the episode: 0.01
Episode number 1333	Average Score for the episode: 0.01
Episode number 1334	Average Score for the episode: 0.02
Episode number 1335	Average Score for the episode: 0.02
Episode number 1336	Average Score for the episode: 0.02
Episode number 1337	Average Score for the episode: 0.02
Episode number 1338	Average Score for the episode: 0.02
Episode number 1339	Average Score for the episode: 0.02
Episode number 1340	Average Score for the episode: 0.02
Episode number 1341	Average Score for the episode: 0.02
Episode number 1342	Average Score for the episode: 0.02
Episode number 1343	Average Score for the episode: 0.02
Episode number 1344	Average Score for the episode: 0.02
Episode number 1345	Average Score for the episode: 0.02
Episode number 1346	Average Score for the episode: 0.02
Episode number 1347	Average Score for the episod

Episode number 1476	Average Score for the episode: 0.04
Episode number 1477	Average Score for the episode: 0.04
Episode number 1478	Average Score for the episode: 0.04
Episode number 1479	Average Score for the episode: 0.04
Episode number 1480	Average Score for the episode: 0.04
Episode number 1481	Average Score for the episode: 0.04
Episode number 1482	Average Score for the episode: 0.04
Episode number 1483	Average Score for the episode: 0.04
Episode number 1484	Average Score for the episode: 0.04
Episode number 1485	Average Score for the episode: 0.04
Episode number 1486	Average Score for the episode: 0.04
Episode number 1487	Average Score for the episode: 0.04
Episode number 1488	Average Score for the episode: 0.04
Episode number 1489	Average Score for the episode: 0.04
Episode number 1490	Average Score for the episode: 0.03
Episode number 1491	Average Score for the episode: 0.04
Episode number 1492	Average Score for the episode: 0.03
Episode number 1493	Average Score for the episod

Episode number 1621	Average Score for the episode: 0.01
Episode number 1622	Average Score for the episode: 0.01
Episode number 1623	Average Score for the episode: 0.01
Episode number 1624	Average Score for the episode: 0.01
Episode number 1625	Average Score for the episode: 0.01
Episode number 1626	Average Score for the episode: 0.01
Episode number 1627	Average Score for the episode: 0.01
Episode number 1628	Average Score for the episode: 0.01
Episode number 1629	Average Score for the episode: 0.01
Episode number 1630	Average Score for the episode: 0.01
Episode number 1631	Average Score for the episode: 0.01
Episode number 1632	Average Score for the episode: 0.01
Episode number 1633	Average Score for the episode: 0.01
Episode number 1634	Average Score for the episode: 0.01
Episode number 1635	Average Score for the episode: 0.01
Episode number 1636	Average Score for the episode: 0.01
Episode number 1637	Average Score for the episode: 0.01
Episode number 1638	Average Score for the episod

Episode number 1767	Average Score for the episode: 0.01
Episode number 1768	Average Score for the episode: 0.01
Episode number 1769	Average Score for the episode: 0.01
Episode number 1770	Average Score for the episode: 0.01
Episode number 1771	Average Score for the episode: 0.01
Episode number 1772	Average Score for the episode: 0.01
Episode number 1773	Average Score for the episode: 0.01
Episode number 1774	Average Score for the episode: 0.01
Episode number 1775	Average Score for the episode: 0.01
Episode number 1776	Average Score for the episode: 0.01
Episode number 1777	Average Score for the episode: 0.01
Episode number 1778	Average Score for the episode: 0.01
Episode number 1779	Average Score for the episode: 0.01
Episode number 1780	Average Score for the episode: 0.01
Episode number 1781	Average Score for the episode: 0.01
Episode number 1782	Average Score for the episode: 0.01
Episode number 1783	Average Score for the episode: 0.01
Episode number 1784	Average Score for the episod

Episode number 1913	Average Score for the episode: 0.00
Episode number 1914	Average Score for the episode: 0.00
Episode number 1915	Average Score for the episode: 0.00
Episode number 1916	Average Score for the episode: 0.00
Episode number 1917	Average Score for the episode: 0.00
Episode number 1918	Average Score for the episode: 0.00
Episode number 1919	Average Score for the episode: 0.00
Episode number 1920	Average Score for the episode: 0.00
Episode number 1921	Average Score for the episode: 0.00
Episode number 1922	Average Score for the episode: 0.00
Episode number 1923	Average Score for the episode: 0.00
Episode number 1924	Average Score for the episode: 0.00
Episode number 1925	Average Score for the episode: 0.01
Episode number 1926	Average Score for the episode: 0.01
Episode number 1927	Average Score for the episode: 0.01
Episode number 1928	Average Score for the episode: 0.01
Episode number 1929	Average Score for the episode: 0.01
Episode number 1930	Average Score for the episod

Episode number 2060	Average Score for the episode: 0.02
Episode number 2061	Average Score for the episode: 0.01
Episode number 2062	Average Score for the episode: 0.01
Episode number 2063	Average Score for the episode: 0.01
Episode number 2064	Average Score for the episode: 0.01
Episode number 2065	Average Score for the episode: 0.01
Episode number 2066	Average Score for the episode: 0.01
Episode number 2067	Average Score for the episode: 0.01
Episode number 2068	Average Score for the episode: 0.01
Episode number 2069	Average Score for the episode: 0.01
Episode number 2070	Average Score for the episode: 0.01
Episode number 2071	Average Score for the episode: 0.01
Episode number 2072	Average Score for the episode: 0.01
Episode number 2073	Average Score for the episode: 0.01
Episode number 2074	Average Score for the episode: 0.01
Episode number 2075	Average Score for the episode: 0.01
Episode number 2076	Average Score for the episode: 0.01
Episode number 2077	Average Score for the episod

Episode number 2205	Average Score for the episode: 0.02
Episode number 2206	Average Score for the episode: 0.02
Episode number 2207	Average Score for the episode: 0.02
Episode number 2208	Average Score for the episode: 0.02
Episode number 2209	Average Score for the episode: 0.02
Episode number 2210	Average Score for the episode: 0.02
Episode number 2211	Average Score for the episode: 0.02
Episode number 2212	Average Score for the episode: 0.02
Episode number 2213	Average Score for the episode: 0.02
Episode number 2214	Average Score for the episode: 0.02
Episode number 2215	Average Score for the episode: 0.02
Episode number 2216	Average Score for the episode: 0.02
Episode number 2217	Average Score for the episode: 0.02
Episode number 2218	Average Score for the episode: 0.02
Episode number 2219	Average Score for the episode: 0.02
Episode number 2220	Average Score for the episode: 0.02
Episode number 2221	Average Score for the episode: 0.02
Episode number 2222	Average Score for the episod

Episode number 2351	Average Score for the episode: 0.02
Episode number 2352	Average Score for the episode: 0.02
Episode number 2353	Average Score for the episode: 0.02
Episode number 2354	Average Score for the episode: 0.02
Episode number 2355	Average Score for the episode: 0.02
Episode number 2356	Average Score for the episode: 0.02
Episode number 2357	Average Score for the episode: 0.02
Episode number 2358	Average Score for the episode: 0.02
Episode number 2359	Average Score for the episode: 0.02
Episode number 2360	Average Score for the episode: 0.02
Episode number 2361	Average Score for the episode: 0.02
Episode number 2362	Average Score for the episode: 0.02
Episode number 2363	Average Score for the episode: 0.02
Episode number 2364	Average Score for the episode: 0.03
Episode number 2365	Average Score for the episode: 0.03
Episode number 2366	Average Score for the episode: 0.03
Episode number 2367	Average Score for the episode: 0.03
Episode number 2368	Average Score for the episod

KeyboardInterrupt: 

### 5. Watch the trained agent

In [None]:
# change the load weights
agent.actor_local.load_state_dict(torch.load('checkpoint_actor.pth'))
agent.critic_local.load_state_dict(torch.load('checkpoint_critic.pth'))

env_info = env.reset(train_mode=False)[brain_name]      # reset the environment    
state = env_info.vector_observations
for t in range(200):
    action = agent.act(state, add_noise=False)
    env.render()
    state, reward, done, _ = env.step(action)
    if done:
        break 

env.close()

When finished, you can close the environment.

In [7]:
env.close()