For basic info about the overcooked game go [here](https://github.com/HumanCompatibleAI/overcooked_ai#introduction)
# Introduction to the agent evaluator
The agent evaluator is an object used to evaluate different agents.

In [2]:
from overcooked_ai_py.agents.benchmarking import AgentEvaluator
agent_eval = AgentEvaluator({"layout_name": "cramped_room"}, {"horizon": 100})

The central method of the AgentEvaluator object is evaluate_agent_pair that runs 2 agents on the chosen layout.
Other methods can call evaluate_agent_pair method with preexisting agents, let's run 2 of them, and compare results.

In [4]:
 # does random actions
trajectory_random_pair = agent_eval.evaluate_random_pair(num_games=2, display=False)
print("Random pair rewards", trajectory_random_pair["ep_returns"])
# Human agent does greedy actions without much cooperation
trajectory_human_pair = agent_eval.evaluate_human_model_pair(num_games=2, display=False)
print("Geedy human pair rewards", trajectory_human_pair["ep_returns"])

Avg rew: 0.00 (std: 0.00, se: 0.00); avg len: 100.00; : 100%|██████████| 2/2 [00:00<00:00, 34.75it/s]
Avg rew: 40.00 (std: 0.00, se: 0.00); avg len: 100.00; : 100%|██████████| 2/2 [00:00<00:00, 29.22it/s]

Random pair rewards [0 0]
Geedy human pair rewards [40 40]





Besides just looking at reward we can dive deeper into what happened during agent interactions.
The first way is just to look at the whole trajectory using an interactive display method. 
It works only in IPython notebooks and is ASCII based trajectory slider.  
More about meaning chars in the "Custom layouts" section.  
For better-looking visualizations check out [overcooked-demo](https://github.com/HumanCompatibleAI/overcooked-demo).

In [5]:
agent_eval.interactive_from_traj(trajectory_human_pair, traj_idx=0)

Output()

IntSlider(value=0, max=99)

Other way to look at trajectory is look at graph visualizing what item are picked up, dropped and held through time.  

In [6]:
# set ipython=False to open graph in default browser
agent_eval.events_visualization(trajectory_human_pair, traj_index=0, ipython=True, chart_settings={"add_cumulative_data": False})

Hover mouse on the event to highlight all interactions with the associated object.
chart_settings is an object influencing the work of charting javascript code. By not setting add_cumulative_data variable to False we can see what agent does most of the object interactions. Lines show cumulative player interactions count. Changing items (e.g. changing held empty dish to the soup) is currently implemented as 2 actions.

In [7]:
agent_eval.events_visualization(trajectory_human_pair, traj_index=0, ipython=True)

## Custom layouts
Besides premade layouts found in the[layout directory](https://github.com/HumanCompatibleAI/overcooked_ai/tree/master/overcooked_ai_py/data/layouts) you can create your own layouts to run agents on. Lets first look at example layout:

In [8]:
corner_circut_grid = {
    "grid":  """XXXPPXXX
                X  2   X
                D XXXX S
                X  1   X
                XXXOOXXX""",
    "start_order_list": None,
    "cook_time": 20,
    "num_items_for_soup": 3,
    "delivery_reward": 20,
    "rew_shaping_params": None
}

Layout territory is defined by grid. Every character is one tile. Available tiles are:
- empty space - ' '
- counter - 'X'
- onion dispenser - 'O'
- tomato dispenser - 'T'
- pot (place where players cook soup from onions and tomatoes) - 'P' 
- dish dispenser - 'D '
- serving location - 'S'
- player starting location - number  
  
You can save layout in ovecooked_ai/overcooked_ai_py/data/layouts directory and then run agent evaluator AgentEvaluator({"layout_name": layout_name}) where layout_name is filename without ".layoyut extension".  
You can also generate random, but valid grids in automated way. Lets create one and run agents on it.

In [9]:

mdp_params = {}
mdp_fn_params = {"size_bounds": ((4,7), (4,7)), # (min_layout_size, max_layout_size)
                "prop_empty":(0.6, 0.8), # (min, max) proportion of empty space in generated layout
                "prop_feats":(0.1, 0.2)} # (min, max) proportion of counters with features on them
env_params =  {"horizon": 100}
agent_eval = AgentEvaluator(mdp_params, env_params, mdp_fn_params=mdp_fn_params)


trajectory_random_pair = agent_eval.evaluate_random_pair(num_games=2, display=False)
print("Random pair rewards", trajectory_random_pair["ep_returns"])

def print_grid(grid):
    for line in grid:
        print("".join(line))
        
for i, params in enumerate (trajectory_random_pair["mdp_params"]):
    if i:
        print("")
    print("Grid number %d:" %i)
    print_grid(params["terrain"])

Avg rew: 0.00 (std: 0.00, se: 0.00); avg len: 100.00; : 100%|██████████| 2/2 [00:00<00:00, 35.85it/s]

Skipping trajectory consistency checking because MDP was recognized as variable. Trajectory consistency checking is not yet supported for variable MDPs.
Random pair rewards [0 0]
Grid number 0:
XXOXXXX
X  XXXX
X PXXXX
S  DXXX
XX XXXX
X  XXXX
XXXXXXX

Grid number 1:
XXXXXXX
XX    X
X P   X
X   DSX
X     X
XXXXOXX
XXXXXXX





## Custom agents
We can also run own custom agents to see how they are would work. Lets re-create agent doing random actions on out own.

In [10]:
import numpy as np
from overcooked_ai_py.mdp.actions import Action, Direction
from overcooked_ai_py.agents.agent import Agent, AgentPair

class CustomRandomAgent(Agent):
    """
    An agent that randomly picks motion actions.
    NOTE: Does not perform interact actions, unless specified
    """   
    def action(self, state):
        action_probs = np.zeros(Action.NUM_ACTIONS)
        legal_actions = Action.ALL_ACTIONS
        legal_actions_indices = np.array([Action.ACTION_TO_INDEX[motion_a] for motion_a in legal_actions])
        action_probs[legal_actions_indices] = 1 / len(legal_actions_indices)
        return Action.sample(action_probs), {"action_probs": action_probs}

    def actions(self, states, agent_indices):
        return [self.action(state) for state in states]


agent_pair = AgentPair(CustomRandomAgent(), CustomRandomAgent())
agent_eval = AgentEvaluator({"layout_name": "cramped_room"}, {"horizon": 100})
trajectory_custom_random_pair = agent_eval.evaluate_agent_pair(agent_pair, num_games=2, display=False)
agent_eval.events_visualization(trajectory_custom_random_pair, traj_index=0, ipython=True, chart_settings={"add_cumulative_data": False})

Avg rew: 0.00 (std: 0.00, se: 0.00); avg len: 100.00; : 100%|██████████| 2/2 [00:00<00:00, 37.39it/s]


CustomRandomAgent is lightweight version of RandomAgent from overcooked_ai_py.agents.agent module. trajectory_custom_random_pair = agent_eval.evaluate_agent_pair(agent_pair, num_games=2, display=False) have same effect as agent_eval.evaluate_random_pair(num_games=2, all_actions=True, display=False).