## Introduction to overcooked_ai

Overcooked-AI is a benchmark environment for fully cooperative multi-agent performance, based on the wildly popular video game [Overcooked](http://www.ghosttowngames.com/overcooked/). 

The goal of the game is to deliver soups as fast as possible. Each soup requires placing up to 3 ingredients in a pot, waiting for the soup to cook, and then having an agent pick up the soup and delivering it. The agents should split up tasks on the fly and coordinate effectively in order to achieve high reward.

You can **try out the game [here](https://humancompatibleai.github.io/overcooked-demo/)** (playing with some previously trained DRL agents). To play with your own trained agents using this interface, you can use [this repo](https://github.com/HumanCompatibleAI/overcooked-demo). To run human-AI experiments, check out [this repo](https://github.com/HumanCompatibleAI/overcooked-hAI-exp). You can find some human-human gameplay data already collected [here](https://github.com/HumanCompatibleAI/human_aware_rl/tree/master/human_aware_rl/data/human/anonymized).
The agent evaluator is an object used to evaluate different agents.

Check out [this repo](https://github.com/HumanCompatibleAI/human_aware_rl) for the DRL implementations compatible with the environment and reproducible results to our paper: *[On the Utility of Learning about Humans for Human-AI Coordination](https://arxiv.org/abs/1910.05789)* (also see our [blog post](https://bair.berkeley.edu/blog/2019/10/21/coordination/)).

# Setup
Run cell below only if you did not installed overcooked_ai yet (e.g. when using this notebook in google collab) to install newest version of overcooked_ai from github repository.

In [6]:
# all imports used in this tutorial, run this if you want to jump to different sections and run only selected cells
import numpy as np
from overcooked_ai_py.mdp.actions import Action, Direction
from overcooked_ai_py.agents.agent import Agent, AgentPair, StayAgent, GreedyAgent
from overcooked_ai_py.agents.benchmarking import AgentEvaluator, LayoutGenerator
from overcooked_ai_py.visualization.state_visualizer import StateVisualizer


pygame 2.0.1 (SDL 2.0.14, Python 3.8.8)
Hello from the pygame community. https://www.pygame.org/contribute.html


## Agent evaluator introduction
The easiest way to start using overcooked_ai is to use agent evaluator object that lets you to run agents on the chosen layouts.

In [7]:
from overcooked_ai_py.agents.benchmarking import AgentEvaluator, LayoutGenerator
mdp_gen_params = {"layout_name": 'cramped_room'}
mdp_fn = LayoutGenerator.mdp_gen_fn_from_dict(mdp_gen_params)
env_params = {"horizon": 100}
agent_eval = AgentEvaluator(env_params=env_params, mdp_fn=mdp_fn)

To create an agent evaluator you need to supply 2 parameters: mdp_fn and env_params.
mdp_fn is a function that returns OvercookedGridworld object that resolves interactions of agents with the environemnt. The quickest method to create valid mdp_fn is to supply dict with layout name to LayoutGenerator.mdp_gen_fn_from_dict. More on the generation of layouts later.
env_params is a dict with additional options. The most important thing to supply here is the horizon key that indicates how many timesteps will be made in each episode.

The central method of the AgentEvaluator object is evaluate_agent_pair that runs 2 agents on the chosen layout. Other methods can call evaluate_agent_pair method with preexisting agents, let's run 2 of them, and compare results.

In [8]:
# does random actions
trajectories_random_pair = agent_eval.evaluate_random_pair(num_games=2)
print("Random pair rewards", trajectories_random_pair["ep_returns"])
# Human model agent does greedy actions without much cooperation
# NOTE: requires layout with only one order - 3 onion soup
trajectories_human_pair = agent_eval.evaluate_human_model_pair(num_games=2)
print("Geedy human pair rewards", trajectories_human_pair["ep_returns"])

  return np.random.choice(Action.ALL_ACTIONS, p=action_probs)
  0%|          | 0/2 [00:00<?, ?it/s]

Computing MotionPlanner to be saved in /d/mleguill/Documents/git_repo/overcooked_mlg/overcooked_ai_py/data/planners/cramped_room_mp.pkl
It took 0.049886465072631836 seconds to create mp





TypeError: '>' not supported between instances of 'NoneType' and 'int'

Let's check what happened in both trajectories. We can do it by using `StateVisualizer` object. It lets you visualize states (instances of `OvercookedState`). Let's visualize the starting state. It is the same for every agent types:

In [9]:
from overcooked_ai_py.visualization.state_visualizer import StateVisualizer

grid = agent_eval.env.mdp.terrain_mtx
starting_state_random_pair = trajectories_random_pair["ep_states"][0][0]
print("starting state of random pair")
# StateVisualizer initialization params are used for additional configuration,
#  let's use defaults here by not supplying any kwarg
img_path = StateVisualizer().display_rendered_state(starting_state_random_pair, grid=grid, ipython_display=True)

starting_state_human_pair = trajectories_human_pair["ep_states"][0][0]
print("starting state of greedy human model pair")
img_path = StateVisualizer().display_rendered_state(starting_state_human_pair, grid=grid, ipython_display=True)

NameError: name 'trajectories_random_pair' is not defined

You can also check out whole trajectories by using `display_rendered_trajectory` method:

In [5]:
print("random trajectory")
# grid is taken automatically from trajectories dict so there is no need to supply it here
img_dir_path = StateVisualizer().display_rendered_trajectory(trajectories_random_pair,
                                                             trajectory_idx=0,ipython_display=True)

random trajectory


interactive(children=(IntSlider(value=0, description='timestep', max=99), Output()), _dom_classes=('widget-int…

Let's compare the chaotic actions above with more logical actions of the greedy human model below:

In [6]:
print("greedy human model trajectory")
img_dir_path = StateVisualizer().display_rendered_trajectory(trajectories_human_pair,
                                                             trajectory_idx=0, ipython_display=True)

greedy human model trajectory


interactive(children=(IntSlider(value=0, description='timestep', max=99), Output()), _dom_classes=('widget-int…

## Custom layouts
Besides premade layouts found in the [layout directory](https://github.com/HumanCompatibleAI/overcooked_ai/tree/master/src/overcooked_ai_py/data/layouts) you can create your own layouts to run agents on. Lets first look at example layout:
```
{
    "grid":  """XXXPPXXX
                X  2   X
                D XXXX S
                X  1   X
                XXXOOXXX""",
    "start_order_list": None,
    "cook_time": 20,
    "num_items_for_soup": 3,
    "delivery_reward": 20,
    "rew_shaping_params": None
}
```

Layout territory is defined by grid. Every character is one tile. Available tiles are:
- empty space - ' '
- counter - 'X'
- onion dispenser - 'O'
- tomato dispenser - 'T'
- pot (place where players cook soup from onions and tomatoes) - 'P' 
- dish dispenser - 'D '
- serving location - 'S'
- player starting location - number  
  
You can save layout in ovecooked_ai/overcooked_ai_py/data/layouts directory and then run agent evaluator AgentEvaluator({"layout_name": layout_name}) where layout_name is filename without `.layout` extension.  
You can also generate random, but valid grids in an automated way. Let's create one and run agents on it.

In [None]:
mdp_gen_params = {"inner_shape": (7,7), # shape the layout
                "prop_empty":0.2, # proportion of empty space in generated layout
                "prop_feats":0.8, # proportion of counters with features on them
                "display": False,
                "start_all_orders": # list of recipes that can be delived
                   [{ "ingredients" : ["onion", "onion", "onion"]},
                    { "ingredients" : ["onion", "onion"]},
                    { "ingredients" : ["onion"]}],
                # (optional param) reward for delivering recipes (for every recipe in start_all_orders)
                "recipe_values" : [20, 9, 4], 
                # (optional param) cooking time of recipes (for every recipe in start_all_orders)
                "recipe_times" : [20, 15, 10]
                 }

env_params =  {"horizon": 100}

mdp_fn = LayoutGenerator.mdp_gen_fn_from_dict(mdp_gen_params, outer_shape=(7, 7))
agent_eval = AgentEvaluator(env_params=env_params, mdp_fn=mdp_fn)

trajectories_random_pair = agent_eval.evaluate_random_pair(num_games=1)
print("Random pair rewards", trajectories_random_pair["ep_returns"])

def pretty_grid(grid):
    return "\n".join("".join(line) for line in grid)

print("\nGenerated grid:\n" + pretty_grid(trajectories_random_pair["mdp_params"][0]["terrain"]))
print("random pair trajectory on generated grid")
img_dir_path = StateVisualizer().display_rendered_trajectory(trajectories_random_pair,
                                                             trajectory_idx=0, ipython_display=True)

  0%|          | 0/1 [00:00<?, ?it/s]

Recomputing motion planner due to: [Errno 2] No such file or directory: '/overcooked_ai_py/data/planners/XXSSPDX|XD    P|XPPX  X|D O2 DX|O 1   X|XDPX  D|XXXXPSX_mp.pkl'
Computing MotionPlanner to be saved in /overcooked_ai_py/data/planners/XXSSPDX|XD    P|XPPX  X|D O2 DX|O 1   X|XDPX  D|XXXXPSX_mp.pkl


Avg rew: 0.00 (std: 0.00, se: 0.00); avg len: 100.00; : 100%|██████████| 1/1 [00:00<00:00,  1.05it/s]

It took 0.9023308753967285 seconds to create mp
Skipping trajectory consistency checking because MDP was recognized as variable. Trajectory consistency checking is not yet supported for variable MDPs.
Random pair rewards [0]

Generated grid:
XXSSPDX
XD    P
XPPX  X
D O  DX
O     X
XDPX  D
XXXXPSX
random pair trajectory on generated grid





interactive(children=(IntSlider(value=0, description='timestep', max=99), Output()), _dom_classes=('widget-int…

## Custom agents
We can also run our own custom agents to see how they are would work. Let's re-create agents doing random actions on our own.

In [None]:
import numpy as np
from overcooked_ai_py.mdp.actions import Action, Direction
from overcooked_ai_py.agents.agent import Agent, AgentPair

class CustomRandomAgent(Agent):
    """
    An agent that randomly picks motion actions.
    NOTE: Does not perform interact actions, unless specified
    """   
    def action(self, state):
        action_probs = np.zeros(Action.NUM_ACTIONS)
        legal_actions = Action.ALL_ACTIONS
        legal_actions_indices = np.array([Action.ACTION_TO_INDEX[motion_a] for motion_a in legal_actions])
        action_probs[legal_actions_indices] = 1 / len(legal_actions_indices)
        return Action.sample(action_probs), {"action_probs": action_probs}

    def actions(self, states, agent_indices):
        return [self.action(state) for state in states]


agent_pair = AgentPair(CustomRandomAgent(), CustomRandomAgent())
mdp_gen_params = {"layout_name": 'cramped_room'}
mdp_fn = LayoutGenerator.mdp_gen_fn_from_dict(mdp_gen_params)
env_params = {"horizon": 200}
agent_eval = AgentEvaluator(env_params=env_params, mdp_fn=mdp_fn)
trajectories_custom_random_pair = agent_eval.evaluate_agent_pair(agent_pair, num_games=2)
print("Custom random pair rewards", trajectories_custom_random_pair["ep_returns"])

Avg rew: 0.00 (std: 0.00, se: 0.00); avg len: 200.00; : 100%|██████████| 2/2 [00:00<00:00, 13.11it/s]

Skipping trajectory consistency checking because MDP was recognized as variable. Trajectory consistency checking is not yet supported for variable MDPs.
Custom random pair rewards [0 0]





CustomRandomAgent is lightweight version of RandomAgent from overcooked_ai_py.agents.agent module. ```agent_eval.evaluate_agent_pair(agent_pair)``` have same effect as ```agent_eval.evaluate_random_pair()```.

## Single player variant
If you want to make single-player variant you need to set one of the agents to stay and do nothing. `StayAgent` is such agent. It is good to take choose a layout where every player is not blocking any crucial path to resource e.g. only onion dispenser on the layout.

In [None]:
from overcooked_ai_py.agents.agent import StayAgent, RandomAgent, GreedyHumanModel
mdp_gen_params = {"layout_name": 'five_by_five'}
mdp_fn = LayoutGenerator.mdp_gen_fn_from_dict(mdp_gen_params)
env_params = {"horizon": 100}
agent_eval = AgentEvaluator(env_params=env_params, mdp_fn=mdp_fn)

single_random_agent_pair = AgentPair(RandomAgent(all_actions=True), StayAgent())
trajectories_single_random_agent = agent_eval.evaluate_agent_pair(single_random_agent_pair, num_games=1)
print("single random agent rewards", trajectories_single_random_agent["ep_returns"])

single_greedy_agent_pair = AgentPair(GreedyHumanModel(agent_eval.env.mlam), StayAgent())
trajectories_single_greedy_agent = agent_eval.evaluate_agent_pair(single_greedy_agent_pair, num_games=1)
print("single greedy agent rewards", trajectories_single_greedy_agent["ep_returns"])
print("single greedy agent trajectory")
img_dir_path = StateVisualizer().display_rendered_trajectory(trajectories_single_greedy_agent,
                                                             trajectory_idx=0, ipython_display=True)

Avg rew: 0.00 (std: 0.00, se: 0.00); avg len: 100.00; : 100%|██████████| 1/1 [00:00<00:00,  8.30it/s]


Recomputing motion planner due to: [Errno 2] No such file or directory: '/overcooked_ai_py/data/planners/five_by_five_mp.pkl'
Computing MotionPlanner to be saved in /overcooked_ai_py/data/planners/five_by_five_mp.pkl
It took 0.09471964836120605 seconds to create mp
Skipping trajectory consistency checking because MDP was recognized as variable. Trajectory consistency checking is not yet supported for variable MDPs.
single random agent rewards [0]
Computing MediumLevelActionManager
Recomputing planner due to: [Errno 2] No such file or directory: '/overcooked_ai_py/data/planners/five_by_five_am.pkl'
Computing MediumLevelActionManager to be saved in /overcooked_ai_py/data/planners/five_by_five_am.pkl


Avg rew: 20.00 (std: 0.00, se: 0.00); avg len: 100.00; : 100%|██████████| 1/1 [00:00<00:00, 23.20it/s]

It took 1.557342767715454 seconds to create mlam
Skipping trajectory consistency checking because MDP was recognized as variable. Trajectory consistency checking is not yet supported for variable MDPs.
single greedy agent rewards [20]
single greedy agent trajectory





interactive(children=(IntSlider(value=0, description='timestep', max=99), Output()), _dom_classes=('widget-int…

In [7]:
from overcooked_ai_py.mdp.layout_generator import LayoutGenerator, MDPParamsGenerator, DEFAULT_PARAMS_SCHEDULE_FN

mdp_param_generator = MDPParamsGenerator(DEFAULT_PARAMS_SCHEDULE_FN)
layout_generator = LayoutGenerator(mdp_param_generator)

## Rotation ou transposition d'une map

In [26]:
from copy import deepcopy
from overcooked_ai_py.mdp.overcooked_mdp import OvercookedGridworld
from overcooked_ai_py.mdp.layout_generator import Grid
import numpy as np
#mdp_gen_params = {"layout_name": 'mlg180'}
for k in range(10):
    mdp = OvercookedGridworld.from_layout_name('trial0_'+str(k), 'overcooked_ai_py/data/layouts')
    base_grid = mdp.terrain_mtx
    counter_goals = mdp.counter_goals
    for idx, pos in enumerate(mdp.mdp_params['start_player_positions']):
        base_grid[pos[1]][pos[0]] = str(idx+1)
    for pos in counter_goals:
        base_grid[pos[1]][pos[0]] = 'K'
    base_grid = np.array(base_grid)
    #mdp.terrain_mtx = np.rot90(base_grid)
    #mdp.terrain_mtx = np.transpose(base_grid)
    print(mdp.terrain_mtx)
    rotated_counter_goals = np.argwhere(mdp.terrain_mtx == 'K')
    mdp.counter_goals = [(x,y) for y,x in rotated_counter_goals]
    print(mdp.counter_goals)
    mdp.terrain_mtx = np.where(mdp.terrain_mtx=="K", "X", mdp.terrain_mtx)
    #print(mdp.start_player_positions)
    print(mdp.terrain_mtx)
    #print(np.where(mdp.terrain_mtx=='1')[::-1])
    mdp.start_player_positions[0] = np.where(mdp.terrain_mtx=='1')[::-1]
    mdp.start_player_positions[1] = np.where(mdp.terrain_mtx=='2')[::-1]
    #print(mdp.start_player_positions)
    mdp.to_layout_file('overcooked_ai_py/data/layouts/trial4_'+str(k)+'.layout')


[['X', 'O', 'X', 'X', 'X', 'X', 'X'], ['X', ' ', ' ', ' ', ' ', ' ', 'S'], ['K', ' ', ' ', '1', ' ', ' ', 'X'], ['P', ' ', ' ', '2', ' ', ' ', 'X'], ['K', ' ', ' ', ' ', ' ', ' ', 'D'], ['K', ' ', ' ', ' ', ' ', ' ', 'X'], ['X', 'X', 'X', 'T', 'X', 'X', 'X']]
[]
[['X' 'O' 'X' 'X' 'X' 'X' 'X']
 ['X' ' ' ' ' ' ' ' ' ' ' 'S']
 ['K' ' ' ' ' '1' ' ' ' ' 'X']
 ['P' ' ' ' ' '2' ' ' ' ' 'X']
 ['K' ' ' ' ' ' ' ' ' ' ' 'D']
 ['K' ' ' ' ' ' ' ' ' ' ' 'X']
 ['X' 'X' 'X' 'T' 'X' 'X' 'X']]
[['X', 'K', 'K', 'P', 'K', 'X', 'X'], ['X', ' ', ' ', '1', ' ', ' ', 'X'], ['X', ' ', ' ', ' ', 'T', ' ', 'X'], ['X', ' ', 'X', ' ', 'X', ' ', 'X'], ['X', ' ', 'O', ' ', ' ', ' ', 'X'], ['X', ' ', ' ', '2', ' ', ' ', 'X'], ['X', 'D', 'X', 'X', 'X', 'S', 'X']]
[]
[['X' 'K' 'K' 'P' 'K' 'X' 'X']
 ['X' ' ' ' ' '1' ' ' ' ' 'X']
 ['X' ' ' ' ' ' ' 'T' ' ' 'X']
 ['X' ' ' 'X' ' ' 'X' ' ' 'X']
 ['X' ' ' 'O' ' ' ' ' ' ' 'X']
 ['X' ' ' ' ' '2' ' ' ' ' 'X']
 ['X' 'D' 'X' 'X' 'X' 'S' 'X']]
[['X', 'X', 'O', 'X', 'D', 'X', 'X'], 

In [2]:
from copy import deepcopy
from overcooked_ai_py.mdp.overcooked_mdp import OvercookedGridworld
from overcooked_ai_py.mdp.layout_generator import Grid
import numpy as np
#mdp_gen_params = {"layout_name": 'mlg180'}
for k in range(5):
    print(k)
    mdp = OvercookedGridworld.from_layout_name('coord_map0_'+str(k), 'overcooked_ai_py/data/layouts')
    base_grid = mdp.terrain_mtx
    for idx, pos in enumerate(mdp.mdp_params['start_player_positions']):
        base_grid[pos[1]][pos[0]] = str(idx+1)
    base_grid = np.array(base_grid)
    #mdp.terrain_mtx = np.rot90(base_grid)
    mdp.terrain_mtx = np.transpose(base_grid)
    print(mdp.terrain_mtx)
    rotated_counter_goals = np.argwhere(mdp.terrain_mtx == 'Y')
    mdp.counter_goals = [(x,y) for y,x in rotated_counter_goals]
    print(mdp.counter_goals)
    mdp.terrain_mtx = np.where(mdp.terrain_mtx=="Y", "Y", mdp.terrain_mtx)
    #print(mdp.start_player_positions)
    print(mdp.terrain_mtx)
    #print(np.where(mdp.terrain_mtx=='1')[::-1])
    mdp.start_player_positions[0] = np.where(mdp.terrain_mtx=='1')[::-1]
    mdp.start_player_positions[1] = np.where(mdp.terrain_mtx=='2')[::-1]
    #print(mdp.start_player_positions)
    mdp.to_layout_file('overcooked_ai_py/data/layouts/coord_map4_'+str(k)+'.layout')



0
[['X' 'X' 'D' 'X' 'X' 'X' 'X']
 ['X' ' ' ' ' ' ' 'X' ' ' 'X']
 ['X' ' ' ' ' 'T' ' ' ' ' 'X']
 ['X' ' ' '2' 'P' '1' ' ' 'X']
 ['X' 'S' 'X' 'X' ' ' ' ' 'X']
 ['X' 'X' ' ' ' ' ' ' 'O' 'X']
 ['X' 'X' 'X' 'X' 'X' 'X' 'X']]
[]
[['X' 'X' 'D' 'X' 'X' 'X' 'X']
 ['X' ' ' ' ' ' ' 'X' ' ' 'X']
 ['X' ' ' ' ' 'T' ' ' ' ' 'X']
 ['X' ' ' '2' 'P' '1' ' ' 'X']
 ['X' 'S' 'X' 'X' ' ' ' ' 'X']
 ['X' 'X' ' ' ' ' ' ' 'O' 'X']
 ['X' 'X' 'X' 'X' 'X' 'X' 'X']]
1
[['X' 'O' 'X' 'X' 'X' 'S' 'X']
 ['X' ' ' 'X' ' ' ' ' ' ' 'X']
 ['X' ' ' 'X' 'X' 'T' ' ' 'X']
 ['X' ' ' 'P' ' ' ' ' ' ' 'X']
 ['X' '1' 'X' ' ' ' ' ' ' 'X']
 ['X' ' ' 'X' ' ' ' ' '2' 'X']
 ['X' 'X' 'X' 'X' 'D' 'X' 'X']]
[]
[['X' 'O' 'X' 'X' 'X' 'S' 'X']
 ['X' ' ' 'X' ' ' ' ' ' ' 'X']
 ['X' ' ' 'X' 'X' 'T' ' ' 'X']
 ['X' ' ' 'P' ' ' ' ' ' ' 'X']
 ['X' '1' 'X' ' ' ' ' ' ' 'X']
 ['X' ' ' 'X' ' ' ' ' '2' 'X']
 ['X' 'X' 'X' 'X' 'D' 'X' 'X']]
2
[['X' 'T' 'X' 'X' 'X' 'X' 'X']
 ['X' ' ' ' ' ' ' ' ' ' ' 'X']
 ['X' '1' ' ' 'X' ' ' ' ' 'X']
 ['X' ' ' ' ' 'X' ' ' '

## Génération des couples map/recettes

In [5]:
from overcooked_ai_py.mdp.overcooked_mdp import Recipe
from itertools import combinations
from copy import deepcopy
Recipe.configure({'onion_value': 3, 'tomato_value': 2})
Recipe.ALL_RECIPES
RECIPES = {}
for recipe in Recipe.ALL_RECIPES:
    RECIPES[recipe] = recipe.value
    #print(recipe.value)
comb = combinations(RECIPES, 6)
print(len(list(deepcopy(comb))))
comb_value = {}
for orders in comb:
    value =0
    for recipe in orders:
        value += recipe.value
    comb_value[orders] = value
comb_value_sorted = {k: v for k, v in sorted(comb_value.items(), key=lambda item: item[1])}
#print(comb_value_sorted)
selected_orders = [orders for orders, value in comb_value_sorted.items() if value == 33] #170 157 178 183 186 191
former_value = 0
count = 0
for key, value in comb_value_sorted.items():
    if value != former_value :
        print(former_value, count)
        former_value = value
        count = 1
    else : 
        count += 1 
        
for order in selected_orders :
    print([{"ingredients" : list(ingredients)
           } for ingredients in order])
        


84
0 0
26 1
27 2
28 3
29 5
30 6
31 8
32 9
33 10
34 9
35 9
36 7
37 6
38 4
39 3
40 1
[{'ingredients': ['onion', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['onion', 'tomato']}, {'ingredients': ['tomato', 'tomato']}, {'ingredients': ['tomato', 'tomato', 'tomato']}, {'ingredients': ['tomato']}]
[{'ingredients': ['onion', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['onion', 'tomato']}, {'ingredients': ['tomato', 'tomato']}, {'ingredients': ['tomato']}, {'ingredients': ['onion', 'onion']}]
[{'ingredients': ['onion', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'tomato']}, {'ingredients': ['onion']}, {'ingredients': ['tomato']}]
[{'ingredients': ['onion', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['onion']}, {'ingredients': ['tomato', 'tomato', 'tomato']}, {'ingredients': ['t

In [2]:
!pwd

/home/marin/ONERA_soft/overcooked_mlg


In [4]:
from itertools import product
import json
selected_orders = [orders for orders, value in comb_value_sorted.items() if value == 170 ]
agents = ["Lazy", "Greedy", "Rational"]
layoutsblock1 = ["mlg","mlg90", "mlg180", "mlg270", "mlgT", "mlg90T", "mlg180T", "mlg270T", "mlg"]
layoutsblock2 = layoutsblock1[::-1]
agentsXrecipes = list(product(selected_orders, agents))
for i in range(len(agentsXrecipes)):
    agentsXrecipes[i] = list(agentsXrecipes[i])
block1 = agentsXrecipes[:9]
block2 = agentsXrecipes[9:]

for i in range(len(block1)):
    block1[i].append(layoutsblock1[i])
for i in range(len(block2)):
    block2[i].append(layoutsblock2[i])
#print(block1)
#print(block2)

dic_block1 = {}
for idx, trial in enumerate(block1):
    dic_block1[idx] = {'all_orders' : [{"ingredients" : list(ingredients)} for ingredients in trial[0]], "agent" : trial[1], "layout": trial[2] }
    
dic_block2 = {}
for idx, trial in enumerate(block2):
    dic_block2[idx] = {'all_orders' : [{"ingredients" : list(ingredients)} for ingredients in trial[0]], "agent" : trial[1], "layout": trial[2] }
    
print(dic_block1)
print(dic_block2)

dic_trials = {"block1" : dic_block1, "block2" : dic_block2}

{0: {'all_orders': [{'ingredients': ['tomato', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['tomato', 'tomato']}, {'ingredients': ['onion', 'onion']}], 'agent': 'Lazy', 'layout': 'mlg'}, 1: {'all_orders': [{'ingredients': ['tomato', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['tomato', 'tomato']}, {'ingredients': ['onion', 'onion']}], 'agent': 'Greedy', 'layout': 'mlg90'}, 2: {'all_orders': [{'ingredients': ['tomato', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['tomato', 'tomato']}, {'ingredients': ['onion', 'onion']}], 'agent': 'Rational', 'layout': 'mlg180'}, 3: {'all_orders': [{'ingredients': ['tomato', 'tomato', 'tomato']}, {'ingredients': ['onion', 'onion', 'onion']}, {'ingredients': ['onion']}, {'ingredients': ['onion', 'tomato', 'tomato']}], 'agent': 'Lazy', 'layout': 'mlg270'}, 4: {'all_orders': [{'ingredients': ['tomato', 'tomato', 'tomato']}, {'ingredients

In [5]:
with open("trials.json", 'w') as f:
    json.dump(dic_trials, f)
    f.close()

In [1]:
import numpy as np
from copy import deepcopy
from overcooked_ai_py.mdp.overcooked_mdp import OvercookedGridworld
from overcooked_ai_py.mdp.layout_generator import Grid
from overcooked_ai_py.utils import read_layout_dict

In [10]:
layout_dir =  'overcooked_ai_py/data/layouts/marin_II_constrained'
base_layout_params = read_layout_dict('marinII0', layout_dir)
grid = base_layout_params['grid']
grid = [layout_row.strip() for layout_row in grid.split("\n")]
grid = np.array([[c for c in row] for row in grid])
print(grid)


[['X' 'O' 'X' 'X' 'X' 'X' 'D' 'X' 'X' 'X']
 ['X' ' ' ' ' ' ' 'X' ' ' ' ' ' ' ' ' 'X']
 ['X' ' ' '1' ' ' 'X' ' ' ' ' ' ' ' ' 'X']
 ['X' ' ' ' ' ' ' 'X' ' ' ' ' ' ' ' ' 'X']
 ['X' ' ' ' ' ' ' ' ' ' ' ' ' 'X' 'X' 'X']
 ['X' ' ' ' ' ' ' '2' ' ' ' ' ' ' ' ' 'S']
 ['X' ' ' 'X' ' ' ' ' ' ' ' ' ' ' ' ' 'X']
 ['X' ' ' 'Y' ' ' ' ' ' ' ' ' ' ' ' ' 'X']
 ['X' ' ' 'Y' ' ' ' ' ' ' ' ' ' ' ' ' 'X']
 ['X' 'P' 'X' 'X' 'T' 'X' 'X' 'X' 'X' 'X']]


In [17]:
def make_grid_string(grid):
    grid_string = """"""
    for line in grid:
        for tile in line:
            grid_string += tile
        grid_string +='\n'
    grid_string += "\n\t\t\t\t"
    return grid_string

In [18]:
print(make_grid_string(grid))

XOXXXXDXXX
X   X    X
X 1 X    X
X   X    X
X      XXX
X   2    S
X X      X
X Y      X
X Y      X
XPXXTXXXXX

				
