Skip to content
Contains the GYM environment for the Chefs Hat game
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.

This repository holds the OpenGYM environment for the Chef's Hat card game.

Chef's Hat Card game

Chef's Hat Card Game

Chef's Hat is a cardgame designed with specific HRI requirements in mind, which allows it to be followed and modeled by artificial agents with ease. Also, the game mechanics were designed to evoke different affective interactions within the game, which can be easily perceived and displayed by a robot. Furthermore, the game elements were design to facilitate the extraction of the game state through the use of QR-codes and specific turn taking actions, which do not break the game flow.

Fora a complete overview on the development of the game, refer to:

If you want to have access to the game materials (cards and playing field), please contact us using the contact information at the end of the page.

Summary of game rules

Chef's Hat Card Game

During each game there are three phases: Start of the game, Making Pizzas, End of the game. The game starts with the cards been shuffled and dealt with the players. Then, starting from the second game, the exchange of roles takes place based on the last games' finishing positions. The player who finished first becomes the Chef, the one that finished second becomes the Sous-Chef, the one that finished third becomes the Waiter and the last one the Dishwasher. Once the roles are exchanged, the exchange of the cards starts. The Dishwasher has to give the two cards with the highest values to the Chef, who in return gives back two cards of their liking. The Waiter has to give their lowest valued card to the Sous-Chef, who in return gives one card of their liking.

If, after the exchange of roles, any of the players have two jokers at hand, they can perform a special action: in case of the Dishwasher, this is "Food Fight" (the hierarchy is inverted), in case of the other roles it is "Dinner is served" (there will be no card exchange during that game).

Once all of the cards and roles are exchanged, the game starts. The goal of each player is to discard all the cards at hand. They can do this by making a pizza by laying down the cards into the playing field, represented by a pizza dough. The person who possesses a Golden 11 card at hand starts making the first pizza of the game. A pizza is done when no one can, or wants, to lay down any ingredients anymore. A player can play cards by discarding their ingredient cards on the pizza base. To play cards, they need to be rarer (i.e. lowest face values) than the previously played cards. The ingredients are played from highest to the lowest number, that means from 11 to 1. Players can play multiple copies of an ingredient at once, but always have to play an equal or greater amount of copies than the previous player did. If a player cannot (or does not want) to play, they pass until the next pizza starts. A joker card is also available and when played together with other cards, it assumes their value. When played alone, the joker has the highest face value (12). Once everyone has passed, they start a new pizza by cleaning the playing field, and the last player to play an ingredient is the first one to start the new pizza.

Chef's Hat OpenAI Gym Simulation Environment

This environment is freely available for scientific purposes and implements all the rules and mechanics of the Chef's Hat game.

For an in-depth look at the functioning of the environment, please refer to:

The environment is build based on the OpenAi Gym toolkit, so it implements a portable and reusable environment. To complement the environment and help with the experimental simulations, we encapsulate the enviromment using the ChefsHatExperimentHandler module.

The experiment handler runs a series of games. Each game is independent, but based on Chef's Hat rules and the end results of each game affects the roles of the next game.

Space an Action definition

Chef's Hat Card Game

The simulator represents the current game state for each player as the cards the player has at hand and the cards in the playing field. For each player, there are a total of 200 allowed actions: to discard one card of face value 1 represents one move or to discard 3 cards of face value 1 and a joker is another move, while passing is considered another move. Each player can only do one action per game turn.

Each action taken by a player is validated based on a look-up-table, created in real-time based on the player's hand and the cards in the playing field. This is a crucial step to guarantee that a taken action is allowed given the game context and to guarantee that the game rules are maintained. The Figure above illustrates an example of calculated possible actions given a game state. The blue areas mark all the possible action states, while the gray areas mark actions that are not allowed due to the game's mechanics. We observed that, given this particular game state, this player would only be allowed to perform one of three actions (marked in green), while any other action (marked in red) would be considered as invalid and not would be carried on by the simulator.


Install the requirements from the Requirements.txt file.


To run the experiment, you have to provide the experiment handler four agents. We include here the following agents implementations:

  • Dummy agent Random: an agent that does not learn, and select all actions randomly based on the possible actions. (more info:
  • Dummy agent AlwaysOneCard: an agent that does not learn, and only select all actions randomly based on discarding o nly one card at a time.
  • DeepQL Agent: an agent that learns how to play the game based on deep Q-learning. (more info:

Simulation Data Structure

The environment creates a set of logs and metrics that can help to understand the game state and the agents' behavior during the experiments. For each experiment, one folder is created with the following structure:

  • Dataset : Holds the generated dataset with all the actions taken by the agents during the game
  • Log: Holds the log of the game, if enabled, and the metrics .csv file with all the metrics of all players for all games.
  • Model: Holds the learned models, per game, when using learning agents.
  • Plots: Holds all the plots generated during the simulations.

The environment saves a dataset, in the format of .csv files, that stores the dated game state, the taken actions, and game events that happen during one game play. The dataset is used later on to create a game-play video, and to extract specific metrics of that gameplay.

Dataset Example

Our Examples folder holds specific examples of recorded datasets.

Logs and Metrics

If requested, the environment saves a human-friendly log of all the experiments within the Log folder. This log, saved in simple .txt form, can be used to perform human analysis on the game state, but also to identify anomalous behavior in long simulations. It saves all the game state, player hand and board, actions taken, reward and number of invalida actions per player turn.

Together with the log, the environment saves the following metrics per player for the entire simulation:

  • The number of rounds per game.
  • The number of victories.
  • The total averaged reward per game.
  • The finishing positions.
  • The number of invalid actions.

Also in the Log folder, the environment saves a metrics.csv file. This file contains the following metrics per player per game:

  • Finishing position
  • Average reward over the entire game
  • Number of passes for each game quarter
  • Number of discards for each game quarter

The number of passes and discards are calculated for each game quarter (25% of the game duration), and help us to understand the agents' strategy while playing the game.


The environment calculates a series of plots to better understand the simulation. The plots folder holds:

Plots per game

  • A) Number of rounds per game
  • B) Victories histogram per game
  • C) Finishing position per player
  • D) Reward valid actions per player
  • E) Invalid actions per player
  • F) Players timeline per game
  • G) The discard behavior of each player per game

Running experiments

Currently, the environment implements the following experiments and examples:

  • illustrates how to run an experiment composed of four agents and a series of games.
  • Experiments_Media/ illustrates how to generate videos from collected datasets.
  • Experiments_Media/ illustrates how to generate timeline plots from collected datasets.
  • Experiments_Optmization folder contains examples on how to run an agent optmization based on Hyperopt.
  • Experiments_Publications folder contains the following implementations:
    • IROS_2020: Barros, P., Sciutti, A., Hootsmans, I. M., Opheij, L. M., Toebosch, R. H., & Barakova, E. (2020) The Chef's Hat Simulation Environment for Reinforcement-Learning-Based Agents. arXiv preprint arXiv:2003.05861. (

The simulator has a specific funcionality that reads a given dataset and generates an entire video of that gameplay. The video can be used to evaluate a specific behavior, or just to illustrate how the agents play the game. The following link directs to a video example: Watch the video


The Examples folder contains the datasets and generated videos of two conditions: Running all four agents as random agents, and running all four agents as learning agents.

Use and distribution policy

All the examples in this repository are distributed under a Non-Comercial license. If you use this environment, you have to agree with the following itens:

  • To cite our associated references in any of your publication that make any use of these examples.
  • To use the environment for research purpose only.
  • To not provide the environment to any second parties.


  • Barros, P., Sciutti, A., Hootsmans, I. M., Opheij, L. M., Toebosch, R. H., & Barakova, E. (2020). It's Food Fight! Introducing the Chef's Hat Card Game for Affective-Aware HRI. Accepted at the HRI2020 Workshop on Exploring Creative Content in Social Robotics! arXiv preprint arXiv:2002.11458.

  • Barros, P., Sciutti, A., Hootsmans, I. M., Opheij, L. M., Toebosch, R. H., & Barakova, E. (2020) The Chef's Hat Simulation Environment for Reinforcement-Learning-Based Agents. arXiv preprint arXiv:2003.05861.


Pablo Barros -

You can’t perform that action at this time.