Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puppets share state in Scenarios #70

Closed
jagapiou opened this issue Aug 22, 2022 · 1 comment
Closed

Puppets share state in Scenarios #70

jagapiou opened this issue Aug 22, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@jagapiou
Copy link
Member

Certain puppeteer functions (e.g. GrimTwoResourceInTheMatrix) are not pure functions, but are Callable objects that maintain their own state between __call__ and update it as a side-effect. This means that puppeteer state is not exposed in the PuppetPolicy state returned by initial_state and transformed by step.

  1. Since puppeteer functions are shared between PuppetPolicy, multiple puppets running in the same Scenario will erroneously share a single puppeteer state between multiple puppets.
  2. Puppeteer state is not reset at the start of a new episode.

Both of these issues will cause undefined behavior when running the affected scenarios.

Affected bots:

  • chicken_puppet_grim
  • cleanup_puppet_reciprocator_threshold_low
  • cleanup_puppet_reciprocator_threshold_mid
  • prisoners_dilemma_puppet_grim_threshold_high
  • prisoners_dilemma_puppet_grim_threshold_low
  • stag_hunt_puppet_grim

Affected scenarios:

  • chicken_in_the_matrix_4
  • clean_up_4
  • clean_up_5
  • clean_up_6
  • prisoners_dilemma_in_the_matrix_4
  • prisoners_dilemma_in_the_matrix_5
  • stag_hunt_in_the_matrix_2
    No other scenarios were affected by this bug.

NOTE: This bug did not affect the evaluations run on these scenarios in the Melting Pot paper.

@jagapiou jagapiou added the bug Something isn't working label Aug 22, 2022
@jagapiou jagapiou pinned this issue Aug 22, 2022
@jzleibo
Copy link
Collaborator

jzleibo commented Aug 22, 2022

I'll add some additional interpretation of how to understand this issue, and what has changed now that it is fixed.

The puppet bots for prisoners_dillemma_in_the_matrix, chicken_in_the_matrix, and stag_hunt_in_the_matrix implement grim trigger strategies. So once any player defects on one of these bots they then "trigger" and start defecting in all future interactions. Note: once triggered, they defect on everyone, not just the player that defected on them (these are 8 player games).

The bug was that multiple puppet bots playing in the same episode would share state with one another instead of maintaining their own state. So, while it was in effect, when any player defected on any puppet bot then ALL puppet bots would share that state change and consequently start defecting indiscriminately. So, these situations were even more grim than we had intended.

As a result, we expect scores will likely increase on the affected *_in_the_matrix scenarios now that the fix is in.

For clean_up, the situation is different. The affected bots are supposed to clean the river whenever a specified number of other players are also cleaning the river. Since this was a global signal in the first place, it didn't matter that it was erroneously shared between puppet bots. They were all meant to clean at the same time, and under the same circumstances. So fixing the bug should not have any effect on clean_up results.

copybara-service bot pushed a commit that referenced this issue Aug 23, 2022
More principled fix for #70

PiperOrigin-RevId: 469493212
Change-Id: I3f1ab3917911a03b02b42f50ee40bd03684767fe
@jagapiou jagapiou unpinned this issue Nov 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants