You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Certain puppeteer functions (e.g. GrimTwoResourceInTheMatrix) are not pure functions, but are Callable objects that maintain their own state between __call__ and update it as a side-effect. This means that puppeteer state is not exposed in the PuppetPolicy state returned by initial_state and transformed by step.
Since puppeteer functions are shared between PuppetPolicy, multiple puppets running in the same Scenario will erroneously share a single puppeteer state between multiple puppets.
Puppeteer state is not reset at the start of a new episode.
Both of these issues will cause undefined behavior when running the affected scenarios.
Affected bots:
chicken_puppet_grim
cleanup_puppet_reciprocator_threshold_low
cleanup_puppet_reciprocator_threshold_mid
prisoners_dilemma_puppet_grim_threshold_high
prisoners_dilemma_puppet_grim_threshold_low
stag_hunt_puppet_grim
Affected scenarios:
chicken_in_the_matrix_4
clean_up_4
clean_up_5
clean_up_6
prisoners_dilemma_in_the_matrix_4
prisoners_dilemma_in_the_matrix_5
stag_hunt_in_the_matrix_2
No other scenarios were affected by this bug.
NOTE: This bug did not affect the evaluations run on these scenarios in the Melting Pot paper.
The text was updated successfully, but these errors were encountered:
I'll add some additional interpretation of how to understand this issue, and what has changed now that it is fixed.
The puppet bots for prisoners_dillemma_in_the_matrix, chicken_in_the_matrix, and stag_hunt_in_the_matrix implement grim trigger strategies. So once any player defects on one of these bots they then "trigger" and start defecting in all future interactions. Note: once triggered, they defect on everyone, not just the player that defected on them (these are 8 player games).
The bug was that multiple puppet bots playing in the same episode would share state with one another instead of maintaining their own state. So, while it was in effect, when any player defected on any puppet bot then ALL puppet bots would share that state change and consequently start defecting indiscriminately. So, these situations were even more grim than we had intended.
As a result, we expect scores will likely increase on the affected *_in_the_matrix scenarios now that the fix is in.
For clean_up, the situation is different. The affected bots are supposed to clean the river whenever a specified number of other players are also cleaning the river. Since this was a global signal in the first place, it didn't matter that it was erroneously shared between puppet bots. They were all meant to clean at the same time, and under the same circumstances. So fixing the bug should not have any effect on clean_up results.
Certain puppeteer functions (e.g.
GrimTwoResourceInTheMatrix
) are not pure functions, but areCallable
objects that maintain their own state between__call__
and update it as a side-effect. This means that puppeteer state is not exposed in thePuppetPolicy
state returned byinitial_state
and transformed bystep
.PuppetPolicy
, multiple puppets running in the same Scenario will erroneously share a single puppeteer state between multiple puppets.Both of these issues will cause undefined behavior when running the affected scenarios.
Affected bots:
Affected scenarios:
No other scenarios were affected by this bug.
NOTE: This bug did not affect the evaluations run on these scenarios in the Melting Pot paper.
The text was updated successfully, but these errors were encountered: