Puppets share state in Scenarios #70

jagapiou · 2022-08-22T12:12:40Z

Certain puppeteer functions (e.g. GrimTwoResourceInTheMatrix) are not pure functions, but are Callable objects that maintain their own state between __call__ and update it as a side-effect. This means that puppeteer state is not exposed in the PuppetPolicy state returned by initial_state and transformed by step.

Since puppeteer functions are shared between PuppetPolicy, multiple puppets running in the same Scenario will erroneously share a single puppeteer state between multiple puppets.
Puppeteer state is not reset at the start of a new episode.

Both of these issues will cause undefined behavior when running the affected scenarios.

Affected bots:

chicken_puppet_grim
cleanup_puppet_reciprocator_threshold_low
cleanup_puppet_reciprocator_threshold_mid
prisoners_dilemma_puppet_grim_threshold_high
prisoners_dilemma_puppet_grim_threshold_low
stag_hunt_puppet_grim

Affected scenarios:

chicken_in_the_matrix_4
clean_up_4
clean_up_5
clean_up_6
prisoners_dilemma_in_the_matrix_4
prisoners_dilemma_in_the_matrix_5
stag_hunt_in_the_matrix_2
No other scenarios were affected by this bug.

NOTE: This bug did not affect the evaluations run on these scenarios in the Melting Pot paper.

The text was updated successfully, but these errors were encountered:

jzleibo · 2022-08-22T12:33:01Z

I'll add some additional interpretation of how to understand this issue, and what has changed now that it is fixed.

The puppet bots for prisoners_dillemma_in_the_matrix, chicken_in_the_matrix, and stag_hunt_in_the_matrix implement grim trigger strategies. So once any player defects on one of these bots they then "trigger" and start defecting in all future interactions. Note: once triggered, they defect on everyone, not just the player that defected on them (these are 8 player games).

The bug was that multiple puppet bots playing in the same episode would share state with one another instead of maintaining their own state. So, while it was in effect, when any player defected on any puppet bot then ALL puppet bots would share that state change and consequently start defecting indiscriminately. So, these situations were even more grim than we had intended.

As a result, we expect scores will likely increase on the affected *_in_the_matrix scenarios now that the fix is in.

For clean_up, the situation is different. The affected bots are supposed to clean the river whenever a specified number of other players are also cleaning the river. Since this was a global signal in the first place, it didn't matter that it was erroneously shared between puppet bots. They were all meant to clean at the same time, and under the same circumstances. So fixing the bug should not have any effect on clean_up results.

More principled fix for #70 PiperOrigin-RevId: 469493212 Change-Id: I3f1ab3917911a03b02b42f50ee40bd03684767fe

jagapiou added the bug Something isn't working label Aug 22, 2022

jagapiou pinned this issue Aug 22, 2022

copybara-service bot closed this as completed in 7ee768f Aug 22, 2022

copybara-service bot pushed a commit that referenced this issue Aug 23, 2022

Create a Puppeteer class.

3515a62

More principled fix for #70 PiperOrigin-RevId: 469493212 Change-Id: I3f1ab3917911a03b02b42f50ee40bd03684767fe

jagapiou unpinned this issue Nov 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Puppets share state in Scenarios #70

Puppets share state in Scenarios #70

jagapiou commented Aug 22, 2022

jzleibo commented Aug 22, 2022 •

edited

Puppets share state in Scenarios #70

Puppets share state in Scenarios #70

Comments

jagapiou commented Aug 22, 2022

jzleibo commented Aug 22, 2022 • edited

jzleibo commented Aug 22, 2022 •

edited