8 players 2 states = ['safe', 'distancing'] 4 actions [A, B, C, D]
D > C > B > A optimal strategy: follow the crowd safe: there should be no more than 4 players choosing the same action distancing: everyone would receive less rewards, to the safe state, 2 players with each actions
to reproduce the results, run plot.py