# Rock Paper Scissors - Nash Equilibrium Strategy

Example of using Nash Equilibrium principle in Rock-Paper-Scissors game

![](https://storage.googleapis.com/kaggle-competitions/kaggle/22838/logos/header.png?t=2020-11-02-21-55-44)

<a id="top"></a>

<div class="list-group" id="list-tab" role="tablist">
<h3 class="list-group-item list-group-item-action active" data-toggle="list" style='color:black; background:#FBE338; border:0' role="tab" aria-controls="home"><center>Quick Navigation</center></h3>

* [1. Nash Equilibrium Overview](#1)
* [2. Agent Code](#2)
* [3. Battle Examples](#3)

<a id="1"></a>
<h2 style='background:#FBE338; border:0; color:black'><center>Nash Equilibrium Overview<center><h2>

In game theory, the Nash equilibrium, named after the mathematician John Forbes Nash Jr., is a proposed solution of a non-cooperative game involving two or more players in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only their own strategy. [Wikipedia](https://en.wikipedia.org/wiki/Nash_equilibrium#cite_note-Osborne-1)

Consider Rock-Paper-Scissors awards matrix (our reward/action is blue, the reward/action of the opponent is red):

![](https://i.imgur.com/aEL9IKd.png)

If we played each action with equal probability 1/3 then the opponent must do the same.   
Otherwise if the opponent will play all the time Rock, then:
- he ties a third of the time against Rock, 
- he loses a third of the time against Paper,
- and he wins a third of the time against Scissors.

Then he will get reward 1/3 \* 0 + 1/3 \* (-1) + 1/3 * 1 = 0.    
**But in this case, we can change our strategy to Paper and win all the time.**

![](https://i.imgur.com/5FYS8L4.png)

If the opponent will play all the time Paper, then:
- he wins a third of the time against Rock,
- he ties a third of the time against Paper,
- and he loses a third of the time against Scissors.

Then he will get reward 1/3 \* 1 + 1/3 \* 0 + 1/3 * (-1) = 0.    
**But in this case, we can change our strategy to Scissors and win all the time.**

![](https://i.imgur.com/doHd5dP.png)

If the opponent will play all the time Scissors, then:
- he loses a third of the time against Rock,
- he wins a third of the time against Paper,
- and he ties a third of the time against Scissors.

Then he will get reward 1/3 \* (-1) + 1/3 \* 1 + 1/3 * (0) = 0.    
**But in this case, we can change our strategy to Rock and win all the time.**

![](https://i.imgur.com/yjy0yCx.png)

The remaining option in order to be in equilibrium is that both players need to play a random strategy, then there is no point in changing their strategy - which is the Nash equilibrium

Slides and more information: [Game Theory 101: Rock, Paper, Scissors](https://www.youtube.com/watch?v=-1GDMXoMdaY&ab_channel=CrashCourse)

<a id="2"></a>
<h2 style='background:#FBE338; border:0; color:black'><center>Agent Code<center><h2>

To create the agent for this competition, we must put its code in \*.py file.   
To do this, we can use the [magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html) of Jupyter Notebooks    
One of these commands is [writefile](https://ipython.readthedocs.io/en/stable/interactive/magics.html#cellmagic-writefile) which writes the contents of the cell to a file.

Let's create an agent that will generate a random number from 0 to 3 each time (Nash Equilibrium Strategy)   
**You must also put all the necessary imports to the \*.py file, in our example, this is a RANDOM module**

In [None]:
%%writefile submission.py

import random

def nash_equilibrium_agent(observation, configuration):
    return random.randint(0, 2)

<a id="3"></a>
<h2 style='background:#FBE338; border:0; color:black'><center>Battle Examples<center><h2>

We need to import the library for creating environments and simulating agent battles

In [None]:
from kaggle_environments import make

In [None]:
env = make("rps", configuration={"episodeSteps": 1000})

Let's create a second agent that will copy our previous action.

In [None]:
%%writefile copy_opponent_agent.py

def copy_opponent_agent(observation, configuration):
    if observation.step > 0:
        return observation.lastOpponentAction
    else:
        return 0

Let's start simulating the battle nash_equilibrium_agent vs copy_opponent_agent

In [None]:
# nash_equilibrium_agent vs copy_opponent_agent
env.run(["submission.py", "copy_opponent_agent.py"])

env.render(mode="ipython", width=800, height=800)

In [None]:
%%writefile reactionary.py

import random
from kaggle_environments.envs.rps.utils import get_score

last_react_action = None


def reactionary(observation, configuration):
    global last_react_action
    if observation.step == 0:
        last_react_action = random.randrange(0, configuration.signs)
    elif get_score(last_react_action, observation.lastOpponentAction) <= 1:
        last_react_action = (observation.lastOpponentAction + 1) % configuration.signs

    return last_react_action

In [None]:
# nash_equilibrium_agent vs nash_equilibrium_agent
env.run(["submission.py", "submission.py"])

env.render(mode="ipython", width=800, height=800)

In [None]:
# nash_equilibrium_agent vs nash_equilibrium_agent
env.run(["submission.py", "reactionary.py"])

env.render(mode="ipython", width=800, height=800)