# Memory Patterns vs Nash Equilibrium: Rock Paper Scissors


### 100 seasons of Memory Patterns vs Nash on Rock Paper Scissors
### 1000 episodes per season

### Bonus: Dataset generation

<a id="1"></a>
<h2 style='background:#FBE338; border:0; color:black'><center>Agent: Nash Equilibrium<center><h2>

![](https://upload.wikimedia.org/wikipedia/commons/thumb/a/a9/John_Forbes_Nash%2C_Jr._by_Peter_Badge.jpg/220px-John_Forbes_Nash%2C_Jr._by_Peter_Badge.jpg)

*...if we all go for the blonde we are blocking each other.*

In [None]:
%%writefile nash_equilibrium.py

import random

def nash_equilibrium(observation, configuration):
    return random.randint(0, 2)

<a id="1"></a>
<h2 style='background:#FBE338; border:0; color:black'><center>Agent: Memory Patterns
    <center><h2>

[Memory_Patterns](https://www.kaggle.com/yegorbiryukov/rock-paper-scissors-with-memory-patterns) by [Yegor Biryukov](https://www.kaggle.com/yegorbiryukov)

In [None]:
%%writefile memory_patterns.py

import random

# how many steps in a row are in the pattern (multiplied by two)
memory_length = 12
# current memory of the agent
current_memory = []
# list of memory patterns
memory_patterns = []

def find_pattern(memory):
    """ find appropriate pattern in memory """
    for pattern in memory_patterns:
        actions_matched = 0
        for i in range(memory_length):
            if pattern["actions"][i] == memory[i]:
                actions_matched += 1
            else:
                break
        # if memory fits this pattern
        if actions_matched == memory_length:
            return pattern
    # appropriate pattern not found
    return None

def my_agent(obs, conf):
    """ your ad here """
    # if it's not first step, add opponent's last action to agent's current memory
    if obs["step"] > 0:
        current_memory.append(obs["lastOpponentAction"])
    # if length of current memory is bigger than necessary for a new memory pattern
    if len(current_memory) > memory_length:
        # get momory of the previous step
        previous_step_memory = current_memory[:memory_length]
        previous_pattern = find_pattern(previous_step_memory)
        if previous_pattern == None:
            previous_pattern = {
                "actions": previous_step_memory.copy(),
                "opp_next_actions": [
                    {"action": 0, "amount": 0, "response": 1},
                    {"action": 1, "amount": 0, "response": 2},
                    {"action": 2, "amount": 0, "response": 0}
                ]
            }
            memory_patterns.append(previous_pattern)
        for action in previous_pattern["opp_next_actions"]:
            if action["action"] == obs["lastOpponentAction"]:
                action["amount"] += 1
        # delete first two elements in current memory (actions of the oldest step in current memory)
        del current_memory[:2]
    my_action = random.randint(0, 2)
    pattern = find_pattern(current_memory)
    if pattern != None:
        my_action_amount = 0
        for action in pattern["opp_next_actions"]:
            # if this opponent's action occurred more times than currently chosen action
            # or, if it occured the same amount of times, choose action randomly among them
            if (action["amount"] > my_action_amount or
                    (action["amount"] == my_action_amount and random.random() > 0.5)):
                my_action_amount = action["amount"]
                my_action = action["response"]
    current_memory.append(my_action)
    return my_action


<a id="11"></a>
<h2 style='background:#FBE338; border:0; color:black'><center>Validate<center><h2>




In [None]:
from kaggle_environments import make, evaluate

env = make("rps", configuration={"episodeSteps": 1000})

env.run(["memory_patterns.py", "nash_equilibrium.py"])

env.render(mode="ipython", width=800, height=800)

<a id="11"></a>
<h2 style='background:#FBE338; border:0; color:black'><center>Action<center><h2>



In [None]:
seasons = 100
episodes = 1000

In [None]:
import numpy as np
import pandas as pd
import json

import matplotlib.pyplot as plt
import seaborn as sns

from kaggle_environments import make

from IPython.display import Markdown as md

action_board = pd.DataFrame(columns = ["season",
                                      "episode",
                                      "Memory Action",
                                      "Nash Action",
                                      "Memory Reward",
                                      "Nash Reward"])
leaderboard = pd.DataFrame(columns = ["season",
                                      "Memory Reward",
                                      "Nash Reward"])


index = 0
env = make("rps", configuration={"episodeSteps": episodes})

for season in range(seasons):
    env.reset()
    results = env.run(["memory_patterns.py", "nash_equilibrium.py"])
    for result in results:
        if (result[0].observation.step == 0):
            continue
        action_board = action_board.append({"season": season,
                              "episode": result[0].observation.step,
                              "Memory Action": result[0].action,
                              "Nash Action": result[1].action,
                              "Memory Reward": result[0].reward,
                              "Nash Reward": result[1].reward},
                                        ignore_index=True)
        if result[0].status == "DONE":
            leaderboard = leaderboard.append({"season": season,
                              "Memory Reward": result[0].reward,
                              "Nash Reward": result[1].reward},
                                        ignore_index=True)

<h1 style='background:#FBE338; border:0; color:black'><center>Result<center><h1>


In [None]:
md('# Memory Patterns - Nash Equilibrium : {} - {}'.format(len(leaderboard[leaderboard["Memory Reward"] > 0]), len(leaderboard[leaderboard["Nash Reward"] > 0])))

In [None]:
md('# Tie : {}'.format(len(leaderboard[leaderboard["Memory Reward"] == 0])))

In [None]:
if (len(leaderboard[leaderboard["Memory Reward"] > 0]) == len(leaderboard[leaderboard["Nash Reward"] > 0])):
    winner = "Tie!"
elif (len(leaderboard[leaderboard["Memory Reward"] > 0]) > len(leaderboard[leaderboard["Nash Reward"] > 0])):
    winner = "Winner is Memory Patterns!"
else:
    winner = "Winner is Nash!"
md('<a id="11"></a><h1 style=\'background:#FBE338; border:0; color:black\'><center>{}<center><h2>'.format(winner))

<h1 style='background:#FBE338; border:0; color:black'><center>Analysis<center><h1>

# Season's results

In [None]:
leaderboard.plot(subplots=True, figsize=(15,10))

# Season's reward histogram

In [None]:
leaderboard[['Memory Reward', 'Nash Reward']].plot.hist(bins=10,  alpha=0.5, figsize=(15,10))

# Actions histogram

In [None]:
action_board[['Memory Action', 'Nash Action']].plot.hist(bins=3, alpha=0.5, xticks=[0,1,2], figsize=(15,10))

## All episodes reward

In [None]:
fig, ax = plt.subplots(figsize=(20,10))
for i, g in action_board.groupby('season'):
    g.plot(x='episode', y='Memory Reward', ax=ax, legend=False )

## First half rewards

In [None]:
fig, ax = plt.subplots(figsize=(20,15))
for i, g in action_board[(action_board['episode']<episodes/2)].groupby('season'):
    g.plot(x='episode', y='Memory Reward', ax=ax, legend=False )

## Mid-episodes reward

In [None]:
fig, ax = plt.subplots(figsize=(20,15))
for i, g in action_board[((action_board['episode']>episodes/3) & (action_board['episode']<2*episodes/3))].groupby('season'):
    g.plot(x='episode', y='Memory Reward', ax=ax, legend=False )

## Last half rewards

In [None]:
fig, ax = plt.subplots(figsize=(20,15))
for i, g in action_board[action_board['episode']>episodes/2].groupby('season'):
    g.plot(x='episode', y='Memory Reward', ax=ax, legend=False )

<h1 style='background:#FBE338; border:0; color:black'><center>Conclusion<center><h1>

<h1 style='background:#FBE338; border:0; color:black'><center>Dataset<center><h1>

Dataset is exported, collected and publicly shared in [Rock Paper Scissors Agents Battles](https://www.kaggle.com/jumaru/rock-paper-scissors-agents-battles) dataset.

## Leaderboard

### First 5 seasons rewards

In [None]:
leaderboard.head()

### Last 5 seasons rewards

In [None]:
leaderboard.tail()

## Rewards Statistics 

In [None]:
leaderboard.describe()

# Action board

## First 5 actions

In [None]:
action_board.head()

## Last 5 actions

In [None]:
action_board.tail()

## Actions Statistics

In [None]:
action_board.drop(columns='season').describe()

# Data export

In [None]:
# Report boards
leaderboard_csv = 'Memory_Patterns_leaderboard_S' + str(seasons) + 'E' + str(episodes) + '.csv'
action_board_csv = 'Memory_Patterns_action_board_S'+ str(seasons) + 'E' + str(episodes) + '.csv'
leaderboard.to_csv(leaderboard_csv)
action_board.to_csv(action_board_csv)
print(leaderboard_csv)
print(action_board_csv)

# References

* [Rock Paper Scissors - Nash Equilibrium Strategy](https://www.kaggle.com/ihelon/rock-paper-scissors-nash-equilibrium-strategy) & [Rock Paper Scissors - Agents Comparison](https://www.kaggle.com/ihelon/rock-paper-scissors-agents-comparison) by [Yaroslav Isaienkov](https://www.kaggle.com/ihelon)
* [(Not so) Markov](https://www.kaggle.com/alexandersamarin/not-so-markov) by [Alexander Samarin](https://www.kaggle.com/alexandersamarin)
* [LB simulation](https://www.kaggle.com/superant/lb-simulation) by [Ant 🐜](https://www.kaggle.com/superant)
* [Memory_Patterns](https://www.kaggle.com/yegorbiryukov/rock-paper-scissors-with-memory-patterns) by [Yegor Biryukov](https://www.kaggle.com/yegorbiryukov)