<a id="top"></a>
<h1 align='center', style='padding:5%; color:white; background:#646464;'>Contents</h1>

## XG Boost agents
* [Agent: Hit The Last Own Action](#1)
* [Agent: Rock](#2)
* [Agent: Paper](#3)
* [Agent: Scissors](#4)
* [Agent: Copy Opponent](#5)
* [Agent: Reactionary](#6)
* [Agent: Counter Reactionary](#7)
* [Agent: Statistical](#8)
* [Agent: Nash Equilibrium](#9)
* [Agent: Statistical Prediction](#10)

### Performance
* [Results](#103)
* [Review](#104)


In [None]:
%%writefile random_forest_random.py
import random
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
import xgboost as xgb
from xgboost.sklearn import XGBClassifier

actions =  np.empty((0,0), dtype = int)
observations =  np.empty((0,0), dtype = int)
total_reward = 0

def random_forest_random(observation, configuration):
    global actions, observations, total_reward
    
    if observation.step == 0:
        action = random.randint(0,2)
        actions = np.append(actions , [action])
        return action
    
    if observation.step == 1:
        action = random.randint(0,2)
        actions = np.append(actions , [action])
        observations = np.append(observations , [observation.lastOpponentAction])
        # Keep track of score
        winner = int((3 + actions[-1] - observation.lastOpponentAction) % 3);
        if winner == 1:
            total_reward = total_reward + 1
        elif winner == 2:
            total_reward = total_reward - 1        
        return action

    # Get Observation to make the tables (actions & obervations) even.
    observations = np.append(observations , [observation.lastOpponentAction])
    
    # Prepare Data for training
    # :-1 as we dont have feedback yet.
    X_train = np.vstack((actions[:-1], observations[:-1])).T
    
    # Create Y by rolling observations to bring future a step earlier 
    shifted_observations = np.roll(observations, -1)
    
    # trim rolled & last element from rolled observations
    y_train = shifted_observations[:-1].T
    
    # Set the history period. Long chains here will need a lot of time
    if len(X_train) > 25:
        random_window_size = 10 + random.randint(0,10)
        X_train = X_train[-random_window_size:]
        y_train = y_train[-random_window_size:]
   
    # Train a classifier model
    model = xgb1 = XGBClassifier(
 learning_rate =0.01,
 n_estimators=25,
 nthread=4)
    model.fit(X_train, y_train)

    # Predict
    X_test = np.empty((0,0), dtype = int)
    X_test = np.append(X_test, [int(actions[-1]), observation.lastOpponentAction])
    prediction = model.predict(X_test.reshape(1, -1))

    # Keep track of score
    winner = int((3 + actions[-1] - observation.lastOpponentAction) % 3);
    if winner == 1:
        total_reward = total_reward + 1
    elif winner == 2:
        total_reward = total_reward - 1
   
    # Prepare action
    action = int((prediction + 1) % 3)
    
    # If losing a bit then change strategy and break the patterns by playing a bit random
    if total_reward < -2:
        win_tie = random.randint(0,1)
        action = int((prediction + win_tie) % 3)

    # Update actions
    actions = np.append(actions , [action])

    # Action 
    return action 

<a id="1"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Hit The Last Own Action<h1>

The idea of the agent:

- A lot of agents use a simple baseline - copy the last action of the opponent.   
- That's why we can simply hit our last actions (new action of the opponent)

In [None]:
%%writefile hit_the_last_own_action.py

my_last_action = 0

def hit_the_last_own_action(observation, configuration):
    global my_last_action
    my_last_action = (my_last_action + 1) % 3
    
    return my_last_action

<a id="2"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Rock<h1>

The idea of the agent:

- Always uses Rock action

In [None]:
%%writefile rock.py

def rock(observation, configuration):
    return 0

<a id="3"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Paper<h1>

The idea of this Agent:

- Always uses Paper action

In [None]:
%%writefile paper.py

def paper(observation, configuration):
    return 1


<a id="4"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Scissors<h1>

The idea of the Agent:

- Always uses Scissors action

In [None]:
%%writefile scissors.py

def scissors(observation, configuration):
    return 2

<a id="5"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Copy Opponent<h1>

The idea of the agent:

- Copy the last action of the opponent

In [None]:
%%writefile copy_opponent.py

import random
from kaggle_environments.envs.rps.utils import get_score

def copy_opponent(observation, configuration):
    if observation.step > 0:
        return observation.lastOpponentAction
    else:
        return random.randrange(0, configuration.signs)

<a id="6"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Reactionary<h1>

The idea of the agent:

- Hit the last action of the opponent

In [None]:
%%writefile reactionary.py

import random
from kaggle_environments.envs.rps.utils import get_score

last_react_action = None


def reactionary(observation, configuration):
    global last_react_action
    if observation.step == 0:
        last_react_action = random.randrange(0, configuration.signs)
    elif get_score(last_react_action, observation.lastOpponentAction) <= 1:
        last_react_action = (observation.lastOpponentAction + 1) % configuration.signs

    return last_react_action

<a id="7"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Counter Reactionary<h1>

The idea of this Agent:

- Hit the counter to the last action of the opponent

In [None]:
%%writefile counter_reactionary.py

import random
from kaggle_environments.envs.rps.utils import get_score

last_counter_action = None


def counter_reactionary(observation, configuration):
    global last_counter_action
    if observation.step == 0:
        last_counter_action = random.randrange(0, configuration.signs)
    elif get_score(last_counter_action, observation.lastOpponentAction) == 1:
        last_counter_action = (last_counter_action + 2) % configuration.signs
    else:
        last_counter_action = (observation.lastOpponentAction + 1) % configuration.signs

    return last_counter_action

<a id="8"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Statistical<h1>

In [None]:
%%writefile statistical.py

import random
from kaggle_environments.envs.rps.utils import get_score

action_histogram = {}

def statistical(observation, configuration):
    global action_histogram
    if observation.step == 0:
        action_histogram = {}
        return
    action = observation.lastOpponentAction
    if action not in action_histogram:
        action_histogram[action] = 0
    action_histogram[action] += 1
    mode_action = None
    mode_action_count = None
    for k, v in action_histogram.items():
        if mode_action_count is None or v > mode_action_count:
            mode_action = k
            mode_action_count = v
            continue

    return (mode_action + 1) % configuration.signs

<a id="9"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Nash Equilibrium<h1>

Refference - [Rock Paper Scissors - Nash Equilibrium Strategy](https://www.kaggle.com/ihelon/rock-paper-scissors-nash-equilibrium-strategy)

Nash Equilibrium Strategy (always random)

In [None]:
%%writefile nash_equilibrium.py

import random

def nash_equilibrium(observation, configuration):
    return random.randint(0, 2)

<a id="10"></a>
<h1 align='center' style='padding:5%; background:#646464; color:white'>Agent: Statistical Prediction<h1>

In [None]:
%%writefile statistical_prediction.py

import random
import pydash
from collections import Counter

# Create a small amount of starting history
history = {
    "guess":      [0,1,2],
    "prediction": [0,1,2],
    "expected":   [0,1,2],
    "action":     [0,1,2],
    "opponent":   [0,1],
}
def statistical_prediction_agent(observation, configuration):    
    global history
    actions         = list(range(configuration.signs))  # [0,1,2]
    last_action     = history['action'][-1]
    opponent_action = observation.lastOpponentAction if observation.step > 0 else 2
    
    history['opponent'].append(opponent_action)

    # Make weighted random guess based on the complete move history, weighted towards relative moves based on our last action 
    move_frequency       = Counter(history['opponent'])
    response_frequency   = Counter(zip(history['action'], history['opponent'])) 
    move_weights         = [ move_frequency.get(n,1) + response_frequency.get((last_action,n),1) for n in range(configuration.signs) ] 
    guess                = random.choices( population=actions, weights=move_weights, k=1 )[0]
    
    # Compare our guess to how our opponent actually played
    guess_frequency      = Counter(zip(history['guess'], history['opponent']))
    guess_weights        = [ guess_frequency.get((guess,n),1) for n in range(configuration.signs) ]
    prediction           = random.choices( population=actions, weights=guess_weights, k=1 )[0]

    # Repeat, but based on how many times our prediction was correct
    prediction_frequency = Counter(zip(history['prediction'], history['opponent']))
    prediction_weights   = [ prediction_frequency.get((prediction,n),1) for n in range(configuration.signs) ]
    expected             = random.choices( population=actions, weights=prediction_weights, k=1 )[0]

    # Play the +1 counter move
    action = (expected + 1) % configuration.signs
    
    # Persist state
    history['guess'].append(guess)
    history['prediction'].append(prediction)
    history['expected'].append(expected)
    history['action'].append(action)

    # Print debug information
    print('opponent_action                = ', opponent_action)
    print('move_weights,       guess      = ', move_weights, guess)
    print('guess_weights,      prediction = ', guess_weights, prediction)
    print('prediction_weights, expected   = ', prediction_weights, expected)
    print('action                         = ', action)
    print()
    
    return action

<a id="103"></a>
<h1 align='center' style='padding:10%; background-color:#646464; color:white;'>Results<h1>

In [None]:
df_scores = pd.DataFrame(
    scores, 
    index=list_names, 
    columns=range(10),
)


plt.figure(figsize=(15, 10))
sns.heatmap(
    df_scores, annot=True, cbar=False, cmap='coolwarm', linewidths=1, linecolor='black', fmt="d"
)
plt.suptitle('Random Forest Random vs all agents', fontsize=20)
plt.title('Final Reward Score', fontsize=15)
plt.xticks(rotation=90, fontsize=15)
plt.yticks(fontsize=15);

In [None]:
df_review=pd.DataFrame()
df_review['Won'] = df_scores.select_dtypes(include='int').gt(0).sum(axis=1)
df_review['Tie'] = df_scores.select_dtypes(include='int').eq(0).sum(axis=1)
df_review['Lost'] = df_scores.select_dtypes(include='int').lt(0).sum(axis=1)

In [None]:
plt.figure(figsize=(5, 10))
sns.heatmap(
    df_review, annot=True, cbar=False, cmap='coolwarm', linewidths=1, linecolor='black', fmt="d"
)
plt.suptitle('Random Forest Random vs all agents', fontsize=20)
plt.title('Total games Won-Tie-Lost', fontsize=15)
plt.xticks(rotation=90, fontsize=15)
plt.yticks(fontsize=15);

<a id="104"></a>
<h1 align='center' style='padding:10%; background-color:#646464; color:white;'>Review<h1>

* `Random Forest Random` can identify the patterns of all simple agents in 5 actions or less.
* `Statistical` is an easy opponent for `Random Forest Random` and performs better almost every time.
* Luck is crucial for the outcome over `Opponent Transition Matrix`, `Decision Tree Classifier` and `Statistical Prediction` as the results can vary a lot over matches.
* Final conclusion is that `Random Forest Classifiers` can be used to predict opponents actions on `Rock-Paper-Scissors` but advanced `defensive` mechanisms are required when the pattern is identified by the opponent.

**Disclaimer: The above review is done on multiple runs of this notebook and is revised after viewing the submmisions in the competetion  after its was ended.**