# RPS - Rock Paper Scissors Agent - using PPL

In this notebook I will show an expirment of RPS game simulation.
I will use two players:
<ol>
    <li>Simple Player - playing according to Categorical seed (alpha vector)</li>
    <li> Inferencing Player - models the opponent as a Probalistic Program and by observations trying to infer the latent alpha vector.<br>
        Using this infered vector the player will try to exploit the simple player.<br>
</ol>
    

In [76]:
import numpy as np
import scipy as sp
import pymc3 as pm


### Simple Player
The simple player creates a categorical distribution (with dirichlet prior) and a given alpha vector and returns <num_of_samples> samples from this distirbution

### Smart Player

Infercing Player, takes the toolset of MCMC infernce. Each round this player takes the moves of the simple players as observations, and uses the same Probabalistic Model to posterior inference

### RPS Model
Probabalistic Program, describes the decision process.<br>
    $dir \sim Dirichlet(\alpha_1, \alpha_2, \alpha_3)$<br>
    $Action \sim Categorical(dir)$
    


In [77]:
def rps_player_model(alpha=[1, 1, 1], observed=None):
    with pm.Model() as model:
        dirirchlet = pm.Dirichlet('dirichlet', a=alpha)
        phi = pm.Categorical('phi', p=dirichlet, observed=observed)
        return model

## The hierarchy

### Base class of all players

In [86]:
class Player:
    def __init__(self, id):
        self.id = id

    def move(self, history):
        raise NotImplementedError()

### Naive player

In [87]:
class NaivePlayer(Player):
    """Naive player chooses a move according to fixed probabilities.
    """
    def __init__(self, id, p=[1, 1, 1]):
        Player.__init__(self, id)
        p = np.array(p)
        p = p/sum(p)
        self.p = p
    
    def move(self, history):
        return np.argmax(sp.stats.multinomial.rvs(1, self.p))

### Frequentist Players

In [114]:
class FrequentistPlayer(Player):
    """Frequentist player uses prior history 
    to choose a move
    """
    def __init__(self, id, counts=None):
        Player.__init__(self, id)
        if counts is None:
            counts = [1, 1, 1]
        self.counts = counts
    
    def stats(self, history):
        counts = self.counts[:]
        for id, m in history:
            if id != self.id:
                counts[m] += 1
        return np.array(counts)

In [119]:
class FixedFrequentistPlayer(FrequentistPlayer):
    def __init__(self, id, counts=None):
        FrequentistPlayer.__init__(self, id, counts)
        
    def move(self, history):
        counts = self.stats(history)
        return np.argmax(counts)
    
# Example
ffp = FixedFrequentistPlayer(1)
print([ffp.move([(1, 1), (2, 1), (1, 0), (2, 1)]) for _ in range(10)])
print([ffp.move([(1, 1), (2, 2), (1, 0), (2, 1), (1, 2), (2, 2)]) for _ in range(10)])

[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2]


In [122]:
class RandomFrequentistPlayer(FrequentistPlayer):
    def __init__(self, id, counts=None):
        FrequentistPlayer.__init__(self, id, counts)
        
    def move(self, history):
        counts = self.stats(history)
        return np.argmax(sp.stats.multinomial.rvs(n=1, p=counts/sum(counts)))
    
# Example
rfp = RandomFrequentistPlayer(1)
print([rfp.move([(1, 1), (2, 1), (1, 0), (2, 1)]) for _ in range(10)])
print([rfp.move([(1, 1), (2, 2), (1, 0), (2, 1), (1, 2), (2, 2)]) for _ in range(10)])

[1, 0, 1, 2, 1, 1, 1, 2, 2, 2]
[0, 2, 1, 1, 2, 2, 1, 2, 2, 2]


In [None]:
class BayesianPlayer(Player):
    def __init__(self, id, alpha=None):
        if alpha is None:
            alpha = 1, 1, 1
        self.alpha = np.array(alpha)
        
    def model

In [99]:
ROCK = 0
PAPER = 1
SCISSORS = 2
def score(m1, m2):
    # ROCK < PAPER < SCISSORS < ROCK
    if m1==m2:
        return 0
    score = -1
    if m1 > m2:
        m1, m2 = m2, m1
        score = -score
    if m2 - m1 == 2:
        score = -score
    return score

def game(player1, player2, n=10):
    # TODO play the game with two players
    history = []
    scores = []
    for i in range(n):
        m1 = player1.move(history)
        history.append([1, m1])
        m2 = player2.move(history)
        history.append([2, m2])
        scores += score(m1, m2)

In [55]:
score(SCISSORS, ROCK)

-1

### Posterior Infernce
using PPL infernce - Metropolis Hasting Algorithm due to the fact that the distribution is discrete

In [3]:
def infer(model):
    with model:
        trace = pm.sample(step=pm.Metropolis(), model=model, return_inferencedata=True, progressbar=False)
        return trace

### Predictive Posterior Sampling
sampling from the posterior and return the most common action at each stage<br>
The smart player using this sampling to play

In [4]:
def sample_from_posterior(model, trace):
    with model:
        posterior_pred = pm.sample_posterior_predictive(trace, progressbar=False)
        median_over_samples = np.median(posterior_pred['phi'], axis=0)
        return median_over_samples

### Sampling from the model without observations
The simple player using this sampling.<br> given alpha vector it's draw samples from the distribtuion

In [5]:
def sample_from_prior(model, num_of_samples):
    with model:
        samples = pm.sample_prior_predictive(num_of_samples)['phi']
        return samples

### Simulation
In the expirement we will look at those two player playing. and will examine the results

### Some aux function for RPS
Rock Paper Scissors is popular game.
With 3 Actions (Rock , Paper , Scissors) each action lose and wins exactly other action

In [6]:
def beats(i):
    return (i + 1) % 3

In [7]:
from enum import IntEnum

class RPS(IntEnum):
    ROCK = 0,
    PAPER = 1,
    SCISSORS = 2
    
def get_result(first_player, second_player):
    if first_player == second_player:
        return 0
    elif (first_player == RPS.ROCK and second_player == RPS.SCISSORS) or (
            first_player == RPS.PAPER and second_player == RPS.ROCK) or (
            first_player == RPS.SCISSORS and second_player == RPS.PAPER):
        return 1
    else:
        return -1

### Simulator
We run <num_of_simulations> simulations.<br>
Each simulation the simple player plays number of actions from the Probalistic Distribution with alpha vector as parameter.<br>
The Smart player infer about the observations of the previous round and suggest <num_of_samples> action.<br> The Simple player does the same with constant distribution and the simulator compare the results and update the number.<br>
In the end we look at the expactation of each player to win and the ties.<br>
We check if the smart player is realy "smarter" then the simple player

In [8]:
def simulate_with_latent_alpha(num_of_simulations=10, alpha=[1, 1, 1]):
    total_smart_player_wins = 0
    total_simple_player_wins = 0
    total_ties = 0

    simple_player_observation = []

    for i in range(num_of_simulations):
        # Learning phase
        simple_player = rps_player_model(alpha=alpha)

        if len(simple_player_observation) == 0:
            simple_player_observations = sample_from_prior(simple_player, num_of_samples=10)

        # gets a list of observed values and returns the distribution of probable action
        smart_player = rps_player_model(observed=simple_player_observations)
        trace = infer(smart_player)

        smart_player_next_moves = sample_from_posterior(smart_player, trace)
        smart_player_next_moves = list(map(beats, smart_player_next_moves))

        # Evaluation phase
        simple_player_next_moves = sample_from_prior(simple_player, num_of_samples=10)

        smart_player_wins = 0
        simple_player_wins = 0
        ties = 0

        for j in range(len(simple_player_next_moves)):
            result = get_result(smart_player_next_moves[j], simple_player_next_moves[j])
            if result > 0:
                smart_player_wins += 1
            elif result < 0:
                simple_player_wins += 1
            else:
                ties += 1
        total_smart_player_wins += smart_player_wins
        total_simple_player_wins += simple_player_wins
        total_ties += ties
        print(f'in simulation {i}: wins: {smart_player_wins}, loses: {simple_player_wins}, ties: {ties}')
    print(
        f'For opponent\'s alpha vector: {alpha} averages in all simulations is wins: {total_smart_player_wins / num_of_simulations} '
        f' loses:{total_simple_player_wins / num_of_simulations} ties:{total_ties / num_of_simulations}')

### Expirements
I will check the results of different alpha played by the simple player

alpha = [1, 10, 10] (playing less Rock)

In [9]:
simulate_with_latent_alpha(alpha=[1, 10, 10])

Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 15 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 0: wins: 5, loses: 0, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 1: wins: 6, loses: 0, ties: 4


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 2: wins: 2, loses: 0, ties: 8


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 14 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 3: wins: 3, loses: 7, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 14 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 4: wins: 4, loses: 0, ties: 6


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 5: wins: 5, loses: 5, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 6: wins: 6, loses: 0, ties: 4


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 7: wins: 4, loses: 1, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 13 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 8: wins: 2, loses: 0, ties: 8


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 9: wins: 5, loses: 1, ties: 4
For opponent's alpha vector: [1, 10, 10] averages in all simulations is wins: 4.2  loses:1.4 ties:4.4


In [10]:
simulate_with_latent_alpha(alpha=[10, 6, 1])

Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 0: wins: 5, loses: 0, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 1: wins: 5, loses: 0, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 2: wins: 5, loses: 0, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 8 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 3: wins: 6, loses: 0, ties: 4


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 4: wins: 6, loses: 4, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 5: wins: 5, loses: 5, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 6: wins: 2, loses: 8, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 7: wins: 1, loses: 9, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 8: wins: 6, loses: 0, ties: 4


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 9: wins: 3, loses: 6, ties: 1
For opponent's alpha vector: [10, 6, 1] averages in all simulations is wins: 4.4  loses:3.2 ties:2.4


In [11]:
simulate_with_latent_alpha(alpha=[1, 6, 1])

Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 0: wins: 7, loses: 0, ties: 3


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 12 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 1: wins: 7, loses: 1, ties: 2


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 2: wins: 9, loses: 1, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 8 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 3: wins: 8, loses: 1, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 8 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 4: wins: 7, loses: 2, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 5: wins: 7, loses: 3, ties: 0


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 6: wins: 7, loses: 2, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 12 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 7: wins: 7, loses: 2, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 8: wins: 6, loses: 1, ties: 3


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 9: wins: 8, loses: 1, ties: 1
For opponent's alpha vector: [1, 6, 1] averages in all simulations is wins: 7.3  loses:1.4 ties:1.3


In [12]:
simulate_with_latent_alpha(alpha=[1, 3, 5])

Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 0: wins: 2, loses: 5, ties: 3


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 1: wins: 2, loses: 5, ties: 3


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 8 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 2: wins: 5, loses: 4, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 3: wins: 7, loses: 2, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 12 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 4: wins: 6, loses: 3, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The rhat statistic is larger than 1.05 for some parameters. This indicates slight problems during sampling.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 5: wins: 3, loses: 4, ties: 3


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 8 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 6: wins: 4, loses: 1, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 8 seconds.
The rhat statistic is larger than 1.05 for some parameters. This indicates slight problems during sampling.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 7: wins: 5, loses: 3, ties: 2


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 8 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 8: wins: 6, loses: 3, ties: 1


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 9: wins: 2, loses: 1, ties: 7
For opponent's alpha vector: [1, 3, 5] averages in all simulations is wins: 4.2  loses:3.1 ties:2.7


In [13]:
simulate_with_latent_alpha(alpha=[1, 1, 1])

Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 0: wins: 4, loses: 4, ties: 2


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 12 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 1: wins: 3, loses: 1, ties: 6


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 2: wins: 5, loses: 2, ties: 3


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 16 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 3: wins: 4, loses: 1, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 4: wins: 4, loses: 4, ties: 2


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 5: wins: 4, loses: 2, ties: 4


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 6: wins: 3, loses: 2, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 11 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 7: wins: 1, loses: 4, ties: 5


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.


in simulation 8: wins: 3, loses: 5, ties: 2


Multiprocess sampling (2 chains in 2 jobs)
Metropolis: [dirichlet]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 9 seconds.
The number of effective samples is smaller than 25% for some parameters.


in simulation 9: wins: 3, loses: 4, ties: 3
For opponent's alpha vector: [1, 1, 1] averages in all simulations is wins: 3.4  loses:2.9 ties:3.7


### Summery
we can see from the expirments that the "smart" player able to exploit the simple opponent.<br> As the opponent is farther from complete random strategy we succeed to exploit it better<br> And when it plays complete random the smart player do the same and the results are even
