In [22]:
import numpy as np
import cvxpy as cp

# 2.1.f Traditional Rock Paper Scissors

In [34]:
# payoff matrix of the rock-paper-scissors game

M = np.array([[0, -1, 1], [1, 0, -1], [-1, 1, 0]])

print(M)
print('Rock = 0, Paper = 1, Scissors = 2')

[[ 0 -1  1]
 [ 1  0 -1]
 [-1  1  0]]
Rock = 0, Paper = 1, Scissors = 2


#### Assuming that the row-player’s strategy is to play rock with probability 1, derive the best-response strategy of the column-player 

(i) logical reasoning : the best response strategy of the column player is to play paper with probability 1

In [35]:
Rock, Paper, Scissors = 0, 1, 2

# (ii) solving a linear program

# define the variables
x = cp.Variable(3, nonneg=True)

# define the objective function
obj = cp.Minimize(cp.sum(cp.multiply(M[Rock,:], x)))

# define the constraints
constraints = [cp.sum(x) == 1]

# define the problem
prob = cp.Problem(obj, constraints)

# solve the problem
prob.solve()

# print the solution
print(f'The best response strategy of the column player is to play *rock* with probability', round(x.value[Rock], 3))
print(f'The best response strategy of the column player is to play *paper* with probability', round(x.value[Paper],3))
print(f'The best response strategy of the column player is to play *scissors* with probability', round(x.value[Scissors], 3))

The best response strategy of the column player is to play *rock* with probability 0.0
The best response strategy of the column player is to play *paper* with probability 1.0
The best response strategy of the column player is to play *scissors* with probability 0.0


Duality theory can be used to reformulate the given minmax inner minimization problem as a maximization problem.

Let's start by defining the Lagrangian function for the given problem as:

L(p,λ) = min(q∈∆) p^⊤Mq + λ(1 - ∑ pi)

Here, λ is the Lagrange multiplier associated with the constraint that the sum of probabilities is equal to 1.

The dual function can be defined as the infimum of the Lagrangian function over the primal variable p:

g(λ) = inf(p∈∆) L(p,λ) = inf(p∈∆) (min(q∈∆) p^⊤Mq + λ(1 - ∑ pi))

Now, let's consider the maximization problem:

maximize(g(λ)) subject to λ≥0

The dual problem is to find the optimal value of λ that maximizes the dual function g(λ) subject to the non-negativity constraint on λ.

Using the strong duality theorem, we know that the optimal value of the dual problem is equal to the optimal value of the primal problem. Therefore, we can solve the original minmax inner minimization problem by solving the dual problem.

To obtain the reformulated problem, we can write the dual function as:

g(λ) = inf(p∈∆) (min(q∈∆) p^⊤Mq) + λ(1 - ∑ pi)

The first term inside the infimum is a minimum over q, which is a linear function of p. Therefore, we can interchange the order of the minimum and the infimum and obtain:

g(λ) = min(q∈∆) (inf(p∈∆) p^⊤Mq + λ(1 - ∑ pi))

The term inside the infimum is a linear function of p and can be expressed as:

p^⊤Mq + λ(1 - ∑ pi) = (λpi + p^⊤Mq) - λ

This is a linear function of pi, and its minimum over pi is achieved at pi = 0 if λM + q ≤ 0, and pi = 1 if λM + q > 0. Therefore, we can express the dual function as:

g(λ) = min(q∈∆) max(0, λM + q)

This is a maximization problem, where we maximize over q and minimize over λ. Therefore, the reformulated problem is:

maximize (q∈∆) max(0, λM + q)

This is the maximization problem that is equivalent to the original minmax inner minimization problem using duality theory.

In [36]:
# construct the Nash strategies of both players and report the expected payoff of the row-player. Interpret your results.

# define the variables
x = cp.Variable((1,3), nonneg=True)
y = cp.Variable((1,3), nonneg=True)

# define the objective function

# CAREFUL : There is an error Here !!!!!!!!!!!!!!!!!!
obj = cp.Maximize(cp.sum(cp.multiply(M, x)) + cp.sum(cp.multiply(M.T, y)))

# define the constraints
constraints = [cp.sum(x) == 1, cp.sum(y) == 1]

# define the problem
prob = cp.Problem(obj, constraints)

# solve the problem
prob.solve()
print(f'Status: {prob.status}\n')

# print the solution
print( "The best response strategy of the *column* player is to play *rock* with probability", np.round(x.value, 3)[0, Rock])
print( "The best response strategy of the *column* player is to play *paper* with probability", np.round(x.value, 3)[0, Paper])
print( "The best response strategy of the *column* player is to play *scissors* with probability", np.round(x.value, 3)[0, Scissors])
print('')
print( "The best response strategy of the *row* player is to play *rock* with probability", np.round(y.value, 3)[0, Rock])
print( "The best response strategy of the *row* player is to play *paper* with probability", np.round(y.value, 3)[0, Paper])
print( "The best response strategy of the *row* player is to play *scissors* with probability", np.round(y.value, 3)[0, Scissors])

print(f'\nThe expected payoff of the row player is', np.round(prob.value, 3))
print(f'the expected payoff of the column player is', np.round(-prob.value, 3))

Status: optimal

The best response strategy of the *column* player is to play *rock* with probability 0.333
The best response strategy of the *column* player is to play *paper* with probability 0.333
The best response strategy of the *column* player is to play *scissors* with probability 0.333

The best response strategy of the *row* player is to play *rock* with probability 0.333
The best response strategy of the *row* player is to play *paper* with probability 0.333
The best response strategy of the *row* player is to play *scissors* with probability 0.333

The expected payoff of the row player is 0.0
the expected payoff of the column player is -0.0


Let p be a Nash strategy for the row-player in the game represented by the matrix M. Then, for any column strategy q, we have:

p^T M q ≤ p^T M p (1)

where the inequality follows from the definition of a Nash strategy, which states that the row-player cannot improve her expected payoff by unilaterally changing her strategy.

Now, let v be an optimal strategy for the column-player in the game represented by the matrix M, i.e.,

v^T M p ≥ v^T M v (2)

where the inequality follows from the definition of an optimal strategy for the column-player, which states that the column-player chooses a strategy that maximizes his/her expected payoff against any row strategy.

Multiplying (1) by v^T on both sides, we get:

v^T p^T M q ≤ v^T p^T M p (3)

Using the fact that (2) holds, we can replace v^T M p on the right-hand side of (3) to obtain:

v^T p^T M q ≤ v^T M p ≤ v^T M v (4)

The left-hand side of (4) is the expected payoff of the row-player when she plays the strategy p against the column strategy q, while the right-hand side is the minimum value of the game represented by the matrix M, i.e., the optimal value of the minimax problem (1). Therefore, we have shown that:

v^T p^T M q ≤ v^T M v (5)

for any column strategy q and any Nash strategy p of the row-player. In other words, the expected payoff of the row-player when she plays a Nash strategy cannot fall below the optimal value of the minimax problem, irrespective of the column-player's strategy. This result is known as the minimax theorem.

# 2.1.g Modified Rock Paper Scissors

In [39]:
# Consider a modified rock-paper-scissors game, where the payoff of the row-player amounts to +2 instead of +1 if she wins by playing rock. 

# define the modified payoff matrix

print("Payoff matrix of the modified rock-paper-scissors game:\n")
M = np.array([[0, -1, 2], [1, 0, -1], [-1, 1, 0]])
print(M)
print('Rock = 0, Paper = 1, Scissors = 2\n')

# define the variables
x = cp.Variable((3,1), nonneg=True)
y = cp.Variable((3,1), nonneg=True)

# define the objective function

# CAREFUL : There is an error Here !!!!!!!!!!!!!!!!!!
obj = cp.Maximize(cp.max(cp.multiply(x.T, M@y)))

# define the constraints
constraints = [cp.sum(x) == 1, cp.sum(y) == 1]

# define the problem
prob = cp.Problem(obj, constraints)

# print if the problem is DCP
print(prob.is_dcp())

# solve the problem
prob.solve()
print(f'Status: {prob.status}')

print(x.value)
print(y.value)

# print the solution
print( "The best response strategy of the *column* player is to play *rock* with probability", np.round(x.value, 3)[0, Rock])
print( "The best response strategy of the *column* player is to play *paper* with probability", np.round(x.value, 3)[0, Paper])
print( "The best response strategy of the *column* player is to play *scissors* with probability", np.round(x.value, 3)[0, Scissors])
print('')
print( "The best response strategy of the *row* player is to play *rock* with probability", np.round(y.value, 3)[0, Rock])
print( "The best response strategy of the *row* player is to play *paper* with probability", np.round(y.value, 3)[0, Paper])
print( "The best response strategy of the *row* player is to play *scissors* with probability", np.round(y.value, 3)[0, Scissors])

print(f'\nThe expected payoff of the row player is', np.round(prob.value, 3))
print(f'the expected payoff of the column player is', np.round(-prob.value, 3))

Payoff matrix of the modified rock-paper-scissors game:

[[ 0 -1  2]
 [ 1  0 -1]
 [-1  1  0]]
Rock = 0, Paper = 1, Scissors = 2

False


DCPError: Problem does not follow DCP rules. Specifically:
The objective is not DCP. Its following subexpressions are not:
[[1.]
 [1.]
 [1.]] @ var612.T @ [[ 0. -1.  2.]
 [ 1.  0. -1.]
 [-1.  1.  0.]] @ var613 @ [[1. 1. 1.]]

In [6]:
import nashpy as nash
import numpy as np

A = np.array([[0, -1, 2], [1, 0, -1], [-1, 1, 0]])
B = - A
rps = nash.Game(A, B)
rps

Zero sum game with payoff matrices:

Row player:
[[ 0 -1  2]
 [ 1  0 -1]
 [-1  1  0]]

Column player:
[[ 0  1 -2]
 [-1  0  1]
 [ 1 -1  0]]

In [20]:
eqs = rps.support_enumeration()

eqs = list(eqs)

print("The Nash equilibria are:\n")
print("For the row player: rock : ", np.round(eqs[0][0][0], 2) ,", paper : ", np.round(eqs[0][0][1], 2), ", scissors : ", np.round(eqs[0][0][2], 2))
print("For the column player: rock : ", np.round(eqs[0][1][0], 2) ,", paper : ", np.round(eqs[0][1][1], 2), ", scissors : ", np.round(eqs[0][1][2], 2))

The Nash equilibria are:

For the row player: rock :  0.25 , paper :  0.42 , scissors :  0.33
For the column player: rock :  0.33 , paper :  0.42 , scissors :  0.25


In [None]:
# This is the answer we are looking for, but we need to use CVXPY to get it (not nashpy)