# Proof that the myopic strategy outperforms the one-shot strategy for finite horizon models T > 2

James Yu, 8 August 2022

In [1]:
from sympy import *

In [2]:
lambda_, delta, c, d, T = symbols("lambda_j, delta, c, Sigma_j, T")

## Proof by induction on the horizon length $T$:

First, recall that we already know the models have identical payoff for $T = 1$, so begin with a base case of $T = 2$. In order to do this, we need the general payoff form for each model first.

In the one-shot model, recall that the optimal payoff is:

$$-\tilde{x_0}^\prime \tilde{x_0} - c \tilde{r_0}^\prime \tilde{r_0} - \sum\limits_1^T \delta^t \tilde{x_1}^\prime D^{2(t-1)} \tilde{x_1}$$

where we know:

$$\tilde{r_0} = -(cI + \delta[\sum\limits_0^{T-1} \delta^t D^{2t}])^{-1} \delta [\sum\limits_0^{T-1} \delta^t D^{2t}] D\tilde{x_0}$$

and:

$$\tilde{x_1} = D\tilde{x_0} + \tilde{r_0}$$

First, notice we can reparametrize the sum of payoffs from $t = 1$ to $T$ by factoring out $\delta$ (and subsequently factor out $\tilde{x_1}$ to put the sum into the same form as the sums in $\tilde{r_0}$):

$$-\tilde{x_0}^\prime \tilde{x_0} - c \tilde{r_0}^\prime \tilde{r_0} - \delta \sum\limits_1^T \delta^{t-1} \tilde{x_1}^\prime D^{2(t-1)} \tilde{x_1}$$

$$= -\tilde{x_0}^\prime \tilde{x_0} - c \tilde{r_0}^\prime \tilde{r_0} - \delta \tilde{x_1}^\prime [\sum\limits_0^{T-1} \delta^t D^{2t}] \tilde{x_1}$$

Next, denote $\sum\limits_0^{T-1} \delta^t D^{2t}$ as $\Sigma$. Substituting $\tilde{r_0}$ and $\tilde{x_1}$, we then have:

$$= -\tilde{x_0}^\prime \tilde{x_0} - c\tilde{x_0}^\prime D^2 \Sigma^2 \delta^2 (cI + \delta \Sigma)^{-2} \tilde{x_0} - \delta (D\tilde{x_0} - (cI + \delta\Sigma)^{-1}\delta\Sigma D \tilde{x_0})^\prime \Sigma (D\tilde{x_0}- (cI + \delta\Sigma)^{-1}\delta\Sigma D \tilde{x_0})$$

$$= -\tilde{x_0}^\prime \tilde{x_0} - c\tilde{x_0}^\prime D^2 \Sigma^2 \delta^2 (cI + \delta \Sigma)^{-2} \tilde{x_0} - \delta \tilde{x_0}^\prime D^2(I- (cI + \delta\Sigma)^{-1}\delta\Sigma)^2 \Sigma \tilde{x_0}$$

Next, factor out $\tilde{x_0}$ and consider an arbitrary dimension of the resulting matrix (where $\Sigma_j$ is the $j$-th diagonal of $\Sigma$):

$$-1 - c\lambda_j^2 \Sigma_j^2 \delta^2 (c + \delta \Sigma_j)^{-2} - \delta \lambda_j^2 (1 - (c + \delta\Sigma_j)^{-1} \delta\Sigma_j)^2 \Sigma_j$$

$$= -1 - \frac{c\lambda_j^2 \Sigma_j^2 \delta^2}{(c + \delta \Sigma_j)^2} - \delta \lambda_j^2 (1 - (c + \delta\Sigma_j)^{-1} \delta\Sigma_j)^2 \Sigma_j$$

$$= -1 - \frac{c\lambda_j^2 \Sigma_j^2 \delta^2}{(c + \delta \Sigma_j)^2} - \delta \lambda_j^2 (1 - \frac{\delta\Sigma_j}{c + \delta\Sigma_j})^2 \Sigma_j$$

$$= -1 - \frac{c\lambda_j^2 \Sigma_j^2 \delta^2}{(c + \delta \Sigma_j)^2} - \delta \lambda_j^2 (1 - 2\frac{\delta\Sigma_j}{c + \delta\Sigma_j} + (\frac{\delta\Sigma_j}{c + \delta\Sigma_j})^2) \Sigma_j$$

$$= -1 - \frac{c\lambda_j^2 \Sigma_j^2 \delta^2}{(c + \delta \Sigma_j)^2} - \delta \Sigma_j \lambda_j^2 + 2\frac{\delta^2\lambda_j^2\Sigma_j^2}{c + \delta\Sigma_j} - \frac{\delta\Sigma_j\lambda_j^2(\delta\Sigma_j)^2}{(c + \delta\Sigma_j)^2}$$

$$= -1 - \frac{\lambda_j^2 \Sigma_j^2 \delta^2(c + \delta\Sigma_j)}{(c + \delta \Sigma_j)^2} - \delta \Sigma_j \lambda_j^2 + 2\frac{\delta^2\lambda_j^2\Sigma_j^2}{c + \delta\Sigma_j}$$

$$= -1 - \frac{\lambda_j^2 \Sigma_j^2 \delta^2}{c + \delta \Sigma_j} - \delta \Sigma_j \lambda_j^2 + 2\frac{\delta^2\lambda_j^2\Sigma_j^2}{c + \delta\Sigma_j}$$

$$= -1 + \frac{\lambda_j^2 \Sigma_j^2 \delta^2}{c + \delta \Sigma_j} - \frac{\delta \Sigma_j \lambda_j^2 (c+\delta\Sigma_j)}{c + \delta \Sigma_j}$$

$$= -1 + \frac{\lambda_j^2 \Sigma_j^2 \delta^2 - c\delta \Sigma_j \lambda_j^2 - \delta^2\Sigma_j^2 \lambda_j^2}{c + \delta \Sigma_j}$$

$$= -1 - \frac{c\delta \Sigma_j \lambda_j^2}{c + \delta \Sigma_j}$$

Next, the optimal payoff for the myopic model is:

$$-\sum\limits_0^{T-1} (\delta^t \tilde{x_t}^\prime \tilde{x_t} + c \tilde{r_t}^\prime \tilde{r_t}) - \delta^T \tilde{x_T}^\prime \tilde{x_T}$$

where:

$$L^m = -\frac{\delta}{\delta+c}D$$

$$\tilde{x_t} = (D+L^m)^t \tilde{x_0} = (\frac{c}{\delta+c})^t D^t \tilde{x_0}$$

$$\tilde{r_t} = L^m \tilde{x_t}$$

Thus, the payoff function simplifies to:

$$-\sum\limits_0^{T-1} \delta^t \tilde{x_t}^\prime (I + c \frac{\delta^2}{(\delta+c)^2}D^2)\tilde{x_t} - \delta^T \tilde{x_T}^\prime \tilde{x_T}$$

$$= -\sum\limits_0^{T-1} \delta^t \tilde{x_0}^\prime (\frac{c}{\delta+c})^{2t}D^{2t} (I + c \frac{\delta^2}{(\delta+c)^2}D^2)\tilde{x_0} - \delta^T \tilde{x_0}^\prime (\frac{c}{\delta+c})^{2T} D^{2T} \tilde{x_0}$$

$$= -\tilde{x_0}^\prime (I + c \frac{\delta^2}{(\delta+c)^2}D^2) [\sum\limits_0^{T-1} \frac{c^{2t} \delta^t}{(\delta+c)^{2t}}D^{2t}]\tilde{x_0} - \delta^T \tilde{x_0}^\prime (\frac{c}{\delta+c})^{2T} D^{2T} \tilde{x_0}$$

We can now again factor out $\tilde{x_0}$ and consider an arbitrary dimension of the resulting matrix:

$$-(1 + c \frac{\delta^2}{(\delta+c)^2}\lambda_j^2) [\sum\limits_0^{T-1} \frac{c^{2t} \delta^t}{(\delta+c)^{2t}}\lambda_j^{2t}] - \delta^T (\frac{c\lambda_j}{\delta+c})^{2T}$$

From here, we can now check the two payoff expressions for $T = 2$. First, the one-shot payoff contains the sum $\Sigma_j$ which for $T = 2$ evaluates to $1 + \delta \lambda_j^2$. Substituting this into the payoff expression gives:

In [3]:
one_shot_payoff = -1 - (c * delta * lambda_**2 * (1 + delta * lambda_**2) / (c + delta * (1 + delta * lambda_**2)))
one_shot_payoff

-c*delta*lambda_j**2*(delta*lambda_j**2 + 1)/(c + delta*(delta*lambda_j**2 + 1)) - 1

The myopic payoff expression for $T = 2$ is:

In [4]:
myopic_payoff = -(1 + c * (delta * lambda_ / (delta + c))**2) * (1 + delta * (c*lambda_/(delta+c))**2) - delta**2 * (c*lambda_/(delta+c))**4
myopic_payoff

-c**4*delta**2*lambda_j**4/(c + delta)**4 + (-c*delta**2*lambda_j**2/(c + delta)**2 - 1)*(c**2*delta*lambda_j**2/(c + delta)**2 + 1)

If we subtract the one-shot payoff from the myopic payoff, we get:

In [5]:
simplify(myopic_payoff - one_shot_payoff)

c**2*delta**3*lambda_j**4*(-c*delta*lambda_j**2 + c + delta)/(c**4 + c**3*delta**2*lambda_j**2 + 4*c**3*delta + 3*c**2*delta**3*lambda_j**2 + 6*c**2*delta**2 + 3*c*delta**4*lambda_j**2 + 4*c*delta**3 + delta**5*lambda_j**2 + delta**4)

which is positive since $c \geq 0, \delta \geq 0, \lambda_j \geq 0, \delta + c - c\delta\lambda_j^2 = \delta + c(1 - \delta\lambda_j^2) > 0$. Thus, in $T = 2$, the myopic payoff is strictly greater than the one-shot payoff. Note that if $c$ and $\delta$ are both zero, multiple results contain $0/0$ forms, so this case is ignored.

Next, assume that the myopic payoff is strictly greater than the one-shot payoff for an arbitrary length $T \geq 2$. We will show that the myopic payoff is then also strictly greater than the one-shot payoff for length $T + 1$.

First, recall that the myopic payoff for horizon length $T$ is:

$$V_{0, T}^M = -(1 + c \frac{\delta^2}{(\delta+c)^2}\lambda_j^2) [\sum\limits_0^{T-1} \frac{c^{2t} \delta^t}{(\delta+c)^{2t}}\lambda_j^{2t}] - \delta^T (\frac{c\lambda_j}{\delta+c})^{2T}$$

For horizon length $T + 1$, the payoff is:

$$V_{0, T+1}^M = -(1 + c \frac{\delta^2}{(\delta+c)^2}\lambda_j^2) [\sum\limits_0^T \frac{c^{2t} \delta^t}{(\delta+c)^{2t}}\lambda_j^{2t}] - \delta^{T+1} (\frac{c\lambda_j}{\delta+c})^{2(T+1)}$$

$$ = -(1 + c \frac{\delta^2}{(\delta+c)^2}\lambda_j^2) [\sum\limits_0^{T-1} \frac{c^{2t} \delta^t}{(\delta+c)^{2t}}\lambda_j^{2t}] - \frac{c^{2T}\delta^T \lambda_j^{2T}}{(\delta+c)^{2T}} - \frac{c^{2T+1}\delta^{T+2}\lambda_j^{2T+2}}{(\delta+c)^{2T+2}} - \delta^{T+1} (\frac{c\lambda_j}{\delta+c})^{2(T+1)}$$

$$= V_{0,T}^M - \frac{c^{2T+1}\delta^{T+2}\lambda_j^{2(T+1)}}{(\delta+c)^{2(T+1)}} - \delta^{T+1}(\frac{c\lambda_j}{\delta+c})^{2(T+1)}$$

$$= V_{0,T}^M - \frac{c^{2T+1}\delta^{T+2}\lambda_j^{2(T+1)} + \delta^{T+1}(c\lambda_j)^{2(T+1)}}{(\delta+c)^{2(T+1)}}$$

$$= V_{0,T}^M - \frac{c^{2T+1}\delta^{T+1}\lambda_j^{2(T+1)}(\delta+c)}{(\delta+c)^{2(T+1)}}$$

$$= V_{0,T}^M - \frac{c^{2T+1}\delta^{T+1}\lambda_j^{2(T+1)}}{(\delta+c)^{2T+1}}$$

From here we can see that $V_{0,T+1}^M - V_{0,T}^M = -\frac{c^{2T+1}\delta^{T+1}\lambda_j^{2(T+1)}}{(\delta+c)^{2T+1}}$.

Next, recall that the one-shot payoff for horizon length $T$ is:

$$V_{0,T}^O = -1 - \frac{c\delta \Sigma_j \lambda_j^2}{c + \delta \Sigma_j}$$

where $\Sigma_j = \sum\limits_0^{T-1}\delta^t \lambda_j^{2t}$ given horizon length $T$. It follows that:

$$V_{0,T+1}^O = -1 - \frac{c\delta (\Sigma_j+\delta^T\lambda_j^{2T}) \lambda_j^2}{c + \delta (\Sigma_j+\delta^T\lambda_j^{2T})}$$

Thus, $V_{0,T+1}^O - V_{0,T}^O$ is:

In [6]:
one_shot_old = -1 -c * delta * lambda_**2 * d / (c + delta * d)
one_shot_new = - 1 - c * delta * lambda_**2 * (d + delta**T * lambda_**(2*T)) / (c + delta * (d + delta**T * lambda_**(2*T)))
one_shot_difference = expand(simplify(one_shot_new - one_shot_old))
one_shot_difference

-c**2*delta*delta**T*lambda_j**2*lambda_j**(2*T)/(Sigma_j**2*delta**2 + 2*Sigma_j*c*delta + Sigma_j*delta**2*delta**T*lambda_j**(2*T) + c**2 + c*delta*delta**T*lambda_j**(2*T))

Now, if we take $V_{0,T+1}^M - V_{0,T}^M$ and subtract $V_{0,T+1}^O - V_{0,T}^O$, we get:

In [7]:
myopic_difference = -lambda_**(2*(T+1)) * c**(2*T + 1) * delta**(T+1) / ((delta+c)**(2*T + 1))
simplify(myopic_difference - one_shot_difference)

delta**(T + 1)*lambda_j**(2*T + 2)*(c**2 - c**(2*T + 1)*(c + delta)**(-2*T - 1)*(Sigma_j**2*delta**2 + 2*Sigma_j*c*delta + Sigma_j*delta**(T + 2)*lambda_j**(2*T) + c**2 + c*delta**(T + 1)*lambda_j**(2*T)))/(Sigma_j**2*delta**2 + 2*Sigma_j*c*delta + Sigma_j*delta**(T + 2)*lambda_j**(2*T) + c**2 + c*delta**(T + 1)*lambda_j**(2*T))

The denominator is positive, so we consider the numerator:

In [8]:
numer = fraction(simplify(myopic_difference - one_shot_difference))[0]
numer

delta**(T + 1)*lambda_j**(2*T + 2)*(c**2 - c**(2*T + 1)*(c + delta)**(-2*T - 1)*(Sigma_j**2*delta**2 + 2*Sigma_j*c*delta + Sigma_j*delta**(T + 2)*lambda_j**(2*T) + c**2 + c*delta**(T + 1)*lambda_j**(2*T)))

In the event that $\delta$ or $\lambda_j$ is zero, the payoff differences evaluate to zero and there is no difference between them (this also occurs in the infinite-horizon solution) so we ignore these cases. As a result, since we are considering positive cases, we can remove these two variables from the front of the numerator and, after simplification, we are left with:

In [9]:
numer_simplify = simplify(c**2 - (c/(c+delta))**(2*T + 1) * ((d*delta + c)**2 + lambda_**(2*T) * delta**(T+1) * (c + delta*d)))
numer_simplify

c**2 - (c/(c + delta))**(2*T + 1)*(Sigma_j*delta + c)*(Sigma_j*delta + c + delta**(T + 1)*lambda_j**(2*T))

Factoring out $c^2$, we get:

$$c^2 (1 - \frac{c^{2T-1}}{(c+\delta)^{2T+1}} (c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T}))$$

We assume $c$ is positive for the same comparison reasons as before (again, this occurs in the infinite-horizon case as well). Then we can remove $c^2$ and focus on the remaining component:

$$1 - \frac{c^{2T-1}}{(c+\delta)^{2T+1}} (c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T})$$

which is positive if and only if we have the following:

$$1 > \frac{c^{2T-1}}{(c+\delta)^{2T+1}} (c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T})$$

This would follow from the following statement, if true:

$$c^{2T-1}(c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T}) < (c+\delta)^{2T+1}$$

To show this, notice that $\Sigma_j = \sum\limits_0^{T-1}\delta^t \lambda_j^{2t} \leq \sum\limits_0^{T-1} 1 = T$, and that $\delta\Sigma_j + \delta^{T+1}\lambda_j^{2T} = \delta(\Sigma_j + \delta^T\lambda_j^{2T}) = \delta(\sum\limits_0^T \delta^t \lambda_j^{2t}) \leq \delta(\sum\limits_0^T 1) = \delta(T+1)$.

It follows that:

$$c^{2T-1}(c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T}) \leq c^{2T-1}(c+\delta T)(c + \delta(T+1))$$

We can expand $c^{2T-1}(c+\delta T)(c + \delta(T+1))$ as follows:

$$= c^{2T-1}(c^2 + c\delta T + c\delta (T+1) + \delta^2 T (T+1)$$

$$= c^{2T+1} + c^{2T}\delta T + c^{2T}\delta (T+1) + c^{2T-1}\delta^2 T (T+1)$$

$$= c^{2T+1} + c^{2T}\delta (2T+1) + c^{2T-1}\delta^2 T (T+1)$$

Now, if we expand $(c+\delta)^{2T+1}$, by the binomial theorem we get the following:

$$(c+\delta)^{2T+1} = {2T+1 \choose 0} c^{2T+1}\delta^0  + {2T+1 \choose 1} c^{2T}\delta^1 + {2T+1 \choose 2} c^{2T-1}\delta^2 + \dots + {2T+1 \choose 2T+1} c^0 \delta^{2T+1}$$

$$= c^{2T+1} + (2T+1) c^{2T}\delta + \frac{(2T+1)(2T)}{2} c^{2T-1}\delta^2 + \dots + \delta^{2T+1}$$

$$= c^{2T+1} + (2T+1) c^{2T}\delta + T(T+1) c^{2T-1}\delta^2 + \dots + \delta^{2T+1}$$

If we then compute $(c+\delta)^{2T+1} - c^{2T-1}(c+\delta T)(c + \delta(T+1))$, we get:

$$c^{2T+1} + (2T+1) c^{2T}\delta + T(T+1) c^{2T-1}\delta^2 + {2T+1 \choose 3}c^{2T-2}\delta^3 + \dots + \delta^{2T+1} - c^{2T+1} - c^{2T}\delta (2T+1) - c^{2T-1}\delta^2 T (T+1)$$

$$= {2T+1 \choose 3}c^{2T-2}\delta^3 + \dots + \delta^{2T+1} > 0$$

Thus, 

$$c^{2T-1}(c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T}) \leq c^{2T-1}(c+\delta T)(c + \delta(T+1)) < (c+\delta)^{2T+1}$$

This implies

$$1 > \frac{c^{2T-1}}{(c+\delta)^{2T+1}} (c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T})$$

and so $1 - \frac{c^{2T-1}}{(c+\delta)^{2T+1}} (c+\delta\Sigma_j)(c + \delta\Sigma_j + \delta^{T+1}\lambda_j^{2T})$ is positive. 

This then implies that $(V_{0,T+1}^M - V_{0,T}^M) - (V_{0,T+1}^O - V_{0,T}^O) > 0$. Since by the induction hypothesis we know that $V_{0,T}^M > V_{0,T}^O$, we have that:

$$V_{0,T}^M + V_{0,T+1}^M - V_{0,T}^M = V_{0,T+1}^M > V_{0,T+1}^O + V_{0,T}^M - V_{0,T}^O$$

Thus, since $V_{0,T}^M > V_{0,T}^O$ implies $V_{0,T}^M - V_{0,T}^O > 0$, we get:

$$V_{0,T+1}^M > V_{0,T+1}^O + V_{0,T}^M - V_{0,T}^O > V_{0,T+1}^O$$

and conclude with $V_{0,T+1}^M > V_{0,T+1}^O$, as was to be shown.

Thus, by induction, we have that the myopic payoff strictly exceeds the one-shot payoff for all $T \geq 2$ given nonzero $c, \delta, \lambda_j$ (otherwise the myopic payoff only weakly exceeds the one-shot payoff).