# Optimization of two random normals

--- **Kowshik Bettadapura**

```{admonition} Problem statement.

*Let $X$ and $Y$ be two random, normals. That is, two random variables distributed normally. Then for a given level of *risk* $r^2$, what combination $Z_{\lambda, \delta} = \lambda X + \delta Y$ will give the largest expectation?*

```

## Preliminaries


### Independent randoms

The normal distribution is an instance of the class of *stable* distributions. This means any linear combination of random, normal variables will again be normally distributed. So if $X\sim \mathcal N(\mu_X, \sigma^2_X)$ and $Y\sim \mathcal N(\mu_Y, \sigma_Y^2)$ with $X$ and $Y$ independent, then any linear combination $Z_{\lambda, \delta} = \lambda X + \delta Y$ for $\lambda, \delta\in \mathbb R$ will be distributed according to

```{math}

Z_{\lambda, \delta} \sim \mathcal N\left(\lambda\mu_X + \delta\mu_Y, (\lambda\sigma_X)^2 + (\delta\sigma_Y)^2\right).

```

The expected value is $\mathbb E Z_{\lambda, \delta} = \lambda\mu_X + \delta\mu_Y$ with variance, or "volatility", $\sigma_{Z_{\lambda, \delta}}^2 = (\lambda\sigma_X)^2 + (\delta\sigma_Y)^2$.

### Non-independent randoms

In the case where $X$ and $Y$ are *not* random, they will have non-zero covariance, denoted $c_{X, Y}$. This affects the probability law governing the linear combination $Z_{\lambda, \delta}$. 

```{note}

The covariance of $X$ with itself is just the variance $\sigma_X^2$, i.e., $c_{X, X} = \sigma_X^2$.

```

With the covariance $c_{X, Y}$ the combination $Z_{\lambda, \delta}$ is distributed according to the law,

```{math}

Z_{\lambda, \delta}
\sim
\mathcal N\left(\lambda\mu_X + \delta\mu_Y, (\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 + 2c_{X, Y}\lambda\delta\right)

```

See that in both the independent or dependent case the expectation of $Z_{\lambda, \delta}$, $\mathbb EZ_{\lambda, \delta}$ remains the same. What changes is the volatility. 

#### A remark on the statistics

Variance is the square of the standard deviation. Accordingly, the variance is always positive for *mathematical* reasons. In contrast, the standard deviation need not be positive for *mathematical* reasons, but must be positive if it is to be understood as a statistical parameter. 

In the distribution law for $Z_{\lambda, \delta}$, its variance is

```{math}

\mathrm{Var}~Z_{\lambda, \delta}
=
(\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 + 2c_{X, Y}\lambda\delta. 
```

Unlike in the case of independent randoms, it is not immediately clear here that $\mathrm{Var}~Z_{\lambda,\delta} \geq 0$. In order to see that it is we will need to use the following:

```{admonition} Fact 

For random variables $X, Y$

$$
\begin{align}
-\sigma_X\sigma_Y \leq c_{X, Y} \leq \sigma_X\sigma_Y
\end{align}
$$

```

We will now prove the following.

```{admonition} Proposition 

For normal randoms $X, Y$, $\mathrm{Var}~Z_{\lambda, \delta}\geq0$ for any $\lambda, \delta$.

```


````{admonition} Proof.

The proof of this statement can be arrived at by contradiction. Since the statement in the proposition is about a statistical parameter, it is equivalent to proving the following mathematical proposition: let $x, y$ be positive, real variables and $\rho < 1$ a parameter. Then 

```{math}

x^2 + y^2 \geq 2\rho xy.

```

To prove the above, suppose in converse that $x^2 + y^2 < 2\rho xy$. We will intend on arriving at a contradiction. Firstly, with this assumption note that $xy> 0$. Dividing through by $xy$ and using that $\rho\leq 1$ gives,

$$
\begin{align}
\frac{x^2 + y^2}{xy} < 2\rho \leq 2.
\end{align}
$$

Set $z = \frac{x}{y}$. Then $z$ will again be a real, positive number. The above inequality becomes

$$
\begin{align}
z + \frac{1}{z} < 2 &\iff z^2 + 1 < 2z
\\
&\iff z^2 - 2z + 1 < 0 
\\
&\iff (z-1)^2 < 0.
\end{align}
$$

We have now arrived at a contradiction. Since $z$ is real, $z-1$ is real so that $(z-1)^2$ will necessarily be positive. Hence that $(z-1)^2< 0$ cannot be true.  

From here, using that $c_{X, Y} = \rho\sigma_X\sigma_Y$ for $\rho$ the *Pearson correlation coefficient* (a parameter between $-1$ and $1$), we can deduce that the variance $\mathrm{Var}~Z_{\lambda,\delta}$ above must indeed be positive for any $\lambda, \delta$.

````



## Optimization

From the problem statement, we want to maximize the expectation $\mathbb EZ_{\lambda, \delta}$ subject to a fixed level of risk $r$, which can be taken here to mean the standard deviation in $Z_{\lambda, \delta}$. We first consider the case where $X$ and $Y$ are independent below, after which we proceed to study the dependent case.

### Independent randoms

For two independent randoms $X, Y$ recall that 

```{math}

Z_{\lambda, \delta} \sim \mathcal N\left(\lambda \mu_X + \delta\mu_Y, (\lambda\sigma_X)^2 + (\delta\sigma_Y)^2\right)

```

Set $r^2 = (\lambda\sigma_X)^2 + (\delta\sigma_Y)^2$. Our problem is thus:

- maximize $\lambda \mu_X + \delta\mu_Y$; 
- with $r^2$ *constant*.


See that we need to maximize a linear form in the parameters $\lambda, \delta$ subject to a quadratic constraint in $\lambda, \delta$. Thus, we find here a typical problem in the area of *quadratic programming*. It will not be necessary to appeal to this area however as, in our setting, an analytic solution can be obtained through the method of *Lagrange multipliers*.


#### Solution via Lagrange multipliers

Let $\mathcal L$ be the Lagrange functional which, in our setting, will be given by

```{math}

\mathcal L(\lambda, \delta, C)
=
\left(\lambda \mu_X + \delta\mu_Y\right)
-
\frac{C}{2}
\left((\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 - r^2\right)

```

Choosing $\frac{C}{2}$ in $\mathcal L$ above is convenient in undertaking forthcoming calculations. This factor, or the constant $C$ itself, is usually referred to as the *Lagrange multiplier*.

```{note}
If $(\lambda^*, \delta^*)$ is an optimal value of $\mathcal L$ subject to $\partial \mathcal L/\partial C = 0$, then $\mathcal L(\lambda^*, \delta^*, C)$ is at a minima or maxima along the curve $\partial \mathcal L/\partial C = 0$. In particular, $d\mathcal L|_{(\lambda^*, \delta^*)} =0$. This latter condition is our means of finding the optimal points $(\lambda^*, \delta^*)$.
```

The differential $d\mathcal L$ of $\mathcal L$ is,

$$
\begin{align}
d\mathcal L
&=
\mu_Xd\lambda + \mu_Y d\delta
-
\frac{C}{2}
\left(
2\sigma_X^2\lambda d\lambda
+
2\sigma_Y^2\delta d\delta
\right)
-
\frac{1}{2}dC
\left((\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 - r^2\right)
\\
&=
\left(\mu_X - C\sigma_X^2\lambda\right)d\lambda
+
\left(\mu_Y - C\sigma_Y^2\delta\right)d\delta
-
\frac{1}{2}
\left((\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 - r^2\right)
dC
\end{align}
$$


Setting $d\mathcal L = 0$ gives the equations,

$$
\begin{align}
\mu_X - C\sigma^2_X\lambda &= 0;
\\
\mu_Y - C\sigma_Y^2\delta &= 0;
\\
(\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 - r^2 &= 0.
\end{align}
$$

This latter equation is just the specification of the curve $\partial \mathcal L/\partial C = 0$ and, oestensibly, adds no further information. It is referred to as the *constraint equation*. 

From the former two equations, see that we obtain expressions for the parameters $\lambda, \delta$ in terms of the Lagrange multiplier $C$,

$$
\begin{align}
\lambda 
= \frac{\mu_X}{C\sigma_X^2}
&&\mbox{and}&&
\delta
=
\frac{\mu_Y}{C\sigma_Y^2}
\end{align}
$$

In order to find the *optimal values* $\lambda^*, \delta^*$ it remains to determine the multiplier $C$. This is where we can appeal to the constraint equation (the last of the above three equations). 

Recall that $\mu_X,\sigma_Y, \mu_Y, \sigma_Y$ and $r^2$ are all constants, established at the outset of the problem statement. Substituting $\lambda, \delta$ above into the constraint equation results in the following quadratic equation for $C$,

```{math}

r^2C^2 - \left(\frac{\mu_X}{\sigma_X}\right)^2 - \left(\frac{\mu_Y}{\sigma_Y}\right)^2 = 0.

```

This readily gives the following values for $C$,

$$
\begin{align}
C_\pm
=
\pm\frac{1}{r}
\sqrt{\left(\frac{\mu_X}{\sigma_X}\right)^2 + \left(\frac{\mu_Y}{\sigma_Y}\right)^2}
\end{align}
$$

Optimal parameters are therefore,

$$
\begin{align}
\lambda^*_\pm
=
\pm
\frac{r\mu_X}{\sqrt{\left(\frac{\mu_X}{\sigma_X}\right)^2 + \left(\frac{\mu_Y}{\sigma_Y}\right)^2}}
&&\mbox{and}&&
\delta^*_\pm
=
\pm\frac{r\mu_Y}{\sqrt{\left(\frac{\mu_X}{\sigma_X}\right)^2 + \left(\frac{\mu_Y}{\sigma_Y}\right)^2}}.
\end{align}
$$

Now recall from the problem statement that we want $(\lambda^*, \delta^*)$ which maximizes the expectation $\mathbb EZ_{\lambda^*, \delta^*} = \lambda^*\mu_X + \delta^*\mu_Y$. As the expectation is linear in the parameters $\lambda^*, \delta^*$, the optima to choose from the possibilities depends entirely on the signs of the slope of the linear form, i.e., on $\mu_X$ and $\mu_Y$. 

```{admonition} Example

In the case where $\mu_X, \mu_Y > 0$ it is clear that $(\lambda^*, \delta^*) = (\lambda^*_+, \delta^*_+)$ are the appropriate optima which maximize the linear form. 

```

Whichever optima $(\lambda^*, \delta^*)$ we find, it means the following: with $r$ fixed, $\mathbb EZ_{\lambda^*, \delta^*} \geq \mathbb EZ_{\lambda, \delta}$ for all $\lambda, \delta$.


### Non-independent randoms

The case of non-independent randoms is a bit more involved. The general principles are the same however. For $X, Y$ non-independent means they have non-vanishing covariance $c_{X, Y}$. Any linear combination $Z_{\lambda, \delta} = \lambda X + \delta Y$ will be distributed according to,

```{math}

Z_{\lambda, \delta}
\sim 
\mathcal N\left(\lambda\mu_X + \delta\mu_Y, (\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 + 2c_{X, Y}\lambda\delta \right)

```

We can use the method of Lagrange multipliers again to find the parameter values $\lambda^*, \delta^*$ which maximize the expectation $\mathbb EZ_{\lambda, \delta}$ for all $\lambda, \delta$ subject to a fixed level of risk $r^2 = (\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 + 2c_{X, Y}\lambda\delta$.

#### Lagrange multipliers for dependent randoms

The Lagrangian functional $\mathcal L$ mediating our optimisation is the following,

```{math}

\mathcal L(\lambda, \delta, C)
=
\lambda\mu_X + \delta\mu_Y
-
\frac{C}{2}
\left(
(\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 + 2c_{X, Y}\lambda\delta
-
r^2
\right)

```

Its differential is,

$$
\begin{align}
d\mathcal L
=&~
\left(\mu_X - C\left(\sigma_X^2\lambda + c_{X, Y}\delta\right)\right)d\lambda
+
\left(\mu_Y - C\left(\sigma_Y^2\delta + c_{X, Y}\lambda\right)\right)d\delta
\\
&
-\frac{1}{2}
\left(
(\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 + 2c_{X, Y}\lambda\delta 
-
r^2
\right)
dC.
\end{align}
$$

Setting $d\mathcal L = 0$ now gives the following equations,

$$
\begin{align}
\mu_X - C\left(\sigma_X^2\lambda + c_{X, Y}\delta\right)
&= 0
\\
\mu_Y - C\left(\sigma_Y^2\delta + c_{X, Y}\lambda\right)
&= 0
\\
(\lambda\sigma_X)^2 + (\delta\sigma_Y)^2 + 2c_{X, Y}\lambda\delta 
-
r^2
&= 0.
\end{align}
$$

Recall that the latter equation is referred to as the *constraint equation*. Solving the first two equations simultaneously gives the following expressions for $\lambda$ and $\delta$ in terms of the problem constants $\mu_X, \sigma_X, \mu_Y, \sigma_Y$ and the yet to be determined multiplier $C$,

$$
\begin{align}
\lambda
=
\frac{\sigma_Y^2\mu_X - c_{X, Y}\mu_Y}{\sigma_X^2\sigma_Y^2 - c^2_{X, Y}}\frac{1}{C}
&&\mbox{and}&&
\delta
=
\frac{\sigma_X^2\mu_Y - c_{X, Y}\mu_X}{\sigma_X^2\sigma_Y^2 - c_{X, Y}^2}\frac{1}{C}.
\end{align}
$$

```{note}

As a quick consistency check, note that if $X$ and $Y$ were independent, then $c_{X, Y} = 0$. In this case, we recover the expressions for $\lambda, \delta$ in the case of independent randoms.

```

````{warning}

If $X$ and $Y$ are perfectly correlated, then $c_{X, Y} = \pm\sigma_X\sigma_Y$, i.e., that $c_{X, Y}^2 = \sigma_X^2\sigma_Y^2$ in which case $\lambda,\delta$ above are *undefined*. The case of perfectly correlated, random variables needs to be considered seperately. In what follows we assume $c^2_{X, Y}\neq \sigma^2_X\sigma^2_Y$.

````


In order to determine the multiplier $C$ we need to substitute the expressions $\lambda, \delta$ above into the constraint equation. To ease our calculation define the following:

$$
\begin{align}
A_\lambda 
&= 
\sigma_X\frac{\sigma_Y^2\mu_X - c_{X, Y}\mu_Y}{\sigma_X^2\sigma_Y^2 - c^2_{X, Y}};
\\
A_\delta
&=
\sigma_Y\frac{\sigma_X^2\mu_Y - c_{X, Y}\mu_X}{\sigma_X^2\sigma_Y^2 - c^2_{X, Y}};
\\
A_{\lambda\delta}
&=
c_{X, Y}
\left(
\frac{\sigma_Y^2\mu_X - c_{X, Y}\mu_Y}{\sigma_X^2\sigma_Y^2 - c^2_{X, Y}}
\right)
\left(
\frac{\sigma_X^2\mu_Y - c_{X, Y}\mu_X}{\sigma_X^2\sigma_Y^2 - c^2_{X, Y}}
\right).
\end{align}
$$

The constraint equation now becomes

```{math}

\frac{A^2_\lambda}{C^2} + \frac{A^2_\delta}{C^2} + 2\frac{A_{\lambda\delta}}{C^2} - r^2 = 0

```

and hence we find,

```{math}

C_\pm = \pm\frac{1}{r}\sqrt{A^2_\lambda + A^2_\delta + 2A_{\lambda\delta}}

```

````{note}

Generally, our solution is well defined if and only if $A^2_\lambda + A^2_\delta + 2A_{\lambda\delta}> 0$, i.e., if $C$ is *real*. This is of course true however for the same reasons that $\mathrm{Var}~Z_{\lambda,\delta} \geq 0$ (see the proof of this statement above).

````

With possible constants $C_\pm$ established, the optimal values $\lambda^*,\delta^*$ are given by

$$
\begin{align}
\lambda^*_\pm
&=
\pm
\frac{\sigma_Y^2\mu_X - c_{X, Y}\mu_Y}{\sigma_X^2\sigma_Y^2 - c^2_{X, Y}}\frac{r}{\sqrt{A^2_\lambda + A^2_\delta + 2A_{\lambda\delta}}}
\\
\delta^*_\pm
&=
\pm
\frac{\sigma_X^2\mu_Y - c_{X, Y}\mu_X}{\sigma_X^2\sigma_Y^2 - c_{X, Y}^2}\frac{r}{\sqrt{A^2_\lambda + A^2_\delta + 2A_{\lambda\delta}}}.
\end{align}
$$

It remains to check the pairs $(\lambda_+, \delta_+), (\lambda_+, \delta_-), (\lambda_-, \delta_+), (\lambda_-, \delta_-)$ to see which pair will maximize the linear form $\mathbb EZ_{\lambda,\delta} = \lambda \mu_X + \delta\mu_Y$.

```{note}

As a consistency check, if $X, Y$ are independent randoms, $c_{X, Y} = 0$. In this case the above expressions reduce to what we found earlier for independent randoms.

```

### For perfectly correlated variables 

If $X$ and $Y$ are perfectly correlated, $c_{X, Y} = \pm\sigma_X\sigma_Y$. 

```{note}

The *Pearson correlation coefficient* is the scalar $\rho$ such that $c_{X, Y} = \rho \sigma_X\sigma_Y$. If $X$ and $Y$ are perfectly correlated, then $\rho = \pm 1$ which is equivalent to $c_{X, Y} = \pm\sigma_X\sigma_Y$.

```


Any linear combination $Z_{\lambda, \delta}$ follows the law,

```{math}

Z_{\lambda, \delta} \sim \mathcal N\left(\lambda\mu_X+\delta\mu_Y, (\lambda\sigma_X \pm \delta\sigma_Y)^2\right)

```

where the sign $\pm$ in the variance is in accordance with whether $\rho = 1$ or $\rho = -1$.


#### Special case

Suppose $X = Y$. Then $c_{X, Y} = \sigma_X^2$ and 

$$
\begin{align}
Z_{\lambda, \delta} 
&= (\lambda + \delta)X
\\
&\sim \mathcal N\left((\lambda + \delta)\mu_X, (\lambda + \delta)^2\sigma_X^2\right)
\\
&= 
\mathcal N\left(\gamma \mu_X, \gamma^2\sigma_X^2\right)
\end{align}
$$

where we have set $\gamma = \lambda + \delta$. The problem of maximizing the expectation $\mathbb EZ_{\lambda, \delta}$ subject to a fixed level of risk $r^2$ amounts solving for $\gamma$ in the equation $r^2 = \gamma^2\sigma_X^2$, being

```{math}

\gamma^* = \pm\frac{r}{\sigma_X}.

```

As for finding the optimal values $(\lambda^*, \delta^*)$, see that this is an undertermined problem. We can choose any value for $\delta^*$. Then with $\gamma^* =\pm r/\sigma_X$ above, we obtain $\lambda^* = \pm r/\sigma_X - \delta^*$. The choice of sign determining whether we have a maximum or minimum value, itself according to the parity of $\mu_X$.

#### General case

With $c_{X, Y} = \pm1$, this means $X$ and $Y$ are perfectly correlated. Hence that a linear relation perfectly describes each variable. And so, this means there there exist constants $a, b$ imposing a linear relation between $Y$ and $X$,

```{math}

Y = aX + b.

```

With $a, b$ determined, see that 

$$
\begin{align}
Z_{\lambda, \delta}
&= 
\lambda X + \delta Y
\\
&= 
(\lambda + a\delta)X + b\delta 
\\
&\sim \mathcal N\left((\lambda + a\delta )\mu_X + b\delta, (\lambda + \delta a)^2\sigma_X^2\right)
\\
&= \mathcal N\left(\gamma\mu_X + b\delta, \gamma^2\sigma_X^2\right)
\end{align}
$$

where we set $\gamma = \lambda + a\delta$ and used that $\mathrm{Var}(X) = \mathrm{Var}(X + \mathrm{constant})$.

The general solution now to maximizing the expectation $\mathbb EZ_{\lambda, \delta}$ subject to a fixed risk $r^2$ is exactly the same as in the special case above. We want $\gamma^*$ such that $(\gamma^*)^2\sigma_X^2 = r^2$. Hence,

```{math}

\gamma^* = \frac{r}{\sigma_X}.

```


#### Determining the linear parameters

If we know the respective means and variances $\mu_X, \sigma_X, \mu_Y, \sigma_Y$ and if we know $Y = aX + b$, the linear parameters $a, b$ can be determined as follows. Using linearity of the mean we have

$$
\begin{align}
\mu_Y
&= \mathbb EY
\\
&= \mathbb E(a X + b)
\\
&= a\mathbb EX + \mathbb Eb
\\
&= a\mu_X + b.
\end{align}
$$

And using that the variance is translation invariant,

$$
\begin{align}
\sigma^2_Y 
= 
\sigma^2_{aX + b}
= 
a^2 \sigma_X^2.
\end{align}
$$

Hence, 

$$
\begin{align}
a
=
\pm
\frac{\sigma_Y}{\sigma_X}
&&
\mbox{and}
&&
b
=
\frac{\sigma_X\mu_Y \mp \sigma_Y\mu_X}{\sigma_X}.
\end{align}
$$

The sign $\pm$ above is determined by the covariance. If $c_{X, Y} = +\sigma_X\sigma_Y$, then $X$ and $Y$ are *positively correlated* so $a> 0$; otherwise if $c_{X, Y} = -\sigma_X\sigma_Y$, $X$ and $Y$ are *negatively* correlated in which case the slope is negative, i.e, $a< 0$. As standard deviations are always positive, the fraction $\sigma_Y/\sigma_X$ will always be positive. Hence,


$$
\begin{align}
a
=
\left\{
\begin{array}{rc}
\frac{\sigma_Y}{\sigma_X} & \mbox{if $c_{X, Y} = \sigma_X\sigma_Y$};
\\
-\frac{\sigma_Y}{\sigma_X} & \mbox{if $c_{X, Y} = -\sigma_X\sigma_Y$}
\end{array}
\right.
&&
\mbox{and}
&&
b
=
\left\{
\begin{array}{rc}
\frac{\sigma_X\mu_Y - \sigma_Y\mu_X}{\sigma_X} & \mbox{if $c_{X, Y} = \sigma_X\sigma_Y$};
\\
\frac{\sigma_X\mu_Y + \sigma_Y\mu_X}{\sigma_X} & \mbox{if $c_{X, Y} = -\sigma_X\sigma_Y$.}
\end{array}
\right.
\end{align}
$$

## Applications

We consider an application of these optimizations to portfolio theory. In this context, how might one interpret $X, Y$ and $Z_{\lambda, \delta}$? Generally, $X$ and $Y$ might refer to particular stocks or funds while $Z$ a combination, or, *portfolio*. 

There are two obvious metrics to which $X$ and $Y$ could refer to: *stock prices* and *growth rates*. The relation between the two are the following. For a time $T$, if $p_0^X$ and $p_1^X$ refer to the price in $X$ at the start and end of $T$ respectively, then the growth rate $r_X$ over $T$ is

```{math}

p_1^X = p_0^Xe^{r_XT}.

```

That is, $r_X = \frac{1}{T} \ln(p_1^X/p_0^X)$. 

```{note}

Assuming growth rates, rather than prices, are normally distributed is another way to say *prices are log-normal*.

```

Should the statistics we observe refer to the variation in prices or the variation in growth rates? We address the issue with prices in the following.

```{warning}

Suppose $X$ is a stock and $p^X$ the random variable representing the price of $X$. Based on a sampling of prices over a time period, let $\mu_X$ be the average and $\sigma_X^2$ the variance in prices. The issue now is that the distribution $\mathcal N(\mu_X, \sigma_X^2)$ does *not* represent growth in the unit price of $X$. Rather, it assumes the price of $X$ will hover around the average $\mu_X$ with volatility $\sigma^2_X$. This will be the case if one expects *zero* growth in the price of $X$. If there is evidence of growth in the price of $X$ however, the distribution $\mathcal N(\mu_X, \sigma_X^2)$ is doomed to grossly mispredict future values of $p_X$.

```

### Growth rates 

Continuing on from the above warning, a probability distribution is more appropriate for modelling *rates of growth*. This is because a constant, non-zero rate of growth will, depending on its sign, represent changes in the price more consistent with a growing or declining value. If the *average* growth rate is positive, it will indicate an upward trend in the price, ideally reflecting overall growth in value.


Rates of growth can be sampled from historical data by segmenting a time *range* $T_{range}$ into time periods of length $T$ and calculating the daily growth of the asset or stock over each such period. 

For stocks $X, Y$ let $r_X, r_Y$ be the random variables representing the respective growth rates in $X,Y$ over time periods of length $T$. Then $\mu_X, \mu_Y$ are the *average* rates of growth over the whole time range $T_{range}$. 

```{note}

These averages $\mu_X, \mu_Y$ reflect the expected level of growth in $X, Y$ over a given day. To annualize these rates we need to simply multiply by the number of days in a year. Similarly, the standard deviation represents expected volatility in daily growth. To annualize it, it is again a matter of multiplying by the number of days in a year.

```

Over the time range a portfolio $Z = \lambda X + \delta Y$ grows at an averate rate $\mu_Z = \lambda \mu_X + \delta \mu_Y$.

```{warning}

Here $r_{Z} = \lambda r_X + \delta r_Y$ does *not* mean we are holding $\lambda$ units of $X$ and $\delta$ units of $Y$. 

```

Let $\lambda^*, \delta^*$ be optimal parameters. Then $\mu_{Z^*}$ represents the *maximum* expected growth subject to a fixed risk tolerance, being the variance in growth rates $\sigma^2_{Z}$ for the portfolio $Z$. 

Now suppose we know $\mu_{Z^*}$ by having found the optimal parameters $\lambda^*, \delta^*$. Suppose moreover that we have an amount $N$ to invest. *How much of $N$ should we invest into holding $X$ and how much into holding $Y$?* 

### Optimal unit holdings

To answer this question, recall from earlier that unit prices are related to growth rates over a time period of length $T$ by,

$$
\begin{align}
p_1^X = p_0^Xe^{r_XT};
&&
p_1^Y = p_0^Ye^{r_YT}
&&
\mbox{and}
&&
p_1^Z = p_0^Ze^{r_ZT}.
\end{align}
$$

Now let $N$ be our total investment amount. Write $N = N_X + N_Y$ where $N_X$ represents the amount invested in $X$ and $N_Y$ the amount invested in $Y$. 

Above, the prices $p_i^X, p_i^Y, p_i^Z$ are over a time period. We invest at the start of the time *range* $T_{range}$ however, so these prices are now at the start and end of the time range. That is, $p_0^X, p_1^X$; $p_0^Y, p_1^Y$ and $p_0^Z, p_1^Z$ now denote the starting and ending prices respectively of $X, Y$ and $Z$ over the time range $T_{range}$. The respective unit holdings are,

$$
\begin{align}
\lambda_X = \frac{N_X}{p_0^X}
&&
\mbox{and}
&&
\delta_Y = \frac{N_Y}{p_0^Y}
\end{align}
$$

Our objective is to find a formula relating $\lambda_X, \delta_Y$ with the optimal parameters $\lambda^*, \delta^*$ in the average growth rate $\mu_{Z^*}$. With $\lambda_X,\delta_Y$ as above, note that $p_0^{Z}, p_1^Z$ will be related to the prices $p_0^X, p_1^X$ and $p_0^Y, p_1^Y$ by

$$
\begin{align}
p_0^Z = \lambda_X p_0^X + \delta_Yp_0^Y
&&
\mbox{and}
&&
p_1^Z = \lambda_X p_1^X + \delta_Yp_1^Y.
\end{align}
$$

Let $\ell$ denote the length of the time range $T_{range}$. With $\mu_{Z} = \lambda \mu_X + \delta \mu_Y$, see that 

```{math}
p_1^Z = p_0^Ze^{\mu_Z\ell} = p_0^Ze^{\lambda \mu_X\ell + \delta \mu_Y\ell}.
```

From the above expressions for $p_0^Z, p_1^Z$ we find

$$
\begin{align}
\lambda_X p_0^Xe^{\mu_X\ell} + \delta_Yp_0^Ye^{\mu_Y\ell}
&=
p_1^Z 
\\
&= p_0^Z e^{\mu_Z\ell}
\\
&= \left(\lambda_X p_0^X + \delta_Yp_0^Y\right)e^{\lambda \mu_X\ell + \delta \mu_Y\ell}
\end{align}
$$


Hence,

$$
\begin{align}
\lambda_X
=
\delta_Y
\frac{p_0^Y}{p_X^0}
\frac{e^{\lambda \mu_X\ell + \delta \mu_Y\ell} - e^{\mu_Y\ell}}{e^{\mu_X\ell} - e^{\lambda\mu_X\ell + \delta\mu_Y\ell}}
\end{align}
$$

Now recall that we have a specified investment amount $N$. Using this gives,

$$
\begin{align}
N 
&= 
N_X + N_Y
\\
&=
p_0^X\lambda_X + p_0^Y\delta_Y
\\
\Longrightarrow
\delta_Y
&= 
N\frac{1}{p_0^Y} - \lambda_X\frac{p_0^X}{p_0^Y}
\\
&= 
N\frac{1}{p_0^Y} - \delta_Y \left(\frac{e^{\lambda \mu_X\ell + \delta \mu_Y\ell} - e^{\mu_Y\ell}}{e^{\mu_X\ell} - e^{\lambda\mu_X\ell + \delta\mu_Y\ell}}\right) 
\end{align}
$$
Rearranging for $\delta_Y$ now gives an expression for $\delta_Y$ in terms of known parameters. Substituting this into $\lambda_X$ then gives $\lambda_X$ in terms of known parameters. In known parameters then we find,

$$
\begin{align}
\lambda_X
=
\frac{N}{p^X_0}
\frac{e^{\lambda \mu_X\ell + \delta \mu_Y\ell} - e^{\mu_Y\ell}}{e^{\mu_X\ell} - e^{\mu_Y\ell}}
&&
\mbox{and}
&&
\delta_Y
=
\frac{N}{p_0^Y}
\frac{e^{\mu_X\ell} - e^{\lambda \mu_X\ell + \delta \mu_Y\ell}}{e^{\mu_X\ell} - e^{\mu_Y\ell}}
\end{align}
$$

As a consistency check, see that $p_0^X\lambda_X + p_0^Y\delta_Y = N$. Note moreover that it is relatively symmetric, as we might expect since $X$ and $Y$ enter symmetrically (i.e., one is not preferable to the other). 

```{note}

By relatively symmetric it is meant *symmetric* if $\lambda = \delta$. So in the case $\lambda = \delta$, interchanging $X$ and $Y$ has no effect on the resepctive holdings.

```


````{warning} 

The formulae for the holdings $\lambda_X, \lambda_Y$ are not defined in the case $\mu_X = \mu_Y$.

````

We can conclude now that for optimal parameters $\lambda^*, \delta^*$ defining the maximum growth rate $\mu_{Z^*}$ subject to a fixed volatility: for a given investment amount $N$, we should hold $\lambda_X^*$ units of $X$ and $\delta_Y^*$ units of $Y$ where $\lambda_X^*, \delta_Y^*$ are given by the above formulas for $\lambda_X, \delta_Y$ with $\lambda = \lambda^*$ and $\delta = \delta^*$.

```{note}

Since unit holdings are integers, we ought to hold the respective *floors* of $\lambda_X^*, \delta_Y^*$. Moreover, these holdings could be negative which signifies taking a short position in the stock in question.

```

The optimal amount to invest in $X$ is then $p_0^X\lambda_X^*$ and in $Y$ it is $p_0^Y\delta^*_Y$.


```{note}

The answer we have obtained above does not generalize to find optimal holdings among three or more stocks. It was essential to use the relation $N_X = N - N_Y$ to eliminate one degree of freedom and thereby return a unique solution. If there are three or more stocks, this relation will still eliminate one degree of freedom, but there will be more degrees of freedom overall. As such, this relation alone will not be sufficient to yield a unique solution.

```

### Remarks on the perfectly correlated case

With the shift in focus from prices to growth rates, recall that the portfolio $Z_{\lambda, \delta} = \lambda X + \delta Y$ represents the fund which grows at the rate $\lambda r_X + \delta r_Y$ over a given day. Accordingly, it grows at an *average* daily rate $\mu_Z = \lambda\mu_X + \delta \mu_Y$. If the growth rates $r_X, r_Y$ are perfectly correlated, then holding $Y$ represents either adding to one's holding position in $X$ or shorting $X$. Either way, the growth rate is unaffected. After all, the growth rate in one unit of $X$ is the same as the growth rate of one million units of $X$. The expected *return* of course differing by a factor of a million. But this is ultimately constrained by our initial investment amount $N$.

If our random variables represent growth rates, it does not make sense to consider the case of perfectly correlated stocks. Indeed, in order to acheive a higher or lower rate of growth it is necessary to combine independent or imperfectly correlated stocks. 

Another way to see this is by heeding the above warning. From the above formulae for unit holdings $\lambda_X, \delta_Y$:

- if $\mu_X = \mu_Y$, then $\lambda_X, \delta_Y$ are both undefined;
- if $\mu_X = (1-\delta)\mu_Y$, then concerning the $X$ holdings, $\lambda_X = 0$;
- if $\mu_Y = (1-\lambda)\mu_X$, then concerning the $Y$ holdings, $\delta_Y = 0$.

Now recall that if $X, Y$ are perfectly correlated, then they lie in a linear relationship, i.e., are linearly *dependent*. In the case they are "canonically dependent", $X = Y$, by the first dot point we see it will not make sense to study stochasticity in the daily growth rates of $X, Y$. 

More generally, recall that if $X$ and $Y$ are linearly dependent, the equation for determining the optimal parameters $\lambda^*, \delta^*$ is undetermined. We can solve uniquely for a linear combination $\gamma^*$ of $\lambda, \delta^*$, but there will always be a degree of freedom in choosing $\lambda^*, \delta^*$. By the second and third dot point above, this degree of freedom can be exploited to set either $\lambda_X$ or $\delta_Y$ to zero. This accords with the common sense understanding above that if $X$ and $Y$ are perfectly correlated, holding both means to either add to our holding or short position in one. This can only affect the sign of the growth rate but not its magnitude (e.g., $-X$ grows at an expected rate of $-\mu_X$).
