In [None]:
from dialoghelper import add_msg
import re
from fastcore.foundation import Path
def md_to_notes(path):
    "Read markdown file and create a note for each header section"
    txt = Path(path).read_text()
    parts = re.split(r'^(#{1,4}\s+.+)$', txt, flags=re.MULTILINE)
    if parts[0].strip(): add_msg(content=parts[0].strip())
    for i in range(1, len(parts), 2):
        content = parts[i] + (parts[i+1] if i+1 < len(parts) else '')
        if content.strip(): add_msg(content=content.strip())

In [None]:
md_to_notes('./md/ch09.md')

## Chapter 9

## Coherent Measures of Risk

Alternatives to the variance-as-risk measure are ubiquitous, and, as indicated from our previous observations such as the inverse relationship between variance-as-risk and *ex-post* returns, warranted. In this chapter we identify the requirements for so-called *coherent measures of risk*. Following the approach of previous chapters, we do not take these conditions as necessary in any sense, but, rather as a taxonomy of what one might articulate as a reasonable risk measure. For example, at the outset of the text we compared unlevered and levered portfolios and noted that any reasonable measure of risk should account for this intuitive approach. This, among other similar features, is indeed the case for coherent measures of risk.

After establishing the standard conditions, we give several examples of potential measures of risk, some of which may fail one or another requirement of being a coherent measure of risk. A few particularly popular risk measures will be formulated as linear programming problems.

### 9.1 Definition and Examples

We focus specifically on coherent measures of risk from the frame of portfolio positions. Specifically, for a fixed universe of  $N$  securities, we define a portfolio,  $\Pi$ , as the positions in these assets,

$$\Pi = \begin{pmatrix} p_1 \\ \vdots \\ p_N \end{pmatrix}.$$

In this formulation, we treat the  $p_i$  as nonrandom. It is common to refer to both the portfolio as  $\Pi$  as well as its vector of positions.

The performance of a portfolio is subject to some stochastic element. We let

$$r = \begin{pmatrix} r_1 \\ \vdots \\ r_N \end{pmatrix}$$

denote the performance of the  $N$  assets in our universe, with  $r_i$  corresponding to the  $i$ th asset. We may regard  $r$  as either normalized performance (read percent change) or changes in value measured in the currency of  $\Pi$ ; in the former case, we often assume that the  $p_i$  are normalized as weights as well, while in the latter,  $p_i$  will denote the position in security  $i$ , irrespective of other positions taken. Where necessary, we will specify which of these interpretations for  $r$  is being used. In contrast to our work in mean-variance optimization, however, we will prefer using positions and changes in security values over returns and normalized positions in the present work and will assume this is the case unless otherwise specified. Of course, there is also an implicit assumption of a unit of time over which  $r$  is evaluated. The profit or loss of the portfolio  $\Pi$  over a given period is simply,  $\Pi'r$ , as we have seen previously. Finally, to be consistent with conditions that follow, we will fix the sign of  $r$  so that a positive value indicates a loss for a long (positive) position.

Before stating the mathematical formulation of what makes a coherent measure of risk, we first give some qualitative motivation. First, we would expect that any reasonable risk measure should respect leverage. That is, a levered portfolio should multiply the risk of its un-levered counterpart. Next, we would want that the addition of a risk-free asset to a portfolio should not add to the risk of the portfolio. This condition will actually be strengthened in what follows. Third, based on all that has preceded, we have some preference for diversification, and we would prefer a risk measure that reflects this. Finally, if one portfolio almost surely exceeds another under all scenarios,  $r$ , then we should like to say that the first is less risky than the second.

With these features established, we say

$$\rho: \Pi \to \mathbb{R} \quad (9.1)$$

is a coherent measure of risk if  $\rho(\cdot)$  satisfies the following conditions:

**Positive Homogeneity** For  $\lambda$  a positive scalar,  $\rho(\lambda \cdot \Pi) = \lambda\rho(\Pi)$ .

**Translation Invariance** For  $p_f$  a position in the risk-free asset with profit,  $r_f$ ,  $\rho(\Pi + p_f) = \rho(\Pi) - p_f \cdot r_f$ .

**Monotonicity** If  $\Pi_1'r < \Pi_2'r$  for all instances of  $r$ , then  $\rho(\Pi_1) < \rho(\Pi_2)$ .

**Subadditivity**  $\rho(\Pi_1 + \Pi_2) \le \rho(\Pi_1) + \rho(\Pi_2)$ .

We have as archetypical examples of risk in portfolio variance and volatility. It should be readily apparent that these examples will not satisfy all of the conditions above. Translation Invariance is immediately an issue as we know

variance is translation invariant in the traditional sense; viz., adding a nonrandom scalar to a random variable does not affect the variance of the random variable. Furthermore, Positive Homogeneity gives an indication that volatility may be better suited to the remaining conditions than variance. Continuing this cursory discussion, it may also be apparent to some readers that coherent measures of risk will focus on values of the distribution of loss (which must be denoted as positive in the above framing) rather than the moments of these distributions.

We proceed by analyzing various risk measures according to the conditions just set out.

#### 9.1.1 Volatility

If we define  $\rho(\Pi)$  by the volatility of the portfolio loss  $\Pi'r$ ,  $\nu(\Pi)=\sqrt{Var(\Pi'r)}$ , or in terms of the inner product and associated norm defined previously,

$$\nu(\Pi)=\sqrt{(\Pi,\Pi)_\Sigma}=||\Pi||_\Sigma,$$

for  $\Sigma$  the covariance of  $r$ , many of the conditions for a coherent measure of risk are immediately evident.

To check Positive Homogeneity, we simply calculate, for  $\lambda>0$ ,

$$\begin{aligned}\nu(\lambda\Pi)&=||\lambda\Pi||_\Sigma\\&=\lambda||\Pi||_\Sigma\\&=\lambda\nu(\Pi).\end{aligned}$$

As alluded to earlier, volatility does not adhere to the Translation Invariance condition of a coherent measure of risk since the inner product defining volatility is translation invariant with respect to nonrandom shifts; viz.,

$$\begin{aligned}\nu(\Pi+p_f)&=||\Pi+p_f||_\Sigma\\&=||\Pi||_\Sigma\\&=\nu(\Pi).\end{aligned}$$

This same reasoning precludes Monotonicity from obtaining for volatility. Consider two identical portfolios but for an additional long position in the risk-free asset in the first portfolio. Clearly, in terms of the positions of the portfolios (and not weights), the first portfolio's losses will always be less than the seconds since the risk-free asset will create a parallel shift in these losses. However, as just shown the risks of these portfolios are equal. More generally, we notice that since coherent measures of risk are location dependent, while volatility and variance are not, this condition will fail in more general cases as well.

Lastly, to verify that volatility is Subadditive, we begin with the square of

$\nu(\cdot)$  for ease of exposition,

$$\begin{aligned}\nu(\Pi_1+\Pi_2)^2 &= \|\Pi_1+\Pi_2\|_\Sigma^2 \\ &= \|\Pi_1\|_\Sigma^2 + \|\Pi_2\|_\Sigma^2 + 2(\Pi_1, \Pi_2)_\Sigma \\ &\le \|\Pi_1\|_\Sigma^2 + \|\Pi_2\|_\Sigma^2 + 2|\Pi_1, \Pi_2|_\Sigma \\ &\le \|\Pi_1\|_\Sigma^2 + \|\Pi_2\|_\Sigma^2 + 2\|\Pi_1\|_\Sigma \cdot \|\Pi_2\|_\Sigma \\ &= (\|\Pi_1\|_\Sigma + \|\Pi_2\|_\Sigma)^2,\end{aligned}$$

where the second inequality is obtained from Cauchy-Schwarz. Taking square roots, we have

$$\nu(\Pi_1+\Pi_2)\le\|\Pi_1\|_\Sigma+\|\Pi_2\|_\Sigma=\nu(\Pi_1)+\nu(\Pi_2).$$

The above construction gives some guidance in identifying why variance is not Subadditive in the general case. This is left as an exercise.

In analyzing volatility above, we noted that a coherent measure of risk must be location dependent. One rather immediate candidate, consistent with previous work, might be to define a risk measure based on both expectation and volatility as

$$\rho_\gamma(\Pi)=\mathbb{E}(\Pi'r)+\gamma\cdot\|\Pi\|_\Sigma \tag{9.2}$$

for some fixed  $\gamma>0$ . We leave it to the reader to show that Positive Homogeneity, Translation Invariance, and Subadditivity hold for  $\rho_\gamma$ . Monotonicity fails in the general case, however, since we have not specified *r* sufficiently to connect the rank ordering of portfolio performance to the values given by  $\rho_\gamma$ .

This, too, gives some indication of a way to construct a coherent measure of risk from this work. Namely, in the case that *r* has an elliptical distribution,  $\rho_\gamma$  is directly related to the percentiles of the loss distribution defined by  $\Pi'r$ . As a simplified example, let  $r\sim N(\mu,\Sigma)$ . We have that the cumulative density function for portfolio losses, *Y*, is then given by the standard normal cumulative density function,  $\Phi$ , by

$$F_\Pi(y)=\Phi\left(\frac{y-\Pi'\mu}{\|\Pi\|_\Sigma}\right)$$

where, again,  $F_\Pi(y)=\mathbb{P}(Y<y)$ . The definition of  $\rho_\gamma$ , then, implies that  $\gamma$  exactly determines a probability; viz., for fixed  $\gamma$ ,

$$\begin{aligned}F_\Pi(\mathbb{E}(\Pi'r)+\gamma\cdot\|\Pi\|_\Sigma)&=\Phi\left(\frac{\mathbb{E}(\Pi'r)+\gamma\cdot\|\Pi\|_\Sigma-\Pi'\mu}{\|\Pi\|_\Sigma}\right)\\&=\Phi(\gamma).\end{aligned}$$

That is,  $\gamma$  determines the probability that portfolio losses are bounded above by  $\rho_\gamma(\Pi)$ ; viz.,  $\mathbb{P}(Y<\mathbb{E}(\Pi'r)+\gamma\cdot\|\Pi\|_\Sigma)$ .

We may then reformulate the problem in terms of these percentiles. For example, if we want to look at the 95<sup>th</sup> percentile of portfolio losses, we fix  $\gamma$  as  $\gamma=\Phi^{-1}(0.95)$ . For this  $\gamma$ ,  $\rho_\gamma(\Pi)$  is the value such that losses will only exceed  $\rho_\gamma(\Pi)$  5% of the time.

Returning to the question of constructing a coherent measure of risk via volatility, we have, finally, that for normally distributed,  $r$ ,  $\rho_\gamma$  is a coherent measure of risk since we may now verify that it satisfies Monotonicity. This follows directly from the work we just established since if  $\Pi'_1 r < \Pi'_2 r$  for all instances of  $r$ , then, necessarily, the percentiles of the loss distribution for  $\Pi_1$  are all less than those of  $\Pi_2$ . Consequently,  $\rho_\gamma(\Pi_1) < \rho_\gamma(\Pi_2)$ .

The generalization to elliptical distributions follows this same pattern, and the exercise is left to the reader.

A further generalization of the above leads to the concept of Value-at-Risk, which we formally define and analyze in the next example.

#### 9.1.2 Value-at-Risk

Given a percentile,  $\beta$ , the  $\beta$  Value-at-Risk, or  $\beta$ -VaR, is the smallest value,  $\alpha_\beta$ , such that the probability of losses exceeding  $\alpha_\beta$  is  $1-\beta$ . In the continuous example given above, this may be stated more concisely since  $\alpha_\beta$  is simply the  $\beta$  percentile of losses.<sup>1</sup>

It is useful to formalize some underlying concepts. If  $r$  has probability density function  $f(\cdot)$ , then the probability that portfolio losses of the portfolio,  $\Pi$ , do not exceed some specified value,  $\alpha$ , is given by the integral

$$\Psi(\Pi,\alpha)=\int_{\Pi'r<\alpha} f(r)dr.$$
 (9.3)

The  $\beta$ -VaR is determined from  $\Psi$  as

$$\alpha_\beta(\Pi)=\min\left\{\alpha\in\mathbb{R}|\Psi(\Pi,\alpha)\ge\beta\right\},$$
 (9.4)

for some  $\beta\in(0,1)$ . It is common to discuss  $\beta$  as a percent; e.g., we often say 95% VaR for  $\beta=0.95$ . In spite of our statement that we will consider continuous portfolio losses in our treatment, we note that the definition given in (9.4) accounts for the non-continuous case by assigning  $\alpha_\beta(\Pi)$  to the leftmost point in the nonempty interval consisting of values,  $\alpha$  such that  $\Psi(\Pi,\alpha)\ge\beta$ .

The  $\beta$ -VaR of a portfolio may be approximated by sampling according to the distribution of  $r$ . For such samples,  $\{r_k\}_{i=1}^K$ , the integral defining (9.3) is approximated as

$$\Psi(\Pi,\alpha)\approx\frac{1}{K}\sum_{k=1}^K\delta(\Pi'r_k<\alpha)$$
 (9.5)

$\delta(\Pi'r_k<\alpha)=1$  if  $\Pi'r_k<\alpha$  and 0 otherwise. Notice that the density function is replaced by  $\frac{1}{K}$  since  $r_k$  are sampled according to the density of  $r$ .

We have already established that  $\beta$ -VaR is a coherent measure of risk when  $r$  has an elliptical distribution. However, this is not true in the general case.

<sup>1</sup>It is somewhat unfortunate to reuse the variables,  $\alpha$  and  $\beta$  in this way, but it is common in the literature.

Positive Homogeneity of  $\beta$ -VaR follows directly from the definitions. For  $\lambda > 0$ ,

$$\begin{aligned}\Psi(\lambda\Pi, \alpha) &= \int_{\lambda\Pi'r < \alpha} f(r)dr \\ &= \int_{\Pi'r < \frac{\alpha}{\lambda}} f(r)dr \\ &= \Psi\left(\Pi, \frac{\alpha}{\lambda}\right).\end{aligned}$$

So that

$$\begin{aligned}\alpha_\beta(\lambda\Pi) &= \min\{\alpha \in \mathbb{R} | \Psi(\lambda\Pi, \alpha) \ge \beta\} \\ &= \min\left\{\alpha \in \mathbb{R} \middle| \Psi\left(\Pi, \frac{\alpha}{\lambda}\right) \ge \beta\right\} \\ &= \min\left\{\lambda\frac{\alpha}{\lambda} \in \mathbb{R} \middle| \Psi\left(\Pi, \frac{\alpha}{\lambda}\right) \ge \beta\right\} \\ &= \lambda \min\left\{\frac{\alpha}{\lambda} \in \mathbb{R} \middle| \Psi\left(\Pi, \frac{\alpha}{\lambda}\right) \ge \beta\right\} \\ &= \lambda\alpha_\beta(\Pi).\end{aligned}$$

Translation Invariance follows similarly since for a risk free position,  $p_f$ ,

$$\begin{aligned}\Psi(\Pi + p_f, \alpha) &= \int_{\Pi'r - p_f r_f < \alpha} f(r)dr \\ &= \int_{\Pi'r < \alpha + p_f r_f} f(r)dr \\ &= \Psi(\Pi, \alpha + p_f r_f).\end{aligned}$$

The remainder of this verification is left as an exercise.

Monotonicity is similarly obtained. For  $\Pi'_1 r < \Pi'_2 r$  for all instances of  $r$ , then

$$\begin{aligned}\alpha_\beta(\Pi_1) &= \min\{\alpha \in \mathbb{R} | \Psi(\Pi_1, \alpha) \ge \beta\} \\ &\le \min\{\alpha \in \mathbb{R} | \Psi(\Pi_2, \alpha) \ge \beta\} \\ &= \alpha_\beta(\Pi_2).\end{aligned}$$

The techniques to verify these conditions cannot be used to establish Subadditivity. And, in fact,  $\beta$ -VaR is not subadditive in the general case. Consider a portfolio with a single position in each of two defaultable bonds,  $B_1$  and  $B_2$ , with recoveries of \$40 and \$60, respectively. In the case of default, the value of each bond is given by its recovery value. Let the current values of  $B_1$  and  $B_2$  be \$100 and \$105, respectively, and assume each bond has a 3% chance of defaulting over the next year. Assume further that defaults are independent, and if no default occurs, the each bond's price increases by \$1.

We begin by looking at the portfolio holding both bonds and consider the loss if a default occurs under the three combinations of defaults ( $B_1$  defaults alone,  $B_2$  defaults alone, or both default) and their probabilities,

$$(.03 \cdot 0.97)((100 - 40) + (105 - 106)) + \\ (.97 \cdot 0.03)((100 - 101) + (105 - 60)) + \\ (.03 \cdot 0.03)((100 - 40) + (105 - 60)),$$

which is a loss of \$8.24 with a probability of 5.91%. As a result, the 95% VaR of the portfolio is \$8.24. For each of the bonds considered in isolation, though, the 95% VaR is -\$1, indicating that diversification *increases* risk.

One feature of  $\beta$ -VaR that may be gleaned from the final example on its lack of general subadditivity is that  $\beta$ -VaR is not concerned with losses in the right portion of the loss distribution past  $\alpha_\beta(\Pi)$ ; viz., in the case of defaultable bonds, the default losses were not accounted for when considering the 95% VaR. One solution to account for such tail behavior is to compute the average loss conditioned on exceeding  $\alpha_\beta(\Pi)$ . This is exactly the definition of conditional value at risk, or CVaR, which we consider in our next example.

#### 9.1.3 Conditional Value-at-Risk

The  $\beta$ -CVaR of the portfolio  $\Pi$ , which we will denote by  $\phi_\beta(\Pi)$ , is the average of losses exceeding  $\alpha_\beta(\Pi)$ , the  $\beta$ -VaR. Formally, we have

$$\phi_\beta(\Pi)=\frac{1}{1-\beta}\int_{\Pi'r\ge\alpha_\beta(\Pi)}\Pi'r f(r)dr, \tag{9.6}$$

where, as usual,  $f(\cdot)$  is the probability density function for  $r$ .

We leave the verification that  $\beta$ -CVaR is a coherent measure of risk as an exercise. To show Monotonicity and Subadditivity, it is useful to write  $\phi_\beta(\Pi)$  as the average of all  $\tilde{\beta}$ -VaR values of the portfolio  $\Pi$  as  $\tilde{\beta}$  ranges from  $\beta$  to 1. That is,

$$\phi_\beta(\Pi)=\frac{1}{1-\beta}\int_\beta^1\alpha_{\tilde{\beta}}(\Pi)d\tilde{\beta}. \tag{9.7}$$

To prove this, we note that we may write  $\phi_\beta(\Pi)$  with respect to the distribution of portfolio losses directly,

$$\phi_\beta(\Pi)=\frac{1}{1-\beta}\int_{\alpha_\beta(\Pi)}^\infty yf_\mathcal{L}(y)dy, \tag{9.8}$$

where  $f_\mathcal{L}(\cdot)$  is the probability density function for portfolio losses. If we let  $\tilde{\beta}=F_\mathcal{L}(y)$ , with  $F_\mathcal{L}(\cdot)$  the cumulative distribution function, then  $d\tilde{\beta}=f_\mathcal{L}(y)dy$ . By noticing that  $F_\mathcal{L}(y)$  is exactly  $\Psi(\Pi,y)$ , we have, using leftpoint values,  $F^{-1}(\tilde{\beta})=y=\alpha_{\tilde{\beta}}(\Pi)$ . Completing the change of variables in (9.8) gives (9.7) as desired.

Notice that (9.7) gives that  $\beta$ -CVaR is always at least as large as  $\beta$ -VaR. As such, if a particular risk tolerance is defined by  $\beta$ -VaR, matching this risk

tolerance with a  $\beta$ -CVaR target will always suffice. Further, as the latter is a coherent measure of risk, it may be more desirable to focus on  $\beta$ -CVaR directly.

Another useful formula for  $\beta$ -CVaR is given by

$$\phi_\beta(\Pi)=\min_\alpha\left(\alpha+\frac{1}{1-\beta}\int_{r\in\mathbb{R}^N}[\Pi'r-\alpha]_+f(r)dr\right),$$
 (9.9)

where, as usual,  $[x]_+ = \max(0, x)$ . Defining

$$G_\beta(\Pi,\alpha)=\alpha+\frac{1}{1-\beta}\int_{r\in\mathbb{R}^N}[\Pi'r-\alpha]_+f(r)dr,$$

it may be shown that for fixed  $\Pi$  [29, 30],

$$\frac{\partial G_\beta}{\partial\alpha}=1+\frac{1}{1-\beta}(\Psi(\Pi,\alpha)-1).$$
 (9.10)

Taking a second partial with respect to  $\alpha$  shows that  $G_\beta$  is also convex in  $\alpha$ . As a result, the function is minimized in  $\alpha$  when  $\frac{\partial G_\beta}{\partial\alpha}=0$ ; in other words, when  $\Psi(\Pi,\alpha)=\beta$ . We have established earlier that this is satisfied exactly when  $\alpha=\alpha_\beta(\Pi)$ . Hence, we are left to verify that

$$\phi_\beta(\Pi)=\alpha_\beta(\Pi)+\frac{1}{1-\beta}\int_{r\in\mathbb{R}^N}[\Pi'r-\alpha_\beta(\Pi)]_+f(r)dr.$$

We have

$$\begin{aligned}\int_{r\in\mathbb{R}^N}[\Pi'r-\alpha_\beta(\Pi)]_+f(r)dr&=\int_{\Pi'r\ge\alpha_\beta(\Pi)}(\Pi'r-\alpha_\beta(\Pi))f(r)dr\\&=\int_{\Pi'r\ge\alpha_\beta(\Pi)}\Pi'rf(r)dr-(1-\beta)\alpha_\beta(\Pi)\\&=(1-\beta)\phi_\beta(\Pi)-(1-\beta)\alpha_\beta(\Pi),\end{aligned}$$

so that multiplying through by  $(1-\beta)^{-1}$  confirms the result.

Finally, we note, but do not prove here, that  $G_\beta(\Pi,\alpha)$  is convex in the joint variables of  $\Pi$  and  $\alpha$  [27].

We are now in a position, again, to provide an approximation to  $\beta$ -CVaR according to sampling from the distribution of  $r$ . Using the same notation as before, we have

$$G_\beta(\Pi,\alpha)=\alpha+\frac{1}{1-\beta}\frac{1}{K}\sum_{k=1}^K[\Pi'r_k-\alpha]_+,$$
 (9.11)

and  $\phi_\beta(\Pi)$  may thus be approximated by minimizing (9.11) over  $\alpha$ . We note that if  $\beta$  is close to 1, it may be beneficial (for numerical stability) to minimize  $(1-\beta)G_\beta(\Pi,\alpha)$ . The question of how to handle the  $[\cdot]_+$  function in a minimization setting will be addressed subsequently in the chapter.

Both  $\beta$ -VaR and  $\beta$ -CVaR are quantile based risk measures, relying on the distribution of portfolio losses. One feature that they do not retain as a result is the serial ordering of portfolio losses; viz.; like all risk measures considered so far, they disregard whether large portfolio losses may cluster or not, a taxonomy of returns we have noted previously. Common risk measures which are order-dependent are constructed from portfolio drawdowns. We examine some of these examples next.

#### 9.1.4 Drawdown Measures

An especially common portfolio evaluation metric in hedge funds is to calculate the maximum drawdown over various time windows. Using the same notation as we have throughout, we now add some serial component to our portfolio loss variable,  $r$ , which we denote, as usual, by  $r_t$ . We define the relative portfolio value at time,  $\tau$ , of a portfolio,  $\Pi$ , by

$$P(\Pi,k)=-\Pi'\sum_{t=1}^k r_t,$$
 (9.12)

where the minus sign accounts for our convention of  $r$  being a loss and discrete time steps are chosen. In the case of a continuous process, we may write

$$P(\Pi,\tau)=-\Pi'\int_1^\tau r_t dt.$$

In both discrete and continuous settings, the drawdown function at time  $\tau$  is given by

$$D(\Pi,\tau)=\max_{1\le t\le\tau} P(\Pi,t)-P(\Pi,\tau).$$
 (9.13)

We will specifically write  $D(\Pi,k)$  when using a discrete sample. The maximum drawdown over a period  $t\in[1,T]$  is then

$$M(\Pi)=\max_{0\le t\le T}D(\Pi,t),$$
 (9.14)

and the period average drawdown, or simply average drawdown, over the same time window is

$$A(\Pi)=\frac{1}{T}\sum_{k=1}^T D(\Pi,k).$$
 (9.15)

As we have written it, the average drawdown is drawdown per unit time denoted by the discrete steps in  $[1,T]$ . In the case of a continuous process, we may write the average drawdown as

$$A(\Pi)=\frac{1}{T}\int_{t=0}^T D(\Pi,t)dt.$$

Neither maximum drawdown nor average drawdown are coherent measures of risk as each fails Translation Invariance and Subadditivity. Monotonicity may

be shown if, for instance, the definition is extended to multiple periods; viz.,  $\Pi'_1 r_t < \Pi'_2 r_t$  for every  $t$ . These verifications are left as exercises.

Analogues of both  $\beta$ -VaR and  $\beta$ -CVaR may be constructed from the maximum drawdown distribution. For example, following Chekhlov, Uryasev, and Zabarankin [7], we may define  $\beta$ -Drawdown-at-Risk ( $\beta$ -DaR) and  $\beta$ -Conditional Drawdown at Risk ( $\beta$ -CDaR) as

$$\delta_\beta(\Pi)=\min\left\{\delta\in\mathbb{R}|\mathbb{P}(D(\Pi,t)>\delta)\ge\beta\text{ for }t\in[0,T]\right\},$$
 (9.16)

and

$$\Delta_\beta(\Pi)=\frac{1}{1-\beta}\frac{1}{T}\int_{D(\Pi,t)\ge\delta_\beta(\Pi)}D(\Pi,t)dt.$$
 (9.17)

Following our previous work, one may show that (9.17) may be also be written as

$$\Delta_\beta(\Pi)=\frac{1}{1-\beta}\frac{1}{T}\int_\beta^1\delta_\beta(\Pi)d\beta.$$
 (9.18)

Similarly, it is helpful to write (9.17) as a minimization problem in  $\delta_\beta(\Pi)$ , and the resulting equation should be familiar:

$$\Delta_\beta(\Pi)=\min_\delta\left(\delta+\frac{1}{1-\beta}\frac{1}{T}\int_0^T(D(\Pi,t)-\delta)_+dt\right).$$
 (9.19)

Of particular utility is being able to approximate the objective function in (9.19), which may be accomplished by writing

$$D_\beta(\Pi,\delta)=\delta+\frac{1}{1-\beta}\frac{1}{K}\sum_{k=1}^K\left[\max_{0\le j\le k}\left(P(\Pi,j)-P(\Pi,k)\right)-\delta\right]_+$$
 (9.20)

where  $r_t$  has been sampled according to the distribution of losses for the securities in question. We will return to this problem when we define drawdown constraints in an optimization setting. Before proceeding, however, we note that there is some difference between the sampling properties used to approximate  $\beta$ -CVaR and  $\beta$ -CDaR exhibited here. Namely, in the former case, sampling was according to the distribution of losses, irrespective of serial ordering. As such, as  $K$  grew, we might expect that the value obtained converged to the population  $\beta$ -CVaR. (This is the case.) In the above, however,  $K$  represents the number of time steps in the path of losses which we have used to define portfolio value. As such, to converge to (9.19), one would have to average (9.20) over several paths.

We state, but do not prove, that  $\beta$ -CDaR satisfies all conditions of being a coherent measure of risk but Translation Invariance, where  $\Delta_\beta(\Pi+p_f)=\Delta_\beta(\Pi)$ . These verifications for the continuous case are left as exercises.

### 9.2 Implementation as Linear Constraints

Several of the new risk measures introduced in this chapter rely on an unspecified distribution for the loss variable,  $r$ . That is, while we might associate variance

(or volatility) as a risk measure associated with and justified by assuming  $r$  has a normal distribution via arguments based on CAPM, we have no theoretical underpinning to determine such a distribution for, say,  $\beta$ -CVaR generally. This is true even while we showed an equivalence of  $\beta$ -CVaR to volatility as a risk measure when the joint distribution of losses is assumed normal. While on the one hand this distribution relaxation allows greater flexibility, on the other, we have yet to specify how we might implement any of these new risk measures in practice.

In this section we outline procedures to include any of  $\beta$ -CVaR, average drawdowns, maximum drawdowns, and  $\beta$ -CDaR as linear constraints in identifying some optimal portfolio,  $\Pi^*$ . We do not include constraints in  $\beta$ -VaR or  $\beta$ -DaR, as these are outside the scope of the text.<sup>2</sup> Throughout, we will assume a baseline portfolio optimization problem of the type seen in (7.32); that is,

$$\begin{array}{ll}\min_{\Pi} & \Pi'\hat{r} \\ & A\Pi = b \\ & C\Pi \ge d.\end{array}\tag{9.21}$$

We will then identify auxiliary variables which we must add to the problem for each risk measure considered. Afterwards, we discuss methods to simulate specified distributions for  $r$ ; largely collecting previous results from the text as opposed to introducing any new concept in this regard.

Given this choice of arrangement, we begin by first assuming some sampling technique from the loss variable,  $r$ , is possible, and then discuss a method for such sampling.

#### 9.2.1 $\beta$ -CVaR Constraints

Assume that  $\{r_k\}_{k=1}^K$  is sampled according to the distribution of  $r$ , and  $\beta$  is fixed between 0 and 1. The discretization of the objective function used to determine  $\beta$ -CVaR given by (9.11),

$$G_\beta(\Pi,\alpha)=\alpha+\frac{1}{1-\beta}\frac{1}{K}\sum_{k=1}^K[\Pi'r_k-\alpha]_+,$$

may be rewritten using auxiliary variables as

$$\begin{array}{ll}\alpha+\frac{1}{1-\beta}\frac{1}{K}\sum_{k=1}^Kz_k & \\ z_k \ge 0 & \tag{9.22} \\ z_k \ge \Pi'r_k-\alpha. & \end{array}$$

<sup>2</sup>This may be surprising given that we will be able to, for instance, constrain  $\beta$ -CVaR, and by doing so, obtain the  $\beta$ -VaR of the portfolio. However, the integration of  $\beta$ -VaR used to obtain  $\beta$ -CVaR is the key feature that distinguishes the two, as it leads to smoothness as well as Subadditivity.

Rockafellar and Uryasev [27] establish that when minimization of (9.21) is carried out in  $\Pi\times\alpha\times z$ , with  $\alpha+\frac{1}{1-\beta}\frac{1}{K}\sum_{k=1}^Kz_k$  bounded above by some risk tolerance as an additional constraint, the result is that  $\alpha^*$  will be the approximate  $\beta$ -VaR (approximate due to discretization), and the approximate  $\beta$ -CVaR is represented and bound by the constraint as well.

Putting this all together, we have that the  $\beta$ -CVaR constrained problem becomes

$$\begin{array}{rcl} \min_{(\Pi,\alpha,z)} & \Pi'\hat{r} \\ & A\Pi = b \\ & C\Pi \ge d \\ & \alpha + \frac{1}{1-\beta}\frac{1}{K}\sum_{k=1}^Kz_k \le \omega \\ & z \ge 0 \\ & z_k + -\Pi'r_k + \alpha \ge 0. \end{array} \tag{9.23}$$

We leave it as an exercise to rewrite this problem in matrix form. The triplet  $(\Pi^*,\alpha^*,z^*)$  solving (9.23) yields the  $\beta$ -VaR for the portfolio,  $\Pi^*$ , in  $\alpha^*$ . Of course, the  $\beta$ -CVaR of the portfolio is  $\alpha^*+\frac{1}{1-\beta}\frac{1}{K}\sum_{k=1}^Kz_k^*$ , and this provides an upper bound for  $\beta$ -VaR, as noted previously. As a result, while we do not give a method of how to constrain  $\beta$ -VaR here, we may conservatively constrain a portfolio to not exceed some  $\beta$ -VaR limit by using that same limit in the above.

We note that (9.23) is convex in  $(\Pi,\alpha)$ ; a result which may be proved directly by resorting to the KKT conditions of the problem. Consequently, we may also, for instance, create an analogous mean- $\beta$ -CVaR curve as previously considered in mean-variance optimization. Finally, and as seen previously, an alternate formulation of (9.23) with  $\beta$ -CVaR objective should be apparent.

#### 9.2.2 Drawdown Constraints

We next assume that  $\{r_t\}_{t=1}^T$  is sampled according to the distribution of  $r$ , noting that some serial relationships may be introduced; i.e., we do not assume necessarily that these observations are iid, and the ordering in  $t$  cannot be disregarded.

The maximum drawdown for this sample, for fixed  $\Pi$ , is given by

$$\begin{aligned} M(\Pi) &= \max_{1\le k\le T} D(\Pi,k) \\ &= \max_{1\le k\le T} \left( \max_{1\le j\le k} (P(\Pi,j)) - P(\Pi,k) \right) \\ &= \max_{1\le k\le T} \left( \max_{1\le j\le k} \left( -\Pi'\sum_{t=1}^j r_t \right) + \Pi'\sum_{t=1}^k r_t \right) \\ &= \max_{1\le k\le T} \left( \max_{1\le j\le k} (-\Pi'R_j) + \Pi'R_k \right) \end{aligned}$$

where we have introduced the variable  $R_j = \sum_{t=0}^j r_t$  as a cumulative uncompounded loss.

As in the previous subsection, we may rewrite  $M(\Pi)$  using auxiliary variables, this time as

$$\begin{aligned}z_k &\ge -\Pi' R_k \\z_k &\ge z_{k-1}\end{aligned}\tag{9.24}$$

for  $k = 1, \dots, T$ , and  $z_0 = 0$ .

The addition of a maximum drawdown constraint to (9.21) becomes, then,

$$\begin{aligned}\min_{(\Pi, z)} & \Pi' \hat{r} \\& A\Pi = b \\& C\Pi \ge d \\z_k + \Pi' R_k & \le \omega \\z_k + \Pi' R_k & \ge 0 \\z_k - z_{k-1} & \ge 0,\end{aligned}\tag{9.25}$$

where, again,  $z_0 = 0$ , and  $k$  ranges from 1 to  $T$ . Notice that  $z_k + \Pi' R_k$  stands in for  $D(\Pi, k)$ . In (9.25), each  $D(\Pi, k)$  is constrained by some upper bound, ensuring the maximum drawdown is likewise bounded. The formulation of average drawdown constraints is similarly handled and the exercise is left to the reader.

These same auxiliary variables may be used to formulate a  $\beta$ -CDaR constraint. The approach should recognizable from the  $\beta$ -CVaR case. Here, we leverage (9.20)

$$D_\beta(\Pi, \delta) = \delta + \frac{1}{1-\beta} \frac{1}{T} \sum_{k=1}^T \left[ \max_{1 \le j \le k} (-\Pi' R_j) + \Pi' R_k - \delta \right]_+$$

which now becomes

$$\begin{aligned}\delta + \frac{1}{1-\beta} \frac{1}{T} \sum_{k=1}^T u_k \\u_k &\ge 0 \\u_k &\ge z_k + \Pi' R_k - \alpha \\z_k &\ge -\Pi' R_k \\z_k &\ge z_{k-1},\end{aligned}\tag{9.26}$$

with, again,  $z_0 = 0$ , and for  $k = 1, \dots, T$ . A  $\beta$ -CDaR constrained version of

(9.21) may now be given as

$$\begin{array}{rcl} & \min_{(\Pi,\delta,u,z)} \Pi'\hat{r} & \\ & A\Pi = b & \\ & C\Pi \ge d & \\ & \delta + \frac{1}{1-\beta}\frac{1}{T}\sum_{k=1}^T u_k \le \omega & (9.27) \\ & u_k \ge 0 & \\ & u_k - z_k - \Pi'R_k + \alpha \ge 0 & \\ & z_k + \Pi'R_k \ge 0 & \\ & z_k - z_{k-1} \ge 0. & \end{array}$$

Checklov, Uryasev, and Zabarankin [7] establish, just as with  $\beta$ -CVaR, that the optimization with  $\beta$ -CDaR constraints results in  $\delta^*$  being the  $\beta$ -DaR and  $D_\beta(\Pi^*,\delta^*)$  the  $\beta$ -CDaR of the portfolio,  $\Pi^*$ .

In both examples presented here, we have considered a single path  $\{r_t\}_{t=1}^T$ . While it is perhaps reasonable to use, say, a historical set of losses for such a path, generally speaking this is myopic with respect to the distribution of losses; viz., the maximum drawdown of one sample path may not be indicative of an expected maximum drawdown. The above work may be easily modified to account for multiple sample paths, and we leave this as an exercise for the reader. In addition to this being an exercise, we suggest that in practice this method is preferred; i.e., defining a random process for  $r_t$  from which many sample paths may be used and incorporated in, for example, (9.27).

#### 9.2.3 Sampling from $r$

One of the standard tools for the sampling problem required for  $\beta$ -CVaR is the theory of copulas, first seen in (3.16). Recall that the copula framework allows for the flexibility of specifying marginal distributions as well as joint distributions.

Based on these results, we may construct joint distributions,  $F(\cdot)$  and marginals,  $\{F_i(\cdot)\}$  based on *a priori* views. For example, in equity returns, we have frequently emphasized a preference for using a Student  $t$  distribution with five degrees of freedom for both marginals as well as the joint distribution of returns as this allows for excess kurtosis. In the present case, we note that such a choice reduces the  $\beta$ -CVaR optimization constraint to a variance constraint as the distribution is elliptical. Even so, two reasons may justify the slight complexity added over using a normal distribution.

First, if a portfolio is managed to a  $\beta$ -CVaR level, we have already seen ample evidence that normal distributions are inadequate in capturing tail behavior and will thus underestimate tail risk. So that while percentiles of the loss distribution map identically to portfolio variance in both the Student  $t$  and

normally distributed case, the former gives a better measure of expected tail losses.

Second, it is not uncommon to consider portfolios with both equities and securities based on these equities; viz., equity options. In this case, changes in option security prices are nonlinear in changes in the respective underlying equity prices. As such, an elliptical distribution of returns for equities does not translate to an elliptical distribution for the portfolio loss random variable. This highlights a key differentiator between a  $\beta$ -CVaR optimization procedure and a mean-variance one. The next example considers this with some more specificity.

**Example 9.2.1.** Suppose that a portfolio is to be constructed with positions in some fixed set of equities,  $p_E$  and options written on those equities,  $p_O$ . For our present purposes, it is sufficient to note that it is common to estimate changes in option prices via Taylor expansions in the underlying stock price; viz.,

$$\Delta v_{O,j}(\Delta v_{E,j})\approx\frac{\partial v_{O,j}}{\partial v_{E,j}}\Delta v_{E,j}+\frac{1}{2}\frac{\partial^2v_{O,j}}{\partial v_{E,j}^2}\Delta v_{E,j}^2,$$
 (9.28)

where  $v_{E,j}$  and  $v_{O,j}$  denote the current price of the  $j^{\text{th}}$  equity and option, respectively.<sup>3</sup>

If equity returns,  $\zeta$  are sampled from  $St_{\mu,\Sigma;\nu}(\cdot)$ , with  $\nu=5$ , and  $\mu$  and  $\Sigma$  matching the sample moments of the historical returns, then for sample  $\zeta_i$ , the associated loss variable for the  $j^{\text{th}}$  equity is exactly

$$r_{i,E,j}=-\Delta v_{i,E,j}=-\zeta_{i,j}\cdot v_{E,j}.$$

The loss variable for the associated option in this same sample is then

$$r_{i,O,j}=-\Delta v_{i,O,j}=\frac{\partial v_{O,j}}{\partial v_{E,j}}r_{i,E,j}-\frac{1}{2}\frac{\partial^2v_{O,j}}{\partial v_{E,j}^2}r_{i,E,j}^2.$$

A full accounting of loss under the simulation driven by  $\zeta_i$  may be achieved in the single vector,

$$r_i=\left(\begin{array}{ c } r_{i,E} \\ r_{i,O} \end{array}\right)$$

From here, it should be clear how to construct a  $\beta$ -CVaR constrained portfolio. Notice that significant information would be lost if one were to focus solely on variance as a risk proxy as the distribution of portfolio losses is not ellipsoidal.

We note that we have not covered the needed theory to introduce a serial relationship in  $r_t$ . That is, all of the above assumes iid samples from the distribution of portfolio losses. To more carefully treat the drawdown constraints considered in this chapter, it is necessary to introduce some time series dynamics into the processes involved.

<sup>3</sup>The interested reader may look at the Black-Scholes formula which gives a closed form solution for so-called European call and put options, which give the buyer the right (but not the obligation) to buy or sell a stock, respectively, for a specified price on a specified future date.

### Exercises

1. Show that the expectation operator is a coherent measure of risk. Discuss why this might not be an appropriate risk measure, and note that as a result the conditions for a coherent measure of risk may only be construed as necessary, but not sufficient, for risk management.
2. Prove that variance is not Subadditive.
3. Prove that  $\rho_\gamma(\cdot)$ , as defined by (9.2), satisfies Positive Homogeneity, Translation Invariance, and Subadditivity.
4. Prove that if  $r$  is elliptically distributed (not just normally distributed), then  $\rho_\gamma(\cdot)$  is a coherent measure of risk and coincides with Value-at-Risk for some  $\beta$ . Expressly determine this  $\beta$ .
   1. Suppose that  $r = B \cdot f + \epsilon$  for some vector of stochastic factors,  $f \in \mathbb{R}^M$  and idiosyncratic component,  $\epsilon \in \mathbb{R}^N$ . If both  $f$  and  $\epsilon$  are elliptically distributed, prove that  $\rho_\gamma$  is a coherent measure of risk when profit and loss are given by  $r$ .
5. Show that a convex combination of coherent measures of risk is itself a coherent measure of risk.
6. Suppose  $\rho(\cdot)$  satisfies all conditions of a coherent measure of risk but Translation Invariance, but that  $\rho(\Pi + p_f) = \rho(\Pi)$ . Prove that  $\tilde{\rho}(\Pi) = \mathbb{E}(\Pi) + \gamma\rho(\Pi)$  is a coherent measure of risk for any  $\gamma > 0$ .
7. Finish the verification that  $\beta$ -VaR satisfies Translation Invariance.
8. Prove that  $\beta$ -CVaR is a coherent measure of risk.
9. Show that  $G_\beta$  as given in (9.11) is convex in  $\alpha$  for fixed  $\Pi$ .
10. Prove that both maximum drawdown and average drawdown as defined by (9.14) and (9.15), respectively, satisfy Positive Homogeneity and Monotonicity. Give an examples of maximum drawdown failing Translation Invariance and Subadditivity.
11. Prove that in the continuous case,  $\beta$ -CDaR satisfies all conditions of being a coherent measure of risk but Translation Invariance, where  $\Delta_\beta(\Pi + p_f) = \Delta_\beta(\Pi)$ . You may assume  $\delta_\beta(\Pi)$  satisfies the Monotonicity condition and that (9.18) holds. How might you modify a risk measure based on  $\beta$ -CDaR to make it a coherent measure of risk?
12. Rewrite (9.23) in matrix form.
    1. How would you modify the problem to incorporate an initial portfolio,  $\Pi_0$ , and turnover constraints?

1. How might you tell if the  $\beta$ -CVaR constraint in (9.23) is binding? If it is not binding, how might this change the interpretation of  $\alpha^*$ ?
2. Write the average drawdown analogue of (9.25).
3. What are the limiting cases for  $\beta$ -CDaR as  $\beta \downarrow 0$  and  $\beta \uparrow 1$ ?
4. Focusing on the maximum drawdown constrained problem (9.25), suppose  $M$  sample paths are drawn rather than just one. Write an optimization problem constraining the average maximum drawdown over these  $M$  sample paths.
5. Following the methodology in Chapter 2, select the fifty largest stocks in the cross-sectional and historical return data on the last date available. Using the full 121 weeks of returns available, simulate 5,000 samples from a Student  $t$  distribution using the sample mean and covariance and five degrees of freedom as parameters. Carry out a  $\beta$ -CVaR minimization with  $\beta = 0.95$ . Include a no short sale constraint, maximum position of 5% constraint, and gross notional of 1. Bound the  $\beta$ -CVaR by the  $\beta$ -VaR from the sample assuming an evenly weighted portfolio. For the optimal portfolio:
   1. What are the  $\beta$ -VaR and  $\beta$ -CVaR values from the optimizer?
   2. How do these compare to the  $\beta$ -VaR and  $\beta$ -CVaR values from the sample?
   3. How does the  $\beta$ -VaR compare to analytic approximation given by using  $\rho_\gamma(\cdot)$ ?

![](b5659b5bfe809ed3258999dc668f535a_img.jpg)