#Position Concentration Risk

By Maxwell Margenot and Delaney Granizo-Mackenzie.

Part of the Quantopian Lecture Series:

www.quantopian.com/lectures
github.com/quantopian/research_public

Notebook released under the Creative Commons Attribution 4.0 License.

When trading, it is important to diversify your risks. By concentrating your positions in only a few assets, you can negatively be impacted by their risks. This notebook is designed to show how diversifying your portfolio can result in a lower overall risk profile.

In [None]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

##Intuition

Let's say you learned to card count at Blackjack, whereas most casinos will throw you out if you are caught, it will give you a [1% edge over the house](https://en.wikipedia.org/wiki/Card_counting). If you walked into the casino with \$10,000 to bet, it would clearly be insane to place all the money on one game. Whereas you have a 51% chance of winning that game, the house still has a 49% chance. The expected outcome is for you to win the game, but the variance is increadibly high.

Let's say you placed your money on 100 different tables. This is known as making independent bets, because the outcome of one table doesn't affect any of the others. Your variance will be reduced as you make more and more bets. You would still expect to win 51% of the tables, but the chance of losing money is greatly reduced. Let's see this in action.

###Simulating Blackjack Games

Each game will be won with a 51% probability. We can simulate this using a binomial distribution, which is parameterized with the number of trials we perform (games), and the chance of each trial succeeding.

First we'll simulate 1000 different universes in which you walk into the casino and play one game.

In [None]:
universes = 1000

results = np.zeros((universes, 1))
for i in range(universes):
    results[i] = np.random.binomial(n = 1, p=0.51)

Now let's check the mean and standard deviation of the results. We see that because there are so many 0s and so many 1s, and nothing in between, the standard deviation is very high. This is saying that you should expect to win half a game, with the potential outcomes being approximately evenly distributed between a loss and a win. Because you played so few games, you have given no time for your edge to work.

In [None]:
np.mean(results), np.std(results)

Now let's simulate 1000 universes in which you walk into the casino and play 100 games.

In [None]:
universes = 1000

results = np.zeros((universes, 1))
for i in range(universes):
    results[i] = np.random.binomial(n = 100, p=0.51)

np.mean(results), np.std(results)

Now we see that the average result is much closer to 51 games won, with a smaller standard deviation. We see here that you're likely still not safe, as your expected edge is only one game, whereas the standard deviation is many games. This would indicate that you can reasonably expect to lose more games than you win. Finally let's try 10,000 games.

In [None]:
universes = 1000

results = np.zeros((universes, 1))
for i in range(universes):
    results[i] = np.random.binomial(n = 10000, p=0.51)

np.mean(results), np.std(results)

In this case we're much safer, as the expected edge is 100 games.

NOTE: There is a subtlety that it's not always valid to use a standard deviation, as the underlying distribution of data in this case is not normal. We use it here because standard deviation is the metric of volatility used in finance, and it still reflects how much 'spread' exists in the data. Be careful not to abuse standard deviation in practice by assuming the underlying data is normal.

##Expanding to Portfolio Theory

The same exact principle exists in portfolio theory. If you think you have an edge over the market in picking stocks that will go up or down, you should try to make as many independent bets as possible. This can be accomplished by investing in as many uncorrelated assets as possible. Let's take a look at an example.

Remember that in finance, volatility is measured by the standard deviation of a time series, and the amount of future risk of a portfolio is estimated by past portfolio volatility.

####Case 1: Investing in Few Assets

Let's simulate some assets by sampling from a normal distribution.

NOTE: In practice real financial asset returns rarely are normally distributed, so this is not a great assumption. However it's okay here to get our point across because we are just concerned with correlation and level of volaility.

In [None]:
R_1 = np.random.normal(1.01, 0.03, 100)
A_1 = np.cumprod(R_1)
P = A_1
plt.plot(P)
plt.xlabel('Time')
plt.ylabel('Price');

In this case, we're totally exposed to the volatility of that asset, as our portfolio is entirely that asset.

####Case 2: Investing in Many Correlated Assets

In this case we expand our asset pool, but there is still a large amount of pairwise correlation between the returns. We simulate this by simulating assets 2 through N as asset 1 plus some noise.

In [None]:
N = 10

returns = np.zeros((N, 100))
assets = np.zeros((N, 100))

R_1 = np.random.normal(1.01, 0.03, 100)
returns[0] = R_1
assets[0] = np.cumprod(R_1)
plt.plot(assets[0], alpha=0.1)

for i in range(1, N):
    R_i = R_1 + np.random.normal(0.001, 0.01, 100)
    returns[i] = R_i
    assets[i] = np.cumprod(R_i)
    
    plt.plot(assets[i], alpha=0.1)

R_P = np.mean(returns, axis=0)
P = np.mean(assets, axis=0)
plt.plot(P)
plt.xlabel('Time')
plt.ylabel('Price');

print 'Asset Volatilities'
print [np.std(R) for R in returns]
print 'Mean Asset Volatility'
print np.mean([np.std(R) for R in returns])
print 'Portfolio Volatility'
print np.std(R_P)

Here you can see the portfolio accompanied by all the assets, the assets being drawn much softer. The important thing to note is that the portfolio undergoes all the same shocks as the assets, because when one asset is up or down, all the others are likely to be so as well. This is the problem with correlated assets. Let's take a look at the volatility of the assets and the volatility of the portfolio.

The mean volatility of our assets is the same as the portfolio volatility. We haven't gained anything by making more bets. You can think of correlated bets as identical to the original bet. If the outcome of the second bet is correlated with the first, then really you've just made the same bet twice and you haven't reduced your volatility.

####Case 3: Investing in Many Uncorrelated Assets

In this case we independently generate a bunch of assets an construct a portfolio that combines all of them.

In [None]:
N = 10

assets = np.zeros((N, 100))
returns = np.zeros((N, 100))

for i in range(N):
    R_i = np.random.normal(1.01, 0.03, 100)
    returns[i] = R_i
    assets[i] = np.cumprod(R_i)
    
    plt.plot(assets[i], alpha=0.1)

R_P = np.mean(returns, axis=0)
P = np.mean(assets, axis=0)
plt.plot(P)
plt.xlabel('Time')
plt.ylabel('Price');

print 'Asset Volatilities'
print [np.std(R) for R in returns]
print 'Mean Asset Volatility'
print np.mean([np.std(R) for R in returns])
print 'Portfolio Volatility'
print np.std(R_P)

Now we see the benefits of diversification. Holding more uncorrelated assets smooths out our portfolio. When one is down, the others are no more likely to be down, so the bumps both upwards and downwards are often much smaller. The more assets we hold, the more we'll reduce our volatility as well. Let's check that.

In [None]:
portfolio_volatilities_by_size = np.zeros((100,1))

for N in range(1,100):

    assets = np.zeros((N, 100))
    returns = np.zeros((N, 100))

    for i in range(N):
        R_i = np.random.normal(1.01, 0.03, 100)
        returns[i] = R_i

    R_P = np.mean(returns, axis=0)

    portfolio_volatilities_by_size[N] = np.std(R_P)
    
plt.plot(portfolio_volatilities_by_size)
plt.xlabel('Uncorrelated Portfolio Size')
plt.ylabel('Uncorrelated Portfolio Volatility');

##Final Point

Be invested in as many uncorrelated assets as possible. In finance this is known as diversification. If you have a pricing model, price everything and invest accordingly. This concept is explained in the Long-Short Equity Lecture.

###Capital Constraints

Because of transaction costs, you need to have certain minimum amounts of capital to invest in large numbers of assets. Therefore sometimes you are unable to invest in hundreds or thousands. In this case you should still try to maximize your portfolio size, keeping in mind that if you have a portfolio of size 20, you can still find 20 relatively uncorrelated assets and that's better than nothing.

##Now Let's Explain with Math Rather Than Pictures

One of the key aspects of modern portfolio theory is that by combining multiple assets into a portfolio, you can reduce the entire package's overall risk. Since we represent the volatility of an asset by its standard deviation, we can easily show this mathematically.

Say that we have two assets in a portfolio, $S_1$ and $S_2$, with weights $\omega_1$ and $\omega_2$ such that $\omega_1 + \omega_2 = 1$. Call the portfolio $P$ and say that $S_1$ and $S_2$ have mean and standard deviation $\mu_1, \sigma_1$ and $\mu_2, \sigma_2$ respectively. We can calculate the value of $P$ easily.

$$ P = \omega_1 S_1 + \omega_2 S_2 $$

Now we set $\mu_P$ as the return of the portfolio $P$. It is simple to calculate the expected return of this portfolio:

$$ E[\mu_P] = E[\omega_1 \mu_1 + \omega_2 \mu_2] = \omega_1 E[\mu_1] + \omega_2 E[\mu_2] $$

As you can see, the expected return of the overall portfolio can be directly determined using the expected returns of the assets *in* the portfolio as well as their associated weights. Similarly, we can use these same characteristics to determine the overall risk of the portfolio, $\sigma_p$. First, we calculate the variance of the portfolio, $\sigma_p^2 = VAR[P]$. Then we say that the correlation between $S_1$ and $S_2$ is $COR[S_1,S_2] = \frac{COV[S_1,S_2]}{\sigma_1\sigma_2} = \rho_{12}$. The calculations then follow:

\begin{eqnarray}
\sigma_p^2 &=& VAR[P] \\
    &=& VAR[\omega_1 S_1 + \omega_2 S_2] \\
    &=& VAR[\omega_1 S_1] + VAR[\omega_2 S_2] + COV[\omega_1 S_1,\omega_2 S_2] \\
    &=& \omega_1^2 VAR[S_1] + \omega_2^2 VAR[S_2] + \omega_1\omega_2 COV[S_1,S_2] \\
    &=& \omega_1^2 \sigma_1^2 + \omega_2^2 \sigma_2^2 + \rho_{12}\omega_1\omega_2\sigma_1\sigma_2
\end{eqnarray}