# Course 1: Introduction to Portfolio Construction and Analysis with Python
## Module 2: An Introduction to Portfolio Optimization

**Question 1**

Use the EDHEC Hedge Fund Indices data set that we used in the lab assignment as well as in the previous week’s assignments. Load them into Python and perform the following analysis based on data since 2000 (including all of 2000): What was the Monthly Parametric Gaussian VaR at the 1% level (as a +ve number) of the Distressed Securities strategy?

Enter the positive number as a percent .e.g. For 5.32% enter 5.32

In [2]:
import pandas as pd
import numpy as np
hfi = pd.read_csv("data/edhec-hedgefundindices.csv",header=0, index_col=0, parse_dates=True)
hfi = hfi/100
hfi.index = hfi.index.to_period('M')
hfi = hfi["2000":]
hfi

Unnamed: 0_level_0,Convertible Arbitrage,CTA Global,Distressed Securities,Emerging Markets,Equity Market Neutral,Event Driven,Fixed Income Arbitrage,Global Macro,Long/Short Equity,Merger Arbitrage,Relative Value,Short Selling,Funds Of Funds
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2000-01,0.0227,0.0128,0.0088,0.0077,0.0075,0.0088,0.0041,0.0021,0.0075,0.0143,0.0173,0.0427,0.0169
2000-02,0.0267,-0.0022,0.0421,0.0528,0.0253,0.0346,0.0097,0.0408,0.0699,0.0239,0.0185,-0.1340,0.0666
2000-03,0.0243,-0.0138,0.0103,0.0318,0.0134,0.0069,-0.0061,-0.0104,0.0006,0.0131,0.0163,-0.0230,0.0039
2000-04,0.0223,-0.0241,-0.0101,-0.0541,0.0168,-0.0059,-0.0006,-0.0304,-0.0201,0.0188,0.0092,0.1028,-0.0269
2000-05,0.0149,0.0114,-0.0132,-0.0433,0.0062,-0.0034,0.0107,-0.0070,-0.0097,0.0146,0.0080,0.0704,-0.0122
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2018-07,0.0021,-0.0058,0.0093,0.0040,-0.0010,0.0055,0.0022,-0.0014,0.0067,-0.0021,0.0045,-0.0052,0.0018
2018-08,0.0024,0.0166,0.0002,-0.0277,0.0004,0.0011,0.0017,-0.0007,0.0035,0.0050,-0.0002,-0.0214,0.0015
2018-09,0.0034,-0.0054,0.0050,-0.0110,-0.0016,0.0032,0.0036,0.0006,-0.0023,0.0028,0.0018,0.0036,-0.0022
2018-10,-0.0073,-0.0314,-0.0158,-0.0315,-0.0129,-0.0257,-0.0023,-0.0096,-0.0402,-0.0080,-0.0109,0.0237,-0.0269


In [3]:
from scipy.stats import norm

def var_gaussian(r, level=5, modified=False):
    """
    Returns the Parametric Gauusian VaR of a Series or DataFrame
    If "modified" is True, then the modified VaR is returned,
    using the Cornish-Fisher modification
    """
    # compute the Z score assuming it was Gaussian
    z = norm.ppf(level/100)
    if modified:
        # modify the Z score based on observed skewness and kurtosis
        s = skewness(r)
        k = kurtosis(r)
        z = (z +
                (z**2 - 1)*s/6 +
                (z**3 -3*z)*(k-3)/24 -
                (2*z**3 - 5*z)*(s**2)/36
            )
    return -(r.mean() + z*r.std(ddof=0))

(var_gaussian(hfi, level = 1)["Distressed Securities"]*100).round(2)

3.14

**Question 2** 

Use the same data set at the previous question. What was the 1% VaR for the same strategy after applying the Cornish-Fisher Adjustment?

In [4]:
def skewness(r):
    """
    Alternative to scipy.stats.skew()
    Computes the skewness of the supplied Series or DataFrame
    Returns a float or a Series
    """
    demeaned_r = r - r.mean()
    # use the population standard deviation, so set dof=0
    sigma_r = r.std(ddof=0)
    exp = (demeaned_r**3).mean()
    return exp/sigma_r**3

def kurtosis(r):
    """
    Alternative to scipy.stats.kurtosis()
    Computes the kurtosis of the supplied Series or DataFrame
    Returns a float or a Series
    """
    demeaned_r = r - r.mean()
    # use the population standard deviation, so set dof=0
    sigma_r = r.std(ddof=0)
    exp = (demeaned_r**4).mean()
    return exp/sigma_r**4

(var_gaussian(hfi, level = 1, modified = True)["Distressed Securities"]*100).round(2)

4.97

**Question 3**

Use the same dataset as the previous question. What was the Monthly Historic VaR at the 1% level (as a +ve number) of the Distressed Securities strategy?

In [5]:
def var_historic(r, level=5):
    """
    Returns the historic Value at Risk at a specified level
    i.e. returns the number such that "level" percent of the returns
    fall below that number, and the (100-level) percent are above
    """
    if isinstance(r, pd.DataFrame):
        return r.aggregate(var_historic, level=level)
    elif isinstance(r, pd.Series):
        return -np.percentile(r, level)
    else:
        raise TypeError("Expected r to be a Series or DataFrame")

(var_historic(hfi, level = 1)["Distressed Securities"]*100).round(2)

4.26

**Question 4**

Next, load the 30 industry return data using the erk.get_ind_returns() function that we developed during the lab sessions. For purposes of the remaining questions, use data during the 5 year period 2013-2017 (both inclusive) to estimate the expected returns as well as the covariance matrix. To be able to respond to the questions, you will need to build the MSR, EW and GMV portfolios consisting of the “Books”, “Steel”, "Oil", and "Mines" industries. Assume the risk free rate over the 5 year period is 10%.
What is the weight of Steel in the EW Portfolio?

In [6]:
def get_ind_file(filetype):
    """
    Load and format the Ken French 30 Industry Portfolios files
    """
    known_types = ["returns", "nfirms", "size"]
    if filetype not in known_types:
        sep = ','
        raise ValueError(f'filetype must be one of:{sep.join(known_types)}')
    if filetype is "returns":
        name = "vw_rets"
        divisor = 100
    elif filetype is "nfirms":
        name = "nfirms"
        divisor = 1
    elif filetype is "size":
        name = "size"
        divisor = 1
    ind = pd.read_csv(f"data/ind30_m_{name}.csv", header=0, index_col=0)/divisor
    ind.index = pd.to_datetime(ind.index, format="%Y%m").to_period('M')
    ind.columns = ind.columns.str.strip()
    return ind

data = get_ind_file("returns")
columns = ["Books","Steel","Oil","Mines"]
returns = data.loc["2013":"2017", columns]
returns.head()

Unnamed: 0,Books,Steel,Oil,Mines
2013-01,0.0513,0.0428,0.0788,0.0059
2013-02,-0.0654,-0.0268,0.0052,-0.0756
2013-03,0.0778,0.021,0.0209,0.0091
2013-04,-0.0029,-0.0441,-0.0129,-0.1057
2013-05,0.0479,0.0384,0.0307,0.0022


In [7]:
100/returns.shape[1]

25.0

**Question 5**

What is the weight of the largest component of the MSR portfolio?

In [8]:
from scipy.optimize import minimize
def annualize_rets(r, periods_per_year):
    """
    Annualizes a set of returns
    """
    compounded_growth = (1+r).prod()
    n_periods = r.shape[0]
    return compounded_growth**(periods_per_year/n_periods)-1

def portfolio_return(weights, returns):
    """
    Computes the return on a portfolio from constituent returns and weights
    weights are a numpy array or Nx1 matrix and returns are a numpy array or Nx1 matrix
    """
    return weights.T @ returns

def portfolio_vol(weights, covmat):
    """
    Computes the vol of a portfolio from a covariance matrix and constituent weights
    weights are a numpy array or N x 1 maxtrix and covmat is an N x N matrix
    """
    return (weights.T @ covmat @ weights)**0.5

def msr(riskfree_rate, er, cov):
    """
    Returns the weights of the portfolio that gives you the maximum sharpe ratio
    given the riskfree rate and expected returns and a covariance matrix
    """
    n = er.shape[0]
    init_guess = np.repeat(1/n, n)
    bounds = ((0.0, 1.0),) * n # an N-tuple of 2-tuples!
    # construct the constraints
    weights_sum_to_1 = {'type': 'eq',
                        'fun': lambda weights: np.sum(weights) - 1
    }
    def neg_sharpe(weights, riskfree_rate, er, cov):
        """
        Returns the negative of the sharpe ratio
        of the given portfolio
        """
        r = portfolio_return(weights, er)
        vol = portfolio_vol(weights, cov)
        return -(r - riskfree_rate)/vol
    
    weights = minimize(neg_sharpe, init_guess,
                       args=(riskfree_rate, er, cov), method='SLSQP',
                       options={'disp': False},
                       constraints=(weights_sum_to_1,),
                       bounds=bounds)
    return weights.x

er = annualize_rets(returns,12)
cov = returns.cov()

In [9]:
msr_weights = msr(0.1, er, cov)
pd.options.display.float_format = '{:.2f}'.format
df_weights = pd.DataFrame({'industry':er.index, 'msr_weights': msr_weights*100})
df_weights.sort_values("msr_weights", ascending = False).head(1)

Unnamed: 0,industry,msr_weights
1,Steel,100.0


**Question 6**

Which of the 4 components has the largest weight in the MSR portfolio?

In [10]:
df_weights.sort_values("msr_weights", ascending = False).head(1)["industry"]

1    Steel
Name: industry, dtype: object

**Question 7**

How many of the components of the MSR portfolio have non-zero weights?

In [11]:
df_weights

Unnamed: 0,industry,msr_weights
0,Books,0.0
1,Steel,100.0
2,Oil,0.0
3,Mines,0.0


**Question 8**

What is the weight of the largest component of the GMV portfolio?

In [12]:
def gmv(cov):
    """
    Returns the weights of the Global Minimum Volatility portfolio
    given a covariance matrix
    """
    n = cov.shape[0]
    return msr(0, np.repeat(1, n), cov)

In [13]:
gmv_weights = gmv(cov)
df_weights["gmv_weights"] = gmv_weights*100
df_weights.sort_values("gmv_weights",ascending = False).head(1)

Unnamed: 0,industry,msr_weights,gmv_weights
0,Books,0.0,47.7


**Question 9**

Which of the 4 components has the largest weight in the GMV portfolio?

In [14]:
df_weights.sort_values("gmv_weights",ascending = False).head(1)["industry"]

0    Books
Name: industry, dtype: object

**Question 10**

How many of the components of the GMV portfolio have non-zero weights?

In [15]:
df_weights

Unnamed: 0,industry,msr_weights,gmv_weights
0,Books,0.0,47.7
1,Steel,100.0,0.0
2,Oil,0.0,43.41
3,Mines,0.0,8.89


**Question 11**

Assume two different investors invested in the GMV and MSR portfolios at the start of 2018 using the weights we just computed. Compute the annualized volatility of these two portfolios over the next 12 months of 2018? (Hint: Use the portfolio_vol code we developed in the lab and use ind[“2018”][l].cov() to compute the covariance matrix for 2018, assuming that the variable ind holds the industry returns and the variable l holds the list of industry portfolios you are willing to hold. Don’t forget to annualize the volatility)

What would be the annualized volatility over 2018 using the weights of the MSR portfolio?

In [16]:
cov_18 = data.loc['2018':, columns].cov()

def portfolio_vol(weights, covmat):
    """
    Computes the vol of a portfolio from a covariance matrix and constituent weights
    weights are a numpy array or N x 1 maxtrix and covmat is an N x N matrix
    """
    return (weights.T @ covmat @ weights)**0.5

vol_msr = portfolio_vol(msr_weights, cov_18)
annualized_vol = vol_msr*(12**0.5)
(annualized_vol*100).round(2)

21.98

**Question 12**

What would be the annualized volatility over 2018 using the weights of the GMV portfolio? (Reminder and Hint: Use the portfolio_vol code we developed in the lab and use ind[“2018”][l].cov() to compute the covariance matrix for 2018, assuming that the variable ind holds the industry returns and the variable l holds the list of industry portfolios you are willing to hold. Don’t forget to annualize the volatility)

In [17]:
vol_gmv = portfolio_vol(gmv_weights, cov_18)
annualized_vol = vol_gmv*(12**0.5)
(annualized_vol*100).round(2)

18.97