# <span style="color:red">QPM: Assignment 7</span>

###  LAFTIT Mehdi, LIN Christine, MUSEUX Célia and YANG Hexuan 

Financial Engineering - Quantitative Portfolio Management

**Due date :** 24/11/2023 9am

**Ressource:** Fama and French (2015), Carhart (1997), Hou, Xue, and Zhang (2015) and Frazzini and Pedersen (2014).

## <span style="color:green">Preliminary step</span>

We import the libraries we are going to use in this assignment:

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from tabulate import tabulate



The data file for this assignment has monthly excess returns for nine firm-specific characteristics: 

- Market
- SMB (Small Minus Big)
- HML (High Minus Low)
- RMW (Robust Minus Weak)
- CMA (Conservative Minus Aggresive)
- UMD (Up Minus Down)
- ROE (Return On Equity)
- IA (Investment to Asset)
- BAB (Bet Against Beta)

We assume that these returns were generated by Nt = 2000 stocks and that the number of stocks is constant over time.

In [2]:
factor = pd.read_excel('QPM-FactorsData.xlsx')
factor.set_index('Dates', inplace=True)
factor.head(7)

Unnamed: 0_level_0,Market,SMB,HML,RMW,CMA,UMD,ROE,IA,BAB
Dates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
196702,0.0078,0.0334,-0.0217,0.0194,-0.0094,0.0356,0.035317,-0.002064,0.0262
196703,0.0399,0.0163,0.0031,0.009,-0.0151,0.0142,0.018876,-0.016933,0.0081
196704,0.0389,0.0062,-0.0264,0.0243,-0.0375,0.0064,0.010983,-0.029519,0.0171
196705,-0.0433,0.0198,0.008,-0.0175,0.0161,0.0067,0.005234,0.024686,0.0201
196706,0.0241,0.0596,0.0096,-0.0064,-0.0239,0.0603,0.002945,-0.0217,-0.0163
196707,0.0458,0.0308,0.0265,0.0051,0.0272,-0.0107,-0.007125,0.023713,0.0456
196708,-0.0089,0.0047,0.0146,0.0042,0.0141,-0.0141,-0.00678,0.018169,0.0227


## <span style="color:green">Question 1 of Assignment 7</span>

Lets explain why one might expect these nine factors to be related to stock returns.For the first five characteristics (Market, SMB, HML, RMW, CMA) are from Fama and French (2015), the sixth (UMD) is from Carhart (1997), the profitability (ROE) and investment (IA) factors are from Hou, Xue, and Zhang (2015), and the betting-against-beta (BAB) factor is from Frazzini and Pedersen (2014).

- **Market**: represents the excess return of the overall market over the risk-free rate. Stocks tend to move in tandem with the broader market, making market performance a crucial determinant of individual stock returns.


- **SMB**: is the return on a diversified portfolio of small stocks minus the return on a diversified portfolio of big stocks. Historically, small-cap stocks have outperformed large-cap stocks, and SMB captures this size effect.


- **HML**: captures the spread in returns between stocks with high book-to-market ratios (value stocks) and low book-to-market ratios (growth stocks). Historically, value stocks have outperformed growth stocks, and HML quantifies this value effect.


- **RMW**: is the difference between the returns on diversified portfolios of stocks with robust and weak profitability. Historically, stocks with higher profitability have tended to outperform less profitable stocks, and RMW captures this profitability factor.


- **CMA**: is the difference between the returns on diversified portfolios of the stocks of low and high investment firms, which we call conservative and aggressive. Historically, stocks with more conservative accounting practices have outperformed those with aggressive practices, and CMA quantifies this conservatism factor.



- **BAB factor**: is a portfolio that holds low-beta assets, leveraged to a beta of one, and that shorts high-beta assets, de-leveraged to a beta of one. BAB factors have a positive average return and that the return is increasing in the ex ante tightness of constraints and in the spread in betas between high- and low-beta securities. Therefore, during times of tightening funding liquidity constraints, the BAB factor realizes negative returns as its expected future return rises. Contrary to the Capital Asset Pricing Model (CAPM), which suggests higher beta implies higher returns, BAB suggests that low-beta stocks tend to outperform high-beta stocks.


- **Investment factor I/A**: is the difference (low-minus-high), each month, between the simple average of the returns on the 6 low I/A portfolios and the simple average of the returns on the 6 high I/A portfolios. Historically, stocks of companies with lower investment have tended to outperform those with higher investment, and IA captures this investment efficiency factor.


- **ROE factor**: is the difference (highminus-low), each month, between the simple average of the returns on the 6 high ROE portfolios and the simple average of the returns on the 6 low ROE portfolios. Companies with higher ROE are often associated with higher stock returns, indicating the importance of profitability in stock performance.


- **UMD factor**: The return of the equal weighted average of the 50% highest performing stocks minus the return of the equal weighted average of the 50% lowest performing stocks. the stock momentum strategy is the UMD factor, which selects stocks based on the prior one-year returns skipping a month and holds them for a month. Stocks that have performed well in the recent past (winners) tend to continue to perform well, while those that have performed poorly (losers) continue to underperform. UMD captures this momentum effect.



Summary: all these factors contribute to explain variation in stock returns

## <span style="color:green">Question 2 of Assignment 7</span>

We have to find the optimal θ vector for a mean-variance investor with risk aversion of γ = 5 if the investor can invest in only these nine factors. 
We then use the entire dataset to estimate the nine factors’ mean and covariance of returns (i.e., we do not need to do out-of-sample analysis).

In [3]:
#we define the risk aversion
gamma= 5

we calculate mean and covariance of returns for the nine factors:

In [4]:
mean_returns = np.mean(factor, axis=0)
std_returns = np.std(factor, axis=0)
covariance_matrix = factor.cov()

In [5]:
# We create a DataFrame for better formatting
summary_table = pd.DataFrame({
    'Factor': factor.columns,
    'Mean': mean_returns,
    'Volatility': std_returns
})

table_str = tabulate(summary_table, headers='keys', tablefmt='fancy_grid', showindex=False)

print("\033[1m\nFactor mean:\033[0m")
print(table_str)

print("\033[1m\nCovariance matrix:\033[0m")
covariance_matrix

[1m
Factor mean:[0m
╒══════════╤════════════╤══════════════╕
│ Factor   │       Mean │   Volatility │
╞══════════╪════════════╪══════════════╡
│ Market   │ 0.00569706 │    0.0455527 │
├──────────┼────────────┼──────────────┤
│ SMB      │ 0.00176368 │    0.0308541 │
├──────────┼────────────┼──────────────┤
│ HML      │ 0.00235549 │    0.0293381 │
├──────────┼────────────┼──────────────┤
│ RMW      │ 0.00266739 │    0.0219473 │
├──────────┼────────────┼──────────────┤
│ CMA      │ 0.00276414 │    0.0200224 │
├──────────┼────────────┼──────────────┤
│ UMD      │ 0.00628362 │    0.0428759 │
├──────────┼────────────┼──────────────┤
│ ROE      │ 0.00506562 │    0.0256235 │
├──────────┼────────────┼──────────────┤
│ IA       │ 0.00340875 │    0.0188358 │
├──────────┼────────────┼──────────────┤
│ BAB      │ 0.0087255  │    0.0335746 │
╘══════════╧════════════╧══════════════╛
[1m
Covariance matrix:[0m


Unnamed: 0,Market,SMB,HML,RMW,CMA,UMD,ROE,IA,BAB
Market,0.002078,0.000426,-0.000306,-0.000216,-0.00035,-0.000327,-0.000271,-0.000316,-0.000135
SMB,0.000426,0.000953,-0.000142,-0.000278,-9.4e-05,-6.7e-05,-0.000312,-0.000134,-6e-05
HML,-0.000306,-0.000142,0.000862,5.3e-05,0.000405,-0.000264,-0.000108,0.000366,0.00033
RMW,-0.000216,-0.000278,5.3e-05,0.000482,-5e-06,0.000102,0.000369,3.3e-05,0.000232
CMA,-0.00035,-9.4e-05,0.000405,-5e-06,0.000402,-1.3e-05,-3.6e-05,0.000344,0.000208
UMD,-0.000327,-6.7e-05,-0.000264,0.000102,-1.3e-05,0.001841,0.000563,3e-06,0.000282
ROE,-0.000271,-0.000312,-0.000108,0.000369,-3.6e-05,0.000563,0.000658,1.7e-05,0.000244
IA,-0.000316,-0.000134,0.000366,3.3e-05,0.000344,3e-06,1.7e-05,0.000355,0.000202
BAB,-0.000135,-6e-05,0.00033,0.000232,0.000208,0.000282,0.000244,0.000202,0.001129


The return of the parametric portfolio at time $t + 1$, $rp,t+1(θ)$, is:  

>$rp,t+1(θ) = rb,t+1 + θ.T*rc,t+1$  

where:

- $rt+1$ is the $Nt × 1$ return vector at time $t + 1$
- $rb,t+1$ = $wb*rt+1$ is the benchmark portfolio return at time $t + 1$
- $rc,t+1 = X.T*rt+1/Nt$ is the characteristic return vector at time $t + 1$ which contains the returns of the long-short portfolios corresponding to the $K$ characteristics scaled by the number of firms $Nt$.


So, the parametric-portfolio return is the benchmark-portfolio return (Market portfolio) plus the return of the characteristic portfolio.



We choses the weights θ by maximizing mean-variance utility:
> $maxθ Et[rp,t+1(θ)] − (γ/2) * Vt[rp,t+1(θ)]$ where:
- $γ$ is the investor’s risk-aversion parameter
- $Et[rp,t+1(θ)]$ is the mean of the parametric-portfolio return
- $Vt[rp,t+1(θ)]$ is the variance of the parametric-portfolio return

**STEP 1**: we define first $𝑟𝑏,𝑡+1$ and $𝑟𝑏,𝑡+1$:

In [6]:
rb_t1 = factor["Market"]

rc_t1 = factor.iloc[:, 1:]

**STEP 2**: we now define the function to calculate parametric portfolio return:

In [7]:
def parametric_portfolio_return(theta):
    return rb_t1 + np.dot(theta, rc_t1.T)

**STEP 3**: Next we define the function to calculate mean-variance utility:

In [8]:
def mean_variance_utility(weights, gamma):
    rp_t1 = parametric_portfolio_return(weights)
    mean_rp_t1 = np.mean(rp_t1)
    var_rp_t1 = np.var(rp_t1)
    return -mean_rp_t1 + (gamma / 2) * var_rp_t1

**STEP 4**: the goal now is to minimize the function defined just above (maximized the mean-variance utility, so minimize -u)

In [17]:
initial_guess = np.zeros(8)
weights_market = np.zeros_like(initial_guess)
weights_market[0] = 1

# We define the equality constraint: the sum of weights equals 1 
constraints = {'type': 'eq', 'fun': lambda weights: np.sum(weights) + weights_market[0] - 1}

# We optimize the objective function
result = minimize(mean_variance_utility, initial_guess, args=(gamma,), method='SLSQP', constraints=constraints)
optimal_weights_remaining = result.x
optimal_weights = np.insert(optimal_weights_remaining, 0, 1.0)


factor_names = factor.columns
print("\033[1m\nOptimal weights (θ) are =\033[0m ")
for factor_name, weight in zip(factor_names, optimal_weights):
    print(f"{factor_name}: {weight:.6f}")

[1m
Optimal weights (θ) are =[0m 
Market: 1.000000
SMB: -0.804106
HML: -0.587802
RMW: -1.308155
CMA: 0.051407
UMD: 0.068890
ROE: 0.711131
IA: 0.662123
BAB: 1.206512


*Note that we assumed that the investor's benchmark is the market portfolio, and therefore, we put a weight on 1 on the market and then we did mean-variance optimization with the remaining 8 factors.* 

*Since the investor can invest in only these nine factors, we added a constraint on the sum of the weights*

## <span style="color:green">Question 3 of Assignment 7</span>

We now have to find the Sharpe ratio for each of the nine factors and compare it to that of the parametric portfolio we have identified in the previous question.

In [10]:
sharpe_ratios = mean_returns/ std_returns 

print("\033[1m\nSharpe Ratios for Each Factor:\033[0m")
for factor_name, sharpe_ratio in zip(factor.columns, sharpe_ratios):
    print(f"{factor_name}: {sharpe_ratio:.6f}")

[1m
Sharpe Ratios for Each Factor:[0m
Market: 0.125065
SMB: 0.057162
HML: 0.080288
RMW: 0.121536
CMA: 0.138052
UMD: 0.146553
ROE: 0.197695
IA: 0.180972
BAB: 0.259884


In [11]:
sharpe_ratios_ann = sharpe_ratios*np.sqrt(12)

print("\033[1m\nAnnulalized Sharpe Ratios for Each Factor:\033[0m")
for factor_name, sharpe_ratio in zip(factor.columns, sharpe_ratios_ann):
    print(f"{factor_name}: {sharpe_ratio:.6f}")

[1m
Annulalized Sharpe Ratios for Each Factor:[0m
Market: 0.433239
SMB: 0.198014
HML: 0.278125
RMW: 0.421013
CMA: 0.478227
UMD: 0.507676
ROE: 0.684834
IA: 0.626907
BAB: 0.900263


We calculate Sharpe ratio for parametric portfolio

In [12]:
portfolio_returns = parametric_portfolio_return(optimal_weights[1:])
sharpe_ratio_portfolio = np.mean(portfolio_returns) / np.std(portfolio_returns)

print("\n\033[1mSharpe Ratio for Parametric Portfolio:\033[0m ")
print(f"{sharpe_ratio_portfolio:.6f}")


[1mSharpe Ratio for Parametric Portfolio:[0m 
0.268187


In [13]:
sharpe_ratios_portfolio_ann = sharpe_ratio_portfolio*np.sqrt(12)

print("\n\033[1mAnnualized Sharpe Ratio for Parametric Portfolio:\033[0m ")
print(f"{sharpe_ratios_portfolio_ann :.6f}")


[1mAnnualized Sharpe Ratio for Parametric Portfolio:[0m 
0.929025


The Parametric Portfolio Sharpe Ratio is higher than those of the individual factors. This difference arises because the portfolio is constructed to optimize a specific objective, such as maximizing the Sharpe ratio, while taking into account the correlation between different assets. <br> <br>
Thus, the diversification provided by combining the factors in the portfolio reduces risk without significantly sacrificing returns. The parametric portfolio seeks to leverage the benefits of diversification to improve the overall Sharpe ratio.

## <span style="color:green">Question 4 of Assignment 7</span>

Having obtained the optimal θ vector (named `optimal_weights` here), we now explain how one would obtain the optimal portfolio weights for each of the Nt = 2000 assets that are used to form each of the nine factors.

Brandt, Santa-Clara, and Valkanov (2009) propose that the Nt x 1 vector of parametric portfolio weights $wt(θ)$ is

> $wt(θ) = wbt + (F1t*θ1 + F2t*θ2 + ... + FKt*θK )/Nt$  

where
- wbt is the Nt x 1 benchmark portfolio at time $t$
- $Fkt$ is the Nt x 1 long-short characteristic portfolio obtained by standardizing the kth firm-specific characteristic at time $t$
- θk is the (scalar) weight on the kth characteristic portfolio in the parametric portfolio
- Nt is the number of firms at time t.



We parameterize the optimal portfolio weights as a function of the stocks’ characteristics:

> $wi,t = wbt + (1/Nt) * θ.T *Fi,t$


where wbt is the weight of stock i at date t in a benchmark portfolio such as the value-weighted market portfolio, θ is a vector of coefficients to be estimated, and Fi,t are the characteristics of stock i, standardized cross-sectionally to have zero mean and unit standard deviation across all stocks at date t. Note that, rather than estimating one weight for each stock, we 
estimate weights as a single function of characteristics that applies to all stocks.


In practice, we have to get data for factor/characteristics matrix Fi,t (N x 8) and apply it to the formula.