<h1 id="Contents">Contents<a href="#Contents"></a></h1>
        <ol>
        <li><a class="" href="#Fama%E2%80%93French-five-factor-model">Fama–French five-factor model</a></li>
<ol><li><a class="" href="#Market-Excess-Return">Market Excess Return</a></li>
<li><a class="" href="#Small-Minus-Big">Small Minus Big</a></li>
<li><a class="" href="#High-Minus-Low">High Minus Low</a></li>
<li><a class="" href="#Robust-Minus-Weak">Robust Minus Weak</a></li>
<li><a class="" href="#Conservative-Minus-Aggressive">Conservative Minus Aggressive</a></li>
<li><a class="" href="#The-Model">The Model</a></li>
<li><a class="" href="#Calculating-the-Factors-for-Apple">Calculating the Factors for Apple</a></li>
</ol>

In [1]:
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
from plotly.subplots import make_subplots

pio.renderers.default = "notebook"
pio.templates.default = "plotly_dark"

from pystock.FFF import FamaFrenchFactors
from pystock.portfolio import Stock
from pystock.utils import merge_dfs

In [2]:
fff = FamaFrenchFactors()
apple = Stock("AAPL", directory="Data/AAPL.csv")
apple.load_data(columns=["Adj Close"], rename_cols=["Close"], frequency="M")

fff5 = fff.load(factors=5, directory="fff", frequency="M")
fff5.head()

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-07-31,-0.0013,0.0011,-0.0003,-0.0013,0.003,0.00012
1963-08-31,0.0044,0.0015,-0.0013,0.0015,-0.0021,0.00011
1963-09-30,-0.006,0.0021,0.0008,0.0024,0.0013,0.00014
1963-10-31,0.0021,-0.0003,0.0008,0.001,-0.0026,0.00013
1963-11-30,0.0134,0.003,0.0029,-0.0031,-0.0015,0.00015


In [5]:
fff.mean_values()

Mkt-RF    0.000799
SMB       0.001204
HML      -0.000043
RMW      -0.000530
CMA      -0.000082
RF        0.000172
dtype: float64

In [6]:
apple.data

Unnamed: 0_level_0,Close
Date,Unnamed: 1_level_1
1980-12-31,0.118546
1981-01-31,0.098137
1981-02-28,0.092058
1981-03-31,0.085110
1981-04-30,0.098571
...,...
2022-07-31,162.015808
2022-08-31,156.959625
2022-09-30,137.971115
2022-10-31,153.086044


# Fama–French five-factor model

The five factor model is a model of asset returns that attempts to explain the returns of assets by five factors: size, value, momentum, profitability, and investment. The model was developed by Fama and French in 1993 and has been widely used in the finance industry since then.

The five factors describing the model are:

1. market excess return (MKT)
2. the outperformance of small versus big companies (SMB)
3. the outperformance of high book/market versus low book/market companies (HML)
4. the profitability factor (RMW)
5. investment factor (CMA)

Let's consider these factors in more detail.

## Market Excess Return

This is equal to $R_m- R_f$ and gives the return of the market portfolio minus the risk-free rate. It is the return that an investor would expect to receive if they invested in the market portfolio.

Rm-Rf includes all NYSE, AMEX, and NASDAQ firms. It is a market capitalization weighted index of all common stocks listed on the NYSE, AMEX, and NASDAQ. It is a price-weighted index of 3,000 stocks. The index is rebalanced quarterly and reconstituted annually.

We'll be denoting it as $MER$.

## Small Minus Big

This is the average return on the nine small stock portfolios minus the average return on the nine big stock portfolios,

<table>
<tbody><tr>
<td width="10">&nbsp;</td>
<td valign="top" class="style1">
<font face="Times New Roman, Times, serif" size="4"><i>SMB<sub>(B/M)</sub> =<br>
<br>
<br>
<br>
SMB<sub>(OP)</sub> =<br>
<br>
<br>
<br>
SMB<sub>(INV)</sub> =<br>
<br>
<br>
<br>
<br>
SMB = </i></font></td>
<td valign="top"><font face="Times New Roman, Times, serif" size="4"><i>
1/3 (Small Value + Small Neutral + Small Growth) <br>
&nbsp; - 1/3 (Big Value + Big Neutral + Big Growth).	 <br>
<br>
1/3 (Small Robust + Small Neutral + Small Weak)<br>
&nbsp; - 1/3 (Big Robust + Big Neutral + Big Weak).<br>
<br>
1/3 (Small Conservative + Small Neutral + Small Aggressive)<br>
&nbsp; - 1/3 (Big Conservative + Big Neutral + Big Aggressive).
<br>
<br>
<br>
1/3 (  SMB<sub>(B/M)</sub> + SMB<sub>(OP)</sub> + SMB<sub>(INV)</sub> ).<br>
</i></font></td>
<td width="10">&nbsp;</td>
</tr>
</tbody></table>

We'll be denoting it as $SMB$.

## High Minus Low

HML (High Minus Low) is the average return on the two value portfolios minus the average return on the two growth portfolios.

<table><tbody><tr>
<td width="10">&nbsp;</td>
<td valign="top" class="style1">
<font face="Times New Roman, Times, serif" size="4"><i>HML =<br>
<br>
</i></font>
</td>
<td valign="top"><font face="Times New Roman, Times, serif" size="4"><i>1/2 (Small Value +
Big Value)<br>
&nbsp;- 1/2 (Small Growth + Big Growth).</i></font></td>
<td width="10">&nbsp;</td>
</tr>
</tbody></table>

We'll be denoting it as $HML$.

## Robust Minus Weak

RMW (Robust Minus Weak) is the average return on the two robust operating profitability portfolios minus the average return on the two weak operating profitability portfolios

<table><tbody><tr>
    <td width="10">&nbsp;</td>
    <td valign="top" class="style1">
    <font face="Times New Roman, Times, serif" size="4"><i>RMW =<br>
    <br>
    </i></font>
    </td>
    <td valign="top"><font face="Times New Roman, Times, serif" size="4"><i>1/2 (Small Robust + Big Robust)<br>
    &nbsp; - 1/2 (Small Weak + Big Weak).</i></font></td>
    <td width="10">&nbsp;</td>
    </tr>
    </tbody></table>

We'll be denoting it as $RMW$.

## Conservative Minus Aggressive

CMA (Conservative Minus Aggressive) is the average return on the two conservative investment portfolios minus the average return on the two aggressive investment portfolios

<table><tbody><tr>
    <td width="10">&nbsp;</td>
    <td valign="top" class="style1">
    <font face="Times New Roman, Times, serif" size="4"><i>CMA =<br>
    <br>
    </i></font>
    </td>
    <td valign="top"><font face="Times New Roman, Times, serif" size="4"><i>1/2 (Small
    Conservative + Big Conservative)<br>
    &nbsp; - 1/2 (Small Aggressive + Big Aggressive).</i></font></td>
    <td width="10">&nbsp;</td>
    </tr>
    </tbody></table>

## The Model

Using these factors, we can write the model as

$$
R = \alpha + \beta_i MER + \beta_i SMB + \beta_i HML + \beta_i RMW + \beta_i CMA + \epsilon
$$

where $R$ is the return of the asset, $\alpha$ is the intercept, $\beta_i$ are the slopes, and $\epsilon$ is the error term.

If $R_f$ is the risk-free rate, then the model can be modified to include the risk-free rate as

$$
R - R_f = \alpha + \beta_i MER + \beta_i SMB + \beta_i HML + \beta_i RMW + \beta_i CMA + \epsilon
$$

The return value of one stock is a linear combination of the five factors. The coefficients $\beta_i$ are the weights of the factors in the linear combination. The intercept $\alpha$ is the expected return of the asset when all the factors are zero. The formula can be applied to all the stocks in the portfolio to get the expected return of the portfolio.

## Calculating the Factors for Apple

In [7]:
apple.data.head()

Unnamed: 0_level_0,Close
Date,Unnamed: 1_level_1
1980-12-31,0.118546
1981-01-31,0.098137
1981-02-28,0.092058
1981-03-31,0.08511
1981-04-30,0.098571


In [8]:
fff5.head()

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-07-31,-0.0013,0.0011,-0.0003,-0.0013,0.003,0.00012
1963-08-31,0.0044,0.0015,-0.0013,0.0015,-0.0021,0.00011
1963-09-30,-0.006,0.0021,0.0008,0.0024,0.0013,0.00014
1963-10-31,0.0021,-0.0003,0.0008,0.001,-0.0026,0.00013
1963-11-30,0.0134,0.003,0.0029,-0.0031,-0.0015,0.00015


In [9]:
fff5.tail()

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
2022-07-31,0.0144,-0.0088,0.0045,0.0048,-0.0118,4e-05
2022-08-31,-0.0074,0.0022,-0.0044,-0.0063,-0.0012,8e-05
2022-09-30,-0.0142,0.0059,0.0027,-0.0067,-0.0009,9e-05
2022-10-31,-0.0067,0.0047,0.0075,-0.0021,0.0048,0.00011
2022-11-30,0.0312,-0.0014,-0.0207,-0.0078,-0.0142,0.00014


In [10]:
apple = apple.data.pct_change().dropna()
apple.head()

Unnamed: 0_level_0,Close
Date,Unnamed: 1_level_1
1981-01-31,-0.172163
1981-02-28,-0.061943
1981-03-31,-0.075475
1981-04-30,0.158162
1981-05-31,0.167398


In [12]:
df = merge_dfs([apple, fff5], join="inner")
df.head()

Unnamed: 0,Close,Mkt-RF,SMB,HML,RMW,CMA,RF
1981-01-31,-0.172163,-0.0048,0.0046,0.0095,-0.0051,0.0078,0.00049
1981-02-28,-0.061943,0.0095,0.0014,-0.0049,0.0035,-0.0045,0.00056
1981-03-31,-0.075475,0.0105,-0.0019,-0.001,-0.0014,-0.0021,0.00055
1981-04-30,0.158162,0.0001,0.0019,-0.0042,-0.001,0.0004,0.00051
1981-05-31,0.167398,-0.0046,0.0068,0.0034,-0.0001,0.0026,0.00057


Now, we'll create a new column which will contain the expected return of the stock - the risk-free rate.

In [13]:
y = df["Close"]-df["RF"]

In [14]:
X = df.drop(["Close", "RF"], axis=1)

In [15]:
X.columns

Index(['Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA'], dtype='object')

In [16]:
import statsmodels.api as sm

In [17]:
X = sm.add_constant(X)

In [18]:
model = sm.OLS(y, X).fit()

In [19]:
print(model.summary())

                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.018
Model:                            OLS   Adj. R-squared:                  0.008
Method:                 Least Squares   F-statistic:                     1.848
Date:                Sun, 01 Jan 2023   Prob (F-statistic):              0.102
Time:                        15:37:32   Log-Likelihood:                 321.73
No. Observations:                 503   AIC:                            -631.5
Df Residuals:                     497   BIC:                            -606.1
Df Model:                           5                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0230      0.006      3.919      0.0

In [20]:
params = model.params

In [21]:
params

const     0.023013
Mkt-RF   -0.034175
SMB      -0.423884
HML      -3.084173
RMW       0.096715
CMA       0.819950
dtype: float64

In [24]:
params["rf"] = 1
params

const     0.023013
Mkt-RF   -0.034175
SMB      -0.423884
HML      -3.084173
RMW       0.096715
CMA       0.819950
rf        1.000000
dtype: float64

In [36]:
A = [1]
A.extend(params.values)

In [38]:
pd.Series(A, index=["const"]+list(params.index))

const     1.000000
const     0.023013
Mkt-RF   -0.034175
SMB      -0.423884
HML      -3.084173
RMW       0.096715
CMA       0.819950
rf        1.000000
dtype: float64

In [25]:
mean_mkt = df["Mkt-RF"].mean()
mean_smb = df["SMB"].mean()
mean_hml = df["HML"].mean()
mean_rf = df["RF"].mean()
mean_rmw = df["RMW"].mean()
mean_cma = df["CMA"].mean()

data = np.array([1, mean_mkt, mean_smb, mean_hml, mean_rmw, mean_cma, mean_rf])

In [26]:
expected_return = np.dot(params, data)
print(f"Expected return: {expected_return*100:.2f}%")

Expected return: 2.27%


In [29]:
def bigger_frequency(freq1, freq2):
    """
    Check if `freq1` is a bigger frequency than `freq2`

    Parameters
    ----------
    freq1 : str
        Frequency 1
    freq2 : str
        Frequency 2

    Returns
    -------
    bool
        True if `freq1` is bigger than `freq2`
    """

    freq_map = {
        "D": 1,
        "W": 2,
        "M": 3,
        "Q": 4,
        "Y": 5,
    }
    return freq_map[freq1] > freq_map[freq2]

In [30]:
bigger_frequency("M", "D")

True

In [1]:
from pystock.portfolio import Stock

In [2]:
apple = Stock("AAPL", directory="Data/AAPL.csv")

In [3]:
apple.load_data(columns=["Adj Close"], rename_cols=["Close"], frequency="D")

Unnamed: 0_level_0,Close
Date,Unnamed: 1_level_1
1980-12-12,0.099874
1980-12-13,0.099874
1980-12-14,0.099874
1980-12-15,0.094663
1980-12-16,0.087715
...,...
2022-12-25,131.860001
2022-12-26,131.860001
2022-12-27,130.029999
2022-12-28,126.040001


In [4]:
apple.load_fff(factors=5, directory="fff", frequency="M")

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-07-31,-0.0013,0.0011,-0.0003,-0.0013,0.0030,0.00012
1963-08-31,0.0044,0.0015,-0.0013,0.0015,-0.0021,0.00011
1963-09-30,-0.0060,0.0021,0.0008,0.0024,0.0013,0.00014
1963-10-31,0.0021,-0.0003,0.0008,0.0010,-0.0026,0.00013
1963-11-30,0.0134,0.0030,0.0029,-0.0031,-0.0015,0.00015
...,...,...,...,...,...,...
2022-07-31,0.0144,-0.0088,0.0045,0.0048,-0.0118,0.00004
2022-08-31,-0.0074,0.0022,-0.0044,-0.0063,-0.0012,0.00008
2022-09-30,-0.0142,0.0059,0.0027,-0.0067,-0.0009,0.00009
2022-10-31,-0.0067,0.0047,0.0075,-0.0021,0.0048,0.00011


In [5]:
params = apple.calculate_fff()

Frequency of stock stock and fama french factors are not equal. Equating frequencies... to D and M respectively.
Fama French Factors Calculated
                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.018
Model:                            OLS   Adj. R-squared:                  0.008
Method:                 Least Squares   F-statistic:                     1.848
Date:                Sun, 01 Jan 2023   Prob (F-statistic):              0.102
Time:                        15:14:49   Log-Likelihood:                 321.73
No. Observations:                 503   AIC:                            -631.5
Df Residuals:                     497   BIC:                            -606.1
Df Model:                           5                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025

In [9]:
apple.params

const     0.023013
Mkt-RF   -0.034175
SMB      -0.423884
HML      -3.084173
RMW       0.096715
CMA       0.819950
dtype: float64

In [10]:
apple.__dict__.keys()

dict_keys(['name', 'directory', 'loaded', 'return_', 'fff', 'data', 'columns', 'start_date', 'end_date', 'frequency', 'params'])

In [12]:
apple.frequency

'M'

In [1]:
from pystock.portfolio import Portfolio
from pystock.models import Model
from pystock.FFF import FamaFrenchFactors
import numpy as np

In [2]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

pt = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt.load_benchmark(columns=["Adj Close"], rename_cols=["Close"], start_date=start_date, end_date=end_date)
pt.load_all(columns=["Adj Close"], rename_cols=["Close"], start_date=start_date, end_date=end_date)

In [3]:
m = Model("M")

In [4]:
pt.calculate_fff_params(factors=5, directory="fff", frequency="M", verbose=0)

Frequency of stock stock and fama french factors are not equal. Equating frequencies... to D and M respectively.
Frequency of stock stock and fama french factors are not equal. Equating frequencies... to D and M respectively.
Frequency of stock stock and fama french factors are not equal. Equating frequencies... to D and M respectively.
Frequency of stock stock and fama french factors are not equal. Equating frequencies... to D and M respectively.
Done. Here are the parameters
+------------+------------+------------+------------+
|       AAPL |       MSFT |       GOOG |       TSLA |
|------------+------------+------------+------------|
|  0.0185366 |  0.0192982 |  0.0140484 |  0.0389433 |
|  0.791173  |  1.05123   |  0.0481774 | -2.34344   |
| -3.74438   | -1.16769   | -1.68188   | -9.60767   |
| -2.06079   | -1.68714   | -2.2483    | -1.77381   |
|  1.57347   |  2.876     |  3.37034   | -2.31403   |
|  3.77877   |  1.96595   |  0.6877    | -4.76799   |
|  1         |  1         |  1  

In [8]:
pt.stocks[0].params

const     0.018537
Mkt-RF    0.791173
SMB      -3.744382
HML      -2.060791
RMW       1.573470
CMA       3.778772
rf        1.000000
dtype: float64

In [6]:
fff = FamaFrenchFactors()

fff5 = fff.load(factors=5, directory="fff", frequency="M")
fff5.head()

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-07-31,-0.0013,0.0011,-0.0003,-0.0013,0.003,0.00012
1963-08-31,0.0044,0.0015,-0.0013,0.0015,-0.0021,0.00011
1963-09-30,-0.006,0.0021,0.0008,0.0024,0.0013,0.00014
1963-10-31,0.0021,-0.0003,0.0008,0.001,-0.0026,0.00013
1963-11-30,0.0134,0.003,0.0029,-0.0031,-0.0015,0.00015


In [7]:
means = fff.mean_values()

In [8]:
means

const     1.000000
Mkt-RF    0.000799
SMB       0.001204
HML      -0.000043
RMW      -0.000530
CMA      -0.000082
RF        0.000172
dtype: float64

In [11]:
np.dot(means, pt.stocks[2].params)

0.010487174535701976

In [14]:
pt["AAPL"].params

const     0.018537
Mkt-RF    0.791173
SMB      -3.744382
HML      -2.060791
RMW       1.573470
CMA       3.778772
rf        1.000000
dtype: float64

In [19]:
pt["AAPL"].fff.mean_values()

const     1.000000
Mkt-RF    0.000799
SMB       0.001204
HML      -0.000043
RMW      -0.000530
CMA      -0.000082
RF        0.000172
dtype: float64

In [22]:
means.iloc[[0,1,2,3,-1]]

const     1.000000
Mkt-RF    0.000799
SMB       0.001204
HML      -0.000043
RF        0.000172
dtype: float64

In [23]:
pt["AAPL"].params.iloc[[0,1,2,3,-1]]

const     0.018537
Mkt-RF    0.791173
SMB      -3.744382
HML      -2.060791
rf        1.000000
dtype: float64

In [7]:
isinstance(None, object)

True