# Imports

In [1]:
from pystock.portfolio import Portfolio, Stock
from pystock.models import Model
from pystock.fff import FamaFrenchFactors

In [2]:
import warnings

warnings.filterwarnings("ignore")

# The `FamaFrenchFactors` class

This class is used to download and load Fama French factors. Start by creating an instance of the class.

In [3]:
fff = FamaFrenchFactors()

## Working with the Fama-French Factors

### Downloading FFF

To download the factors, use the `download` function. It takes the following parameters:

    frequency : str, optional
        The frequency of the data. The default is "D".
    factors : int, optional
        The number of factors. The default is 3. Possible values are 3 and 5
    directory : str, optional
        The directory to save the file. The default is ".".
    overwrite : bool, optional
        Whether to overwrite the file if it already exists. The default is False.

`factors` has two possible values, 3 and 5.

In [5]:
file_path = fff.download(frequency="D", factors=5, directory=".", overwrite=True)

Downloading Fama French Factors. This may take about 10 seconds.
Download complete. File saved as fff_daily_5_factors.csv
Use load() to load the file as a pandas dataframe.


### `load` function

Once downloaded, the fff can be loaded using the `load` function. The function takes the following params:

    directory : str, optional
        The directory to save the file. The default is ".".
    frequency : str, optional
        The frequency of the data. The default is "M".
    factors : int, optional
        The number of factors. The default is 3. Possible values are 3 and 5
    preprocess : bool, optional
        Whether to preprocess the data. The default is True.

In [6]:
fff5 = fff.load(frequency="D", factors=5, directory=".", preprocess=True)

In [7]:
fff5

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-07-01,-0.0067,0.0002,-0.0035,0.0003,0.0013,0.00012
1963-07-02,0.0079,-0.0028,0.0028,-0.0008,-0.0021,0.00012
1963-07-03,0.0063,-0.0018,-0.0010,0.0013,-0.0025,0.00012
1963-07-05,0.0040,0.0009,-0.0028,0.0007,-0.0030,0.00012
1963-07-08,-0.0063,0.0007,-0.0020,-0.0027,0.0006,0.00012
...,...,...,...,...,...,...
2022-11-23,0.0063,-0.0025,-0.0094,-0.0073,-0.0057,0.00014
2022-11-25,-0.0002,0.0027,0.0044,-0.0016,0.0014,0.00014
2022-11-28,-0.0155,-0.0047,-0.0020,0.0032,0.0031,0.00014
2022-11-29,-0.0018,0.0035,0.0103,0.0019,0.0047,0.00014


These factors will be used for `fff3` and `fff4` models later. For now, we'll have a look at some more things which you can do with the `FamaFrenchFactors` class.

## Some More Functions

### Changing the Frequncy

You can change the frequency of the factors using the `change_frequency` function. It takes just one parameter:

    frequency : str, optional
        The frequency of the data. The default is "D".

In [5]:
fff5_quarterly = fff.change_frequency(frequency="Q")
fff5_quarterly

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-09-30,-0.0157,-0.0052,0.0013,-0.0071,0.0029,0.0027
1963-12-31,0.0183,-0.0210,-0.0002,0.0003,-0.0007,0.0029
1964-03-31,0.0141,0.0123,0.0340,-0.0221,0.0322,0.0031
1964-06-30,0.0127,0.0029,0.0062,-0.0028,-0.0017,0.0030
1964-09-30,0.0269,-0.0034,0.0170,-0.0056,0.0062,0.0028
...,...,...,...,...,...,...
2021-09-30,-0.0437,0.0114,0.0508,-0.0190,0.0214,0.0000
2021-12-31,0.0310,-0.0077,0.0328,0.0492,0.0443,0.0001
2022-03-31,0.0305,-0.0215,-0.0180,-0.0156,0.0317,0.0001
2022-06-30,-0.0843,0.0130,-0.0597,0.0185,-0.0470,0.0006


> The function changes the frequency of data inplace, meaning that if you want to upsample the data (i.e. change frequency from month to day), you will get wrong results. The function uses `ffill` to fill the missing values so changing frequency from month to day will result in the same value for all the days in the month.


### Calculating Mean "Returns"

In Fama-French model, we'll need the mean of the columns for calculating the expected return of stock. The class provides a function to do this:

In [6]:
means = fff.calculate_mean_values()

In [7]:
means

const     1.000000
Mkt-RF    0.002920
SMB       0.004762
HML       0.001381
RMW       0.003614
CMA       0.002217
RF        0.003654
dtype: float64

Note that there is an extra value named `const`. This is here because the Fama-French model has a constant term. Using `mean` in this form makes it easy to calculate the expected return of a stock.

# The `Stock` class


## Creating a `Stock` object


Start by loading the `Stock` class from the `pystock` module:


In [3]:
apple = Stock("AAPL", "Data/AAPL.csv")
apple


Stock(name=AAPL)

Let's see what the `Stock` object has:


In [4]:
apple.__dict__


{'name': 'AAPL',
 'directory': 'Data/AAPL.csv',
 'loaded': False,
 'return_': {},
 'fff': <pystock.FFF.FamaFrenchFactors at 0x7f173634e850>}

`return_` is a dictionary which will contain the return of the stock, it can be a float (if you want mean return) or a `pd.Series` of floats (if you want to get the return of each day).


`fff` is a reference to the `FamaFrenchFactors` object. We will see later what it is. `loaded` is a boolean equal to `True` if the stock data has been loaded, `False` otherwise. Let's load the data. The `load` function takes a number of parameters:

    start_date : str, optional
        Start date of the data, by default None
    end_date : str, optional
        End date of the data, by default None
    columns : list, optional
        Columns to keep, by default None which means keep all columns
    frequency : str, optional
        Frequency of the data, by default "D"
    rename_cols : list, optional
        Columns to rename, by default None


The function returns a `pd.DataFrame` with the data. Let's see what the data looks like:


## `load_data` function


In [5]:
start_date = "2010-01-01"
end_date = "2022-12-20"
frequency = "D"
apple.load_data(start_date=start_date, end_date=end_date, frequency=frequency)
apple.__dict__.keys()


dict_keys(['name', 'directory', 'loaded', 'return_', 'fff', 'data', 'columns', 'start_date', 'end_date', 'frequency'])

The `Stock` object now has some more attributes. `data` is a `pd.DataFrame` with the data. `start_date` and `end_date` are the start and end dates of the data. `columns` is a list of the columns of the data. `frequency` is the frequency of the data.


In [6]:
apple.data.head()


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,7.6225,7.660714,7.585,7.643214,6.515212,493729600
2010-01-05,7.664286,7.699643,7.616071,7.656429,6.526478,601904800
2010-01-06,7.656429,7.686786,7.526786,7.534643,6.422666,552160000
2010-01-07,7.5625,7.571429,7.466071,7.520714,6.41079,477131200
2010-01-08,7.510714,7.571429,7.466429,7.570714,6.453411,447610800


In [7]:
apple.loaded

True

As you can see, `loaded` is now equal to `True`.


## Working With Returns


Next, we'll calculate various returns using the object. For this, we have the `freq_return` function having the following parameters:

    frequency : str, optional
        Frequency of the data, by default "M"
    mean : bool, optional
        Whether to return the mean of the return, by default True
    column : str, optional
        Column to calculate the return, by default "Close"


In [31]:
daily_return_series = apple.freq_return(frequency="D", mean=False)
daily_return_avg = apple.freq_return(frequency="D", mean=True)
display(daily_return_series.head())
display(daily_return_avg)


Date
2010-01-05    0.001729
2010-01-06   -0.015906
2010-01-07   -0.001849
2010-01-08    0.006648
2010-01-09    0.000000
Freq: D, Name: Close, dtype: float64

0.0007156125449657148

In [32]:
monthly_return_series = apple.freq_return(frequency="M", mean=False)
monthly_return_avg = apple.freq_return(frequency="M", mean=True)
display(monthly_return_series.head())
display(monthly_return_avg)


Date
2010-02-28    0.065396
2010-03-31    0.148470
2010-04-30    0.111021
2010-05-31   -0.016125
2010-06-30   -0.020827
Freq: M, Name: Close, dtype: float64

0.02320634467521188

These returns are saved in the `return_` attribute of the object. Note that the key of the dictionary `return_` is the frequency of the return. So, it will save the mean of the returns as that was what calculated last.


In [33]:
apple.return_


{'D': 0.0007156125449657148, 'M': 0.02320634467521188}

## Changing the frequency of the data


In [34]:
apple.frequency


'D'

In [35]:
apple.data.head()


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,7.6225,7.660714,7.585,7.643214,6.515212,493729600
2010-01-05,7.664286,7.699643,7.616071,7.656429,6.526478,601904800
2010-01-06,7.656429,7.686786,7.526786,7.534643,6.422666,552160000
2010-01-07,7.5625,7.571429,7.466071,7.520714,6.41079,477131200
2010-01-08,7.510714,7.571429,7.466429,7.570714,6.453411,447610800


The data was loaded with a frequency of day. Suppose you want to change it to some other frequency. This can be done by the `change_frequency` function. It takes just one parameter:

    frequency : str
            Frequency of the data


In [36]:
apple.change_frequency("M")


In [37]:
apple.frequency


'M'

In [38]:
apple.data.head()


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-31,7.181429,7.221429,6.794643,6.859286,5.846978,1245952400
2010-02-28,7.227857,7.3275,7.214286,7.307857,6.229348,507460800
2010-03-31,8.410357,8.450357,8.373571,8.392857,7.154222,430659600
2010-04-30,9.618214,9.663214,9.321429,9.324643,7.948493,542463600
2010-05-31,9.263929,9.264286,9.048214,9.174286,7.820327,815614800


> The function changes the frequency of data inplace, meaning that if you want to upsample the data (i.e. change frequency from month to day), you will get wrong results. The function uses `ffill` to fill the missing values so changing frequency from month to day will result in the same value for all the days in the month.


## `Stock` object with `FamaFrenchFactors`


The `fff` attribute of a `Stock` object is reference to a `FamaFrenchFactors` object. This object is used to get the Fama-French factors. See the corresponding section for more details. Here, we'll give a brief overview of how to use it.


### `download_data` function


This is a wrapper function for `FamaFrenchFactors.download`. It takes the same parameters as `FamaFrenchFactors.download` (Along with some other params like `load`) and returns the same thing. It is used to download the data from the Fama-French website. Again, see the corresponding section for more details.


The `load` parameter is used to load the data into the `FamaFrenchFactors` object. If `load` is `True`, then the data is loaded into the `fff` attribute of the `Stock` object. If `load` is `False`, then the data is not loaded only downloaded. This is useful if you want to download the data and then load it later.


In [21]:
fff_data = apple.download_fff(frequency="D", factors=5, directory="Data", load=True)


Downloading Fama French Factors. This may take about 10 seconds.
Download complete. File saved as Data/fff_daily_5_factors.csv
Use load() to load the file as a pandas dataframe.


Since we have used `load=True`, the data is loaded into the `fff` attribute of the `Stock` object.

In [23]:
apple.fff.data

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-07-01,-0.0067,0.0002,-0.0035,0.0003,0.0013,0.00012
1963-07-02,0.0079,-0.0028,0.0028,-0.0008,-0.0021,0.00012
1963-07-03,0.0063,-0.0018,-0.0010,0.0013,-0.0025,0.00012
1963-07-05,0.0040,0.0009,-0.0028,0.0007,-0.0030,0.00012
1963-07-08,-0.0063,0.0007,-0.0020,-0.0027,0.0006,0.00012
...,...,...,...,...,...,...
2022-11-23,0.0063,-0.0025,-0.0094,-0.0073,-0.0057,0.00014
2022-11-25,-0.0002,0.0027,0.0044,-0.0016,0.0014,0.00014
2022-11-28,-0.0155,-0.0047,-0.0020,0.0032,0.0031,0.00014
2022-11-29,-0.0018,0.0035,0.0103,0.0019,0.0047,0.00014


### `load_fff` function

This is a wrapper function for `FamaFrenchFactors.load`. It takes the same parameters as `FamaFrenchFactors.load` and returns the same thing. It is used to load the data from local if it exists.

In [9]:
apple.load_fff(frequency="D", factors=5, directory="Data")

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,RF
1963-07-01,-0.0067,0.0002,-0.0035,0.0003,0.0013,0.00012
1963-07-02,0.0079,-0.0028,0.0028,-0.0008,-0.0021,0.00012
1963-07-03,0.0063,-0.0018,-0.0010,0.0013,-0.0025,0.00012
1963-07-05,0.0040,0.0009,-0.0028,0.0007,-0.0030,0.00012
1963-07-08,-0.0063,0.0007,-0.0020,-0.0027,0.0006,0.00012
...,...,...,...,...,...,...
2022-11-23,0.0063,-0.0025,-0.0094,-0.0073,-0.0057,0.00014
2022-11-25,-0.0002,0.0027,0.0044,-0.0016,0.0014,0.00014
2022-11-28,-0.0155,-0.0047,-0.0020,0.0032,0.0031,0.00014
2022-11-29,-0.0018,0.0035,0.0103,0.0019,0.0047,0.00014


### Calculating Fam-French Factors

The factors can be calculated using the `calculate_fff` function. It takes the following parameters:

    column : str, optional
        Column to calculate the fama french factors on, by default "Close"
    verbose : int, optional
        Verbosity, by default 1

The function will throw error if either the `Stock` or the `FamaFrenchFactors` object is not loaded. 

In [10]:
params = apple.calculate_fff(column = "Close")

Fama French Factors Calculated
                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.534
Model:                            OLS   Adj. R-squared:                  0.534
Method:                 Least Squares   F-statistic:                     744.8
Date:                Sun, 01 Jan 2023   Prob (F-statistic):               0.00
Time:                        19:20:12   Log-Likelihood:                 9671.0
No. Observations:                3250   AIC:                        -1.933e+04
Df Residuals:                    3244   BIC:                        -1.929e+04
Df Model:                           5                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0004

In [11]:
params

const     0.000370
Mkt-RF    1.176796
SMB      -0.176945
HML      -0.499524
RMW       0.594542
CMA      -0.015124
dtype: float64

# `Portfolio` class

The class represents a portfolio which has a list of stocks and a benchmark. You can also provide a weight for each stock. 

## Getting Started with `Portfolio`

To start, you have to at least provide the directory of the benchmark data as well as its name. You must also provide at least one stock. You can also provide a list of stock names and their directory. The weight can also be provided. If not provided (which defualts to `"equal"`), then the weight will be equal to 1/n where n is the number of stocks. 

    def __init__(self, benchmark_dir, benchmark_name, stocks_dir=None, stocks_name=None, weights=None):

In [2]:
benchmark_name = "S&P"
benchmark_dir = "Data/GSPC.csv"

portfolio = Portfolio(benchmark_dir=benchmark_dir, benchmark_name=benchmark_name)
portfolio

Portfolio(S&P,[])

In [3]:
len(portfolio)

1

The representation of portfolio shows the name of benchmark and the stocks in the portfolio. The length of the portfolio is the number of stocks in the portfolio (Including the benchmark).

## Customizing the `Portfolio`

In [4]:
portfolio.benchmark.loaded

False

Right now, portfolio has just one unloaded benchmark and no stocks. Let's load the benchmark and add a stock.

### Loading The Benchmark

This can be done by using the `load_benchmark` function. It takes the following parameters:

    start_date : str, optional
        Start date of the data, by default None
    end_date : str, optional
        End date of the data, by default None
    columns : list, optional
        Columns to keep, by default None which means keep all columns
    frequency : str, optional
        Frequency of the data, by default "D"
    rename_cols : list, optional
        Columns to rename, by default None

In [5]:
start_date = "2012-01-01"
end_date = "2022-12-20"
frequency = "D"
portfolio.load_benchmark(start_date=start_date, end_date=end_date, frequency=frequency)

In [6]:
portfolio.benchmark.loaded

True

In [7]:
portfolio.benchmark.data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2012-01-03,1258.859985,1284.619995,1258.859985,1277.060059,1277.060059,3943710000
2012-01-04,1277.030029,1278.72998,1268.099976,1277.300049,1277.300049,3592580000
2012-01-05,1277.300049,1283.050049,1265.26001,1281.060059,1281.060059,4315950000
2012-01-06,1280.930054,1281.839966,1273.339966,1277.810059,1277.810059,3656830000
2012-01-07,1280.930054,1281.839966,1273.339966,1277.810059,1277.810059,3656830000


>Alternatively, you can use the `Stock.load_data` function to load the benchmark data since benchmark is just a `Stock` object.

### Changing The Benchmark

You can also change the benchmark by using the `change_benchmark` function. It takes the following parameters:

    benchmark_dir : str
        Directory of the benchmark
    benchmark_name : str
        Name of the benchmark
    load : bool, optional
        Load the data, by default True
    use_prev : bool, optional
        Use the values of start_date, end_date, columns, frequency, rename_cols from the previous benchmark, by default True
    start_date : str, optional
        Start date, by default None
    end_date : str, optional
        End date, by default None
    columns : list, optional
        Columns to keep, by default None
    frequency : str, optional
        Frequency of the data, by default "D"
    rename_cols : list, optional
        Columns to rename, by default None

In [6]:
dji_name = "Dow_Jones"
dji_dir = "Data/DJI.csv"

portfolio.change_benchmark(benchmark_dir=dji_dir, benchmark_name=dji_name, load=True, use_prev=False)

In [7]:
portfolio

Portfolio(Dow_Jones,[])

In [8]:
portfolio.benchmark.loaded

True

### Adding A Stock

#### Quick Way

The class provides a function `add_stocks` to add a stock. It takes the following parameters:

    stock_dirs : list
        List of stock directories
    stock_names : list, optional
        List of stock names, by default None
    load_data : bool, optional
        Whether to load the data, by default True
    start_date : str, optional
        Start date, by default None
    end_date : str, optional
        End date, by default None
    columns : list, optional
        Columns to keep, by default None
    frequency : str, optional
        Frequency of the data, by default "D"
    rename_cols : list, optional
        Columns to rename, by default None
    overwrite : bool, optional
        Whether to overwrite existing stocks, by default False

The quickest way to add a single or a number of stock is by passing the `stock_dirs` and `stock_names` parameter. Let's see this in action:

In [9]:
stock_names = ["AAPL"]
stock_dirs = ["Data/AAPL.csv"]

portfolio.add_stocks(stock_dirs = stock_dirs, stock_names = stock_names, load_data=False, frequency=frequency, start_date=start_date, end_date=end_date)

> If we want to add a single stock, give the name and directory of stock inside a list. This is what we have done here.

In [12]:
portfolio

Portfolio(Dow_Jones,['AAPL'])

#### Using `Stock` Object

Another way is to first create the `Stock` object and then add it using the same method.

In [10]:
google = Stock("GOOG", "Data/GOOG.csv")
portfolio.add_stocks(stocks=[google], load_data=False, frequency=frequency, start_date=start_date, end_date=end_date)

In [14]:
portfolio

Portfolio(Dow_Jones,['AAPL', 'GOOG'])

Now, our portfolio has one benchmark and two stock.

In [15]:
portfolio.weights

array([0.5, 0.5])

You can see that the `weights` has been adjusted.

### Adding Multiple Stocks

For this, just pass a list of stock directories and names. The `weights` will be adjusted accordingly. Or you can pass a list of `Stock` objects.

In [11]:
stock_names = ["TSLA", "MSFT"]
stock_dirs = ["Data/TSLA.csv", "Data/MSFT.csv"]

portfolio.add_stocks(stock_dirs = stock_dirs, stock_names = stock_names, load_data=False, frequency=frequency, start_date=start_date, end_date=end_date)

In [12]:
portfolio

Portfolio(Dow_Jones,['AAPL', 'GOOG', 'TSLA', 'MSFT'])

> Another thing to note is that two `Stocks` are considered equal if they have the same name. You can not have two stocks with the same name in the portfolio. If you try to add a stock with the same name as an existing stock, then the existing stock will be overwritten or the command will be ignored depending on the value of `overwrite` parameter.

In [13]:
google = Stock("GOOG", "Data/GOOG.csv")

portfolio.add_stocks(stocks=[google], load_data=False, frequency=frequency, start_date=start_date, end_date=end_date, overwrite=False)

Stock GOOG already exists
You have not specified overwrite=True. Skipping...


In [14]:
portfolio

Portfolio(Dow_Jones,['AAPL', 'GOOG', 'TSLA', 'MSFT'])

In [15]:
google = Stock("GOOG", "Data/GOOG.csv")

portfolio.add_stocks(stocks=[google], load_data=False, frequency=frequency, start_date=start_date, end_date=end_date, overwrite=True)

Stock GOOG already exists
Overwriting...


In [16]:
portfolio

Portfolio(Dow_Jones,['AAPL', 'TSLA', 'MSFT', 'GOOG'])

### Removing Stocks

To remove a `Stock` from `Portfolio`, use the `remove_stock` function. It takes the following parameters:

    names : list
        A list names of the stock to remove

In [17]:
portfolio.remove_stocks(["GOOG"])

In [18]:
portfolio

Portfolio(Dow_Jones,['AAPL', 'TSLA', 'MSFT'])

### Change the Frequency of the Data

To change the frequency of the portfolio, use the `change_benchmark_frequency` function. It takes the following parameters:

    frequency : str
        Frequency of the data
    change_stocks : bool, optional
        Whether to change the frequency of the stock data, by default True

In [19]:
portfolio.benchmark.frequency

'D'

However, you can change the frequency only if you have loaded the data. If you have not loaded the data, then the function will throw an error.

In [21]:
portfolio.change_benchmark_frequency("M")

AttributeError: 'Stock' object has no attribute 'data'

Loading data will be covered in the next section. For now, as benchmark is already loaded, we will change the frequency of the benchmark.

In [23]:
portfolio.change_benchmark_frequency("M", change_stocks=False)

In [24]:
portfolio.benchmark.frequency

'M'

>Although you can get away with changing the frequency of the benchmark only, it is recommended to change the frequency of the stock data as well.

## Loading the Data

Many times, when we try to run some function, you will get an exception telling that "'Stock' object has no attribute 'data'". This happens because the `Stock` is not loaded yet as you can check by using the `loaded` attribute of the `Stock` object.

In [30]:
for stock, name in portfolio:
    print(name, stock.loaded)

Dow_Jones True
AAPL False
TSLA False
MSFT False


We see that no stock data is loaded. Let's load the data.

> ou can use the `Portfolio` as an iterator. Some more details about these special methods will be covered later.

There are mainly three functions to load data. We already discussed the `load_benchmark` function. Other two are discussed below.

### `load_one_stock`

As the name suggests, this loads data of one stock specified by the `name` parameter. The function is built on `Stock.load_data` function. It takes the following parameters:

    name : str
        Name of the stock
    start_date : str, optional
        Start date, by default None
    end_date : str, optional
        End date, by default None
    columns : list, optional
        Columns to keep, by default None
    frequency : str, optional
        Frequency of the data, by default "D"
    rename_cols : list, optional
        Columns to rename, by default None
    overwrite : bool, optional
        Whether to overwrite existing data, by default False

In [31]:
apple_data = portfolio.load_one_stock("AAPL", frequency=frequency, start_date=start_date, end_date=end_date)

In [32]:
for stock, name in portfolio:
    print(name, stock.loaded)

Dow_Jones True
AAPL True
TSLA False
MSFT False


The data of `APPL` is now loaded. We get some more attributes by loading the data. See the `Stock` class for more details.

### `load_all`

As the name suggests, this loads data of all the stocks in the portfolio. It takes the following parameters:

    start_date : str, optional
        Start date, by default None
    end_date : str, optional
        End date, by default None
    columns : list, optional
        Columns to keep, by default None
    frequency : str, optional
        Frequency of the data, by default "D"
    rename_cols : list, optional
        Columns to rename, by default None
    overwrite : bool, optional
        Whether to overwrite existing data, by default False

Previously, we just loaded the apple data, now we'll load all the data.

In [33]:
portfolio.load_all(frequency=frequency, start_date=start_date, end_date=end_date)

In [34]:
for stock, name in portfolio:
    print(name, stock.loaded)

Dow_Jones True
AAPL True
TSLA True
MSFT True


Let's see the data of these stocks.

In [36]:
portfolio["AAPL"].data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2012-01-03,14.621429,14.732143,14.607143,14.686786,12.519279,302220800
2012-01-04,14.642857,14.81,14.617143,14.765714,12.58656,260022000
2012-01-05,14.819643,14.948214,14.738214,14.929643,12.726295,271269600
2012-01-06,14.991786,15.098214,14.972143,15.085714,12.85933,318292800
2012-01-07,14.991786,15.098214,14.972143,15.085714,12.85933,318292800


In [37]:
portfolio["TSLA"].data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2012-01-03,1.929333,1.966667,1.843333,1.872,1.872,13921500
2012-01-04,1.880667,1.911333,1.833333,1.847333,1.847333,9451500
2012-01-05,1.850667,1.862,1.79,1.808,1.808,15082500
2012-01-06,1.813333,1.852667,1.760667,1.794,1.794,14794500
2012-01-07,1.813333,1.852667,1.760667,1.794,1.794,14794500


## Special Methods

### Representation

The `Portfolio` object has implemented the `__repr__` method which lets it represent the object in understandable manner.

In [93]:
portfolio

Portfolio(Dow_Jones,['AAPL', 'TSLA', 'MSFT'])

You can see that the represenation of `Portfolio` has the name of the benchmark and the list of the stocks. This lets us have a "peek" at the portfolio!

### String

You can "print" the `Portfolio` and it will give a peek of the portfolio:

In [95]:
print(portfolio)

Portfolio with benchmark Dow_Jones and stocks ['AAPL', 'TSLA', 'MSFT']


### Using the `in` Keyword

The `Portfolio` class implements the `__contains__` special method. This makes it easy to use the `in` keyword to check if a `Stock` is in the `Portfolio`. Use the stock name or the `Stock` object.

In [89]:
"AAPL" in portfolio, "TCS" in portfolio

(True, False)

In [90]:
portfolio.stocks[0] in portfolio

True

In [91]:
portfolio.benchmark in portfolio

True

### Using Subscriptation

You can use the name of the stock to get the `Stock` from the `Portfolio` object:

In [92]:
portfolio["AAPL"]

Stock(name=AAPL)

In [94]:
portfolio["Dow_Jones"]

Stock(name=Dow_Jones)

### Iteration

You can iterate over the `Portfolio`.

In [97]:
for stock, name in portfolio:
    print(stock.name, name)

Dow_Jones Dow_Jones
AAPL AAPL
TSLA TSLA
MSFT MSFT


The `Portfolio` iterator yields the `Stock ` and name of the stock. Note that the first entry is that of the benchmark.

You can use the `list` constructor to create a list of stock and names:

In [98]:
list(portfolio)

[(Stock(name=Dow_Jones), 'Dow_Jones'),
 (Stock(name=AAPL), 'AAPL'),
 (Stock(name=TSLA), 'TSLA'),
 (Stock(name=MSFT), 'MSFT')]

## Merging

Merging is necessary for calculating various stock parameters used in the portfolio optimization models. For this reason, we have a couple of methods.

### Merging With the Benchmark

This is necessary for calculating $\alpha$ and $\beta$ parameters. This is realized by using the function `merge_stock_with_benchmark`

In [38]:
merged = portfolio.merge_stock_with_benchmark("AAPL")

In [40]:
merged.head()

Unnamed: 0_level_0,Dow_Jones_Open,Dow_Jones_High,Dow_Jones_Low,Dow_Jones_Close,Dow_Jones_Adj Close,Dow_Jones_Volume,AAPL_Open,AAPL_High,AAPL_Low,AAPL_Close,AAPL_Adj Close,AAPL_Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2012-01-31,12632.900391,12632.900391,12632.900391,12632.900391,12632.900391,0,16.27107,16.365713,16.18107,16.302856,13.896848,391683600
2012-02-29,12952.099609,12952.099609,12952.099609,12952.099609,12952.099609,0,19.341429,19.557501,19.132143,19.372856,16.513771,952011200
2012-03-31,13212.0,13212.0,13212.0,13212.0,13212.0,0,21.741785,21.805714,21.355,21.4125,18.252399,731038000
2012-04-30,13213.599609,13213.599609,13213.599609,13213.599609,13213.599609,0,21.35,21.371429,20.821428,20.856428,17.778395,506144800
2012-05-31,12393.5,12393.5,12393.5,12393.5,12393.5,0,20.740713,20.767857,20.409286,20.633215,17.58812,491674400


> When merging, it is recommended that you use just those columns which will be required later. Usually the column "Close" is the only one which is useful so it is good idea to use just this column while calling `load_all` method.

### Merge Everything

Use the `merge_all` function for this. This merges all the stocks with benchmark. Note that all stocks must be loaded.

In [41]:
merged_all = portfolio.merge_all()

In [42]:
merged_all.columns

Index(['Dow_Jones_Open', 'Dow_Jones_High', 'Dow_Jones_Low', 'Dow_Jones_Close',
       'Dow_Jones_Adj Close', 'Dow_Jones_Volume', 'AAPL_Open', 'AAPL_High',
       'AAPL_Low', 'AAPL_Close', 'AAPL_Adj Close', 'AAPL_Volume', 'TSLA_Open',
       'TSLA_High', 'TSLA_Low', 'TSLA_Close', 'TSLA_Adj Close', 'TSLA_Volume',
       'MSFT_Open', 'MSFT_High', 'MSFT_Low', 'MSFT_Close', 'MSFT_Adj Close',
       'MSFT_Volume'],
      dtype='object')

Since we used all the columns while loading, after the `merge_all`, you get huge number of columns.

## Returning the Return

Return of a stock is its one of the most important feature. The `Portfolio` class provides a number of way to get this.

Of course, you can get the return by calling the methods inbuilt in the `Stock` object. Here, we'll discuss methods of the `Portfolio` object. 

> Both of these methods as well as most of the method discussed below takes a parameter `column` dictating which column to use while calculating the corresponding values. The default is "Close" and you should not change this. An exception is when you want to use the "Adj. Close". However, in that case too, it is recommended that you change the column name from "Adj. Close" to "Close" while loading the data.

### Return of A Single Stock

This can be determined using the `get_stock_return` method. As usualy, pass the name of the stock. The method also taked a `frequency` parameter.

In [62]:
apple_return, apple_std = portfolio.get_stock_return("AAPL")

In [65]:
apple_return, apple_std

(0.02034460996923982, 0.08107760447739829)

>The methods in this object are implemented to give an average return. If you want to get a series of return, use the methods of the `Stock` object.

### Return of All Stocks

Use the `get_all_stock_returns` function!

In [66]:
monthly_returns = portfolio.get_all_stock_returns()
monthly_returns

Unnamed: 0,Stock,Monthly_Mean_Return,Monthly_Return_STD
0,AAPL,0.020345,0.081078
1,TSLA,0.050261,0.182803
2,MSFT,0.018516,0.060729


### What About the Whole Portfolio?

Well, you can use the `portfolio_return` method to get this. The function gives a weighted return. You can also specify the `weights`.

In [68]:
portfolio_return_equal, _ = portfolio.portfolio_return()
portfolio_return_equal

0.029707288531672652

In [70]:
portfolio_return_just_apple, _ = portfolio.portfolio_return(weights=[1,0,0])
portfolio_return_just_apple

0.02034460996923982

## Calculating the Stock Parameters

### alpha and beta

These two parameters are required for the CAPM and SIM models. There are two methods for calculating this:

#### `get_stock_params`

This function returns the parameters for one stock identified by the name of the stock.

In [47]:
tesla_alpha, tesla_beta = portfolio.get_stock_params("TSLA")

In [48]:
print(tesla_alpha, tesla_beta)

0.04086664252883307 1.713736456801432


#### `get_all_stock_params`

This returns parameters for all the stocks in the portfolio.

In [54]:
alpha_beta_all = portfolio.get_all_stock_params(return_dict=False, column="Close")

In [55]:
alpha_beta_all

Unnamed: 0,Stock,Alpha,Beta
0,AAPL,0.013671,0.976056
1,TSLA,0.040867,1.713736
2,MSFT,0.013301,0.858644


After using `get_all_stock_params` method, the `alpha` and `beta` of a stock can also be accessed thorugh the attribute of that stock.

In [59]:
for stock in portfolio.stocks:
    print(stock.name, stock.alpha, stock.beta)

AAPL 0.013670770282234219 0.9760561904155619
TSLA 0.04086664252883307 1.713736456801432
MSFT 0.013301247231778873 0.8586435486025358


The parameters can also be accessed directly from the `Portfolio`:

In [60]:
portfolio.alphas, portfolio.betas

([0.013670770282234219, 0.04086664252883307, 0.013301247231778873],
 [0.9760561904155619, 1.713736456801432, 0.8586435486025358])

### Summary!

`Portfolio` object has a `summary` method which gives summary of the portfolio. The method requires `frequency`, `weights` and `column`:

In [86]:
portfolio.summary()

Portfolio Summary
*****************

Portfolio with benchmark Dow_Jones and stocks ['AAPL', 'TSLA', 'MSFT']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+-----------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |     Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+-----------+----------+----------|
|  0 | AAPL    |             0.0203446 |            0.0810776 | 0.0136708 | 0.976056 | 0.333333 |
|  1 | TSLA    |             0.0502608 |            0.182803  | 0.0408666 | 1.71374  | 0.333333 |
|  2 | MSFT    |             0.0185164 |            0.0607293 | 0.0133012 | 0.858644 | 0.333333 |
+----+---------+-----------------------+----------------------+-----------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+
|      |       AAPL |       TSLA |       MSFT |
|------+------------+-----

> If you are feeling lazy and don't want to call a couple of methods to calculate the `return`, `alpha` and `beta`, you can just vcall the `summary` method and it calculates all the values under the hood!

> The calculation of FFF parameters, however, is not included in the `summary` method. The reason is that calculations of FFF parameters are a bit involved and unless you want to optimize portfolio using the `fff3` or `fff5` model, you don't even need to do the calculations of FFF parameters.

### FFF Parameters

To use the Fama–French three-factor model or five factor model, you need the three or five parameters. As usual, we have two methods to do this:

#### `calculate_fff_params_one`

This calculates the FFF params for the given stock. You can pass the name of the stock or the stock itself. The function uses the `Stock.load_fff` method to load the FFF data. See the corresponding section for more detail.

In [79]:
apple_fff5 = portfolio.calculate_fff_params_one("AAPL", frequency="M", factors=5, directory="Data")

In [75]:
apple_fff5

const     0.005514
Mkt-RF    1.197614
SMB      -0.258640
HML      -0.513560
RMW       0.750830
CMA      -0.181558
rf        1.000000
dtype: float64

#### `calculate_fff_params`

You already know what this method does!

In [80]:
all_ff5 = portfolio.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)

Done. Here are the parameters
+-------------+------------+-------------+
|        AAPL |       TSLA |        MSFT |
|-------------+------------+-------------|
|  0.00551378 |  0.0341999 |  0.00794491 |
|  1.19761    |  1.87135   |  0.993504   |
| -0.25864    | -0.379263  | -0.787259   |
| -0.51356    | -0.610234  |  0.0173204  |
|  0.75083    | -1.45574   | -0.100924   |
| -0.181558   | -0.497976  | -0.529687   |
|  1          |  1         |  1          |
+-------------+------------+-------------+


One you have calculated the fff parameters, you can access this with the `params` attribute of `Stock` object.

In [82]:
portfolio["AAPL"].params

const     0.005514
Mkt-RF    1.197614
SMB      -0.258640
HML      -0.513560
RMW       0.750830
CMA      -0.181558
rf        1.000000
dtype: float64

Or use the `stock_params` attribute of the `Portfolio` object:

In [84]:
portfolio.stock_params

{'AAPL': const     0.005514
 Mkt-RF    1.197614
 SMB      -0.258640
 HML      -0.513560
 RMW       0.750830
 CMA      -0.181558
 rf        1.000000
 dtype: float64,
 'TSLA': const     0.034200
 Mkt-RF    1.871348
 SMB      -0.379263
 HML      -0.610234
 RMW      -1.455743
 CMA      -0.497976
 rf        1.000000
 dtype: float64,
 'MSFT': const     0.007945
 Mkt-RF    0.993504
 SMB      -0.787259
 HML       0.017320
 RMW      -0.100924
 CMA      -0.529687
 rf        1.000000
 dtype: float64}

#### The Mean Values

These values are required while calculating the expected stock return using `fff3` or `fff5` method. If you have called `calculate_fff_params_one` or `calculate_fff_params` method, yoy don't need to do anything else. The mean values have been calculated and can be accessed by `mean_values` attribute. If you have not called at least one of these methods, well, call it!

In [85]:
portfolio.mean_values

const     1.000000
Mkt-RF    0.005592
SMB       0.002236
HML       0.003101
RMW       0.002819
CMA       0.002955
RF        0.003621
dtype: float64

# `Model` Class

This class has methods to optimize the portfolio. The class is build on top of the `Portfolio` class. Let's get started!

## Getting Started With `Model`

Let's instantiate the model:

In [3]:
model = Model("M")

The only parameters which the `Model` expects are the `frequency` and `risk_free_rate`.

### Creating a Portfolio

The easiest way to get started with `Model` is by using the `create_portfolio` method. This method creates a portfolio by using the `benchmark_dir`, `benchmark_name`, `stock_dirs`, and `stock_names`. The method accepts some other parameters which are necessary to create a `Portfolio`.

In [5]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
start_date = "2012-01-01"
end_date = "2022-12-20"

portfolio = model.create_portfolio(
    benchmark_dir=benchmark_dir,
    benchmark_name=benchmark_name,
    stock_dirs=stock_dirs,
    stock_names=stock_names,
    frequency=frequency,
    start_date=start_date,
    end_date=end_date
)


Loading benchmark...
Loading stocks...
Calculating other results...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+-----

The `create_portfolio` method returns the `Portfolio` object. It does all the work of loading the data, merging the data and calculating the parameters. If your goal is to optimize portfolio using `capm` or `sim` model, you don't need to do anything else. Just call the `optimize_portfolio` method.

This method, by default, loads just the "Adj. Close" column and renames it to "Close" column.

Though this method is enough for many works, it is not recommended way to use the module. You should create a `Portfolio` object and then use other method to add it to the `Model` object.

### Adding a Portfolio

Start by creating a `Portfolio` object. Then, use the `add_portfolio` method to add it to the `Model` object.

In [7]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
pt = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)

Let's print the portfolio summary:

In [8]:
pt.summary()

Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------

Note that you need to calculate the FFF parameters explicitly if you want to use the FFF models. Let's do that:

In [9]:
pt.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)

Done. Here are the parameters
+-------------+-------------+-------------+------------+
|        AAPL |        MSFT |        GOOG |       TSLA |
|-------------+-------------+-------------+------------|
|  0.00682682 |  0.00970946 |  0.00699103 |  0.0341999 |
|  1.19736    |  0.994033   |  1.01023    |  1.87135   |
| -0.249423   | -0.776582   | -0.572982   | -0.379263  |
| -0.512566   |  0.0184882  |  0.199086   | -0.610234  |
|  0.749702   | -0.103125   | -0.107732   | -1.45574   |
| -0.201821   | -0.548657   | -0.867854   | -0.497976  |
|  1          |  1          |  1          |  1         |
+-------------+-------------+-------------+------------+


Great! Now you can optimize the portfolio. But there is another method which we need to discuss.

### Updating a Portfolio

The `Model` object accepts just one `Portfolio`. You can update the portfolio with another one:

In [10]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv"]
stock_names = ["AAPL", "MSFT", "GOOG"]

frequency = "M"
pt2 = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt2.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt2.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)

In [11]:
model.portfolio

Portfolio(S&P,['AAPL', 'MSFT', 'GOOG', 'TSLA'])

In [14]:
model.update_portfolio(pt2, weights="equal")

Adding portfolio...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  | 0.333333 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 | 0.333333 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  | 0.333333 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+
|      |       AAPL |       MSFT |       GOOG |
|----

In [15]:
model.portfolio

Portfolio(S&P,['AAPL', 'MSFT', 'GOOG'])

The function calls the `Portfolio.summary()` method to make the model ready for optimization.

### The `load_portfolio` Function

Suppose yoy created a `Portfolio` but have not loaded the data yet. You then add this to the `Model` by setting the `portfolio` attribute. You can use `Model.portfolio` attribute to load the data of benchmark and stocks, or, you can use the `load_portfolio` method which does all this.

In [19]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv"]
stock_names = ["AAPL", "MSFT", "GOOG"]

frequency = "M"
pt2 = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"

model = Model("M")

model.portfolio = pt2

In [20]:
model.portfolio

Portfolio(S&P,['AAPL', 'MSFT', 'GOOG'])

In [21]:
model.load_portfolio(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)

Loading benchmark...
Loading stocks...
Calculating other results...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  | 0.333333 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 | 0.333333 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  | 0.333333 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+
|    

## Optimization

Before optimizing the portfolio, suppose you want to try some `weights` and see how the return and risk is changing. Or you just want to see expected return of a stock based on its calculated parameters. For this the `Model` has some methods. Let's create a model:

In [12]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
pt = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)

In [13]:
model = Model()
model.add_portfolio(pt, weights="equal")

Adding portfolio...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as fo

We won't calculate the FFF parameters just yet.

### `expected_return_of_stock`

This function returns what the name says:

In [5]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="capm")
exp_return



1.1054838377097305

In [6]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="sim")
exp_return



1.115282131986719

Returns by the `capm` and `sim` models are almost same. Let's try FFF models. As the warning message says, we have to first do the FFF calculations.

In [16]:
pt.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)

Done. Here are the parameters
+-------------+-------------+-------------+------------+
|        AAPL |        MSFT |        GOOG |       TSLA |
|-------------+-------------+-------------+------------|
|  0.00682682 |  0.00970946 |  0.00699103 |  0.0341999 |
|  1.19736    |  0.994033   |  1.01023    |  1.87135   |
| -0.249423   | -0.776582   | -0.572982   | -0.379263  |
| -0.512566   |  0.0184882  |  0.199086   | -0.610234  |
|  0.749702   | -0.103125   | -0.107732   | -1.45574   |
| -0.201821   | -0.548657   | -0.867854   | -0.497976  |
|  1          |  1          |  1          |  1         |
+-------------+-------------+-------------+------------+


In [8]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="fff5")
exp_return

1.65139661893182

In [9]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="fff3")
exp_return

1.499680184234527

So, the FFF models predict higher returns!

### `portfolio_info`

This method returns the expected value and risk of portfolio given the `weights` and `model`:

In [14]:
model.portfolio

Portfolio(S&P,['AAPL', 'MSFT', 'GOOG', 'TSLA'])

In [18]:
weights = "equal"
model_ = "capm"
exp_return, variance, _ = model.portfolio_info(weights=weights, model=model_)
print(f"Expected Return: {exp_return:.2f}%")
print(f"Expected Variance: {variance:.2f}")

Expected Return: 1.11%
Expected Variance: 0.55


In [19]:
weights = "equal"
model_ = "fff5"
exp_return, variance, _ = model.portfolio_info(weights=weights, model=model_)
print(f"Expected Return: {exp_return:.2f}%")
print(f"Expected Variance: {variance:.2f}")

Expected Return: 2.11%
Expected Variance: 0.55


In [20]:
weights = [0.2, 0.2, 0.2, 0.4]
model_ = "fff5"
exp_return, variance, _ = model.portfolio_info(weights=weights, model=model_)
print(f"Expected Return: {exp_return:.2f}%")
print(f"Expected Variance: {variance:.2f}")

Expected Return: 2.49%
Expected Variance: 0.86


### Plotting the Portfolio Frontier

If you have just two stocks in your portfolio, you can use the `portfolio_frontier` method to plot the portfolio frontier with a model.

In [23]:
pt.remove_stocks(["TSLA", "MSFT"])

In [24]:
model.portfolio

Portfolio(S&P,['AAPL', 'GOOG'])

As you have deleted two stocks, you need to call `summary` again to recalculate the params.

In [27]:
pt.summary()

Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'GOOG']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+---------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |    Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+---------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897 |      0.5 |
|  1 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298 |      0.5 |
+----+---------+-----------------------+----------------------+------------+---------+----------+
The covariance matrix is as follows
+------+------------+------------+
|      |       AAPL |       GOOG |
|------+------------+------------|
| AAPL | 0.00662438 | 0.00215109 |
| GOOG | 0.00215109 | 0.00416849 |
+------+------------+------------+
Portfolio Return: 0.0193

You will also need to delete the series of calculted FFF params for these stocks.

In [35]:
del pt.stock_params["MSFT"]
del pt.stock_params["TSLA"]

In [36]:
model.portfolio.stock_params

{'AAPL': const     0.006827
 Mkt-RF    1.197356
 SMB      -0.249423
 HML      -0.512566
 RMW       0.749702
 CMA      -0.201821
 rf        1.000000
 dtype: float64,
 'GOOG': const     0.006991
 Mkt-RF    1.010230
 SMB      -0.572982
 HML       0.199086
 RMW      -0.107732
 CMA      -0.867854
 rf        1.000000
 dtype: float64}

In [29]:
model.portfolio_frontier(model="capm")

In [37]:
model.portfolio_frontier(model="sim")

In [30]:
model.portfolio_frontier(model="fff3")

In [38]:
model.portfolio_frontier(model="fff5")

The `fff3` is coming out to be very different.

### Optimization (at last!)

Okay, let's optimize the following `Portfolio`:

In [39]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
pt = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)

In [40]:
model = Model()
model.add_portfolio(pt, weights="equal")

Adding portfolio...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as fo

In [41]:
pt.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)

Done. Here are the parameters
+-------------+-------------+-------------+------------+
|        AAPL |        MSFT |        GOOG |       TSLA |
|-------------+-------------+-------------+------------|
|  0.00682682 |  0.00970946 |  0.00699103 |  0.0341999 |
|  1.19736    |  0.994033   |  1.01023    |  1.87135   |
| -0.249423   | -0.776582   | -0.572982   | -0.379263  |
| -0.512566   |  0.0184882  |  0.199086   | -0.610234  |
|  0.749702   | -0.103125   | -0.107732   | -1.45574   |
| -0.201821   | -0.548657   | -0.867854   | -0.497976  |
|  1          |  1          |  1          |  1         |
+-------------+-------------+-------------+------------+


All set! All you need now is to call `optimize_portfolio` with `model`, `risk` and `can_short` parameter. You may call the `portfolio_info` first with default parameters. This will give you an idea about how much risk to consider.

In [42]:
model.portfolio_info()

(1.112657603301926, 0.550881209951722, 0.742213722556867)

It seems that the variance of the `Portfolio` with "equal" `weights` is 0.551. Let's see what is the maximum return at that risk.

In [47]:
def get_return(risk, can_short):
    models = ["capm", "sim", "fff3", "fff5"]
    for m in models:
        print(f"Optimizing for -> {m.upper()}")
        _ = model.optimize_portfolio(m, risk=risk, can_short=can_short)
        print()

In [48]:
risk = 0.5
can_short = False
get_return(risk, can_short)

Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.1155%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 47.20%
MSFT: 0.00%
GOOG: 36.08%
TSLA: 16.73%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.1283%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 46.20%
MSFT: 0.00%
GOOG: 36.81%
TSLA: 17.00%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 2.3360%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 0.00%
MSFT: 53.08%
GOOG: 23.86%
TSLA: 23.06%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.0628%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 15.42%
MSFT: 53.74%
GOOG: 9.06%
TSLA: 21.78%



So, FFF3 model gives the best return of 2.336% for the weights:

    AAPL: 0.00%
    MSFT: 53.08%
    GOOG: 23.86%
    TSLA: 23.06%

Let's allow shorting:

In [49]:
risk = 0.5
can_short = True
get_return(risk, can_short)

Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.1165%
Variance: 0.5001%
Expected weights:
--------------------
AAPL: 48.81%
MSFT: -8.60%
GOOG: 44.26%
TSLA: 15.53%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.1288%
Variance: 0.5001%
Expected weights:
--------------------
AAPL: 47.40%
MSFT: -5.97%
GOOG: 42.38%
TSLA: 16.19%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 2.3391%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: -5.51%
MSFT: 56.67%
GOOG: 25.98%
TSLA: 22.86%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.0628%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 15.42%
MSFT: 53.74%
GOOG: 9.06%
TSLA: 21.78%



Very little increase in maximum return in observed (2.339%) for

    AAPL: -5.51%
    MSFT: 56.67%
    GOOG: 25.98%
    TSLA: 22.86%

Let's increase the risk to 1:

In [50]:
risk = 1
can_short = False
get_return(risk, can_short)

Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.2233%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 62.21%
MSFT: 0.00%
GOOG: 0.00%
TSLA: 37.79%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.2421%
Variance: 1.0001%
Expected weights:
--------------------
AAPL: 61.20%
MSFT: 0.00%
GOOG: 0.74%
TSLA: 38.06%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 3.0121%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 0.00%
MSFT: 47.74%
GOOG: 6.33%
TSLA: 45.92%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.6580%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 10.72%
MSFT: 44.08%
GOOG: 0.00%
TSLA: 45.20%



Maximum return is increased (at it should be). The new maximum return is 3.0121% for

    AAPL: 0.00%
    MSFT: 47.74%
    GOOG: 6.33%
    TSLA: 45.92%

In [51]:
risk = 1
can_short = True
get_return(risk, can_short)

Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.2434%
Variance: 1.0001%
Expected weights:
--------------------
AAPL: 76.10%
MSFT: -57.41%
GOOG: 49.23%
TSLA: 32.08%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.2591%
Variance: 0.9999%
Expected weights:
--------------------
AAPL: 73.48%
MSFT: -52.53%
GOOG: 45.76%
TSLA: 33.30%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 3.0453%
Variance: 0.9999%
Expected weights:
--------------------
AAPL: -24.65%
MSFT: 63.64%
GOOG: 15.33%
TSLA: 45.68%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.6661%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 14.17%
MSFT: 58.19%
GOOG: -16.04%
TSLA: 43.68%



Allowing for short does not have very large effect.

At last, we'll consider a very small risk.

In [52]:
risk = 0.1
can_short = False
get_return(risk, can_short)

Optimizing for -> CAPM
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 0.9828%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%

Optimizing for -> SIM
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 0.9921%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%

Optimizing for -> FFF3
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 1.6268%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%

Optimizing for -> FFF5
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 1.4503%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 

The model can not optimize for this low risk. The best result is is:

    Expected return: 1.6268%
    Variance: 0.2992%
    Expected weights:
    AAPL: 14.90%
    MSFT: 47.07%
    GOOG: 38.03%
    TSLA: 0.00%