# Investment Simulation

## Methodology

We hypothesize the asset allocation decision follow a multi-factor model with the following factors:  
1) Market Capitalization.  
2) Book-to-market Ratio.  
3) Past 2 to 12 months returns.   
4) Past 1 month return.  
5) Past 13 to 60 month returns.  
6) Stock return variance.  
7) Operating profitability.  
8) Investment.  
9) Accruals.  
10) CAPM beta.  
11) Net share issuance.  

The objective is then to come up with an asset allocation strategy that allocates based on the coefficient of these factors. Thus, we hypothesize it to be a multi-factor linear model.  

Let $X_i$ be the factors.  

The expected returns of each asset is an important parameter of our model. The expected returns premium $E[R_i - R_m]$ of each assets is a function as follow:
  
\begin{equation}
    \begin{split}
        E[R_i - R_m] =& \gamma_0 + \gamma_2 X_1 + \gamma_3 X_2 + ... \gamma_n X_{n+1} \\
                     =& \gamma_0 + \sum_{i=1}^n \gamma_i X_i  \\
    \end{split}
\end{equation}
  
To arrive at the coefficients for these factors, we will train our model based on 250 different 72-month blocks of the data.

The input parameters for our model are:  
1) Expected Monthly Returns for each funds.  
2) Monthly data of 14 Factors indicated above.  
3) Expected Returns for each funds based on monthly data.  
4) Standard Deviation for each funds based on monthly data.  

For weight optimization, we will use Sequential Leasts Squares Programming method via $scipy.optimize.minimize()$.

In [1]:
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

## Data Importing  

First order of business is to import the historical performance data of the assets we are trying to allocate:

In [2]:
import pandas as pd
import numpy as np

# Load the Excel Sheet
fn = r'Round_1.xls'

xl = pd.ExcelFile(fn)

dfs = {sh:xl.parse(sh) for sh in xl.sheet_names}  # Read each sheets to a dict

# Assign each sheet (dict) to a separate dataframe
dfReturns = dfs['returns']
dfChar = dfs['characteristics']

# Clean up characteristic sheet
dfChar = dfChar.T  # Transpose data so funds' factors end up in rows
dfChar.columns = dfChar.iloc[0]  # Set name for columns to be factor name
dfChar.drop(dfChar.index[0], inplace=True)  # Dropping unnecessary index row due to tranpose
dfChar

Unnamed: 0,Market capitalization,Book-to-market ratio,Past 2 to 12 month return,Past 1 month return,Past 13 to 60 month return,Stock return variance,Operating profitability,Investment,Accruals,CAPM beta,Net share issuance
Fund 1,-0.6,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Fund 2,-0.6,-0.2,0.04,0.004,0.1,0.0,-0.05,0.05,0.0,0.0,0.04
Fund 3,0.4,-0.3,0.06,0.006,0.15,0.0,-0.075,0.075,-0.3,0.0,0.06
Fund 4,-0.6,1.0,-0.2,-0.02,-0.5,0.0,0.25,-0.25,0.0,0.0,-0.2
Fund 5,0.6,-0.3,0.06,0.006,0.15,0.0,-0.075,0.075,0.5,0.0,0.06
Fund 6,1.0,-0.3,0.1,0.0,0.1,0.0,0.0,1.0,0.0,0.0,0.2
Fund 7,-0.6,-0.2,1.0,0.0,0.2,0.1,0.2,0.2,0.0,0.0,0.1
Fund 8,0.6,0.06,0.0,0.0,0.0,-0.6,0.0,0.0,0.0,-0.18,0.0
Fund 9,0.2,0.21,-0.25,0.0,0.04,-0.03,0.44,-0.06,0.0,0.0,-0.03
Fund 10,-0.6,-0.09,-0.25,0.0,-0.01,-0.03,-0.06,0.44,0.0,0.0,0.07


Next, we inspect the data and calculate the variance-covariance matrix.

In [3]:
histR = dfReturns.loc[:, slice("Fund 1", "Fund 10")]  # Slice the returns only to grab vcov
histR.head()
vcovHistRMat = np.cov(histR)  # Get variance-covariance matrix of historical returns
vcovHistRMat

Unnamed: 0,Fund 1,Fund 2,Fund 3,Fund 4,Fund 5,Fund 6,Fund 7,Fund 8,Fund 9,Fund 10
0,0.0315,0.0492,0.082652,0.0284,0.076272,0.0516,0.0373,0.03782,0.0499,0.0406
1,0.0161,0.0145,0.003718,0.0075,-0.007507,-0.0342,0.0212,-0.02865,-0.00295,0.01165
2,0.011,0.0162,0.01895,-0.0027,0.017551,0.0358,0.0221,0.011466,0.004,-0.01625
3,-0.0329,-0.0324,-0.039429,-0.0353,-0.033744,-0.028,-0.0318,-0.037986,-0.0393,-0.04665
4,-0.024,-0.0036,0.020851,-0.0044,0.001634,0.0113,0.0139,0.004149,-0.0254,-0.04695


array([[ 3.27176865e-04, -6.99425383e-05,  9.83658782e-05, ...,
         2.66842795e-04, -7.58903333e-05,  3.75715021e-05],
       [-6.99425383e-05,  3.53335513e-04, -1.07897875e-04, ...,
        -4.48946388e-04, -1.82612144e-04, -4.26115224e-04],
       [ 9.83658782e-05, -1.07897875e-04,  2.04920228e-04, ...,
         3.73950966e-04,  2.80721838e-05,  2.39309457e-04],
       ...,
       [ 2.66842795e-04, -4.48946388e-04,  3.73950966e-04, ...,
         1.13005344e-03,  1.27878979e-04,  7.51289431e-04],
       [-7.58903333e-05, -1.82612144e-04,  2.80721838e-05, ...,
         1.27878979e-04,  2.53328052e-04,  2.73012338e-04],
       [ 3.75715021e-05, -4.26115224e-04,  2.39309457e-04, ...,
         7.51289431e-04,  2.73012338e-04,  8.00702385e-04]])

In [4]:
dfReturns.head()

Unnamed: 0.1,Unnamed: 0,RMRF,RSMB,RHML,RF,Fund 1,Fund 2,Fund 3,Fund 4,Fund 5,Fund 6,Fund 7,Fund 8,Fund 9,Fund 10
0,Month 1,0.040151,0.014519,-0.034433,0.003875,0.0315,0.0492,0.082652,0.0284,0.076272,0.0516,0.0373,0.03782,0.0499,0.0406
1,Month 2,-0.022998,0.026868,-0.017723,0.003706,0.0161,0.0145,0.003718,0.0075,-0.007507,-0.0342,0.0212,-0.02865,-0.00295,0.01165
2,Month 3,0.013343,-0.022135,-0.023471,0.003892,0.011,0.0162,0.01895,-0.0027,0.017551,0.0358,0.0221,0.011466,0.004,-0.01625
3,Month 4,-0.040294,-0.001657,-0.000332,0.003779,-0.0329,-0.0324,-0.039429,-0.0353,-0.033744,-0.028,-0.0318,-0.037986,-0.0393,-0.04665
4,Month 5,0.008569,0.000378,0.002764,0.004416,-0.024,-0.0036,0.020851,-0.0044,0.001634,0.0113,0.0139,0.004149,-0.0254,-0.04695


In [5]:
dfChar

Unnamed: 0,Market capitalization,Book-to-market ratio,Past 2 to 12 month return,Past 1 month return,Past 13 to 60 month return,Stock return variance,Operating profitability,Investment,Accruals,CAPM beta,Net share issuance
Fund 1,-0.6,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Fund 2,-0.6,-0.2,0.04,0.004,0.1,0.0,-0.05,0.05,0.0,0.0,0.04
Fund 3,0.4,-0.3,0.06,0.006,0.15,0.0,-0.075,0.075,-0.3,0.0,0.06
Fund 4,-0.6,1.0,-0.2,-0.02,-0.5,0.0,0.25,-0.25,0.0,0.0,-0.2
Fund 5,0.6,-0.3,0.06,0.006,0.15,0.0,-0.075,0.075,0.5,0.0,0.06
Fund 6,1.0,-0.3,0.1,0.0,0.1,0.0,0.0,1.0,0.0,0.0,0.2
Fund 7,-0.6,-0.2,1.0,0.0,0.2,0.1,0.2,0.2,0.0,0.0,0.1
Fund 8,0.6,0.06,0.0,0.0,0.0,-0.6,0.0,0.0,0.0,-0.18,0.0
Fund 9,0.2,0.21,-0.25,0.0,0.04,-0.03,0.44,-0.06,0.0,0.0,-0.03
Fund 10,-0.6,-0.09,-0.25,0.0,-0.01,-0.03,-0.06,0.44,0.0,0.0,0.07


## Optimization Functions and Constraint Functions. 

We develop the optimization function $negSharpe()$, $conSumWeight()$, and $conLimitWeight()$.

### negSharpe()  
Our objective is maximizing Sharpe ratio. Scipy Optimize Minimize optimizes the min so we add negative component to the Sharpe Ratio equation.  

### conSumWeight() 
This constraint is an equality constraint in the scipy.optmize engine, ensuring our weights sum up to 1.  

### conLimitWeight()  
This constraint is an equality constraint, ensuring that no weight have absolute value exceeding 2.

In [6]:
import numpy as np

# Calculate Sharpe Ratio
def negSharpe(wVector):
    ret = np.dot(np.squeeze(np.asarray(wVector)), np.squeeze(np.asarray(returnsVector)))
    risk = np.sqrt(np.matmul(np.matmul(wVector, covMatrix), np.transpose(wVector)))
    return -(ret - mean_rf) / risk 

# Add sum weights = 1 constraint
def conSumWeight(wVector):
    return np.sum(wVector) - 1

# Add abs|weight| <= 2
def conLimitWeight(wVector):    
    return - abs(wVector) + 2

### Weights Optimization

Reviewing the fitted intercept and coefficient for asset risk premium

In [7]:
# Reload data from Data Processing Engine
%store -r intercept
%store -r coefArr

eReturnsIntercept = intercept
eReturnsCoefs = coefArr

eReturnsIntercept
eReturnsCoefs

array([0.00116192])

array([[-2.99164282e-04, -3.05415656e-05,  6.12886809e-03,
        -4.68012161e-03, -2.55798465e-03, -5.11186897e-03,
         1.81583266e-03, -4.38320389e-03, -1.86641086e-03,
         1.01806976e-03, -2.65130098e-03]])

Where the magic happens 😯😯😯😯😯😯

In [8]:
import scipy.optimize as optimize

# Generate initial array of guess weight, all .1 or 10% for each of the 10 funds
initial_guess = np.full((1,10), .1)

# Set up covariance matrix from downloaded data
global covMatrix 
covMatrix = np.cov(dfReturns.loc[:,"Fund 1": "Fund 10"].T)

# Set up E[r_f]
global mean_rf
mean_rf = dfReturns["RF"].mean()

# Set up E[R_m]
global mean_rm
mean_rm = dfReturns["RMRF"].mean() + dfReturns["RF"].mean()

# Set up factor matrix
factorMat = dfChar.loc[:,"Market capitalization": "Net share issuance"]

# Set up E[R_i] vector
global returnsVector
returnsVector = np.dot(eReturnsCoefs, factorMat.T) + np.full((1,10), eReturnsIntercept) + np.full((1,10), mean_rm)

# Set up constraints
cons = [{'type': 'eq', 'fun': conSumWeight}, 
        {'type': 'ineq', 'fun': conLimitWeight}]

# Optimize!
result = optimize.minimize(negSharpe, initial_guess, constraints=cons, method='SLSQP')

In [9]:
dfReturns.head()

Unnamed: 0.1,Unnamed: 0,RMRF,RSMB,RHML,RF,Fund 1,Fund 2,Fund 3,Fund 4,Fund 5,Fund 6,Fund 7,Fund 8,Fund 9,Fund 10
0,Month 1,0.040151,0.014519,-0.034433,0.003875,0.0315,0.0492,0.082652,0.0284,0.076272,0.0516,0.0373,0.03782,0.0499,0.0406
1,Month 2,-0.022998,0.026868,-0.017723,0.003706,0.0161,0.0145,0.003718,0.0075,-0.007507,-0.0342,0.0212,-0.02865,-0.00295,0.01165
2,Month 3,0.013343,-0.022135,-0.023471,0.003892,0.011,0.0162,0.01895,-0.0027,0.017551,0.0358,0.0221,0.011466,0.004,-0.01625
3,Month 4,-0.040294,-0.001657,-0.000332,0.003779,-0.0329,-0.0324,-0.039429,-0.0353,-0.033744,-0.028,-0.0318,-0.037986,-0.0393,-0.04665
4,Month 5,0.008569,0.000378,0.002764,0.004416,-0.024,-0.0036,0.020851,-0.0044,0.001634,0.0113,0.0139,0.004149,-0.0254,-0.04695


## Results 

Examine our result:

In [10]:
result

     fun: -1.1746145209617604
     jac: array([0.09733638, 0.09795088, 0.09713349, 0.09719698, 0.09704521,
       0.09774138, 0.0977768 , 0.09699115, 0.01200101, 0.13123661])
 message: 'Optimization terminated successfully'
    nfev: 212
     nit: 19
    njev: 19
  status: 0
 success: True
       x: array([-0.95064586,  0.34451712,  0.4308091 ,  1.76048221, -0.22086964,
       -0.65105197,  0.53668159, -0.24992254,  2.        , -2.        ])

### Optimized Weights  

We can see our optimize weight array to be:  

In [11]:
idx = pd.Index(["Fund 1", 
               "Fund 2", 
               "Fund 3", 
               "Fund 4", 
               "Fund 5", 
               "Fund 6", 
               "Fund 7", 
               "Fund 8", 
               "Fund 9", 
               "Fund 10"])
optimizedResult = pd.DataFrame(result.x, columns=["Weight"]).set_index(idx)
optimizedResult

Unnamed: 0,Weight
Fund 1,-0.950646
Fund 2,0.344517
Fund 3,0.430809
Fund 4,1.760482
Fund 5,-0.22087
Fund 6,-0.651052
Fund 7,0.536682
Fund 8,-0.249923
Fund 9,2.0
Fund 10,-2.0


### Constraint check  
We can see that the weight constraint are satisfied and visually, the upper and lower limit of 200% also satisfied:

In [12]:
result.x.sum()

1.0000000000000002

### Sharpe Ratio  

The Sharpe Ratio is estimated to be:

In [13]:
abs(result.fun)

1.1746145209617604

### Portfolio Returns  

The Portfolio Returns is:  

In [14]:
np.dot(np.squeeze(np.asarray(result.x)), np.squeeze(np.asarray(returnsVector)))

0.03742679212066288

### Portfolio Risk  

The Portfolio Risk is

In [15]:
np.sqrt(np.matmul(np.matmul(result.x, covMatrix), np.transpose(result.x)))

0.028445248569855405