
# Stock market factor model experiments

### Druce Vertes
### Metis Bootcamp NYC
### 10/12/2018

# Can we predict stock returns and outperform the market with simple models?

# Thought experiment: If we couldn't predict attractive returns, why buy stocks?

# Spoiler: Predicting 'something' about returns is possible. Predicting consistently better than the collective wisdom of the market is very hard.

### Project: Model monthly stock returns using

- Scraped from [FRED](https://fred.stlouisfed.org/) using Selenium:
    - Growth (Leading index)
    - Inflation (CPI ex food & energy)
    - Monetary Tightness (10-year bond rate - 3-month T-bill rate)
    - All shifted 1 month so we predict October using September data
- [CRSP + Compustat](http://www.crsp.com/products/research-products/crspcompustat-merged-database)
  - 55 years of monthly data
  - Top 50% of stocks by market cap
    - [Market cap cutoffs](http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/det_48_ind_port.html) from Fama/French
    - about 500 stocks at start of sample
    - about 1200 at end of sample
    - peaked at ~1400 stocks in early 2000s


![Dataframe.png](Dataframe.png)

- Predictors:
  - Value (Stock Price / Book Value)
  - Momentum (11-month change in stock through one month prior)
  - Industry (Fama 48 industries 1-hot with pd.get_dummies )
  - Economic indicators: Growth (LEI), Inflation (CPI), Monetary policy (30y-3m)
- Response:
  - 3-month absolute return (also tried relative return v. S&P)



# Factor models *

 - Original [Markowitz CAPM model](http://book.ivo-welch.info/read/source.mba/chap10.pdf) models stock returns as risk-free rate + beta * market factor + epsilon 
     - Risk-free rate and beta explain ~70% of returns (epsilon ~= 30%)
 - [Fama & French](https://faculty.chicagobooth.edu/john.cochrane/teaching/35904_Asset_Pricing/Fama_French_multifactor_explanations.pdf) added value and size (market cap)
     - Fama/French 3 factor model explains ~90%
 - [Carhart](https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1540-6261.1997.tb03808.x) adds momentum, reflecting the idea that prices trend more than expected in a random walk 
     - Trends may reflect hidden factors that take time to be appreciated by market (slow information diffusion v. perfect information)
     - Success breeds success. Companies whose stock goes up subsequently get good press, hire better people, make better acquisitions, get better credit terms, etc., which leads to more success. Conversely problems beget problems (Reflexivity, never just one cockroach in the kitchen)
 - Fund companies like [AQR](https://www.aqr.com/Insights/Research/Journal-Article/Value-and-Momentum-Everywhere) have put value/momentum models into practice, done OK

### (* As far as I can tell, simple linear regression models with fancy names)


From https://en.wikipedia.org/wiki/Carhart_four-factor_model 

![Carhart.png](Carhart.png)


# Baseline: Momentum only

- Generate momentum quintile buckets
- Each month, go long the stocks in e.g. top momentum quintile
- Seems to have worked (surprisingly well?)
    
![quintiles1.png](quintiles1.png)

![quintiles1table.png](quintiles1table.png)

# Question: can we do better with machine learning?

# OLS model

- Fit RET vs.
  - 2 Fundamental variables: Value, Momentum
  - 3 Econ variables
  - 48 industries (1-hot dummies)
- Get a small but nonzero R-squared ~ 0.5% out of sample
  - Bucket predicted return into quintiles
  - These quintiles have much less impact than momentum only
  - Which is interesting because the prediction should have the momentum information + whatever information is in the other variables
  - Thinking about how to fix, what I might be doing wrong

![Quintiles2.png](Quintiles2.png)

![OLS.png](OLS.png)


Things I tried
    - basic profiling, pair plot
    - Lasso, ElasticNet, Ridge, didn't improve OLS R-squared
    

In [None]:
todos: 
    P/E or EBIT / EV + some leverage measure
    Beta adjust