# Project 7 - Cross Sectional Regressions

Bowen Chen

March 8, 2018

In [80]:
# R Setup
options(repr.plot.width= 8, repr.plot.height=5, warn = -1)

library(plyr)
library(data.table)
library(lubridate)
library(dplyr)
library(magrittr)
library(ggplot2)

## Executive Summary

In this project, we attempt to examine if the 4 pre-designed factors,  **firm marketcap**, **price-normalized accruals**, **the earnings-price ratio**, and **1/price** are appropriate predictors for the stock returns. The annual frequency fundamentals data of all stocks (1950 - 2017) is downloaded from CRSP-compustat merged database. Cross-sectional regressions are performed on the year level. The coefficients are found below. 



| Intercept | Beta MktCap   | Beta E/P    | Beta A/R    | Beta 1/Price |
|-----------|---------------|-------------|-------------|--------------|
| 0.3244531 | -5.565982e-06 | 0.000603361 | -0.08465663 | -0.09723741  |



The practice proved that these 4 factors are not great predictors of stock returns, while the **price-normalized accruals** showed some predictabilities in stock returns

## Key Questions to Answer

* Are the  4 pre-designed factors, firm marketcap, price-normalized accruals, the earnings-price ratio, and 1/price are appropriate predictors for the stock returns?

* Which one of the 4 factors showed the most predictabilities in stock returns?

## Computation


The data consists of the following information for all stocks from 1950-1-1 to 2017-12-31, 

* *datadate* - The date data is recorded, in annual frequency
* *fyear* - Fiscal year
* *tic* - The corresponding ticker that the stock used in Exchange
* *fyr* - not used
* *csho* - Common shares outstanding
* *ni* - Net Income
* *reecch* - change in account receivables
* *costat* - Company status, Inactive/Active
* *prcc_f* - Fiscal year end stock prices

In [81]:
crossSectionalData  = fread('data/crosssectional_data.csv')

In [82]:
crossSectionalData = crossSectionalData %>% setNames(c("PERMNO", "Data Date", "Fiscal Year", "Ticker", "fyr",
                                              "Shares Outstanding", "Net Income", "Change in A/R",
                                              "Company Status", "Stock Prices"))

**Clean missing data**

For simplicity reason, Remove all rows that has missing data

In [83]:
crossSectionalData = crossSectionalData[complete.cases(crossSectionalData), ]

**Find the market cap, earnings price ratios, price normalized accurals, 1/price**

* market cap = stock prices x shares outstandings
* earnings price ratio = Net Income/ Stock Prices
* price normalized Account Receivables = Change in A/R / market cap


In [84]:
crossSectionalData[, `:=` (`Market Cap` = `Shares Outstanding` * `Stock Prices`, 
                            `earnings/price` = `Net Income`/`Stock Prices`,
                            `price normalized A/R` = `Change in A/R`/(`Shares Outstanding` * `Stock Prices`),
                             `1/price`  = 1/`Stock Prices`)]

In [85]:
crossSectionalData = crossSectionalData[,c('Fiscal Year', 'Ticker', 'Market Cap','earnings/price',
                                           'price normalized A/R', '1/price', 'Stock Prices')]

**Find Stock Returns**

Find the lag price for every stock ticker

In [86]:
crossSectionalData[, `Lag Stock Prices`:= lag(`Stock Prices`), by=Ticker]

Remove some records that have infinity price normalized A/R ratios

In [87]:
crossSectionalData[, `Stock Returns` := (`Stock Prices` - `Lag Stock Prices`)/`Lag Stock Prices`]
crossSectionalData = crossSectionalData[is.finite(crossSectionalData$`price normalized A/R`), ]
crossSectionalData = crossSectionalData[complete.cases(crossSectionalData), ]

**Cross Sectional Regression  **

Find the cross sectional regression coefficient for every remaining years

In [88]:
crossSectionalregression = crossSectionalData  %>% group_by(`Fiscal Year`) %>%
                            summarise(`Intercept` = lm(`Stock Returns` ~ `Market Cap` + 
                                                                       `earnings/price` + 
                                                                       `price normalized A/R` + 
                                                                        `1/price`, na.action="na.omit")$coeff[1],
                                      
                                     `Beta MktCap` = lm(`Stock Returns` ~ `Market Cap` + 
                                                                       `earnings/price` + 
                                                                       `price normalized A/R` + 
                                                                        `1/price`, na.action="na.omit")$coeff[2],
                                      
                                     `Beta E/P` = lm(`Stock Returns` ~ `Market Cap` + 
                                                                       `earnings/price` + 
                                                                       `price normalized A/R` + 
                                                                        `1/price`, na.action="na.omit")$coeff[3],
                                      
                                     `Beta A/R` =  lm(`Stock Returns` ~ `Market Cap` + 
                                                                       `earnings/price` + 
                                                                       `price normalized A/R` + 
                                                                        `1/price` , na.action="na.omit")$coeff[4],
                                     `Beta 1/Price` =  lm(`Stock Returns` ~ `Market Cap` + 
                                                                       `earnings/price` + 
                                                                       `price normalized A/R` + 
                                                                        `1/price`, na.action="na.omit" )$coeff[5],
                                     )

Find the mean of every coefficient for all of these 4 factors, use them as the factor loadings

In [96]:
t(colMeans(crossSectionalregression[, -1]))

Intercept,Beta MktCap,Beta E/P,Beta A/R,Beta 1/Price
0.3244531,-5.565982e-06,0.000603361,-0.08465663,-0.09723741


From the cross sectional coefficients above, we can conclude that the mentioned 4 factors are not great predictors for stock returns - evident from the large residuals (intercept). The most significant factor is the **1/price** ratio, which is expected. The **price normalized Accurals**, which is the **price normalized Account Receivables**, is also a very strong predictor of the stock returns. Overall, the returns of stocks are very difficult to predict, and the practice in this project proves that point