# XGBoost for Prediction using Senator Trading Data
* Laurent Lanteigne
* Max Frankel
* Yan Sun


### Introduction
Legal insider trading is a useful source of information for forecasting stock prices. XGBoost algorithm can be used to help determine relationships between insider information and future prices, and thus help us to trade on insider activity.  

#### Legal Insider Trading
Insiders have different kinds of information advantages. First, they know in advance which major events will affect stock prices. This type of advantage disappears after the events are announced, usually within a short time. Second, insiders can better assess the company’s earnings prospects and growth potential than outsiders. This type of information advantage is usually long term and does not rely on specific events. Third, insiders can better assess the intrinsic value of a company than outsiders and can thus identify and exploit opportunities when the stock market overvalues or undervalues the company. Fourth, insiders have a better sense of industry and macro-economy trends and thus tend to predict future macroeconomic directions more accurately. The first three types of advantages can increase insiders’ ability to predict firm-level price movement more precisely. They seem to always be able to cash in when the price is high and purchase when the price is low. The fourth advantage increases aggregate insiders’ ability to time the market. According to STOCK ACT, US senators are required to disclose their stock trade publicly, and we can obtain the stock trade feed including the file and transaction dates, stock issuers, lower and higher bound for trade size, and transaction side from the Securities and Exchange Commision(SEC) website. 

#### XGBoost
XGBoost is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework. When it comes to small-to-medium structured/tabular data, XGBoost estalishes itself among all other machine learning algorithms in both speed and performance. We'll do a quick review of GBM: Gradient Boosting Machine and why XGBoost suits the need for this project.

Boosting algorithm fits the ensemble models of the following structure:

$$
f(x) = \sum_{m=0}^{M} f_m(x) = f_0(x) + \sum_{m=1}^M \theta_m\phi_m(x).
$$

where $f_0$ is the initial guess, $\phi_m(x)$ is the base estimator at the $m^{th}$ iteration, and $\theta_m$ is the weight for the $m^{th}$ estimator. GBM constructs a forward addititive model by implementing gradient descent in the function space. Similar to the classic gradient descent framework in parameter space, at the $m^{th}$ iteration, the direction of the speest descent is given by 

$$
-g_m(x) = - \left(\frac{\partial L(y,f(x))}{\partial f(x)}\right)_{f(x)=f^{m-1}(x)}.
$$
This gives the direction and to reduce the loss function. Typically, a squared error is used as the loss function this leads to 

$$
\phi_m = \text{argmin}_{\phi}\sum_{i=1}^n \left[-g_m(x_i)-\phi(x_i)\right]^2.
$$

As to figure out the size of the step toward the negative gradient, in similar fashion we have

$$
\rho_m = \text{argmin}_\rho \sum_{i=1}^n L\left(y_i f^{m-1}(x_i) + \rho \phi_m(x_i)\right).
$$

Finally

$$
f_m(x) = \eta \rho_m \phi_m(x),
$$
where $\eta$ is the learning rate parameter.

XGBoost, short for "Extreme Gradient Boosting", introduced by Tianqi Chen and Carlos Guestrin in 2016 (1). GBM divides the optimization problem mainly in two different parts by determining first, the direction of the step $\phi_m$ and then the optimal step length $\rho_m$. XGBoost tries to do this in one step by directly solving 

$$
\frac{\partial L(y, f^{m-1}(x)+f_m(x)}{\partial f_m(x)} = 0
$$
for each input in the data set. The main reason to use XGBoost for a GBM Model is that by doing a second-order Taylor expansion of the loss function and optimizing the problem in one step instead of two, the convergence and speed of execution of the algorithm is greatly improved. 

____________________________________________________________________________________________________________________________________________________________________________________________
(1) Tianqi Chen, Carlos Guestrin, 2016 XGBoost: A Scalable Tree Boosting System https://arxiv.org/abs/1603.02754



### Model

#### Data Cleaning and Variable Selection

#### Model Optimization

#### Model Performance

### Trading Strategy Implementation

#### Key Assumptions
The initial capital is 1,000,000,000.  

#### Entry and Exit Rules

##### Entry Signal
On each filing date, the strategy enters a long/short position if there is a win signal for the Senator's transaction from the output generated by the XGBoost model.

###### Leverage
When the leverage parameter is set to 0, the strategy takes an initial position size of 10,000 for each long position. Depending on the short bias parameter, the strategy takes some multiplier of that 10,000 on each short. Since the initial capital is 1,000,000 times this, this setting lets you look at the strategy PnL in some sense from an ergodic perspective as opposed to a time-average one. 

If the leverage parameter is set to -1, the strategy weights the initial position size by the midpoint of the reported range in the relevant trade disclosure.

If the leveral paramter is greater than 0, the long position entry size is set as a percentage of the NLQ (net liquidating value) of the portfolio's holding + cash, scaling up and down depending on current profitability. 

For all the leverage parameters, each short taken is the default long position size multiplied by the short bias. 

##### Funding
If the current cash holdings go negative, the backtest allows you to vary the cost of acessing funding to better assess the vulnerability of the strategy to changes in funding rates. The implementation assumes interest on cash reserves at 150bps less than the funding rate, down to a floor of 0%. 

##### Exit Signal
On each trading day, the strategy would check two exit signals for outstanding positions.

###### Stop Loss
If the simulation experiences a day such that the present position value has lost more than a proportion $s$ of the current cash holdings and outstanding position value at position entry time), then force an exit at current prices. 

###### Force Exit after a Predetermined Time Period
The strategy does not to hold any position longer than a predetermined period $\tau$ due to funding and liquidity issues as well as to isolate the "alpha" in the signal. If any open positions have been held for $\tau$ days, the corresponding positions are closed.

#### Transaction Costs
The strategy assume we trade on the close, with some type of MOC order. In order to get a rough approximation of the impact of transaction costs on strategy performance, we take a default assumption for the backtest analytics that market impact of each trade is 50bps per trade. However, the parameter allows you vary the market impact and look at the performance of the trading strategy if you assume a larger amount of capital allocated. 

#### Short Bias
Since most of the trades during the OOS period were long trades, in order to decorrelate the strategy from the broader market uplift and isolate any potential alpha from the signal, we can adjust the strategy to overweight short trades. Short bias is a parameter which simply sets the relative ratio of short position sizing to long position sizing.

#### Miscellaneous

### Performance

#### PnL

#### Statistics

### Reference
Chafen Zhu, Li Wang, Tengfei Yang,
“Swimming Ducks Forecast the Coming of Spring”—The predictability of aggregate insider trading on future market returns in the Chinese market,
China Journal of Accounting Research,
Volume 7, Issue 3,
2014,
Pages 179-201,
ISSN 1755-3091,
https://doi.org/10.1016/j.cjar.2014.08.001.

Safer, A., &amp; Wilamowski, B. M. (1998). Neural networks and Mars for prediction using legal insider stock trading data (dissertation), http://www.eng.auburn.edu/~wilambm/pap/1998/ANNIE98_LegalInsider_Safer_Sprecher.pdf