# Predicting Index Effective Date Price Movement by Matt Silva

### Problem Statement
   Every quarter various index providers (MSCI, S&P, Russel, FTSE) conduct rebalances or reconstitutions of their indices.  Blackrock manages 2 trillion dollars worth of passive index funds whether it be ETF’s or SMA.  Every rebalance these funds must match the closing price to perform in line with their investment mandate.  Due to large volume of securities in the fund it has grown harder and harder to execute our full position on the close of effective date so we are looking to develop strategies to find liquidity around the index date while trying to preserve capital in relation to our closing benchmark price.   Simply,  we can’t put a large amount of money into the closing auction without impacting the price of the security.  As fiduciaries we must find a way to responsibly execute these changes in the portfolio without losing value to the fund and without influencing stock price. 

I want to know what factors if any can help me predict how a securities price will move on effective date.  We can't trade earlier than effective date but we can leave a tail on a trade if we have high conviction that the price will revert in the days following the rebalance. 


### Goals of the Research
   
   The goal of my research is to find trends or patterns using historical factors and historical price movements on effective date.  The variables I will analyze are country, industry, market class, weight change to the benchmark, excess trading volume, and historical price changes.  These are current factors that we use to make decisions today but our decisions are not derived using data, they are currently derived using personal experience and so called expertise our portfolio managers have.  

### Data Analysis:
   
   I've collected data on over 4,000 stocks all of which went through a weight change during the MSCI rebalance on May 31 2017. Data collection began when MSCI announced the changes to the indices on May 15th. From May 15th to May 30th I beleive the stocks trade differently because many money managers try to make money of the index fund managers who must trade on May 31st.   The data I collected is summed up in the data table below
   
   Variable | Description | Type of Variable
---| ---| ---
country |Country the Stock trades in  | Categorical
marketclass| DM = Developed, DMSC = Developed Small Cap, EM = Emerging, EMSC = Emerging Small Cap | Categorical
security| Security Name | Description
industry| Industry Classification of the Security | Categorical
weightchange| Percentgae of Market Cap Change to the Benchmarks |  Number
excessvolume| Percentage difference in trading volume after announce date relative to the stocks annual average daily volume | Number
pricechangepercent | Percentage change in Price of the Stock on effective date | Number

   The data I collected is from the most recent rebalance.  A more desireable dataset would include data spanning over multiple rebalances.  I intend to update my dataset every rebalance with the hopes that I can find some trends that occur consistently each quarter.   
   
### What is the outcome:
  
   I want to create two models one that will predict the actual percentage change in the price and one that will predict whether a security will trade up or down on effective date.  To accomplish the first result I will create a regression model with multiple variables and to accomplish the second part I will create a logrithmic model.  I am hoping to gain any insight I can from these models so I can create a trade strategy based on data analysis as opposed to gut instinct.  

### What are the predictors:

   The predictors are country, market classification, industry classification, weight change to the benchmark and excess trading volume between announcment and effective date.

### What is the Hypothesis:

   H:  I predict that stock prices are affected by index rebalance announcement and that these movements can help me predict the movement of a security on effective date. 
    
### Initial Questions:
   Do brokers in specific countries follow the index announcements more than others?
   Does the market segment have an effect price movement
   Are certain industries effected more after announcement date?
   Will the degree of change to the benchmark give me an indication of price movement?
   If there is a large spike in volume prior to effective what will that effect the price on effective date?
   

### Exploratory Data Initiatives:
   Initially I will obtain descriptive statistics, distribution characteristics, and correlations amongst the data.  I will also create different diagrams in hopes to give me insights into how I should build a predictive model. 

   Eventually I hope to find trends that will help me predict price movements of the rebalance stocks on effective date 

In [19]:
import os

import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 10)
pd.set_option('display.max_columns', 10)
pd.set_option('display.notebook_repr_html', True)

import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')

import statsmodels.formula.api as smf

from sklearn import linear_model

In [20]:
df = pd.read_csv(os.path.join('QIRSummaryMay2017.csv'))

In [21]:
df

Unnamed: 0,sedol,country,marketclass,pro_forma_gics_industry_group_name,weightchange,excessvolume,pricechangeannouncement,pricechangeeffective
0,B1TJG95,INDIA,EMSC,Insurance,308.33%,0.720641636,-94,93
1,BNN7WR9,CHINA,EMSC,Food Beverage & Tobacco,120.00%,0.309378945,-0.74,-0.26
2,BGSH2S6,BRAZIL,EMSC,Consumer Services,114.29%,0.044939547,-2.99,1.99
3,BD6P5Q0,USA,DMSC,Software & Services,107.51%,-0.34737835,-0.69,-0.31
4,BVG6VZ0,THAILAND,EMSC,Media,100.50%,-0.381416738,-0.5,-0.5
...,...,...,...,...,...,...,...,...
8529,6122265,TAIWAN,EMSC,,-100.00%,0.351928854,-1.25,0.25
8530,BS7JP33,TAIWAN,EMSC,,-100.00%,-0.272885381,2.5,-3.5
8531,B1CDRR9,TAIWAN,EMSC,,-100.00%,-0.143160222,-1.3,0.3
8532,B606XG6,TAIWAN,EMSC,,-100.00%,-0.306892754,-2.8,1.8


In [13]:
df.isnull().sum()

sedol                                   0
country                                 0
marketclass                             0
pro_forma_gics_industry_group_name    214
weightchange                            0
excessvolume                            0
pricechangeannouncement                 0
pricechangeeffective                    0
dtype: int64

Im not going to drop the data that is missing industry.  I think it is more important for this to be in the analysis.

It seems as though price change is most correlated with excess volume.  This seems to reinforce some of my original thoughts in that market participants actively trade these securities prior to effective date because they know index portfolio managers will be buying these names from them at a higher price than they bought leading into the rebalance. 

In [14]:
df.describe()

Unnamed: 0,sedol,country,marketclass,pro_forma_gics_industry_group_name,weightchange,excessvolume,pricechangeannouncement,pricechangeeffective
count,8534,8534,8534,8320,8534,8534,8534,8534
unique,8534,46,4,24,1150,8497,2520,2520
top,2795371,USA,DMSC,Capital Goods,0.00%,#VALUE!,-1,0
freq,1,2427,4063,939,4088,32,117,117


I need to seperate the securities with zero change to the index and track how the prices move relative to the securities that have weight changes in the index.  