# Earnings Open Interest Shift Trading

Our goal is to trade earnings announcements using a couple different indicators with the assumption as follows: The options markets are more sophisticated than the equity markets are, and are able to understand news better. Thus, we want to develop a signal as follows:

> We find that option traders’ information processing ability stems primarily from
> their ability to identify underreaction to news: their trades on news days are only
> informative about future stock prices when they buy call options on stocks that have
> experienced positive news events and when they buy put options on stocks that have
> experienced negative news events, while otherwise their trades are on average uninformative.
> 
> -[M. Cremers, A.Fodor, D. Weinbaum][1]

We have pulled earnings surprise data from [StreetInsider.com](http://www.streetinsider.com/), and thus
can determine ahead of time whether earnings were "positive" or "negative" news by their surprises.
We will test logic to determine how "positive" or "negative" these surprises are; we expect that many
will have very small surprises and thus be a neutral event overall. We do not want to trade these events.

Additionally, we need to define a proxy for what it means to "buy call options" or "buy put options."
We use as a proxy for buying pressure in one direction or the other the Open Interest of an option.
For example, if we see that call open interest increases while put open interest decreases, we
determine that this is "buying call option" behavior.

[1]: http://optionmetrics.com/m-cremers-a-fodor-d-weinbaum-where-do-informed-traders-trade-first-option-trading-activity-news-releases-and-stock-return-predictability-working-paper-series/

In [2]:
import sys
sys.path.append('../../utils/')

from query import get_connection
from trading_days import TradingDay

date_format = '%Y-%m-%d'

We have previously gotten a list of earnings dates for stocks in Russell 3000. We will be looking at earnings data for the top 50 and middle 50 companies by market capitalization.

In [None]:
import pandas as pd
import numpy as np
earnings_dates = pd.read_csv('all_earnings.csv', index_col=0)
earnings_dates['NormDate'] = pd.to_datetime(
    earnings_dates['NormDate'],
    format='%Y-%m-%d',
    errors='coerce'
)
tickers = earnings_dates['Ticker'].unique()
top_50 = tickers[0:50]

tick_len = len(tickers)
mid_50 = tickers[tick_len//2:tick_len//2+50]

all_tickers = np.hstack((top_50, mid_50))
all_tickers

We're now going to fill our database with all the important information we need. For each ticker we will get all of its earnings dates, and populate the database with the information needed for our strategy for a couple different parameter settings.

In [None]:
date_settings = [
    (1, 20),
    (1, 10),
    (1, 30),
    (2, 10),
    (2, 20),
    (2, 30),
    (3, 10),
    (3, 20),
    (3, 30)
]

base_insert = open('OpenInterestInsert.sql', 'r').read()

c = get_connection()
for bus_days_prior, bus_days_after in date_settings:
    td_prior = TradingDay(bus_days_prior)
    td_after = TradingDay(bus_days_after)
    for ticker in all_tickers:
        dates = earnings_dates[earnings_dates['Ticker'] == ticker]['NormDate']
        print("Processing {} dates for ticker {}".format(
                len(dates), ticker
            ))
        for date in dates:
            query = base_insert.format(
                ticker=ticker,
                bus_days_prior=bus_days_prior,
                bus_days_after=bus_days_after,
                trade_open=date.strftime(date_format),
                trade_close=(date + td_after).strftime(date_format),
                prior_date=(date - td_prior).strftime(date_format)
            )
            try:
                c.execute(query)
            except:
                print(query)
                raise
            print("It didn't break!")