# Research

by Joshua Isaacson and Hannah Isaacson 

For our Fall 2017 SICE@IU undergraduate research project, *A Sentiment-Based Long-Short Equity Strategy*.

## Components

1. Universe Selection
2. Factor Analysis
3. Rebalancing
4. Portfolio
5. Pipeline

##  Universe Selection

This component covers our process of defining the trading universe for which the algorithm operates.

### Imports 

In [99]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.psychsignal import stocktwits
from quantopian.pipeline.data import Fundamentals
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters.fundamentals import IsPrimaryShare
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import CustomFactor, Returns
from quantopian.pipeline.classifiers.fundamentals import Sector
from quantopian.pipeline.data.sentdex import sentiment_free
from quantopian.pipeline.factors import SimpleMovingAverage
from time import time
import alphalens as al

### Universe Filters

For our strategy, we want our base universe to be filtered by the following criteria:

* is a primary share
* is listed as a common stock
* is not a depositary receipt (ADR/GDR)
* is not trading over-the-counter (OTC)
* is not when-issued (WI)
* is not a limited partnership (LP)
* is not an ETF (has Morningstar fundamental data)
* has a price greater than or equal to $2.00
* is found in the PsychSignal dataset

In [100]:
#is a primary share
primary_share = IsPrimaryShare()

#is a common stock
common_stock = Fundamentals.security_type.latest.eq('ST00000001')

#not a depositary receipt
not_depositary = ~Fundamentals.is_depositary_receipt.latest

#not trading over-the-counter
not_otc = ~Fundamentals.exchange_id.latest.startswith('OTC')

#not when-issued
not_wi = ~Fundamentals.symbol.latest.endswith('.WI')

#not a limited partnership
not_lp_name = ~Fundamentals.standard_name.latest.matches('.* L[. ]?P.?$')
not_lp_balance_sheet = Fundamentals.limited_partnership.latest.isnull()

#not an ETF
have_market_cap = Fundamentals.market_cap.latest.notnull()

#equity price greater than or equal to $2.00
price_filter = USEquityPricing.close >= 2.00

#Filter
tradeable_stocks = (
    primary_share
    & common_stock
    & not_depositary
    & not_otc
    & not_wi
    & not_lp_name
    & not_lp_balance_sheet
    & have_market_cap
    & price_filter
)

## Factor Analysis

We want to test to see how good our alpha factors are at predicting relative price movements. A wide range of factors that are independent of each other yield a better ranking scheme.

The factors we are going to evaluate are:
* bearish_intensity
* bullish_intensity
* sentiment_signal
* sentiment moving average (10, 20, 30, 50, 80 day)
    * simple and exponential

### Fields in PsychSignal Dataset

In [101]:
def print_fields(dataset):
    print "Dataset: %s\n" % dataset.__name__
    print "Fields:"
    for field in list(dataset.columns):
        print "%s - %s" % (field.name, field.dtype)
    print "\n"

for data in (stocktwits,):
    print_fields(data)

Dataset: stocktwits

Fields:
bull_scored_messages - float64
bullish_intensity - float64
symbol - object
bull_minus_bear - float64
bull_bear_msg_ratio - float64
source - object
bear_scored_messages - float64
total_scanned_messages - float64
asof_date - datetime64[ns]
bearish_intensity - float64




### Fields in Sentdex Sentiment Analysis Dataset

In [102]:
def print_fields(dataset):
    print "Dataset: %s\n" % dataset.__name__
    print "Fields:"
    for field in list(dataset.columns):
        print "%s - %s" % (field.name, field.dtype)
    print "\n"

for data in (sentiment_free,):
    print_fields(data)

Dataset: sentiment_free

Fields:
sentiment_signal - float64
symbol - object
asof_date - datetime64[ns]




The datasets are set to variables to reduce clutter.

In [111]:
sentdex_sentiment_signal = sentiment_free.sentiment_signal
stocktwits.bearish_intensity

### Dealing with NaN Values

In [None]:


adjusted_sentiment_signal = df.where(df.replace(to_replace=0, value=np.nan),
 other=(df.fillna(method='ffill') + df.fillna(method='bfill'))/2)

### Sentiment Signal Moving Averages

Simple Moving Averages

In [107]:
sma_10 = SimpleMovingAverage(inputs=[sentiment_free.sentiment_signal], window_length=10)
sma_20 = SimpleMovingAverage(inputs=[sentiment_free.sentiment_signal], window_length=20)
sma_30 = SimpleMovingAverage(inputs=[sentiment_free.sentiment_signal], window_length=30)
sma_50 = SimpleMovingAverage(inputs=[sentiment_free.sentiment_signal], window_length=50)
sma_80 = SimpleMovingAverage(inputs=[sentiment_free.sentiment_signal], window_length=80)

Exponential Weighted Moving Averages

### Sector Codes

In [108]:
MORNINGSTAR_SECTOR_CODES = {
     -1: 'Misc',
    101: 'Basic Materials',
    102: 'Consumer Cyclical',
    103: 'Financial Services',
    104: 'Real Estate',
    205: 'Consumer Defensive',
    206: 'Healthcare',
    207: 'Utilities',
    308: 'Communication Services',
    309: 'Energy',
    310: 'Industrials',
    311: 'Technology' ,
}

### Getting Data

In [109]:
pipe = Pipeline()

pipe.add(stocktwits.bearish_intensity.latest, 'bearish_intensity')
pipe.add(stocktwits.bullish_intensity.latest, 'bullish_intensity')
pipe.add(sentiment_free.sentiment_signal.latest, 'sentiment_signal')
pipe.add(sma_10, 'sma_10')
pipe.add(sma_20, 'sma_20')
pipe.add(sma_30, 'sma_30')
pipe.add(sma_50, 'sma_50')
pipe.add(sma_80, 'sma_80')

start_timer = time()
results = run_pipeline(pipe, '2015-01-01', '2016-01-01')
end_timer = time()


In [110]:
results.head()

Unnamed: 0,Unnamed: 1,bearish_intensity,bullish_intensity,sentiment_signal,sma_10,sma_20,sma_30,sma_50,sma_80
2015-01-02 00:00:00+00:00,Equity(2 [ARNC]),0.0,1.2,2.0,2.8,3.6,4.266667,4.26,2.7375
2015-01-02 00:00:00+00:00,Equity(21 [AAME]),0.0,0.0,,,,,,
2015-01-02 00:00:00+00:00,Equity(24 [AAPL]),1.82,1.46,2.0,1.8,0.2,0.8,0.8,0.875
2015-01-02 00:00:00+00:00,Equity(25 [ARNC_PR]),,,,,,,,
2015-01-02 00:00:00+00:00,Equity(31 [ABAX]),0.0,0.0,,,,,,
