# Research

by Joshua Isaacson and Hannah Isaacson 

For our Fall 2017 SICE@IU undergraduate research project, *A Sentiment-Based Long-Short Equity Strategy*.

## Components

1. Universe Selection
2. Factor Analysis
3. Rebalancing
4. Portfolio
5. Pipeline

##  Universe Selection

This component covers our process of defining the trading universe for which the algorithm operates.

### Imports 

In [7]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from quantopian.pipeline.data.psychsignal import stocktwits
from quantopian.pipeline.data import Fundamentals
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters.fundamentals import IsPrimaryShare

### Universe Filters

For our strategy, we want our base universe to be filtered by the following criteria:

* is a primary share
* is listed as a common stock
* is not a depositary receipt (ADR/GDR)
* is not trading over-the-counter (OTC)
* is not when-issued (WI)
* is not a limited partnership (LP)
* is not an ETF (has Morningstar fundamental data)
* has a price greater than or equal to $2.00
* is found in the PsychSignal dataset

In [8]:
def universe_filter():
    
    #is a primary share
    primary_share = IsPrimaryShare()

    #is a common stock
    common_stock = Fundamentals.security_type.latest.eq('ST00000001')

    #not a depositary receipt
    not_depositary = ~Fundamentals.is_depositary_receipt.latest

    #not trading over-the-counter
    not_otc = ~Fundamentals.exchange_id.latest.startswith('OTC')

    #not when-issued
    not_wi = ~Fundamentals.symbol.latest.endswith('.WI')

    #not a limited partnership
    not_lp_name = ~Fundamentals.standard_name.latest.matches('.* L[. ]?P.?$')
    not_lp_balance_sheet = Fundamentals.limited_partnership.latest.isnull()

    #not an ETF
    have_market_cap = Fundamentals.market_cap.latest.notnull()

    #equity price greater than or equal to $2.00
    price_filter = USEquityPricing.close >= 2.00

    #Filter
    tradeable_stocks = (
        primary_share
        & common_stock
        & not_depositary
        & not_otc
        & not_wi
        & not_lp_name
        & not_lp_balance_sheet
        & have_market_cap
        & price_filter
    )
    
    return tradeable_stocks

## Factor Analysis

We want to test to see how good our alpha factors are at predicting relative price movements. A wide range of factors that are independent of each other yield a better ranking scheme.

The factors we are going to evaluate are:
* bearish_intensity
* bull_intensity
* sentiment score 

$$SentimentScore = \frac{(bullScoredMessages - bearScoredMessages)}{(bullScoredMessages + bearScoredMessages)}$$

* sentiment moving average (10, 20, 30, 50 day)

### Fields in PsychSignal Dataset

In [9]:
def print_fields(dataset):
    print "Dataset: %s\n" % dataset.__name__
    print "Fields:"
    for field in list(dataset.columns):
        print "%s - %s" % (field.name, field.dtype)
    print "\n"

for data in (stocktwits,):
    print_fields(data)

Dataset: stocktwits

Fields:
bear_scored_messages - float64
asof_date - datetime64[ns]
bearish_intensity - float64
bull_scored_messages - float64
bullish_intensity - float64
symbol - object
bull_minus_bear - float64
bull_bear_msg_ratio - float64
source - object
total_scanned_messages - float64




### Sector Codes

In [11]:
MORNINGSTAR_SECTOR_CODES = {
     -1: 'Misc',
    101: 'Basic Materials',
    102: 'Consumer Cyclical',
    103: 'Financial Services',
    104: 'Real Estate',
    205: 'Consumer Defensive',
    206: 'Healthcare',
    207: 'Utilities',
    308: 'Communication Services',
    309: 'Energy',
    310: 'Industrials',
    311: 'Technology' ,
}