Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantopian Pairs Trading Algo - Deprecation Fix #1550

Closed
wants to merge 252 commits into from
Closed

Conversation

ash487
Copy link

@ash487 ash487 commented Oct 23, 2016

Greetings Quantopian Community,

I was at the NYC Event on Pairs Trading, and the current example algorithm is deprecated, such that one cannot deploy it in live trading. With this fix, users can now deploy the algorithm in live trading.

import numpy as np
import statsmodels.api as sm
import pandas as pd
from zipline.utils import tradingcalendar
import pytz


def initialize(context):
    # Quantopian backtester specific variables
    set_slippage(slippage.FixedSlippage(spread=0))
    set_commission(commission.PerTrade(cost=1))
    set_symbol_lookup_date('2010-01-01')

    context.stock_pairs = [(symbol(' '), symbol(' '))]
    # set_benchmark(context.y)

    context.num_pairs = len(context.stock_pairs)
    # strategy specific variables
    context.lookback = 20 # used for regression
    context.z_window = 20 # used for zscore calculation, must be <= lookback

    context.spread = np.ndarray((context.num_pairs, 0))
    # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0))
    context.inLong = [False] * context.num_pairs
    context.inShort = [False] * context.num_pairs

    # Only do work 30 minutes before close
    schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30))

# Will be called on every trade event for the securities you specify. 
def handle_data(context, data):
    # Our work is now scheduled in check_pair_status
    pass

def check_pair_status(context, data):
    if get_open_orders():
        return

    new_spreads = np.ndarray((context.num_pairs, 1))

    for i in range(context.num_pairs):

        (stock_y, stock_x) = context.stock_pairs[i]


       ***THE CHANGE***
       ######################################################

        Y = data.history(stock_y, 'price', 35, '1d').iloc[-context.lookback::]
        X = data.history(stock_x, 'price', 35, '1d').iloc[-context.lookback::]

       ######################################################

        try:
            hedge = hedge_ratio(Y, X, add_const=True)      
        except ValueError as e:
            log.debug(e)
            return

        # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge)

        new_spreads[i, :] = Y[-1] - hedge * X[-1]

        if context.spread.shape[1] > context.z_window:
            # Keep only the z-score lookback period
            spreads = context.spread[i, -context.z_window:]

            zscore = (spreads[-1] - spreads.mean()) / spreads.std()

            if context.inShort[i] and zscore < 0.0:
                order_target(stock_y, 0)
                order_target(stock_x, 0)
                context.inShort[i] = False
                context.inLong[i] = False
                record(X_pct=0, Y_pct=0)
                return

            if context.inLong[i] and zscore > 0.0:
                order_target(stock_y, 0)
                order_target(stock_x, 0)
                context.inShort[i] = False
                context.inLong[i] = False
                record(X_pct=0, Y_pct=0)
                return

            if zscore < -1.0 and (not context.inLong[i]):
                # Only trade if NOT already in a trade
                y_target_shares = 1
                X_target_shares = -hedge
                context.inLong[i] = True
                context.inShort[i] = False

                (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] )
                order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) / float(context.num_pairs) )
                order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) / float(context.num_pairs) )
                record(Y_pct=y_target_pct, X_pct=x_target_pct)
                return

            if zscore > 1.0 and (not context.inShort[i]):
                # Only trade if NOT already in a trade
                y_target_shares = -1
                X_target_shares = hedge
                context.inShort[i] = True
                context.inLong[i] = False

                (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] )
                order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) / float(context.num_pairs) )
                order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) / float(context.num_pairs) )
                record(Y_pct=y_target_pct, X_pct=x_target_pct)

    context.spread = np.hstack([context.spread, new_spreads])

def hedge_ratio(Y, X, add_const=True):
    if add_const:
        X = sm.add_constant(X)
        model = sm.OLS(Y, X).fit()
        return model.params[1]
    model = sm.OLS(Y, X).fit()
    return model.params.values

def computeHoldingsPct(yShares, xShares, yPrice, xPrice):
    yDol = yShares * yPrice
    xDol = xShares * xPrice
    notionalDol =  abs(yDol) + abs(xDol)
    y_target_pct = yDol / notionalDol
    x_target_pct = xDol / notionalDol
    return (y_target_pct, x_target_pct)

phil.zhang and others added 30 commits September 2, 2016 16:47
When in python2.7, and unicode_literals is imported
type check will raise error because 'type' is not str but unicode
…#1470)

This reverts commit 5b1aa5e.

The paradigm is: we're calculating a new capital base for the
performance period. We are therefore using the total
portfolio_value, not just the cash, to calculate the
difference from the specified target as the algorithm
has meaningful holdings.
Remove module scope invocations of `get_calendar('NYSE')`, which cuts
zipline import time in half on my machine. This make the zipline CLI
noticeably more responsive, and it reduces memory consumed at import
time from 130MB to 90MB.

Before:

$ time python -c 'import zipline'

real    0m1.262s
user    0m1.128s
sys     0m0.120s

After:

$ time python -c 'import zipline'

real    0m0.676s
user    0m0.536s
sys     0m0.132s
MAINT: remove __getitem__ as alias of __getattr__
…me-from-pushing-this-commit-directly-;_;

ENH: improve warning for protocol getitem
Update release notes.

Generate api stubs.
REL: Prepare for 1.0.2 release.
PERF: Remove import-time calendar creations.
* REF: More options before raise MultiFound.

* TST: Checks corner case for fuzzy matching.
BUG: run_algorithm with no data source should default
This reverts commit a5ecaf4.

This causes downstream problems; unsure why, Jamie advised
reverting.
These were previously available like the others.
Refcount pipeline terms during execution and release terms once they're
no longer needed.

This dramatically reduces memory usage on large pipelines.
Eddie Hebert and others added 16 commits October 26, 2016 14:41
There have been cases where the requested start or end date is not in
the history calendar.

Add the beginning and of the calendar to the KeyError to give more
detail to figure out root cause.
…ar-mismatch

MAINT: Add more info to history calendar KeyError.
This provides a 15% speedup for an algo that calls `data.current` with
1000 every minute.
Make `__next__` and `seek` share code instead of seek() calling
`__next__`.  This avoids having to make a large number of integer
comparisons and `asanyarray` calls when seeking more than one tick
forward.
This is a dramatic speedup (~25% in local benchmarks) for history calls
with a large number of assets and a short window length.
This shaves off 20 out of 160 seconds for an algorithm that makes a
large number of large universe, short window_length `history()` calls.
`_get_minute_window_data` was just forwarding its input to a method with
the same signature.
Avoids a couple function calls in a hot path.
Instead of using the difference between the session close of the front
contract before the roll and and the open of back contract on the
beginning of the roll, use the close of both at the end of the session
before the roll.

The closes of the session prior to roll is in lieu of settlement data.
…t-closes

BUG: Use proxy for settlement on future adjustments.
@richafrank
Copy link
Member

Hi @ash487! It looks like this PR proposes we merge our master branch into @ssanderson 's "revamp-tutorial" branch. Is the fix you're suggesting on a branch somewhere that you want us to pull in?

Eddie Hebert and others added 11 commits October 27, 2016 16:23
Apply offset value when writing out the rolls in a continuous future
which is offset from the primary.
This boundary case was exposed with internal fixture data which used a
continuous future with a contract chain of size one.
BUG: Fix continuous future history with offsets.
This will keep `opens`, `closes`, `early_closes`, etc to the
same pattern.
Rename _get_daily_window_for_sids to _get_daily_window_data.
Rename _get_minute_window_for_assets to _get_minute_window_data.
Rename _get_daily_data to get_daily_spot_value.
@richafrank
Copy link
Member

@ash487 I'm going to close this, but feel free to open a new PR using the branch with your fix!

@richafrank richafrank closed this Oct 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet