New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create in-memory restricted list #1487

Merged
merged 15 commits into from Oct 3, 2016

Conversation

Projects
None yet
3 participants
@lianga888
Contributor

lianga888 commented Sep 14, 2016

  • Create a restricted list manager that takes in information about restricted sids and stores in memory upon instantiation
  • Create a restrictions controller that aggregates and provides restrictions information from multiple restricted list managers
  • Make this restrictions controller available to BarData so that can_trade can take into account these restrictions
  • Register each restricted list manager as a trading control using set_do_not_order_list so that we may log/fail when we try to order a restricted sid
@coveralls

This comment has been minimized.

coveralls commented Sep 14, 2016

Coverage Status

Coverage increased (+0.04%) to 86.715% when pulling fb98c9a on rlist into 5e52d29 on master.

@lianga888

This comment has been minimized.

Contributor

lianga888 commented Sep 16, 2016

Some performance stats

The following are all 1-year backtest that calls can_trade for each 10 sids every bar:

No restrictions, run on master
85914154 function calls (85850195 primitive calls) in 136.524 seconds

No restrictions
88863552 function calls (88799595 primitive calls) in 138.518 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
983100    1.060    0.000    1.821    0.000 rl_manager.py:32(is_restricted)

Empty list of restrictions
89848085 function calls (89783898 primitive calls) in 151.495 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
983100    1.307    0.000    6.101    0.000 rl_manager.py:32(is_restricted)

Every sid is restricted for every minute
22692416 function calls (22627694 primitive calls) in 28.267 seconds (shorter because can_trade can just return False after checking the restricted list without checking the rest of the logic)
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
983100    0.880    0.000   13.295    0.000 rl_manager.py:32(is_restricted)

Every sid is restricted for every minute (don't return after checking restricted list)
97704933 function calls (97640211 primitive calls) in 162.710 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
983100    1.324    0.000   20.534    0.000 rl_manager.py:32(is_restricted)

10 restricted sids that are not the 10 sids checked
89874265 function calls (89808548 primitive calls) in 143.355 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
983100    1.206    0.000    5.904    0.000 rl_manager.py:32(is_restricted)

30 restrictions, 10 sids are restricted for every minute
22739981 function calls (22672901 primitive calls) in 28.331 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
983100    0.883    0.000   13.258    0.000 rl_manager.py:32(is_restricted)

The following is 1-year backtest that calls can_trade for each 30 sids every bar:

Every sid is restricted for every minute
51301697 function calls (51235132 primitive calls) in 58.730 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
2850990    2.318    0.000   37.002    0.000 rl_manager.py:32(is_restricted)
@ssanderson

This comment has been minimized.

Member

ssanderson commented Sep 16, 2016

@lianga888 for the stats above, are the can_trade calls in a loop, or are we doing can_trade([list, of assets])?

@ssanderson

This comment has been minimized.

Member

ssanderson commented Sep 16, 2016

How do the timings above compare to an algo calling can_trade a comparable number of times on master?

@lianga888

This comment has been minimized.

Contributor

lianga888 commented Sep 16, 2016

@ssanderson they are in a loop. Is it more meaningful to call them all at once?

@ssanderson

This comment has been minimized.

Member

ssanderson commented Sep 16, 2016

@ssanderson they are in a loop. Is it more meaningful to call them all at once?

If they're in a loop, then you're adding a lot of function call overhead to the timings. In general, I would hope that we can process can_trade([list, of, stocks]) faster than for stock in list: can_trade(stock), because we have the potential to do work more in batches in the latter case.

@lianga888

This comment has been minimized.

Contributor

lianga888 commented Sep 19, 2016

Did a run of the no-restrictions case with can_trade called for all stocks at once, and it seems to be slower than the one called in loop?

Mon Sep 19 09:21:41 2016    algobench/results/call-at-once/stats.dmp

         94171976 function calls (93911468 primitive calls) in 167.678 seconds

   Random listing order was used
   List reduced from 1831 to 1 due to restriction <'is_restricted'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   983100    1.195    0.000    2.112    0.000 rl_manager.py:32(is_restricted)

vs. 

Mon Sep 19 09:26:00 2016    algobench/results/call-in-loop/stats.dmp

         88863566 function calls (88799609 primitive calls) in 148.885 seconds

   Random listing order was used
   List reduced from 1831 to 1 due to restriction <'is_restricted'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   983100    1.094    0.000    1.915    0.000 rl_manager.py:32(is_restricted)
@lianga888

This comment has been minimized.

Contributor

lianga888 commented Sep 19, 2016

In any case, it seems like these performance numbers are good enough?

@ssanderson

This comment has been minimized.

Member

ssanderson commented Sep 19, 2016

In TradingAlgorithm.set_do_not_order_list, the docstring is out of date (it says that the input needs to be a container[Asset].

'Restriction', ['sid',
'effective_date',
'expiry_date',
'restriction_type']

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

What are the valid values for this? We probably want an enum for these.

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

This might just want to be a True/False value until we know what the actual values are going to be.

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

Or, equivalently, an enum with ALLOWED and FROZEN as the only states.

)


class RestrictionsController(object):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

I don't think we need this class. This is just another implementation of RLManager that takes two restricted lists as inputs.

def add_restrictions(self, rl_manager):
self.rl_managers.append(rl_manager)

def restrictions(self, sid, dt):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

I don't think this needs to be a method on the interface. I'm also not sure we need this for the current implementation goals.

return len(self.restrictions(sid, dt)) != 0


class RLManager(with_metaclass(abc.ABCMeta)):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

I'd probably just call this interface something like Restrictions.

"""
ABC for a restricted list manager that returns information about
restrictions from a single source
"""

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

We probably want documentation for the interface methods here.

restrictions from a single source
"""

def __init__(self, **kwargs):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

We don't need this constructor.

raise NotImplementedError


class StaticRestrictedList(RLManager):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

This can probably just be StaticRestrictions.

Tracks restrictions with defined start, end and type for different sids
"""

def __init__(self, restrictions):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

I think we can simplify this a fair amount if we assume that the restrictions are disjoint and sorted:

# Assumptions:
# Restrictions don't overlap.
# Restrictions are ordered by effective_date.
def restriction_type(restrictions, sid, dt):
    state = ALLOWED
    for r in restrictions.get(sid, ()):
        if r.effective_date > dt:
            break
        state = r.state
    return state
'Restriction', ['sid',
'effective_date',
'expiry_date',
'restriction_type']

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

Or, equivalently, an enum with ALLOWED and FROZEN as the only states.

raise NotImplementedError

@abc.abstractmethod
def is_restricted(self, sid, dt):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

This should probably take an iterable of sids to allow for a future vectorized implementation.

pass

@abc.abstractmethod
def restrictions(self, sid, dt):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

Noted below, but I don't think we need this in the interface.

def __init__(self, restricted_list):
super(StaticRestrictedList, self).__init__()

self._restricted_set = set(restricted_list)

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

You might make this a frozenset, since we don't want to mutate it in place.

@@ -16,7 +16,7 @@

class SecurityList(object):

def __init__(self, data, current_date_func, asset_finder):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

👍 on all of this.

self.order_count = 0
self.set_do_not_order_list(restricted_list)
self.set_do_not_order_list(restricted_list, fail_on_violation)

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

This might want to be a string/enum like on_error with values of 'warn' or 'error' rather than a bool. Having an enum makes it easier to extend the behavior here in the future.

@@ -38,7 +38,7 @@ class AlgorithmSimulator(object):
}

def __init__(self, algo, sim_params, data_portal, clock, benchmark_source,
universe_func):
rl_controller, universe_func):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

References to rl_controller can probably just be restrictions.

@@ -137,8 +151,11 @@ def validate(self,
"""
Fail if the asset is in the restricted_list.
"""
if asset in self.restricted_list:
self.fail(asset, amount, _algo_datetime)
if self.restricted_list.is_restricted(asset, _algo_datetime):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

We might rename this to restrictions for consistency elsewehere.

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

The underscores here are intended to signify unused parameters, so this should probably no longer be prefixed with an underscore.

def log(self, asset, amount, datetime, metadata=None):
constraint = repr(self)
if metadata:
constraint = "{constraint} (Metadata: {metadata})".format(

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

This is duplicated between fail and log. We should probably refactor to something like _format_error_message.

constraint=constraint,
metadata=metadata
)
log.error("Order for %s shares of %s at %s violates trading "

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

We don't need to format the string ahead of time here. Instead, this wants to be:

log.error('Order for {num_shares} of {asset} at {dt} violates ...', num_shares=amount, asset=asset, dt=datetime ...).

@@ -75,6 +78,16 @@ def fail(self, asset, amount, datetime, metadata=None):
datetime=datetime,
constraint=constraint)

def log(self, asset, amount, datetime, metadata=None):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

I think rather than making a new method here, we should just push the notion of fail_on_violation into the base class.

@@ -482,6 +487,9 @@ cdef class BarData:
cdef object session_label
cdef object dt_to_use_for_exchange_check,

if self._rl_controller.is_restricted(asset, adjusted_dt):

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

We might want to cache the method lookup here at the constructor. That will save us an attribute access per invocation here.

@@ -185,6 +189,7 @@ cdef class BarData:
self._adjust_minutes = False

self._trading_calendar = trading_calendar
self._rl_controller = rl_controller

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

See note below re: caching this.

@@ -153,6 +153,9 @@ cdef class BarData:
data_frequency : {'minute', 'daily'}
The frequency of the bar data; i.e. whether the data is
daily or minute bars
rl_controller : callable

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

callable isn't the right annotation here. We just just put Restrictions (probably with the module path when we decide on it.)

self.set_do_not_order_list(self.rl.leveraged_etf_list)
self.order_count = 0
self.sid = self.symbol(symbol)

def handle_data(self, data):
if not self.order_count:
if self.sid not in \
self.rl.leveraged_etf_list:
if not self.rl.leveraged_etf_list.is_restricted(

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

We probably need to check if this is going to break any existing algos on Q.

This comment has been minimized.

@ssanderson

ssanderson Sep 19, 2016

Member

If it does we may need to add a backwards compat shim.

@lianga888 lianga888 force-pushed the rlist branch 5 times, most recently from d32355f to a07791b Sep 20, 2016

@coveralls

This comment has been minimized.

coveralls commented Sep 20, 2016

Coverage Status

Coverage increased (+0.02%) to 86.691% when pulling a07791b on rlist into 3fff659 on master.

@lianga888 lianga888 force-pushed the rlist branch 2 times, most recently from b1610d1 to 283f13e Sep 20, 2016

@coveralls

This comment has been minimized.

coveralls commented Sep 20, 2016

Coverage Status

Coverage increased (+0.06%) to 86.733% when pulling 283f13e on rlist into 3fff659 on master.

@coveralls

This comment has been minimized.

coveralls commented Sep 20, 2016

Coverage Status

Coverage increased (+0.06%) to 86.733% when pulling 283f13e on rlist into 3fff659 on master.

@ssanderson

Apparently I had a bunch of staged comments from the last time I looked at this. Apologies; still learning the new review interface I guess.

from unittest import TestCase
import pandas as pd

from zipline.rl_manager import InMemoryRLManager, Restriction, \

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

Style nit: I would format this as:
from zipline.rl_manager import ( <imports> )

rlm1 = InMemoryRLManager(restrictions)

self.assertEqual(rlm1.restrictions(
1, str_to_ts('2011-01-03')), {'freeze'})

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

Similar general style nit: I tend to prefer putting the closing paren here on a new line, which makes it easier to add or remove a parameter from the parameter list here.

self.assertEqual(rlm1.restrictions(
1, str_to_ts('2011-01-03 14:31')), {'freeze'})
self.assertEqual(rlm1.restrictions(
1, str_to_ts('2011-01-04')), {'freeze', 'liquidate'})

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

I don't think we need the tests for restrictions right now.

self.assertEqual(rlm1.restrictions(
1, str_to_ts('2011-01-05')), {'liquidate'})
self.assertEqual(rlm1.restrictions(
2, str_to_ts('2011-01-03')), {'liquidate'})

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

We should make sure we test intraday additions/removals of restrictions.

@@ -768,11 +775,13 @@ def test_minute_midnight(self):

midnight_bar_data = \
BarData(self.data_portal, lambda: midnight, 'minute',

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

These call sites might benefit from refactoring to a self.make_bar_data method or something like it, since they all share the same parameters except one.

"""

minutes_to_check = [
(pd.Timestamp("2016-01-05", tz="UTC"), False),

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

We should make sure that we test with minutely data.

@@ -1059,11 +1125,40 @@ def test_get_value_adjustments(self,
self.data_portal,
lambda: self.equity_daily_bar_days[2],
"daily",
self.trading_calendar
self.trading_calendar,
self.restrictions_controller

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

Same deal as below, this would have been an easier change to make if we refactored the creation of BarData to one call site.

self.assertEqual(
380,
bar_data.current(self.ILLIQUID_SPLIT_ASSET, "price")
)

bar_data = BarData(
self.data_portal, lambda: day0_minutes[-1], "minute",
self.trading_calendar
self.trading_calendar,

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

Same note as below re: refactoring these call sites to be easier to update in bulk.

@@ -25,6 +25,8 @@

from zipline.data.data_portal import DataPortal
from zipline.protocol import BarData
from zipline.rl_manager import RestrictionsController, InMemoryRLManager, \

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

See note below re: formatting here.

@@ -750,7 +774,8 @@ def test_orders_stop_limit(self):
bar_data = BarData(self.data_portal,
lambda: self.minutes[1],
self.sim_params.data_frequency,
self.trading_calendar)

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

See notes elsewhere re: refactoring BarData call sites.

This comment has been minimized.

@lianga888

lianga888 Sep 23, 2016

Contributor

I've definitely refactored it and added a fixture. Not sure how you're able to comment on a piece of code that's no longer there

This comment has been minimized.

@ssanderson

ssanderson Sep 26, 2016

Member

I think this comment was left over from previously-staged comments. Feel free to ignore.

self.trading_calendar)
bar_data = self.create_bardata(
simulation_dt_func=lambda: self.minutes[0],
data_frequency=self.sim_params.data_frequency

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

Are there any cases where this test suite wants to pass a data frequency different from self.sim_params.data_frequency? If not, should we just push that logic into create_bardata?

This comment has been minimized.

@lianga888

lianga888 Sep 26, 2016

Contributor

It's true that only self.sim_params.data_frequency is ever passed in this test suite. But what about the other suites that use this fixture and want to use a frequency other than self.sim_params.data_frequency?

class WithCreateBarData(WithDataPortal, WithTradingCalendars):

def create_bardata(self, simulation_dt_func, **kwargs):
data_portal = kwargs.get('data_portal', self.data_portal)

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

These should just be function parameters. (For data_portal and trading_calendar, we can make the function parameter default None, and then use an attribute of self if it's not provided). The only time it's correct to accept **kwargs in a function is if we're forwarding those kwargs to another function.

@@ -505,9 +505,10 @@ def initialize(self, asset=None, max_shares=None, max_notional=None):


class SetDoNotOrderListAlgorithm(TradingAlgorithm):
def initialize(self, sid=None, restricted_list=None):
def initialize(self, sid=None, restricted_list=None,

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

Style nit: can we either put the parameters here all on one line, or put one parameter on a line? The way this is written makes it hard to see at a glance how many arguments this function accepts.

def is_restricted(self, assets, dt):
if isinstance(assets, Asset):
return False
return pd.Series(data={asset: False for asset in assets})

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

We should generally avoid using the Series constructor from a dict if we can, especially in performance-critical paths. This should be pd.Series(index=assets, data=False).

"""

def __init__(self, restricted_list):
super(StaticRestrictions, self).__init__()

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

There's no reason to call super here.

If dynamic information should be displayed as well, pass it in via
`metadata`.
"""
constraint = self._format_constraint(metadata)

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

I might call this constraint_msg or something similar to clarify that this isn't a constraint object, it's just a string with info about the constraint.

@@ -86,19 +103,19 @@ class MaxOrderCount(TradingControl):
placed in a given trading day.
"""

def __init__(self, max_count):
def __init__(self, max_count, on_error='fail'):

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

I'm inclined to say that on_error should be a required parameter here, with defaults being supplied at the user-facing API layer of set_*_control. Thoughts?

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

As a general rule, we want to make parameters required as early in the stack as possible. The more places a parameter is a default, the more opportunities we create to accidentally forget to pass the non-default parameter.

@@ -120,25 +137,25 @@ class RestrictedListOrder(TradingControl):
Parameters
----------
restricted_list : container[Asset]
The assets that cannot be ordered.
restrictions : Implementation of zipline.finance.restrictions.Restrictions

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

You don't need Implementation of. The usual format for these comments is "name : type", where type might be a base class.

Is the asset or assets restricted on this dt?
"""
raise NotImplementedError

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

This is raising an exception type rather than an instance. That works, but mostly just for legacy compatibility with old python, and raising a naked type is usually frowned upon. A common pattern for abc's like this is to raise a NotImplementedError with the name of the method not implemented. In this case, it would be raise NotImplementedError('is_restricted').

break
state = r.state
return state == RESTRICTION_STATES.FROZEN
return pd.Series(data={

This comment has been minimized.

@ssanderson

ssanderson Sep 23, 2016

Member

We should really try to avoid constructing Series from dictionaries in performance-critical paths. Better here would be:

index = pd.Index(assets)
is_restricted = partial(self.is_restricted, dt=dt)  # or self._is
values = [is_restricted(a) for a in assets]  # This could also be `map(is_restricted, assets)`, but there's some py3 compat trickiness
rlm = InMemoryRestrictions(restrictions)

# Check the restrictions are ordered by dt for each sid
self.assertEqual(

This comment has been minimized.

@ssanderson

ssanderson Sep 26, 2016

Member

I don't think we should be making assertions about the internal representation of the data that's chosen here. As long as the Restrictions class behaves as expected, it should be free to store the data however it wants.

('2011-01-04', '2011-01-05', '2011-01-06')]
)

self.assertFalse(

This comment has been minimized.

@ssanderson

ssanderson Sep 26, 2016

Member

As a reader, it's hard for me to see the relationship between the assertions being made here and the original test data. I would write this as something like:

    def test_historical_restrictions(self):
        A1_FREEZE = str_to_ts('2011-01-04')
        A1_THAW = str_to_ts('2011-01-05')
        A1_RE_FREEZE = str_to_ts('2011-01-06')
        minute = pd.Timedelta('1 minute')

        # Build restrictions.

        def assert_is_restricted(asset, dt):
            self.assertTrue(rlm.is_restricted(asset, dt))

        def assert_not_restricted(asset, dt):
            self.assertFalse(rlm.is_restricted(asset, dt))

        # Not restricted until on or after the freeze.
        assert_not_restricted(self.ASSET1, A1_FREEZE - one_minute)
        assert_is_restricted(self.ASSET1, A1_FREEZE)
        assert_is_restricted(self.ASSET1, A1_FREEZE + one_minute)

        # Unrestricted on or after the thaw.
        assert_is_restricted(self.ASSET1, A1_THAW - one_minute)
        assert_not_restricted(self.ASSET1, A1_THAW)
        assert_not_restricted(self.ASSET1, A1_THAW + one_minute)

        # Restricted again on or after the freeze.
        assert_not_restricted(self.ASSET1, A1_RE_FREEZE - one_minute)
        assert_is_restricted(self.ASSET1, A1_RE_FREEZE)
        assert_is_restricted(self.ASSET1, A1_RE_FREEZE + one_minute)
        # Should stay restricted for the rest of time.
        assert_is_restricted(self.ASSET1, A1_RE_FREEZE + one_minute * 100000)

         # Assertions for ASSET2.

The above also has the benefit that we could parametrize on the freeze/thaw dates (e.g. to ensure that we still work properly intraday) without affecting the business logic of the test.

cls.ASSET2 = cls.asset_finder.retrieve_asset(2)
cls.ASSET3 = cls.asset_finder.retrieve_asset(3)

def test_in_memory_restrictions(self):

This comment has been minimized.

@ssanderson

ssanderson Sep 26, 2016

Member

Other cases that we should probably test:

  • Multiple assets with restrictions on the same date shouldn't interact. (Right now no assets ever have restrictions that apply on the same day.)
  • The same asset with consecutive FROZEN or ALLOWED entries (this should either be an error, or we should ensure that it works. I'm inclined to say the former, since there's a clear and obvious semantics for it.)
  • Restrictions with intraday effective_dates. (See my note below re: refactoring this to be more readable/parametrizable.
  • Re-ordering of the restrictions here and asserting that there's no change in behavior.
"""

def __new__(cls, sub_restrictions):
"""

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Generally we don't put docstrings on __init__ and __new__, since the constructor for a type gets documented in the class docstring.

def is_restricted(self, assets, dt):
if isinstance(assets, Asset):
return any(
r.is_restricted(assets, dt) for r in self.sub_restrictions)

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Personal style note: I find it easier to read and edit expressions like this if the closing paren is on a new line by itself.

class HistoricalRestrictions(Restrictions):
"""
Historical restrictions stored in memory with effective dates for each
asset

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Style nit: period here.


class SecurityListRestrictions(Restrictions):
"""
Restrictions based on a security list

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Same note as above. The initial sentence for a docstring should end in a period.


class Restrictions(with_metaclass(abc.ABCMeta)):
"""
Abstract restricted list interface

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Often for the ABC in a hierarchy, you want a sentence or two describing what an instance of that type represents abstractly. Here, it would be something like: "ABC representing a set of stocks that an algorithm is restricted from trading.". For bonus points, you could add a Methods section listing the names and signatures of the methods required for implementing the interface.


class NoRestrictions(Restrictions):
"""
A no-op restrictions that contains no restrictions

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Period here.

class StaticRestrictions(Restrictions):
"""
Static restrictions stored in memory that are constant regardless of dt
for each asset

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Period here.


@api_method
@expect_types(
restrictions=(Restrictions,),

This comment has been minimized.

@ssanderson

ssanderson Sep 30, 2016

Member

Minor note: this can just be Restrictions instead of a single-element tuple.

@ssanderson

This comment has been minimized.

Member

ssanderson commented Sep 30, 2016

@lianga888 this looks great to me. I left a bunch of ticky-tack style notes on the docstrings for Restrictions, but otherwise I think this is good to merge pending downstream readiness.

assert_not_restricted(self.ASSET3, dt)
assert_all_restrictions([False, False, False], dt)

def test_union_restrictions(self):

This comment has been minimized.

@ssanderson

Andrew Liang and others added some commits Sep 28, 2016

Andrew Liang
MAINT: Deprecate `set_do_not_order_list`
In favor of a new method `set_restrictions` which takes a Restrictions
object. Calls to `set_do_not_order_list` should raise a deprecation
warning and create an equivalent Restrictions object, with which
`set_restrictions` will be called. For convenience, create a
RestrictionsSet from which the "restrictions" version of a security
list can be accessed
TEST: Clarify test_restrictions a bit.
- Use parameter_space instead of `parameterized.expand`.
- Use a timedelta instead of concatenating strings.
- Use a (possibly no-op) scramble function instead of reordering list
  literals.
- Use `freeze_dt, unfreeze_dt, re_freeze_dt` instead of `dts[n]`.
- Rename `assert_vectorized_results` to `assert_all_restrictions`.
MAINT: Use Timedelta instead of DateOffset.
In days_at_time, use a Timedelta instead of a DateOffset.  We were
previously using DateOffset to work around a pandas 16 bug even though
it raises a PerformanceWarning.  Now that we're on pandas 18, we can use
the much simpler Timedelta construction.
BUG: `days_at_time` should return UTC dates.
Adds an example and clarifies the docs.
Andrew Liang
MAINT: Rename restrictions.py to asset_restrictions.py
For clarity as to what sort of restrictions these are
@coveralls

This comment has been minimized.

coveralls commented Sep 30, 2016

Coverage Status

Coverage increased (+0.1%) to 86.802% when pulling c5ee71a on rlist into 12357a8 on master.

@lianga888 lianga888 merged commit ea99962 into master Oct 3, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@lianga888 lianga888 deleted the rlist branch Oct 3, 2016

@lianga888

This comment has been minimized.

Contributor

lianga888 commented Oct 4, 2016

The following are all 1-year backtest that calls can_trade for each 10 sids every bar:

No restrictions, run on master
59027945 function calls (59016856 primitive calls) in 93.299 seconds

No restrictions
59028121 function calls (59016996 primitive calls) in 95.662 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
979200 0.556 0.000 0.829 0.000 asset_restrictions.py:123(is_restricted)

Empty list of restrictions
59029588 function calls (59018233 primitive calls) in 91.450 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
979200 0.763 0.000 1.025 0.000 asset_restrictions.py:143(is_restricted)

Every sid is restricted for every minute(by-passes remaining can_trade logic because restricted)
12805905 function calls (12794098 primitive calls) in 14.291 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
979200 0.689 0.000 2.551 0.000 asset_restrictions.py:177(is_restricted)

10 restricted sids that are not the 10 sids checked(runs remaining can_trade logic)
61013160 function calls (61000358 primitive calls) in 101.846 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
979200 1.010 0.000 2.882 0.000 asset_restrictions.py:177(is_restricted)

30 restrictions, 10 sids are restricted for every minute(by-passes remaining can_trade logic because restricted)
12852158 function calls (12838031 primitive calls) in 14.053 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
979200 0.670 0.000 2.484 0.000 asset_restrictions.py:177(is_restricted)

The following are all 1-year backtest that calls can_trade for each 30 sids every bar:

Every sid is restricted for every minute(by-passes remaining can_trade logic because restricted)
25871908 function calls (25858296 primitive calls) in 21.991 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
2839680 1.734 0.000 6.550 0.000 asset_restrictions.py:177(is_restricted)

@lianga888

This comment has been minimized.

Contributor

lianga888 commented Oct 5, 2016

screen shot 2016-10-05 at 9 47 50 am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment