New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: custom class registration support #2210

Merged
merged 46 commits into from Jul 12, 2018

Conversation

Projects
None yet
4 participants
@jnazaren
Contributor

jnazaren commented Jun 12, 2018

This PR further extends #2200 and #2201, which respectively implement custom blotter and custom command line argument support. It includes some restructuring of the code, and a new general RegistrationManager class defined in the extensions.py file started in #2201 . Users can now add RegistrationManager instances for abstract base classes into the global custom_types dictionary by decorating abstract base classes with @create_registration_manager. Once this is done, subclasses can be registered with the following decorator: @register(base-class, 'name'). Registration of subclasses will also add options to the corresponding base class's click option in the __main__.run() command, which can be set with --[base-class]-class 'name'. This PR fully preserves the functionality introduced in #2201 , which includes support for accessing custom command line arguments introduced with -x via namespaces, as in zipline.extension_args.[argument].[optional_namespaces...]

jnazaren added some commits May 30, 2018

@richafrank

Great functionality enhancement @jnazaren ! Round of comments inline. Nothing major. We should also update the body of this PR description to change "metaclass" to "base class".

@@ -17,7 +17,7 @@
import pandas as pd

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

What are your thoughts on the new module layout here? Would we want it to match the layout of the classes under test, e.g. tests/finance/blotter/test_blotter.py? If there's just the one module here, do you think the subpackage is useful?

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

@richafrank Thank you for pointing that out! I originally had it this way because I also had a blotter/test_core.py for testing registration, but now it's no longer needed due to the RegistrationManager class.

super(CmdLineTestCase, self).init_instance_fixtures()

# make sure this starts empty
# TODO: decide how this needs to be reset

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

Do we need to DO this still?

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

That was a comment I left when @llllllllll mentioned we still need to confirm how this field would be reset. I think it works fine this way - any thoughts?

'key': 'value',
}
)
create_args(arg_list, zipline.extension_args)

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

If the other tests all need to touch the global value anyway, then disregard this suggestion, but we could isolate this test more by passing a new Namespace() here. Just thinking about the remaining TODO and needing to reset the global state.

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

One of the test functions does need to access the global state, as it tests user input, so using the global is required there. However, I could use a Namespace instance for the other tests.


def init_instance_fixtures(self):
super(RegistrationManagerTestCase, self).init_instance_fixtures()
self.rm = RegistrationManager(Blotter)

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

Is it important (e.g. for speed, mutated state, ...) that the same RegistrationManager be used across these tests? If not, my tendency would be to make a new one per test function, so there's less danger of one test affecting another unexpectedly or imposing a required order for the tests running.

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

Ok, I've changed it

(self.dtype.__name__, name, sorted(self._classes)),
)

def class_exists(self, name):

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

What do you think about class_registered? This seems more about registration than existence.

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

Sure, that sounds better

class RegistrationManager(object):

def __init__(self, dtype):
self.dtype = dtype

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

How does dtype here compare to numpy.dtype? What's the "d" here?

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

It's the abstract base class that the instance is responsible for managing, which is why it's a bit like dtype in numpy. type would also make sense.


import click
from zipline.algorithm import TradingAlgorithm

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

The overall ordering of imports we try for is stdlib, 3rd-party libs, this package's modules; and alphabetized within each. Want to move these new imports below (looks like the import of TradingAlgorithm used to be below)?

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

Sure, I was also a bit confused about why that import statement got moved around

@@ -181,6 +186,15 @@ def choose_loader(column):
except ValueError as e:
raise _RunAlgoError(str(e))

blotter_class = kwargs.pop("Blotter_class")

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

What casing scheme are we using? Blotter_class

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

The caller of this below looks like it could pass in a class name or a class. Does load expect a class?

@@ -181,6 +186,15 @@ def choose_loader(column):
except ValueError as e:
raise _RunAlgoError(str(e))

blotter_class = kwargs.pop("Blotter_class")
if blotter_class is not None:

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

When would someone pass None here?

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

I changed this a bit in order to support the fact that subclasses can now be chosen via command line options in the form of --[base-classname]-class [default | other options...]. In the case of Blotter, we'd be expecting --Blotter-class 'name', where the name defaults to default. I don't think I thought it through fully though, so I'll change this.

environ):
environ,
*args,
**kwargs):

This comment has been minimized.

@richafrank

richafrank Jun 14, 2018

Member

Does this need to accept unknown more arguments as we have here, or could it accept just blotter_class? I didn't see any other uses of args or kwargs.

This comment has been minimized.

@jnazaren

jnazaren Jun 14, 2018

Contributor

This is for new custom command line arguments in the case where the user adds a new type of base class. However, this is pretty unlikely now that I think of it, as I was expecting this kind of thing to be our responsibility. Any extra subclasses added will simply be options in the list, so that doesn't really change anything. I'll change this back to just blotter.

add_cli_option(
run,
name="--%s-class" % c.__name__.lower(),
choices=list(custom_types[c].get_registered_classes().keys()),

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

do we need the .keys() here? doesn't the default iterator traverse the keys?

run,
name="--%s-class" % c.__name__.lower(),
choices=list(custom_types[c].get_registered_classes().keys()),
help="The subclass of %s to use, defaults to 'default'"

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

can we group this string expression in parens so it is clear that the % ... is attached to the string?

"""
A placeholder object representing a namespace
"""
pass

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

you don't need pass if you have the docstring.

def register(self, name, custom_class=None):

if custom_class is None:
return partial(self.register, name)

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

if you curry the method, you only need to program against the fully applied case, the decorator manages this part.



# A global dictionary for storing instances of RegistrationManager:
custom_types = OrderedDict([])

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

why is this ordered?

This comment has been minimized.

@jnazaren

jnazaren Jun 22, 2018

Contributor

@llllllllll This was originally so that we would have an easy way of removing/modifying any of the RegistrationManager instances, but I don't think we would really have to do anything like that due to the fact that the user only has these options displayed when they start running zipline, right?

pass


class RegistrationManager(object):

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

NounManager names can often be shortened to a more direct word. What about the more concise: Registry, which means the same thing?

This comment has been minimized.

@jnazaren

jnazaren Jun 22, 2018

Contributor

Sure

@@ -51,6 +59,17 @@ def main(extension, strict_extensions, default_extension):
os.environ,
)

for c in custom_types:

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

Right now the user still needs to add an argument to run to handle any of these. These are also known at import time, so could we move this code to module scope or make it less automatic?


class RegistrationManagerTestCase(ZiplineTestCase):

def init_instance_fixtures(self):

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

we can remove this if it is just calling super.

super(RegistrationManagerTestCase, self).init_instance_fixtures()

def test_load_not_registered(self):
rm = RegistrationManager(Blotter)

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

can we use a fake class here instead of Blotter?

import sys
import warnings

from runpy import run_path

This comment has been minimized.

@llllllllll

llllllllll Jun 20, 2018

Member

this was sorted alphabetically before

jnazaren added some commits Jun 22, 2018

@llllllllll

Sorry it took me so long to get back to this. My only remaining question is around when we want to parse the args in __main__, other than that, this looks really good!

with assert_raises_str(ValueError, msg):
parse_extension_arg('1=test3', {})
msg = (
'invalid extension argument arg4 test4, '

This comment has been minimized.

@llllllllll

llllllllll Jul 11, 2018

Member

should we quote the name of the key in this error message to make it clear that it parsed as a single string with a space?

)


class FakeInterface(with_metaclass(ABCMeta)):

This comment has been minimized.

@llllllllll

llllllllll Jul 11, 2018

Member

Does this actually need to be an abc?

This comment has been minimized.

@jnazaren

jnazaren Jul 12, 2018

Contributor

It doesn't need to be, but it does make it more realistic if we have it this way (the registry would usually be instantiated with an abc) -- I can change it though

This comment has been minimized.

@llllllllll

llllllllll Jul 12, 2018

Member

This makes me think that it is a requirement, even if it is not. For unit tests, it is best to keep the test limited to just a small component under test.

This comment has been minimized.

@jnazaren

jnazaren Jul 12, 2018

Contributor

Ok, makes sense

expected_classes = {'ayy-lmao': ProperDummyInterface}
assert_equal(rm.get_registered_classes(), expected_classes)
assert_is(rm.load('ayy-lmao'), ProperDummyInterface)
assert_true(rm.class_registered('ayy-lmao'))

This comment has been minimized.

@llllllllll

llllllllll Jul 11, 2018

Member

assert true normally should take a msg. If this fails, it normally just says something like: False != True which is not the most helpful.

@@ -68,7 +70,6 @@ def cleanup_tempdir():
_()
del _


__all__ = [

This comment has been minimized.

@llllllllll

llllllllll Jul 11, 2018

Member

we should put extension_args in the all`

@click.option(
'-x',
multiple=True,
help='Any custom command line arguments to define, in key=value form'

This comment has been minimized.

@llllllllll

llllllllll Jul 11, 2018

Member

missing a period

@@ -51,6 +60,8 @@ def main(extension, strict_extensions, default_extension):
os.environ,
)

create_args(x, zipline.extension_args)

This comment has been minimized.

@llllllllll

llllllllll Jul 11, 2018

Member

should we parse the args before loading the extension? I could imagine the extension code wanting access to this to find resources

@@ -267,6 +286,25 @@ def run(ctx,
return perf


def add_cli_option(func, name, choices, help):

This comment has been minimized.

@llllllllll

llllllllll Jul 11, 2018

Member

I think this is now unused.

@llllllllll

This comment has been minimized.

Member

llllllllll commented Jul 12, 2018

LGTM, ready to merge on pass!

@jnazaren jnazaren merged commit ce69ad0 into master Jul 12, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jnazaren jnazaren deleted the custom_class_support branch Jul 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment