Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Remove import-time calendar creations. #1471

Merged
merged 4 commits into from Sep 8, 2016
Merged

Conversation

@ssanderson
Copy link
Contributor

@ssanderson ssanderson commented Sep 5, 2016

Remove module scope invocations of get_calendar('NYSE'), which cuts
zipline import time in half on my machine. This make the zipline CLI
noticeably more responsive, and it reduces memory consumed at import
time from 130MB to 90MB.

All of the module-scope get_calendar invocations were in the bundle registration code. There are two major changes required to make this work:

  1. Added a register_calendar_alias method and reworked the way calendars are stored internally in TradingCalendarDispatcher. There's now always one canonical name under which a calendar is stored, but possibly many "aliases" to that calendar. This is now used for the QUANDL and YAHOO loaders to alias NYSE.
  2. Removed the creation of a calendar during bundle registration. The calendar associated with a bundle is now always registered by name.

Before:

$ time python -c 'import zipline'

real 0m1.262s
user 0m1.128s
sys 0m0.120s

After:

$ time python -c 'import zipline'

real 0m0.676s
user 0m0.536s
sys 0m0.132s

cc @jbredeche on the calendar changes and @llllllllll on the bundle setup changes.

@ssanderson ssanderson changed the title PERF: Remove module-scope calendar creations. PERF: Remove import-time calendar creations. Sep 5, 2016
@ssanderson ssanderson force-pushed the fix-slow-startup branch 2 times, most recently from eb5c4d5 to 53ea5c9 Sep 5, 2016
Remove module scope invocations of `get_calendar('NYSE')`, which cuts
zipline import time in half on my machine. This make the zipline CLI
noticeably more responsive, and it reduces memory consumed at import
time from 130MB to 90MB.

Before:

$ time python -c 'import zipline'

real    0m1.262s
user    0m1.128s
sys     0m0.120s

After:

$ time python -c 'import zipline'

real    0m0.676s
user    0m0.536s
sys     0m0.132s
@ssanderson ssanderson force-pushed the fix-slow-startup branch from 53ea5c9 to 1ca23f2 Sep 6, 2016
@coveralls
Copy link

@coveralls coveralls commented Sep 6, 2016

Coverage Status

Coverage decreased (-0.02%) to 86.591% when pulling 977d1fa on fix-slow-startup into c09f7ab on master.

1 similar comment
@coveralls
Copy link

@coveralls coveralls commented Sep 6, 2016

Coverage Status

Coverage decreased (-0.02%) to 86.591% when pulling 977d1fa on fix-slow-startup into c09f7ab on master.

@@ -369,7 +369,7 @@ def _empty_ingest(self, _wrote_to=[]):
"""
if not self.bundles:
@self.register('bundle',
calendar=get_calendar('NYSE'),
calendar_name=('NYSE'),

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

we can drop the parens

# These need to happen after the other imports.
from . algorithm import TradingAlgorithm
from . import api

__version__ = get_versions()['version']

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

why did this move before the other imports

calendars : dict[str -> TradingCalendar]
Initial set of calendars.
calendar_factories : dict[str -> function]
Factories for lazy

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

This seems cut off

"""
# Use an OrderedDict as an ordered set so that we can return the order
# of aliases in the event of a cycle.
seen = OrderedDict({name: None})

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

this initialization will be overridden on the first line of the loop

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

Could we just use a list here, I can't imagine this growing very large

edit: read the comment, the list question still stands

cycle = seen.keys()
cycle.append(name)
raise CyclicCalendarAlias(
cycle=" -> ".join([repr(k) for k in cycle])

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

you can drop the brackets

# of aliases in the event of a cycle.
seen = OrderedDict({name: None})

while name in self._aliases:

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

This creates a really nice error, thanks

@@ -612,5 +620,5 @@ def should_clean(name):

bundles, register, unregister, ingest, load, clean = _make_bundle_core()

register_calendar("YAHOO", get_calendar("NYSE"))
register_calendar("QUANDL", get_calendar("NYSE"))
register_calendar_alias("YAHOO", "NYSE")

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

can we move these to the yahoo and quandl modules?

# PERF: Fire a warning if calendars were instantiated during zipline import.
# Having calendars doesn't break anything per-se, but it makes zipline imports
# noticeably slower, which becomes particularly noticeable in the Zipline CLI.
from zipline.utils.calendars.calendar_utils import global_calendar_dispatcher

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

+1 glad we will catch this early in the future

@@ -157,7 +157,7 @@ def adjustments_callback(request):
self.register(
'bundle',
yahoo_equities(self.symbols),
calendar=self.calendar,
calendar_name='NYSE',

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

should we use YAHOO?

This comment has been minimized.

@ssanderson

ssanderson Sep 8, 2016
Author Contributor

The vanilla yahoo equities loader still uses NYSE, so I think this is fine.


def test_follow_alias_chain(self):
self.assertIs(
self.dispatcher.get_calendar('ICE'),

This comment has been minimized.

@llllllllll

llllllllll Sep 6, 2016
Contributor

assertions are normally flipped like: result, expected

Scott Sanderson added 2 commits Sep 8, 2016
Scott Sanderson
@coveralls
Copy link

@coveralls coveralls commented Sep 8, 2016

Coverage Status

Coverage increased (+0.05%) to 86.659% when pulling 40fa6ae on fix-slow-startup into c09f7ab on master.

2 similar comments
@coveralls
Copy link

@coveralls coveralls commented Sep 8, 2016

Coverage Status

Coverage increased (+0.05%) to 86.659% when pulling 40fa6ae on fix-slow-startup into c09f7ab on master.

@coveralls
Copy link

@coveralls coveralls commented Sep 8, 2016

Coverage Status

Coverage increased (+0.05%) to 86.659% when pulling 40fa6ae on fix-slow-startup into c09f7ab on master.

@ssanderson ssanderson merged commit 1ccc9e4 into master Sep 8, 2016
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@ssanderson ssanderson deleted the fix-slow-startup branch Sep 8, 2016
bartosh pushed a commit to bartosh/zipline that referenced this pull request Sep 27, 2016
PERF: Remove import-time calendar creations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.