New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Save up to 80% of the calls to minute_to_session_label #1492

Merged
merged 1 commit into from Sep 16, 2016

Conversation

Projects
None yet
5 participants
@jbredeche
Member

jbredeche commented Sep 16, 2016

One year NYSE test that buys a lot triggers 492,963 calls to
minute_to_session_label. Only 98924 (~= (390 * 252)) make it past the
cache and trigger the heavier computation.

@richafrank

This comment has been minimized.

Member

richafrank commented Sep 16, 2016

I'm excited! Can you paste the before and after?

@jbredeche

This comment has been minimized.

Member

jbredeche commented Sep 16, 2016

reference is one year NYSE backtest that buys 1 share of apple every minute.

before, the zipline sim took 64 seconds. afterwards, 56 seconds. (on my machine)

@@ -690,11 +699,19 @@ def minute_to_session_label(self, dt, direction="next"):
pd.Timestamp (midnight UTC)
The label of the containing session.
"""
if direction == "next":

This comment has been minimized.

@ssanderson

ssanderson Sep 16, 2016

Member

How does this compare to just decorating this method with cache_tools.lru_cache(maxsize=1)?

This comment has been minimized.

@jbredeche

jbredeche Sep 16, 2016

Member

We no longer have cache_tools in zipline, replaced it with lru-dict, which is faster.

I don't think it can be used as a decorator out of the box, but that could definitely be added.

# inputs.
self._minute_to_session_label_cache = LRU(1)
self._session_nanos = self.schedule.index.values.astype(np.int64)

This comment has been minimized.

@jbredeche

jbredeche Sep 16, 2016

Member

oops, this is cruft, will remove

This comment has been minimized.

@obsidia4

obsidia4 Sep 27, 2016

When I import zipline ,it returns 'ImportError: No module named 'lru''.
My system is win10-64bit,and environment is python 3.5. Would you please tell me what should I do to fix it?

@coveralls

This comment has been minimized.

coveralls commented Sep 16, 2016

Coverage Status

Coverage increased (+0.01%) to 86.658% when pulling b565ed3 on cache-minute-to-session-lookups into f1dba1d on master.

PERF: Save up to 75% of the calls to minute_to_session_label
One year NYSE test that buys a lot triggers 492,963 calls to
minute_to_session_label.  Only 98924 ~(390 * 252) make it past the
cache and trigger the heavier computation.

@jbredeche jbredeche force-pushed the cache-minute-to-session-lookups branch from b565ed3 to 1d2e101 Sep 16, 2016

@coveralls

This comment has been minimized.

coveralls commented Sep 16, 2016

Coverage Status

Coverage increased (+0.009%) to 86.657% when pulling 1d2e101 on cache-minute-to-session-lookups into f1dba1d on master.

2 similar comments
@coveralls

This comment has been minimized.

coveralls commented Sep 16, 2016

Coverage Status

Coverage increased (+0.009%) to 86.657% when pulling 1d2e101 on cache-minute-to-session-lookups into f1dba1d on master.

@coveralls

This comment has been minimized.

coveralls commented Sep 16, 2016

Coverage Status

Coverage increased (+0.009%) to 86.657% when pulling 1d2e101 on cache-minute-to-session-lookups into f1dba1d on master.

@jbredeche jbredeche merged commit 592a4df into master Sep 16, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jbredeche jbredeche deleted the cache-minute-to-session-lookups branch Sep 16, 2016

if direction == "previous":
if direction == "next":

This comment has been minimized.

@richafrank

richafrank Sep 16, 2016

Member

👍 on putting the most common case first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment