Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Variable-size minutely cache. #2110

Merged
merged 2 commits into from
Feb 14, 2018
Merged

PERF: Variable-size minutely cache. #2110

merged 2 commits into from
Feb 14, 2018

Conversation

ssanderson
Copy link
Contributor

Use a variable-size cache for minutely pricing data.

Increase the default cache size for close to 3000, since close prices
are used in many places in the simulation as the best-known price of
assets. This dramatically speeds up algorithms that read the prices of
many assets without ordering them.

Use a variable-size cache for minutely pricing data.

Increase the default cache size for close to 3000, since close prices
are used in many places in the simulation as the best-known price of
assets. This dramatically speeds up algorithms that read the prices of
many assets without ordering them.
'low': 1550,
'volume': 1550,
}
assert set(FIELDS) == set(DEFAULT_MINUTELY_SID_CACHE_SIZES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a message on this

@@ -898,8 +899,22 @@ class BcolzMinuteBarReader(MinuteBarReader):
zipline.data.minute_bars.BcolzMinuteBarWriter
"""
FIELDS = ('open', 'high', 'low', 'close', 'volume')
DEFAULT_MINUTELY_SID_CACHE_SIZES = {
'close': 3000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure we are okay with doubling this?

Copy link
Contributor Author

@ssanderson ssanderson Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so. #2108 shrinks the expected size of the volume cache significantly for most algorithms, so we have some wiggle room. For Quantopian, even if we were fully saturating all the caches, the extra memory cost of increasing just this field isn't that big of a deal, and the performance cost of thrashing this cache is bad enough that I think any zipline user who is doing so would almost certainly prefer the extra memory cost here in exchange for the substantial (~50% or more in local testing) speedup this provides.

@coveralls
Copy link

coveralls commented Feb 14, 2018

Coverage Status

Coverage increased (+0.003%) to 87.57% when pulling fd9feaf on variable-cache into ba60392 on master.

@ssanderson ssanderson removed the request for review from ehebert February 14, 2018 06:20
@ssanderson ssanderson merged commit a4a4401 into master Feb 14, 2018
@ssanderson ssanderson deleted the variable-cache branch February 14, 2018 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants