-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: Variable-size minutely cache. #2110
Conversation
Use a variable-size cache for minutely pricing data. Increase the default cache size for close to 3000, since close prices are used in many places in the simulation as the best-known price of assets. This dramatically speeds up algorithms that read the prices of many assets without ordering them.
zipline/data/minute_bars.py
Outdated
'low': 1550, | ||
'volume': 1550, | ||
} | ||
assert set(FIELDS) == set(DEFAULT_MINUTELY_SID_CACHE_SIZES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put a message on this
@@ -898,8 +899,22 @@ class BcolzMinuteBarReader(MinuteBarReader): | |||
zipline.data.minute_bars.BcolzMinuteBarWriter | |||
""" | |||
FIELDS = ('open', 'high', 'low', 'close', 'volume') | |||
DEFAULT_MINUTELY_SID_CACHE_SIZES = { | |||
'close': 3000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure we are okay with doubling this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so. #2108 shrinks the expected size of the volume cache significantly for most algorithms, so we have some wiggle room. For Quantopian, even if we were fully saturating all the caches, the extra memory cost of increasing just this field isn't that big of a deal, and the performance cost of thrashing this cache is bad enough that I think any zipline user who is doing so would almost certainly prefer the extra memory cost here in exchange for the substantial (~50% or more in local testing) speedup this provides.
Use a variable-size cache for minutely pricing data.
Increase the default cache size for close to 3000, since close prices
are used in many places in the simulation as the best-known price of
assets. This dramatically speeds up algorithms that read the prices of
many assets without ordering them.