Skip to content

Commit

Permalink
Improve price performance for 1000+ asset universes. (#2108)
Browse files Browse the repository at this point in the history
PERF: Optimize price for liquid assets.

When using `price`, the call to `last_traded_dt` for every value retrieved
became a noticeable bottleneck in algorithms which used over 1000 assets.

Instead of calling `last_traded_dt` before every retrieval of a `close` for the
`price` field, assume that `close` will retrieve a non-empty value, and then
forward fill if it is empty.

This change optimizes for the case of a tradeable universe which is predominately
composed of liquid assets.
  • Loading branch information
ehebert committed Feb 14, 2018
1 parent 4002a03 commit ba60392
Showing 1 changed file with 30 additions and 21 deletions.
51 changes: 30 additions & 21 deletions zipline/data/data_portal.py
Original file line number Diff line number Diff line change
Expand Up @@ -648,34 +648,43 @@ def get_adjusted_value(self, asset, field, dt,
def _get_minute_spot_value(self, asset, column, dt, ffill=False):
reader = self._get_pricing_reader('minute')

if ffill:
# If forward filling, we want the last minute with values (up to
# and including dt).
query_dt = reader.get_last_traded_dt(asset, dt)

if pd.isnull(query_dt):
# no last traded dt, bail
if column == 'volume':
return 0
else:
if not ffill:
try:
return reader.get_value(asset.sid, dt, column)
except NoDataOnDate:
if column != 'volume':
return np.nan
else:
# If not forward filling, we just want dt.
query_dt = dt
else:
return 0

# At this point the pairing of column='close' and ffill=True is
# assumed.
try:
result = reader.get_value(asset.sid, query_dt, column)
# Optimize the best case scenario of a liquid asset
# returning a valid price.
result = reader.get_value(asset.sid, dt, column)
if not pd.isnull(result):
return result
except NoDataOnDate:
if column == 'volume':
return 0
else:
return np.nan
# Handling of no data for the desired date is done by the
# forward filling logic.
# The last trade may occur on a previous day.
pass
# If forward filling, we want the last minute with values (up to
# and including dt).
query_dt = reader.get_last_traded_dt(asset, dt)

if pd.isnull(query_dt):
# no last traded dt, bail
return np.nan

result = reader.get_value(asset.sid, query_dt, column)

if not ffill or (dt == query_dt) or (dt.date() == query_dt.date()):
if (dt == query_dt) or (dt.date() == query_dt.date()):
return result

# the value we found came from a different day, so we have to adjust
# the data if there are any adjustments on that day barrier
# the value we found came from a different day, so we have to
# adjust the data if there are any adjustments on that day barrier
return self.get_adjusted_value(
asset, column, query_dt,
dt, "minute", spot_value=result
Expand Down

0 comments on commit ba60392

Please sign in to comment.