Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Currency Improvements #2609

Merged
merged 11 commits into from
Jan 10, 2020
Merged

Currency Improvements #2609

merged 11 commits into from
Jan 10, 2020

Conversation

ssanderson
Copy link
Contributor

@ssanderson ssanderson commented Jan 9, 2020

This PR contains several improvements to the currency conversion support that was added in #2586 and #2597.

  • Fixes a rather embarrasing bug where we loaded currency codes incorrectly unless we queried for all the sids contained in an HDF5 file: fb07353.
  • Adds handling for currency queries containing unknown sids. This is useful for robustness against invalid input data, or for against data artifacts produced at different times: 006b7b9. This works as follows:
    • Adds a new special sentinel, zipline.currency.MISSING_CURRENCY_CODE (value 'XXX', which denotes no currency according to wikipedia). This currency is returned by CurrencyAwareDailyBarReader for unknown sids, or sids for which currency is unknown. I'm using this rather than None or '\0\0\0' because we're currently returning codes in numpy arrays of dtype S3, which can't hold None, and these values might get displayed to users, which makes \0\0\0 an awkward choice. Another option would be to use None, and switch to using object arrays, but that makes things like sorting annoying (None can't be compared with strings in python 3). Object array with None is probably what we'd want to expose to users if we allow querying for currencies though (it's what we do elsewhere for string-typed data, at least), so that might be preferable.
    • Adds handling for queries of unknown currencies to the currency reader interface. Queries for unknown currencies yield np.nan.
  • Adds version metadata and checking to the h5 currency format, to allow for safer future upgrades: 9e3deab.

Scott Sanderson added 6 commits January 8, 2020 15:34
- Fixes a bug where we only returned the correct currency codes when loading
  all sids stored in an hdf5 file.
- Adds a failing test for the bug.
Don't crash on queries for currency codes of possibly-unknown sids on calls to
`SessionBarReader.currency_codes`. When a currency code is requested for an
unknown sid, we return zipline.currency.MISSING_CURRENCY_CODE for that sid.
These are good sanity checks/documentation, and they don't meaningfully impact
performance.
@coveralls
Copy link

coveralls commented Jan 9, 2020

Coverage Status

Coverage decreased (-0.002%) to 88.031% when pulling 09fb188 on currency-fixes into 5e61943 on master.

Copy link
Contributor

@peterhbromley peterhbromley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all LGTM, I like that you found the wikipedia reference that 'XXX' denotes missing currency. That makes me feel good about choosing that as the display value to the users.

Scott Sanderson added 5 commits January 10, 2020 12:51
This causes the stored currencies to get out of sync with
cls.FX_RATES_CURRENCIES.
Pipeline column objects override __lt__ to raise an error, which means using
the default sorted() implementation doesn't work.
@ssanderson ssanderson merged commit 3ea5916 into master Jan 10, 2020
@ssanderson ssanderson deleted the currency-fixes branch January 10, 2020 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants