Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: HDF5-based FX Reader/Writer. #2597

Merged
merged 8 commits into from
Jan 3, 2020
Merged

ENH: HDF5-based FX Reader/Writer. #2597

merged 8 commits into from
Jan 3, 2020

Conversation

ssanderson
Copy link
Contributor

HDF5-based implementation of FXRateReader.

  • Adds new classes, HDF5FXRateReader and HDF5FXRateWriter for reading/writing fx rates stored in hdf5.
  • Moves existing content of zipline/data/fx.py into a directory. We now have zipline.data.fx as a module, with child modules implementing readers, plus zipline.data.fx.base defining the interface they all implement.
  • Adds a basic test suite that should be suitable for testing any implementation of FXRateReader.
    • Adds subclasses for testing InMemoryFXRateReader (previously only indirectly tested via the pipeline API tests) and the new HDF5FXRateReader.
    • Updates testing infrastructure to be able to share code between in memory and h5-based tests.

@coveralls
Copy link

coveralls commented Dec 17, 2019

Coverage Status

Coverage increased (+0.1%) to 88.032% when pulling 49fbc20 on h5-fx into bbf9221 on master.

Scott Sanderson added 4 commits December 17, 2019 14:34
- Add test coverage for non-scalar lookups.
- Performance optimizations for InMemoryFXRateReader and HDF5FXRateReader. We
  no longer construct a DataFrame on each call to get_rates, since doing so is
  surprisingly expensive and caused test_lookup_scalar to take more time than
  was reasonable.
Explicitly require that pipeline-compatible readers support 'default' as a
key. I'm currently just using the string 'default' for this. We should probably
move that value to a shared location, but it's not clear to me that value
should live since it's defining an interface boundary between pipeline and fx
data.
- Rather than using numpy's S3 and U3 types, which behave differently in py2
  and py3, just expect object arrays of `str` everywhere.

  This makes it slightly more expensive for us to read the currency index from
  the file in py3, but that array is on the order of a hundred or so elements
  total, so that's not a major concern, and it simplifies the handling
  elsewhere.
Copy link
Contributor

@quantophred quantophred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See what you think of the comments: if you think they're fine as-is, then LGTM 👍

zipline/data/fx/exploding.py Show resolved Hide resolved
zipline/data/fx/hdf5.py Outdated Show resolved Hide resolved
zipline/data/fx/in_memory.py Show resolved Hide resolved
zipline/data/fx/in_memory.py Show resolved Hide resolved
zipline/data/fx/hdf5.py Show resolved Hide resolved
Scott Sanderson added 3 commits January 2, 2020 10:49
This makes it easier to distinguish different usages of 'default', and makes it
easier in the future to change how we handle defaults.
@ssanderson ssanderson removed the request for review from peterhbromley January 3, 2020 20:19
@ssanderson ssanderson merged commit 09032e9 into master Jan 3, 2020
@ssanderson ssanderson deleted the h5-fx branch January 3, 2020 21:09
@ssanderson ssanderson mentioned this pull request Jan 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants