Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Backend, HDF5_01 - Optimizations for large fixed size arrayset schemas. #160

Merged
merged 5 commits into from
Nov 21, 2019

Conversation

rlizzo
Copy link
Member

@rlizzo rlizzo commented Nov 8, 2019

Motivation and Context

Why is this change required? What problem does it solve?:

Significant performance improvements for larger fixed size arrayset data.

Inline documentation explains rationale.

If it fixes an open issue, please link to the issue here:

Description

Describe your changes in detail:

  • new HDF5_01 backend (to complement HDF5_00).
  • introducing methods to more intelligently select backends for an arrayset schema based on it's access conventions and setup.
  • updated tests
  • updated docs

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Documentation update
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Is this PR ready for review, or a work in progress?

  • Ready for review
  • Work in progress

How Has This Been Tested?

Put an x in the boxes that apply:

  • Current tests cover modifications made
  • New tests have been added to the test suite
  • Modifications were made to existing tests to support these changes
  • Tests may be needed, but they are not included when the PR was proposed
  • I don't know. Help!

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have signed (or will sign when prompted) the tensorwork CLA.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@rlizzo rlizzo self-assigned this Nov 8, 2019
@rlizzo rlizzo added the enhancement New feature or request label Nov 8, 2019
@rlizzo rlizzo requested a review from hhsecond November 8, 2019 16:01
@rlizzo rlizzo added Awaiting Review Author has determined PR changes area nearly complete and ready for formal review. WIP Don't merge; Work in Progress labels Nov 8, 2019
@rlizzo
Copy link
Member Author

rlizzo commented Nov 8, 2019

Hey @hhsecond this is a first draft PR here.. It's ready for a high level review, but will definitely need some work before thinking about merging it.

@codecov
Copy link

codecov bot commented Nov 8, 2019

Codecov Report

Merging #160 into master will decrease coverage by 0.17%.
The diff coverage is 92.64%.

@@            Coverage Diff             @@
##           master     #160      +/-   ##
==========================================
- Coverage   95.31%   95.14%   -0.17%     
==========================================
  Files          64       65       +1     
  Lines       11548    11821     +273     
  Branches      977     1023      +46     
==========================================
+ Hits        11006    11246     +240     
- Misses        361      384      +23     
- Partials      181      191      +10
Impacted Files Coverage Δ
tests/test_diff_staged_summary.py 100% <ø> (ø) ⬆️
src/hangar/remote/client.py 80.29% <ø> (-0.23%) ⬇️
tests/test_checkout.py 99.8% <100%> (ø) ⬆️
tests/test_arrayset_backends.py 100% <100%> (ø) ⬆️
tests/test_cli.py 99.57% <100%> (ø) ⬆️
src/hangar/arrayset.py 95.1% <100%> (ø) ⬆️
src/hangar/backends/hdf5_00.py 92.99% <100%> (+0.03%) ⬆️
tests/test_arrayset.py 100% <100%> (ø) ⬆️
tests/property_based/test_pbt_arrayset.py 100% <100%> (ø) ⬆️
tests/property_based/test_pbt_metadata.py 100% <100%> (ø) ⬆️
... and 14 more

@rlizzo rlizzo force-pushed the fixed-size-arrayset-backend-optim branch from f562d36 to 38f5df2 Compare November 8, 2019 17:06
@rlizzo
Copy link
Member Author

rlizzo commented Nov 9, 2019

@elistevens this will interest you. Once this is merged (hopefully in the next few days) I'll be pushing v0.4.0b1

@rlizzo rlizzo force-pushed the fixed-size-arrayset-backend-optim branch 2 times, most recently from 5b7a483 to 9a2f406 Compare November 11, 2019 22:47
@tensorwerk tensorwerk deleted a comment from lgtm-com bot Nov 11, 2019
@tensorwerk tensorwerk deleted a comment from lgtm-com bot Nov 11, 2019
@tensorwerk tensorwerk deleted a comment from lgtm-com bot Nov 21, 2019
@rlizzo rlizzo force-pushed the fixed-size-arrayset-backend-optim branch from 4afb38b to c30f326 Compare November 21, 2019 11:44
@tensorwerk tensorwerk deleted a comment from lgtm-com bot Nov 21, 2019
@tensorwerk tensorwerk deleted a comment from lgtm-com bot Nov 21, 2019
@rlizzo rlizzo force-pushed the fixed-size-arrayset-backend-optim branch from 40f5545 to d6a9bfd Compare November 21, 2019 12:26
@tensorwerk tensorwerk deleted a comment from lgtm-com bot Nov 21, 2019
…rethink how we express what each backend is optimized for during selection
… is not a PR which should introduce changes to that particular backends performance
@rlizzo rlizzo force-pushed the fixed-size-arrayset-backend-optim branch from d6a9bfd to 368f5eb Compare November 21, 2019 12:48
@tensorwerk tensorwerk deleted a comment from lgtm-com bot Nov 21, 2019
@rlizzo rlizzo force-pushed the fixed-size-arrayset-backend-optim branch from 368f5eb to 9167eee Compare November 21, 2019 12:50
@lgtm-com
Copy link

lgtm-com bot commented Nov 21, 2019

This pull request fixes 16 alerts when merging 9167eee into d105e56 - view on LGTM.com

fixed alerts:

  • 8 for Unused local variable
  • 5 for Unused import
  • 1 for Unmatchable dollar in regular expression
  • 1 for Module-level cyclic import
  • 1 for Module is imported with 'import' and 'import from'

@rlizzo rlizzo removed Awaiting Review Author has determined PR changes area nearly complete and ready for formal review. WIP Don't merge; Work in Progress labels Nov 21, 2019
@rlizzo rlizzo merged commit 2ed1157 into tensorwerk:master Nov 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant