-
Notifications
You must be signed in to change notification settings - Fork 59
Use hash of entries for merge key. #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Let's run it on platinum genomes to ensure everything works as expected.
@bashir2 FYI, this would be a great candidate for the 'large' integration test based on platinum genomes.
| variant.reference_bases or '', | ||
| ','.join(variant.alternate_bases or [])]]) | ||
| hashlib.sha256(variant.reference_bases or '').hexdigest(), | ||
| hashlib.sha256(','.join(variant.alternate_bases or [])).hexdigest()]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making a private method _get_hash_key and use it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| variant.end or '', | ||
| variant.reference_bases or '', | ||
| ','.join(variant.alternate_bases or [])]]) | ||
| hashlib.sha256(variant.reference_bases or '').hexdigest(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if md5 is good enough here. 256 seems like an overkill :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used md5 instead.
| def test_get_merge_keys(self): | ||
| strategy = move_to_calls_strategy.MoveToCallsStrategy(None, None, None) | ||
|
|
||
| empty_string_hash = hashlib.sha256('').hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making a helper method here as well. I think it's ok to use the one in the private lib too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
a162a85 to
30c6b03
Compare
Pull Request Test Coverage Report for Build 283
💛 - Coveralls |
Hash of reference_bases and alternate_bases is used for the merge key. fixes googlegenomics#102
arostamianfar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
Job finished successfully on platinum dataset. Thanks for the review! |
Hash of reference_bases and alternate_bases is used for the merge key.
fixes #102