Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsigned int is backed by a long value #339

Merged
merged 1 commit into from Nov 12, 2015
Merged

Conversation

vadimzalunin
Copy link
Contributor

replaced exception when reading unsigned int in ReadTag with a direct conversion to long

@droazen
Copy link
Contributor

droazen commented Oct 22, 2015

@vadimzalunin I've made some changes in master to make it easier to see which tests are failing in travis, and have taken the liberty of rebasing this branch on top of those changes. It's clear from doing this that the test failure in this branch is closely related to the changes you've made in it, as you should now be able to see in travis:

[testng] FAILED: test("auxf#values")
[testng] htsjdk.samtools.util.RuntimeEOFException: htsjdk.samtools.SAMException: Attribute type class java.lang.Long not supported. Tag: I9
[testng]    at htsjdk.samtools.CRAMIterator.hasNext(CRAMIterator.java:257)
[testng]    at htsjdk.samtools.CRAMComplianceTest.test(CRAMComplianceTest.java:123)
[testng] Caused by: htsjdk.samtools.SAMException: Attribute type class java.lang.Long not supported. Tag: I9
[testng]    at htsjdk.samtools.SAMRecord.setAttribute(SAMRecord.java:1210)
[testng]    at htsjdk.samtools.SAMRecord.setAttribute(SAMRecord.java:1201)
[testng]    at htsjdk.samtools.SAMRecord.setAttribute(SAMRecord.java:1177)
[testng]    at htsjdk.samtools.cram.build.Cram2SamRecordFactory.create(Cram2SamRecordFactory.java:93)
[testng]    at htsjdk.samtools.CRAMIterator.nextContainer(CRAMIterator.java:191)
[testng]    at htsjdk.samtools.CRAMIterator.hasNext(CRAMIterator.java:255)
[testng]    ... 22 more
[testng] ... Removed 21 stack frames
[testng] 
[testng] ===============================================
[testng]     Ant test
[testng]     Tests run: 22443, Failures: 1, Skips: 0
[testng] ===============================================

@droazen droazen assigned vadimzalunin and unassigned droazen Oct 22, 2015
@droazen
Copy link
Contributor

droazen commented Oct 22, 2015

Back to @vadimzalunin for test fixes

@mp15
Copy link
Member

mp15 commented Oct 23, 2015

Looks like this only failing on the incorrect test testUnsignedIntegerSAM which is expecting an exception to be thrown (it shouldn't be with this fix in place). Maybe update the test?

@vadimzalunin
Copy link
Contributor Author

@mp15 done

@droazen
Copy link
Contributor

droazen commented Oct 23, 2015

@vadimzalunin Could you please squash this branch into a single commit?

@vadimzalunin
Copy link
Contributor Author

@droazen squashed

@droazen droazen assigned droazen and unassigned vadimzalunin Oct 23, 2015
@@ -1203,7 +1203,7 @@ protected void setAttribute(final short tag, final Object value) {

protected void setAttribute(final short tag, final Object value, final boolean isUnsignedArray) {
if (value != null &&
!(value instanceof Byte || value instanceof Short || value instanceof Integer ||
!(value instanceof Byte || value instanceof Short || value instanceof Integer || value instanceof Long ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change to allow Long values directly in SAMRecord attributes, what protection do we have against a client setting an attribute outside the range of uint32_t that bams can handle? The htsjdk BAM codec will fail given a Long value that exceeds BinaryCodec.MAX_UINT.

Also, if you set such an attribute using setAttribute() and then call getIntegerAttribute() you'll get a RuntimeException if the value exceeds Integer.MAX_VALUE (even though the attribute may be marked as of type i/I, like the ia:i:4294967295 attribute in http://gatkforums.broadinstitute.org/discussion/comment/24038/#Comment_24038).

Can you comment on these concerns? I don't want to block this important fix, but I want to be sure we're not doing something here that will cause problems later on.

@droazen droazen assigned vadimzalunin and unassigned droazen Oct 23, 2015
@droazen
Copy link
Contributor

droazen commented Oct 23, 2015

Back to @vadimzalunin for comments.

@vadimzalunin vadimzalunin force-pushed the CRAM_uint32_t_in_ReadTag branch 2 times, most recently from 794f45e to 0c3f09b Compare October 28, 2015 16:02
@vadimzalunin
Copy link
Contributor Author

@droazen re-implemented the fix, basically made uint32 policy the same in CRAM as in BAM. Added dedicated test class and a BAM file.

@vadimzalunin
Copy link
Contributor Author

added tests for values over max unsigned int

@vadimzalunin
Copy link
Contributor Author

@droazen please review

@mp15
Copy link
Member

mp15 commented Nov 2, 2015

@vadimzalunin It won't stop an effective review but it looks like you've accidentally left the squashed branch as two commits with same message rather than one? (This has been fixed by 19e8e3b)

@@ -379,6 +379,8 @@ protected void finish() {
outputStream.flush();
if (indexer != null)
indexer.finish();
}catch (final RuntimeException re) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} catch

@cmnbroad cmnbroad mentioned this pull request Nov 4, 2015
@nh13
Copy link
Member

nh13 commented Nov 4, 2015

@droazen thanks for the context as it wasn't clear here, it also is timely as htsjdk states that java 7 will be supported only until October 2015, so we should feel free to move to java 8 if we need specific features there.

There seem to be a lot of Cram changes here, are those also necessary given @droazen suggestions, which seem reasonable?

@droazen
Copy link
Contributor

droazen commented Nov 6, 2015

@vadimzalunin Ping me once you've made the requested changes and this is ready for final review.

@vdauwera vdauwera added the cram label Nov 11, 2015
@vadimzalunin vadimzalunin force-pushed the CRAM_uint32_t_in_ReadTag branch 3 times, most recently from 78309c8 to 7749f23 Compare November 11, 2015 13:11
@vadimzalunin
Copy link
Contributor Author

@droazen please review

@@ -1159,6 +1161,32 @@ public Object getAttribute(final short tag) {
}

/**
* A convinience method that will return a valid unsigned integer or fail with an exception if the tag value is invalid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convenience -> convenience

@vadimzalunin
Copy link
Contributor Author

Sorry guys, I appreciate this has turned into a large patch. I tried to cover critical points with tests, in particular see src/tests/java/htsjdk/samtools/SAMIntegerTagTest.java
One issue is that SAM is different from BAM and CRAM.
Another issue is ambiguity in type detection, for example -1 may be a valid signed or invalid unsigned.

@vadimzalunin
Copy link
Contributor Author

@droazen ping

@droazen
Copy link
Contributor

droazen commented Nov 12, 2015

@vadimzalunin Am going to take a look now -- while you're waiting, perhaps you could take a look at this serious-sounding CRAM bug we discovered recently?

#364

It sounds like the kind of thing that might be a quick fix (ie., a simple off-by-one error)

@droazen
Copy link
Contributor

droazen commented Nov 12, 2015

Looks like there are a few additional changes I'd like to see here -- since we're pressed for time, I'll make these changes myself and push them directly into this branch to spare us another code review cycle. While I do this if you could look at #364 it would be much appreciated, as we can't really recommend that anyone use GATK 3.5 with cram for anything other than testing purposes if we have bugs like that open.

@droazen
Copy link
Contributor

droazen commented Nov 12, 2015

Ok, I've finished making changes to this branch -- a summary of the notable changes:

-BinaryTagCodec and ReadTag return an int for I-typed attributes when possible, only returning a long for values > Integer.MAX_VALUE and <= BinaryCodec.MAX_UINT. This is slightly more conservative and less likely to break client code.

-Removed some unrelated cram changes in CRAMIterator (allowing null ReferenceSource) that conflicted with changes in #368

-Added an overload of getUnsignedIntegerAttribute() that takes a String (with tests).

-Moved isValidUnsignedInteger() to SAMUtils

-Improved error messages, documentation, and naming of test cases.

Will rebase and merge now, once tests pass.

@vadimzalunin
Copy link
Contributor Author

@droazen thanks! looks good to me 👍

droazen added a commit that referenced this pull request Nov 12, 2015
unsigned int is backed by a long value
@droazen droazen merged commit 9bd6406 into master Nov 12, 2015
@droazen droazen deleted the CRAM_uint32_t_in_ReadTag branch November 12, 2015 23:27
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling ca7e1c9 on CRAM_uint32_t_in_ReadTag into ** on master**.

yfarjoun pushed a commit to yfarjoun/htsjdk that referenced this pull request Jun 5, 2019
While it was a Google Docs document, the specification was corrected
to reflect that HTTP Range byte ranges are zero-based fully-inclusive.
Update the OpenAPI description correspondingly.

Also fix typos.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants