Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
SAMUtils:getOtherCanonicalAlignments extract 'SA' tag and return a list of supplementary alignments #685
Conversation
coveralls
commented
Aug 16, 2016
coveralls
commented
Aug 16, 2016
coveralls
commented
Aug 16, 2016
|
@jamesemery could you please review? |
coveralls
commented
Aug 17, 2016
jamesemery
commented on an outdated diff
Aug 17, 2016
| + otherRec.setReadBases( record.getReadBases() ); | ||
| + otherRec.setBaseQualities( record.getBaseQualities() ); | ||
| + | ||
| + /* get reference sequence */ | ||
| + final int tid = record.getHeader().getSequenceIndex( commaStrs[0] ); | ||
| + if( tid == -1 ) throw new SAMException("Unknown contig in " + semiColonStr); | ||
| + otherRec.setReferenceIndex( tid ); | ||
| + | ||
| + /* fill other fields */ | ||
| + otherRec.setAlignmentStart( Integer.parseInt(commaStrs[1]) ); | ||
| + otherRec.setFlags( | ||
| + SAMFlag.SUPPLEMENTARY_ALIGNMENT.flag + | ||
| + (commaStrs[2].equals("+") ? 0 : SAMFlag.READ_REVERSE_STRAND.flag) ); | ||
| + otherRec.setCigar( TextCigarCodec.decode( commaStrs[3] ) ); | ||
| + otherRec.setMappingQuality( Integer.parseInt(commaStrs[4]) ); | ||
| + otherRec.setAttribute( SAMTagUtil.getSingleton().NM , Integer.parseInt(commaStrs[5]) ); |
jamesemery
Contributor
|
jamesemery
commented on an outdated diff
Aug 17, 2016
| @@ -1096,4 +1097,79 @@ public static SAMRecord clipOverlappingAlignedBases(final SAMRecord record, fina | ||
| public static boolean isValidUnsignedIntegerAttribute(long value) { | ||
| return value >= 0 && value <= BinaryCodec.MAX_UINT; | ||
| } | ||
| + | ||
| + /** | ||
| + * Extract a List of 'other canonical alignments' from a SAM record. Those alignments are stored as a string in the 'SA' tag as defined | ||
| + * in the SAM specification. | ||
| + * Each record in the List is a non-paired read. | ||
| + * The name, sequence and qualities are copied from the original record. | ||
| + * The SAM flag is set to <code>SUPPLEMENTARY_ALIGNMENT (+ READ_REVERSE_STRAND )</code> | ||
| + * @param record must be non null and must have a non-null associated header. | ||
| + * @return a list of 'other canonical alignments' SAMRecords. The list is empty if the 'SA' attribute is missing. | ||
| + */ | ||
| + public static List<SAMRecord> getOtherCanonicalAlignments(final SAMRecord record) { | ||
| + final Pattern semiColonPattern = Pattern.compile("[;]"); | ||
| + final Pattern commaPattern = Pattern.compile("[,]"); |
jamesemery
Contributor
|
jamesemery
commented on the diff
Aug 17, 2016
| + | ||
| + SAMRecord other = suppl.get(0); | ||
| + Assert.assertEquals(other.getReferenceName(),"2"); | ||
| + Assert.assertEquals(other.getAlignmentStart(),500); | ||
| + Assert.assertFalse(other.getReadNegativeStrandFlag()); | ||
| + Assert.assertEquals(other.getMappingQuality(), 60); | ||
| + Assert.assertEquals(other.getAttribute(SAMTagUtil.getSingleton().NM),1); | ||
| + Assert.assertEquals(other.getCigarString(),"3S2=1X2=2S"); | ||
| + | ||
| + other = suppl.get(1); | ||
| + Assert.assertEquals(other.getReferenceName(),"1"); | ||
| + Assert.assertEquals(other.getAlignmentStart(),191); | ||
| + Assert.assertTrue(other.getReadNegativeStrandFlag()); | ||
| + Assert.assertEquals(other.getMappingQuality(), 60); | ||
| + Assert.assertEquals(other.getAttribute(SAMTagUtil.getSingleton().NM),0); | ||
| + Assert.assertEquals(other.getCigarString(),"8M2S"); |
jamesemery
Contributor
|
jamesemery
commented on the diff
Aug 17, 2016
| + | ||
| + for( final String semiColonStr : semiColonStrs ) { | ||
| + /* ignore empty string */ | ||
| + if( semiColonStr.isEmpty() ) continue; | ||
| + | ||
| + /* break string using comma */ | ||
| + final String commaStrs[] = commaPattern.split(semiColonStr); | ||
| + if( commaStrs.length != 6 ) throw new SAMException("Bad 'SA' attribute in " + semiColonStr); | ||
| + | ||
| + /* create the new record */ | ||
| + final SAMRecord otherRec = samReaderFactory.createSAMRecord( record.getHeader() ); | ||
| + | ||
| + /* copy fields from the original record */ | ||
| + otherRec.setReadName( record.getReadName() ); | ||
| + otherRec.setReadBases( record.getReadBases() ); | ||
| + otherRec.setBaseQualities( record.getBaseQualities() ); |
|
|
jamesemery
commented on an outdated diff
Aug 17, 2016
| + final SAMRecord otherRec = samReaderFactory.createSAMRecord( record.getHeader() ); | ||
| + | ||
| + /* copy fields from the original record */ | ||
| + otherRec.setReadName( record.getReadName() ); | ||
| + otherRec.setReadBases( record.getReadBases() ); | ||
| + otherRec.setBaseQualities( record.getBaseQualities() ); | ||
| + | ||
| + /* get reference sequence */ | ||
| + final int tid = record.getHeader().getSequenceIndex( commaStrs[0] ); | ||
| + if( tid == -1 ) throw new SAMException("Unknown contig in " + semiColonStr); | ||
| + otherRec.setReferenceIndex( tid ); | ||
| + | ||
| + /* fill other fields */ | ||
| + otherRec.setAlignmentStart( Integer.parseInt(commaStrs[1]) ); | ||
| + otherRec.setFlags( | ||
| + SAMFlag.SUPPLEMENTARY_ALIGNMENT.flag + |
jamesemery
Contributor
|
|
Thank you for the review and back to you @jamesemery
NB: I'll be away from github for a few days. |
coveralls
commented
Aug 17, 2016
|
@droazen These changes look good to me |
lbergelson
merged commit c3d5a88
into
samtools:master
Aug 25, 2016
|
Thanks @lindenb and @jamesemery |
lindenb commentedAug 16, 2016
Description
I've added a new function
getOtherCanonicalAlignmentsin SAMUtils. This function is used to extract the 'SA' tag of a SAMRecord as aList<SAMRecord>of supplementary alignements.testOtherCanonicalAlignmentswas added to SAMUtilsTest ( create one read with SA flag, check the values of the supplementary alignments)PS: The SA tag is defined in the spec as
Checklist