New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename SetNmAndUqTags and fix #622 (MergeBamAlignment MD tag). #636
Conversation
505970d
to
9d2018a
Compare
9d2018a
to
6463c53
Compare
final boolean calculateMd = rec.getStringAttribute(SAMTag.MD.name()) != null; | ||
if (calculateNm || calculateMd) { | ||
final byte[] referenceBases = this.refSeq.get(this.refSeq.getSequenceDictionary().getSequenceIndex(rec.getReferenceName())).getBases(); | ||
SequenceUtil.calculateMdAndNmTags(rec, referenceBases, calculateNm, calculateMd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this code that you removed calculateNm, calculateMd were reversed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it. I really like scala where we can have named params. Sigh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. especially important for booleans. but all params could benefit. not sure why java don't have it too....
if I understand correctly, this reverts the fixing of tags when the cigar changes, but includes MD in the fixTags CLP. Should we purge MD NM and UQ when we change the cigar? At-least we will not be leaving around incorrect information.... |
@yfarjoun: If downstream users are impacted (i.e. they want MD tags but no longer have them), then they have been relying on potentially the wrong values. I vote for removing NM/MD/UQ when we clip overlapping reads. Also, should we do the same for clipping adapters (see |
public class SetNmAndUqTags extends CommandLineProgram { | ||
static final String USAGE_SUMMARY = "Fixes the UQ and NM tags in a SAM file. "; | ||
static final String USAGE_DETAILS = "This tool takes in a SAM or BAM file (sorted by coordinate) and calculates the NM and UQ tags by comparing with the reference."+ | ||
public class SetNmMDAndUqTags extends CommandLineProgram { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's going to drive me crazy that it's Nm
, Uq
and MD
in the name. Can you change to Md
please?
@nh13 @yfarjoun Agreed that MergeBamAlignment should probably remove tags rather than output invalid values. The other option is to add back @nh13's change but have it use either the indexed fasta file (potentially loading the same region many times) or load the entire reference into memory (requiring more memory). |
that would require a new type of ReferenceFile reader, no? On Sat, Aug 20, 2016 at 5:24 AM, Tim Fennell notifications@github.com
|
@yfarjoun I think I will remove the tags, and in the case folks output queryname, the old behavior stands. I will add some documentation somewhere about this too. |
but even if outputting in queryname the tags will be incorrect...so why not On Tue, Aug 23, 2016 at 2:34 PM, Nils Homer notifications@github.com
|
6463c53
to
8728570
Compare
@yfarjoun see my latest changes. I remove the NM/MD/UQ tags for reads that are clipped due to adapters or overlapping bases. I also added some doc to |
@@ -477,7 +476,7 @@ public void mergeAlignment(final File referenceFasta) { | |||
|
|||
for (final SAMRecord rec : sink.sorter) { | |||
if (!rec.getReadUnmappedFlag() && refSeq != null) { | |||
fixNMandUQ(rec, refSeq, bisulfiteSequence); | |||
fixNmMdAndUQ(rec, refSeq, bisulfiteSequence); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UQ -> Uq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
can I haz A small test that shows that the tags are actually removed? |
back to you @nh13 |
why assigning me here? this has some work to be done:
Thanks! :-) |
@yfarjoun you need to un-assign yourself even if you assign someone else. Multiple folks can be assigned an issue now. |
1c24dfb
to
57c0ac1
Compare
@yfarjoun added the test and rebased onto master. After a final 👍 this is ready to go. |
|
||
private final Log log = Log.getInstance(SetNmMdAndUqTags.class); | ||
|
||
public static void main(final String[] argv) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this main still needed with the new system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, but I am going to leave it here as other classes have it; cleaning this up should be in all classes could be in another PR.
|
||
final ReferenceSequenceFileWalker refSeq = new ReferenceSequenceFileWalker(REFERENCE_SEQUENCE); | ||
|
||
StreamSupport.stream(reader.spliterator(),false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aligning the line, and after comma.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
sorry for the delay. thanks!! 👍 |
1. Rename SetNmAndUqTags to SetNmMDAndUqTags. 2. Fixes: #622
57c0ac1
to
0ea9d4c
Compare
@yfarjoun @tfenne
WIP
Fixes #634 as an alternate to reverting (#635).