-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coverm filter inverse paired reads #159
Comments
Hi, Good point. The way coverm is interpreting this is that you want a BAM file with only proper pairs that don't pass the I suggest taking a two step approach - grab reads that fail in the same way you did above, and then concatenate unpaired reads and unmapped reads with samtools and its flags for filtering. Does that make sense? I understand this is a bit confusing, and the documentation could be better - will fix. |
Thanks for this quick and helpful response! I see, I think that makes sense. One follow up question, I guess I would then expect the bam file to still contain a matching number of reads in the forward and reverse file with this proper pairs flag, rather than having a different number between the files. I.e., if only proper pairs are passed to this output bam file, and the bam file contains output of properly paired reads which do not meet the identity and alignment requirements, should the fwd and rev fastq files have the same number of reads? Could you explain this piece a bit more? |
Good q. I think what is happening is that they are aligned as a proper pair, but then since you used thresholds which apply to a single side of each pair, then only one of the pair is present in the output BAM. Does that make sense? I have to admit it has been a while since I looked at that code though. Perhaps it would be worth verifying by inspecting the mappings of now-singleton mappings in the original file? |
I see, yes that does make sense. I'll definitely inspect the singletons to see if that is the case. Thanks! |
Hello,
I had a question about the filtering module of coverm, I'm guessing the answer may have to do with the SAM flags within the files passed to coverm filter. When running coverm filter on a bam file (mapped with only paired reads) with the
--inverse --proper-pairs-only
flags, I would expect that the output file contains all singletons and unmapped pairs of the original bam file, and any other reads not passing filtering, thus being 'proper unmapped pairs'. However, when I look to output all of these unmapped reads to a fastq file usingsamtools fastq -1 r1.fq -2 r2.fq
, I get an uneven number of reads in the forward and reverse file. Is there a reason for this?I understand this issue may not be on the side of coverm, but rather on the samtools call, and can ask this elsewhere if that is the case. Thanks!
Example:
The text was updated successfully, but these errors were encountered: