You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From Artem by email: "I'd be happy to abandon paired-end analysis and it would make our lives much simpler, if there is data showing that it's not going to be informative for detecting CoVs."
It seems certain to me that read pairing could have no meaningful benefit in the big compute, though it might in a second pass to analyze candidate datasets. If one read in a pair does(n't) have at least one alignment, bowtie2 will (not) find and report at least one alignment independently of the other read. This is equally true in both paired and unpaired mode. Essentially the only benefit of pairing is to increase MAPQ if the pair are mapped to locations that are close together consistent with the estimated range of construct lengths (say, in the range 200 - 500nt for a typical Illumina shotgun library). In rare cases, this resolves the location of one of the reads if it maps to repetitive sequence -- e.g., if R1 has a unique mapping but R2 maps to repeats that are 300nt and 1000nt distant, then the first repeat is probably right. In unpaired mode, bowtie2 would make an arbitrary choice between the two R2 alignments and would assign a very low MAPQ.
To re-state, if bowtie2 finds alignments for a paired read, it will almost always find alignments for R1 and R2 separately in unpaired mode. The only difference between paired and unpaired is that in paired mode the location may be better resolved and the MAPQ may be higher. We don't care about location or MAPQ for the first round, so these benefits have no value.
You can verify this by comparing paired and unpaired mode in the benchmark tests.
The text was updated successfully, but these errors were encountered:
From Artem by email: "I'd be happy to abandon paired-end analysis and it would make our lives much simpler, if there is data showing that it's not going to be informative for detecting CoVs."
It seems certain to me that read pairing could have no meaningful benefit in the big compute, though it might in a second pass to analyze candidate datasets. If one read in a pair does(n't) have at least one alignment, bowtie2 will (not) find and report at least one alignment independently of the other read. This is equally true in both paired and unpaired mode. Essentially the only benefit of pairing is to increase MAPQ if the pair are mapped to locations that are close together consistent with the estimated range of construct lengths (say, in the range 200 - 500nt for a typical Illumina shotgun library). In rare cases, this resolves the location of one of the reads if it maps to repetitive sequence -- e.g., if R1 has a unique mapping but R2 maps to repeats that are 300nt and 1000nt distant, then the first repeat is probably right. In unpaired mode, bowtie2 would make an arbitrary choice between the two R2 alignments and would assign a very low MAPQ.
To re-state, if bowtie2 finds alignments for a paired read, it will almost always find alignments for R1 and R2 separately in unpaired mode. The only difference between paired and unpaired is that in paired mode the location may be better resolved and the MAPQ may be higher. We don't care about location or MAPQ for the first round, so these benefits have no value.
You can verify this by comparing paired and unpaired mode in the benchmark tests.
The text was updated successfully, but these errors were encountered: