Recommended usage with paired-end reads #42

mikemc · 2019-11-16T21:12:24Z

The paper and wiki say that reads should be appropriately quality controlled. Do you recommend a particular filter-and-trim strategy for paired-end Illumina reads that works well with mOTUs2? Also, the wiki gives two different commands for profiling with paired-end reads:

As fastq input is possible to provide paired end reads:

motus profile -f forward_reads.fastq -r reverse_reads.fastq

as well as single reads that comes from quality filtering:

motus profile -f forward_reads.fq -r reverse_reads.fq -s single_reads.fq

Can you clarify what is happening in the second case (what is in single_reads.fq and how is it being used with the forward and reverse reads) and which method you typically suggest? Thanks!

The text was updated successfully, but these errors were encountered:

AlessioMilanese · 2019-11-17T11:12:01Z

Hi @mikemc,
Thanks for your interest in mOTUs.

So, you have paired-end reads (forward and reverse). Some reads are of low quality and would be better to either remove them or trim the bases of low quality.

You can use trimmomatic to filter and trim reads. Note that if you remove one read from the forward file, then the corresponding read in the reverse file doesn't have a pair. You have two options now:

throw away also the read in the reverse file, or
keep the read, but put it in a new file (which contains reads without pairs), and we call this file single.

mikemc · 2019-11-17T15:08:59Z

keep the read, but put it in a new file (which contains reads without pairs), and we call this file single.

I see, thanks for the clarification! Does the mapping done by motus profile use the paired-end information, or would the results be the same if the forward and reverse reads were filtered independently (losing the paired correspondence and keeping reads where the mate was discarded) and then passed to mOTUs with

motus profile -s forward_reads_filtered.fq,reverse_reads_filtered.fq

AlessioMilanese · 2019-11-17T20:13:02Z

mOTUs profile use paired end information when mapping the reads.

It helps to assign reads to map to multiple species. For example, if for_read map equally well to species_1 and species_2; and rev_read map equally well to species_2 and species_3. Then we know that the reads comes from species_2.

mikemc · 2019-11-18T22:16:22Z

Got it, thanks.

I'm interested in whether you think trimming and filtering is necessary (beyond removing any adapter sequences), and if so what parameters (e.g. for trimmomatic) you use in your own analysis or in the benchmarking datasets from the mOTUs2 paper. The BWA creator has suggested that quality trimming is not necessary for BWA-MEM due to the soft-clipping behavior, but the BWA FAQ indicates error rates above 2% on 100bp reads might not work well, so it is unclear to me whether spurious alignments are likely enough to be a problem for the resulting mOTUs2 profiles. Have you seen benefits of trimming and filtering beyond computational speed-ups from having to align less reads? I.e., more false-positives in the mOTUs that are identified?

AlessioMilanese · 2019-11-19T12:53:47Z

The BWA creator has suggested that quality trimming is not necessary for BWA-MEM due to the soft-clipping behavior, but the BWA FAQ indicates error rates above 2% on 100bp reads might not work well, so it is unclear to me whether spurious alignments are likely enough to be a problem for the resulting mOTUs2 profiles. Have you seen benefits of trimming and filtering beyond computational speed-ups from having to align less reads? I.e., more false-positives in the mOTUs that are identified?

It doesn't make much of a difference to trim reads. For example, here I evaluated with ~100 samples with mOTUs V2. Changing -g has a bigger effect than trimming.

AlessioMilanese · 2019-11-19T13:14:42Z

and if so what parameters (e.g. for trimmomatic) you use in your own analysis or in the benchmarking datasets from the mOTUs2 paper.

I used the following parameters for trimmomatic:

LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

which are also used in their website.

mikemc · 2019-11-19T15:42:19Z

Great, thanks again for the info. That answers my questions so I'll go ahead and close the issue.

AlessioMilanese self-assigned this Nov 17, 2019

AlessioMilanese added the help wanted Extra attention is needed label Nov 17, 2019

mikemc closed this as completed Nov 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommended usage with paired-end reads #42

Recommended usage with paired-end reads #42

mikemc commented Nov 16, 2019

AlessioMilanese commented Nov 17, 2019

mikemc commented Nov 17, 2019

AlessioMilanese commented Nov 17, 2019

mikemc commented Nov 18, 2019

AlessioMilanese commented Nov 19, 2019

AlessioMilanese commented Nov 19, 2019

mikemc commented Nov 19, 2019

Recommended usage with paired-end reads #42

Recommended usage with paired-end reads #42

Comments

mikemc commented Nov 16, 2019

AlessioMilanese commented Nov 17, 2019

mikemc commented Nov 17, 2019

AlessioMilanese commented Nov 17, 2019

mikemc commented Nov 18, 2019

AlessioMilanese commented Nov 19, 2019

AlessioMilanese commented Nov 19, 2019

mikemc commented Nov 19, 2019