Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended usage with paired-end reads #42

Closed
mikemc opened this issue Nov 16, 2019 · 7 comments
Closed

Recommended usage with paired-end reads #42

mikemc opened this issue Nov 16, 2019 · 7 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@mikemc
Copy link

mikemc commented Nov 16, 2019

The paper and wiki say that reads should be appropriately quality controlled. Do you recommend a particular filter-and-trim strategy for paired-end Illumina reads that works well with mOTUs2? Also, the wiki gives two different commands for profiling with paired-end reads:

As fastq input is possible to provide paired end reads:

motus profile -f forward_reads.fastq -r reverse_reads.fastq

as well as single reads that comes from quality filtering:

motus profile -f forward_reads.fq -r reverse_reads.fq -s single_reads.fq

Can you clarify what is happening in the second case (what is in single_reads.fq and how is it being used with the forward and reverse reads) and which method you typically suggest? Thanks!

@AlessioMilanese AlessioMilanese self-assigned this Nov 17, 2019
@AlessioMilanese AlessioMilanese added the help wanted Extra attention is needed label Nov 17, 2019
@AlessioMilanese
Copy link
Member

Hi @mikemc,
Thanks for your interest in mOTUs.

So, you have paired-end reads (forward and reverse). Some reads are of low quality and would be better to either remove them or trim the bases of low quality.

You can use trimmomatic to filter and trim reads. Note that if you remove one read from the forward file, then the corresponding read in the reverse file doesn't have a pair. You have two options now:

  1. throw away also the read in the reverse file, or
  2. keep the read, but put it in a new file (which contains reads without pairs), and we call this file single.

@mikemc
Copy link
Author

mikemc commented Nov 17, 2019

  1. keep the read, but put it in a new file (which contains reads without pairs), and we call this file single.

I see, thanks for the clarification! Does the mapping done by motus profile use the paired-end information, or would the results be the same if the forward and reverse reads were filtered independently (losing the paired correspondence and keeping reads where the mate was discarded) and then passed to mOTUs with

motus profile -s forward_reads_filtered.fq,reverse_reads_filtered.fq

@AlessioMilanese
Copy link
Member

mOTUs profile use paired end information when mapping the reads.

It helps to assign reads to map to multiple species. For example, if for_read map equally well to species_1 and species_2; and rev_read map equally well to species_2 and species_3. Then we know that the reads comes from species_2.

@mikemc
Copy link
Author

mikemc commented Nov 18, 2019

Got it, thanks.

I'm interested in whether you think trimming and filtering is necessary (beyond removing any adapter sequences), and if so what parameters (e.g. for trimmomatic) you use in your own analysis or in the benchmarking datasets from the mOTUs2 paper. The BWA creator has suggested that quality trimming is not necessary for BWA-MEM due to the soft-clipping behavior, but the BWA FAQ indicates error rates above 2% on 100bp reads might not work well, so it is unclear to me whether spurious alignments are likely enough to be a problem for the resulting mOTUs2 profiles. Have you seen benefits of trimming and filtering beyond computational speed-ups from having to align less reads? I.e., more false-positives in the mOTUs that are identified?

@AlessioMilanese
Copy link
Member

The BWA creator has suggested that quality trimming is not necessary for BWA-MEM due to the soft-clipping behavior, but the BWA FAQ indicates error rates above 2% on 100bp reads might not work well, so it is unclear to me whether spurious alignments are likely enough to be a problem for the resulting mOTUs2 profiles. Have you seen benefits of trimming and filtering beyond computational speed-ups from having to align less reads? I.e., more false-positives in the mOTUs that are identified?

It doesn't make much of a difference to trim reads. For example, here I evaluated with ~100 samples with mOTUs V2. Changing -g has a bigger effect than trimming.
diference_trimming

@AlessioMilanese
Copy link
Member

and if so what parameters (e.g. for trimmomatic) you use in your own analysis or in the benchmarking datasets from the mOTUs2 paper.

I used the following parameters for trimmomatic:

LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

which are also used in their website.

@mikemc
Copy link
Author

mikemc commented Nov 19, 2019

Great, thanks again for the info. That answers my questions so I'll go ahead and close the issue.

@mikemc mikemc closed this as completed Nov 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants