Trouble with uniquely mapped reads #43

nafeesatk · 2017-02-23T22:27:14Z

I am trying to just learn how to use the tool and am having issues with the first step (species taxonomy) when running samples downloaded from NCBI. I ran the test script and went through the tutorial without issue. I downloaded one of the sequences from the Hadza microbiome study and tried to run MIDAS with that. At the first step I did not get any uniquely mapped reads and therefore can't seem to get the species taxonomy and abundance for the sample. I've run in to this with other sequences I've tried as well. Here is the result I got:

sbrgxps@sbrgxps-XPS-8910:~/MIDAS$ run_midas.py species SRR1927149 -1 ../rampelliseqs/SRR1927149.fq

MIDAS: Metagenomic Intra-species Diversity Analysis System
version 1.2.1; github.com/snayfach/MIDAS
Copyright (C) 2015-2016 Stephen Nayfach
Freely distributed under the GNU General Public License (GPLv3)

===========Parameters===========
Command: /home/sbrgxps/MIDAS/scripts/run_midas.py species SRR1927149 -1 ../rampelliseqs/SRR1927149.fq
Script: run_midas.py species
Database: /home/sbrgxps/MIDAS/midas_db_v1.2
Output directory: SRR1927149
Input reads (1st mate): ../rampelliseqs/SRR1927149.fq
Input reads (2nd mate): None
Remove temporary files: False
Word size for database search: 28
Minimum mapping alignment coverage: 0.75
Number of reads to use from input: use all
Number of threads for database search: 1

Aligning reads to marker-genes database
17.92 minutes
0.73 Gb maximum memory

Classifying reads
total alignments: 25486
uniquely mapped reads: 0
ambiguously mapped reads: 0
0.0 minutes
0.74 Gb maximum memory

Estimating species abundance
total marker-gene coverage: 0.0
0.0 minutes
0.74 Gb maximum memory

Did I install something incorrectly or is there an issue with how I am running the scripts? I am new to using metagenomics analysis tools so I wasn't sure where to start with troubleshooting.

Thank you!

snayfach · 2017-02-23T22:52:36Z

Hi nafeesatk,

Everything you've done looks correct. However, I've run that same sample through MIDAS and get several hundred uniquely reads mapped to the marker genes.

I think the easiest solution would be for me to recreate the issue on my system using your FASTQ file. If you send me your email address, I can send a dropbox link where you can upload your FASTQ.

Thanks,
Stephen

nafeesatk · 2017-02-24T00:03:44Z

Hi Stephen,

Thank you so much for your help! My email is ntkhan@eng.ucsd.edu.

Best,

Nafeesa

snayfach · 2017-02-24T00:41:09Z

Hi Nafeesa,

I see the issue now. It appears you have merged read1 and read2 from SRR1927149_1.fastq and SRR1927149_2.fastq.

For most Illumina libraries read1 and read2 occur about 350 bp apart. So you won't get an alignment that covers the entire read, because in reality, there is a big gap between them. MIDAS sees that the alignments only cover a portion of the read and is therefore throwing them out.

I would strongly recommend running MIDAS without merging your reads. But if this is not possible, you can try decreasing the value for --aln_cov. The default if 0.7, but you can try dropping it to 0.35.

Best,
Stephen

snayfach closed this as completed Feb 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble with uniquely mapped reads #43

Trouble with uniquely mapped reads #43

nafeesatk commented Feb 23, 2017

snayfach commented Feb 23, 2017

nafeesatk commented Feb 24, 2017

snayfach commented Feb 24, 2017

Trouble with uniquely mapped reads #43

Trouble with uniquely mapped reads #43

Comments

nafeesatk commented Feb 23, 2017

snayfach commented Feb 23, 2017

nafeesatk commented Feb 24, 2017

snayfach commented Feb 24, 2017