Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

screen.seqs errors with optimize option #288

Closed
khyox opened this Issue Nov 30, 2016 · 5 comments

Comments

Projects
None yet
2 participants

khyox commented Nov 30, 2016

Mothur v.1.38.1 running on a GNU/Linux box. When using screen.seqs in a batch file, the combination of the options count, alignreport and minscore works properly filtering the sequences:

(...)
align.seqs(reference=/software/mothur/silva.v123.V4.fasta, flip=t)
(...)
Output File Names:
myproject.trim.contigs.good.good.unique.align
myproject.trim.contigs.good.good.unique.align.report
(...)
summary.seqs(fasta=current, count=current)
(...)
screen.seqs(fasta=current, count=current, alignreport=myproject.trim.contigs.good.good.unique.align.report, minscore=50)
Using myproject.trim.contigs.good.good.count_table as input file for the count parameter.
Using myproject.trim.contigs.good.good.unique.align as input file for the fasta parameter.

Using 64 processors.
Processing sequence: 100
(...)
Processing sequence: 384

Output File Names:
myproject.trim.contigs.good.good.unique.align.good.align.report
myproject.trim.contigs.good.good.unique.good.align
myproject.trim.contigs.good.good.unique.bad.accnos
myproject.trim.contigs.good.good.good.count_table

It took 1 secs to screen 24569 sequences.

However, when I try to use the optimize option (with count and alignreport), the code fails with errors about the namefile, which is not an input:

(...)
screen.seqs(fasta=current, count=current, alignreport=myproject.trim.contigs.good.good.unique.align.report, optimize=minscore, criteria=99)
Using myproject.trim.contigs.good.good.count_table as input file for the count parameter.
Using myproject.trim.contigs.good.good.unique.align as input file for the fasta parameter.

Using 64 processors.
[ERROR]: M01167_151_000000000-AV07A_1_1101_27033_7405 is not in your namefile, please correct.
[ERROR]: M01167_151_000000000-AV07A_1_1101_16105_10387 is not in your namefile, please correct.
[ERROR]: M01167_151_000000000-AV07A_1_1101_2853_13443 is not in your namefile, please correct.
[ERROR]: M01167_151_000000000-AV07A_1_1101_16835_17019 is not in your namefile, please correct.
(...)

On the other hand, the labels seem to exist in the current namefile (got with get.current() just before screen.seqs):

> egrep M01167_151_000000000-AV07A_1_1101_27033_7405 myproject.trim.contigs.good.good.names
M01167_151_000000000-AV07A_1_1101_27033_7405	M01167_151_000000000-AV07A_1_1101_27033_7405,...
 > egrep M01167_151_000000000-AV07A_1_1101_16105_10387 myproject.trim.contigs.good.good.names
M01167_151_000000000-AV07A_1_1101_16105_10387	M01167_151_000000000-AV07A_1_1101_16105_10387,...

Finally, I unsuccessfully try to circumvent the problem generating an updated namefile and using it in screen.seqs:

align.seqs(reference=/software/mothur/silva.v123.V4.fasta, flip=t)
(...)
Output File Names:
myproject.trim.contigs.good.good.unique.align
myproject.trim.contigs.good.good.unique.align.report
(...)
summary.seqs(fasta=current, count=current)
(...)
unique.seqs(fasta=current, count=current, format=name)
Using myproject.trim.contigs.good.good.count_table as input file for the count parameter.
Using myproject.trim.contigs.good.good.unique.align as input file for the fasta parameter.
(...)
Output File Names:
myproject.trim.contigs.good.good.unique.names
myproject.trim.contigs.good.good.unique.unique.align

screen.seqs(fasta=current, name=current, alignreport=gollum.trim.contigs.good.good.unique.align.report, optimize=minscore, criteria=99)
Using gollum.trim.contigs.good.good.unique.unique.align as input file for the fasta parameter.
Using gollum.trim.contigs.good.good.unique.names as input file for the name parameter.

Using 64 processors.
[ERROR]: M01167_151_000000000-AV07A_1_1101_27033_7405 is not in your namefile, please correct.
[ERROR]: M01167_151_000000000-AV07A_1_1101_16105_10387 is not in your namefile, please correct.
(...)

Unlucky again.
Thanks!

Contributor

mothur-westcott commented Jan 3, 2017

Sorry for the confusion. The name file label in the error message is misleading. It refers to the count file or name file in this case. Thank you for bringing to my attention the issue with the optimize parameter. I will look into that and work on a fix.

khyox commented Jan 8, 2017

Thank you very much Sarah!

Contributor

mothur-westcott commented Jan 9, 2017

The fix will be part of 1.39.0 releasing next week. :)

khyox commented Jan 25, 2017

Thank you very much for the release of 1.39.0 with this solved! :)

Contributor

mothur-westcott commented Jan 26, 2017

😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment