Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Count.seqs large=t parameter #203

Closed
mothur-westcott opened this Issue Jan 28, 2016 · 2 comments

Comments

Projects
None yet
2 participants
Contributor

mothur-westcott commented Jan 28, 2016

Works fine without large=t

`mothur > count.seqs(name=EMP2.cat.trim.unique.pick.good.filter.unique.abund.precluster.pick.names, group=EMP2.cat.trim.unique.pick.good.filter.unique.abund.precluster.pick.groups, large=t)

Using 1 processors.
It took 28716 seconds to sort and index the group and name files.
[ERROR]: found AF10.10.15.1181373_14597 in your groupfile, but AF10.10.15.1181373_14597288 was in your namefile, please correct.
It took 9937 seconds to create the count table file.

mothur > quit()

However, I sure that both sequences are in my group and my name file:

This is a grep of my group file:

AF10.10.15.1181373_14597 AF10.10.15.1181373
AF10.10.15.1181373_14597489 AF10.10.15.1181373
AF10.10.15.1181373_14597288 AF10.10.15.1181373

My name file has the sequences in the following context:

TV10.11.7.1182081_14566,AF10.10.15.1181373_14597,TV10.4.7.1181534_14605,
,AF10.9.15.1181763_14597487,AF10.10.15.1181373_14597489,5.11.6I.1181392_14597501,
AF11.2.7.1181945_14597254,AF10.10.15.1181373_14597288,IF5.23.1182198_14603576,

As a result, my the count.seqs command did not return an count file.`

@mothur-westcott mothur-westcott added this to the Version 1.38.0 milestone Jan 28, 2016

Contributor

pschloss commented Apr 14, 2016

I feel like we should probably scrap the large=T option in cluster as it actually performs worse than large=F for every dataset.

Contributor

mothur-westcott commented Jul 12, 2016

Removes large parameter from count.seqs with commit f6d9e36

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment