Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge.groups #105

Closed
mothur-westcott opened this issue Jun 8, 2015 · 5 comments
Closed

Merge.groups #105

mothur-westcott opened this issue Jun 8, 2015 · 5 comments

Comments

@mothur-westcott
Copy link
Contributor

Add method parameter. Options are sum, average, median. sum=default which is mothur's current option.

@mothur-westcott
Copy link
Contributor Author

@pschloss What do you want to do with the average and median options when there are OTUs of low abundance. For example you may have a count table or shared OTU that looks like:

1 0 0 2 0 2 0 0 1 0 0 Given the merging of the groups as:

1 0 0 - average = 0 median = 0
2 0 - average = 1 median = 1
2 0 0 - average = 0 median = 0
1 0 0 - average = 0 median = 0

Do you want to round up for the average? A ceiling on the int? With the median as is we could end up with empty count tables in theory. Another approach?

@pschloss
Copy link
Contributor

Ugh, I really don't like average and median when people have different numbers of sequences per group. For average can we give the float? And for median can you remove the column?

@mothur-westcott
Copy link
Contributor Author

For the average I could give a float and return a relabund file instead of a shared file, but that wouldn't work for the count table since the count table requires an integer. Maybe we don't allow average for the count table?? For the median in a shared file, I could remove zero otus. For the median in the count table, remove sequences that zero out and produce an accnos file you could use to remove those sequences from your other files. Your thoughts??

@pschloss
Copy link
Contributor

How about this...

For average and median, check that all of the groups have the same number of sequences, if not - throw an error and stop the command.

Don't use average for a count table. For median, can we force them to provide the fasta file and do the remove.seqs step on the fasta file at the same time?

mothur-westcott added a commit that referenced this issue Sep 15, 2015
Issue #105

Bug Fix: missing tab in header count.seqs large=t
mothur-westcott added a commit that referenced this issue Sep 15, 2015
Issue #105

Fixes design flaw in designMap new functions
@mothur-westcott
Copy link
Contributor Author

Completed with commit f487554

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants