add functionality for collapsing replicate samples in either OTU tables or distance matrices #1678

gregcaporaso · 2014-09-24T14:16:47Z

Define a replicate group as samples that are considered replicates of each other (e.g., biological or technical replicates). Samples belonging to a replicate group, in practice, are likely to be grouped based on a pair of sample metadata categories (e.g., subject-id and replicate-number).

It's common that we have replicate samples in a study, but when we start performing downstream analyses we want to collapse the replicates to a single sample per replicate group.

Some possible ways that we would want to collapse samples in a replicate group at the OTU table or distance matrix stage are:

randomly select one sample from the replicate group
for each observation, take the median count across samples in the replicate group

Other ideas for how we might want to collapse these?

If anyone has code for doing this already, please follow up here. I'll take the lead on this as I need the code for an analysis I'm running now.

rob-knight · 2014-09-24T14:21:23Z

Mean count across samples

Pick sample with the most reads (ie collapse before rarefaction)

Sum counts across samples (before or after rarefaction)

Pick sample that is the centroid of the set of replicates

Also you might want to apply automated outlier detection on per-group or per-dataset basis before running any of these.

Thanks for adding functionality for doing this generally and right -- it will be really useful!

On Sep 24, 2014, at 8:16 AM, "Greg Caporaso" <notifications@github.com mailto:notifications@github.com> wrote:

Define a replicate group as samples that are considered replicates of each other (e.g., biological or technical replicates). Samples belonging to a replicate group, in practice, are likely to be grouped based on a pair of sample metadata categories (e.g., subject-id and replicate-number).

It's common that we have replicate samples in a study, but when we start performing downstream analyses we want to collapse the replicates to a single sample per replicate group.

Some possible ways that we would want to collapse samples in a replicate group at the OTU table or distance matrix stage are:

randomly select one sample from the replicate group
for each observation, take the median count across samples in the replicate group

Other ideas for how we might want to collapse these?

If anyone has code for doing this already, please follow up here. I'll take the lead on this as I need the code for an analysis I'm running now.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/1678.

gregcaporaso · 2014-09-24T16:32:23Z

Some initial experiments with this here.

gregcaporaso added enhancement question labels Sep 24, 2014

gregcaporaso self-assigned this Sep 24, 2014

jairideout added this to the QIIME 1.9.0 milestone Dec 8, 2014

gregcaporaso closed this as completed in 7a5b786 Dec 18, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add functionality for collapsing replicate samples in either OTU tables or distance matrices #1678

add functionality for collapsing replicate samples in either OTU tables or distance matrices #1678

gregcaporaso commented Sep 24, 2014

rob-knight commented Sep 24, 2014

gregcaporaso commented Sep 24, 2014

add functionality for collapsing replicate samples in either OTU tables or distance matrices #1678

add functionality for collapsing replicate samples in either OTU tables or distance matrices #1678

Comments

gregcaporaso commented Sep 24, 2014

rob-knight commented Sep 24, 2014

gregcaporaso commented Sep 24, 2014