Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using phyloseq distance with normalisations from DESeq or edgeR #449

Closed
FabianRoger opened this issue Mar 18, 2015 · 1 comment
Closed

Comments

@FabianRoger
Copy link

Hej,

sorry for this naive question, but after reading your convincing article ("waste not want not") I would like to use the proposed techniques on my data. I especially want to calculate distance matrices with the appropriate alternatives to rarefaction but I am not sure how and if this has been implemented in the distance() function from the phyloseq package. It doesn't seem to have an option to choose the normalisation method (e.g. rarefaction, proportions, edgeR or DESeq). I found code for these normalisations in the supplementary but the resulting objects are in the format "DGEList" which is not supported by distance().

Before I spend more time trying to stick together the necessary lines of code, I just want to make sure that I didn't miss the obvious and that it hasn't been implemented in some function that I have missed or in some extension?

Any help would be appreciated!

Thanks for your great contributions,

Fabian

@joey711
Copy link
Owner

joey711 commented Jul 16, 2015

Transformations of the count values will be upstream of the distance() function call, except in cases where the distance method that you select already has a normalization implied/required in the method.

So for your case you would perform the variance stabilization first, and then replace the otu_table component with the variance stabilized version.

dds = phyloseq_to_deseq2(physeq, ~Treatment)
# Make a copy of your phyloseq object, which you will then modify with VST values
physeqvsd = physeq
vsd = getVarianceStabilizedData(dds)
otu_table(physeqvsd) <- otu_table(vsd, taxa_are_rows = TRUE)
distance(physeqvsd, ...)

This should be enough to get you started. Variance stabilization often produces negative values for very rare/low counts. If that is a problem for the distance method you plan to use, you can set these to zero, add pseudocounts before VST, or some other procedure.

If getVarianceStabilizedData is taking too long, you might try rlog, also in DESeq2.

See the following closed issue for more details...

#445

@joey711 joey711 closed this as completed Jul 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants