-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do you define contrasts in the hierarchical multiple testing? #18
Comments
Ben
I think your question is about the structSSI component of the test which
allows
you to test different levels of taxonomy
(ie is there a difference at the phyla level, is there a difference at the
genus level for the phyla that are
significantly different..etc).
Reading the complete article about testing with hierarchical structure
https://www.jstatsoft.org/article/view/v059i13
When doing testing for more than two groups, the only conclusion in a first
round of tests is:
there is a difference, the extra testing is done afterwards but looking at
boxplots of the significant taxa
gives the direction.
Susan
…On Tue, Jun 6, 2017 at 9:12 AM, benkraj ***@***.***> wrote:
Hi-
Within DESeq you normally apply some sort of contrast with the results
function i.e.:
ddsTEST <- phyloseq_to_deseq2(ps.glom, ~ SeasonLoc)
resT = results(ddsTEST, contrast=c("SeasonLoc", "PermDry", "PermWet"))
Which then will give you log2fold changes and adjusted p-values between
your defined groups (i.e. "PermDry" compared to "PermWet" in this example).
I understand with the hierarchical multiple testing it's accounting for
higher level structure in the data based on phylogeny, but I don't see how
I define which groups the procedure compares. In the workflow, which age
bins is it finding have different Lachnospiraceae abundance as there are 3
bins (0-100, 100-200, 200-400)? Is there any sort of directionality you can
get from the output (increasing/decreasing abundance)?
Let me know if I'm being unclear or you need more info.
Thanks for any help,
Ben
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#18>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABJcvXVzsTI5kHCr4JcX2zvwLJCWY9R0ks5sBXp9gaJpZM4Nxk1V>
.
--
Susan Holmes
Professor, Statistics and BioX
John Henry Samter Fellow in Undergraduate Education
Sequoia Hall,
390 Serra Mall
Stanford, CA 94305
http://www-stat.stanford.edu/~susan/
|
Hi Susan- Thanks for your response. Yes I am after the structSSI portion of the analysis. I have read through that paper, and I think maybe I follow now. Is it best to define your group comparison (or rounds of group comparisons) using the treePValues command's "environments"? i.e.
Then say I want to compare only "Soil" vs "Skin" samples, I would just define a new environment, i.e.:
This would then be analogous to the contrasts in DESeq? It seems like it works, and then I wouldn't need to do any sort of phyloseq modification, i.e. subsetting the phyloseq to just my Soil and Skin samples. In regards to the boxplots/ways to get fold change direction, if I understand correctly, I then could use basically what I had been before with DESeq to get the fold change information, but replace the p/adjusted p-values it calculates with the hierarchical multiple testing adjusted ones? Are there alternative helper functions I could use for this? Does this seem to be correct in the approach? Thanks, |
Ben
Your approach is the best one, sorry not many helpers as everyone tailors
their use of the
tools to a different workflow so not much help having a completely general
tool, but seems like
your approach is good.
If you are testing many different subsets for the same data, you may have
to cost yourself
a little for multiple testing too.
Susan
…On Wed, Jun 7, 2017 at 1:24 PM, benkraj ***@***.***> wrote:
Hi Susan-
Thanks for your response.
Yes I am after the structSSI portion of the analysis. I have read through
that paper, and I think maybe I follow now. Is it best to define your group
comparison (or rounds of group comparisons) using the treePValues command's
"environments"?
i.e.
First round (comparing everything):
data("chlamydiae", package = "structSSI")
environments <- sample_data(chlamydiae)$SampleType
abundances <- otu_table(chlamydiae)
chl.tree <- get.edgelist(as.igraph(phy_tree(chlamydiae)))
chl.pval <- treePValues(chl.tree, abundances, environments)
chl.hfdr <- hFDR.adjust(chl.pval, chl.tree, alpha = 0.1)
Then say I want to compare only "Soil" vs "Skin" samples, I would just
define a new environment, i.e.:
enviro2 <- as.factor(c("Soil", "Skin"))
chl.pval2 <- treePValues(chl.tree, abundances, enviro2)
chl.hfdr2 <- hFDR.adjust(chl.pval2, chl.tree, alpha = 0.1)
This would then be analogous to the contrasts in DESeq? It seems like it
works, and then I wouldn't need to do any sort of phyloseq modification,
i.e. subsetting the phyloseq to just my Soil and Skin samples.
In regards to the boxplots/ways to get fold change direction, if I
understand correctly, I then could use basically what I had been before
with DESeq to get the fold change information, but replace the p/adjusted
p-values it calculates with the hierarchical multiple testing adjusted
ones? Are there alternative helper functions I could use for this?
Does this seem to be correct in the approach?
Thanks,
Ben
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#18 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABJcvYl4TsK4JqiK4RKpNuXAl4idPvpYks5sBwbhgaJpZM4Nxk1V>
.
--
Susan Holmes
Professor, Statistics and BioX
John Henry Samter Fellow in Undergraduate Education
Sequoia Hall,
390 Serra Mall
Stanford, CA 94305
http://www-stat.stanford.edu/~susan/
|
Hi-
Within DESeq you normally apply some sort of contrast with the results function i.e. (from: https://www.bioconductor.org/packages/release/bioc/vignettes/phyloseq/inst/doc/phyloseq-mixture-models.html):
ddsTEST <- phyloseq_to_deseq2(ps.glom, ~ SeasonLoc)
ddsTEST = estimateSizeFactors(ddsTEST, geoMeans=geoMeans)
ddsTEST <- DESeq(ddsTEST, fitType="parametric")
resT = results(ddsTEST, contrast=c("SeasonLoc", "PermDry", "PermWet"))
Which then will give you log2fold changes and adjusted p-values between your defined groups (i.e. "PermDry" compared to "PermWet" in this example).
I understand with the hierarchical multiple testing it's accounting for higher level structure in the data based on phylogeny, but I don't see how I define which groups the procedure compares. In the workflow, which age bins is it finding have different Lachnospiraceae abundance as there are 3 bins (0-100, 100-200, 200-400)? Is there any sort of directionality you can get from the output (increasing/decreasing abundance)?
Let me know if I'm being unclear or you need more info.
Thanks for any help,
Ben
The text was updated successfully, but these errors were encountered: