Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you define contrasts in the hierarchical multiple testing? #18

Open
benkraj opened this issue Jun 6, 2017 · 3 comments
Open

Comments

@benkraj
Copy link

benkraj commented Jun 6, 2017

Hi-

Within DESeq you normally apply some sort of contrast with the results function i.e. (from: https://www.bioconductor.org/packages/release/bioc/vignettes/phyloseq/inst/doc/phyloseq-mixture-models.html):
ddsTEST <- phyloseq_to_deseq2(ps.glom, ~ SeasonLoc)
ddsTEST = estimateSizeFactors(ddsTEST, geoMeans=geoMeans)
ddsTEST <- DESeq(ddsTEST, fitType="parametric")
resT = results(ddsTEST, contrast=c("SeasonLoc", "PermDry", "PermWet"))

Which then will give you log2fold changes and adjusted p-values between your defined groups (i.e. "PermDry" compared to "PermWet" in this example).

I understand with the hierarchical multiple testing it's accounting for higher level structure in the data based on phylogeny, but I don't see how I define which groups the procedure compares. In the workflow, which age bins is it finding have different Lachnospiraceae abundance as there are 3 bins (0-100, 100-200, 200-400)? Is there any sort of directionality you can get from the output (increasing/decreasing abundance)?

Let me know if I'm being unclear or you need more info.

Thanks for any help,
Ben

@spholmes
Copy link
Owner

spholmes commented Jun 7, 2017 via email

@benkraj
Copy link
Author

benkraj commented Jun 7, 2017

Hi Susan-

Thanks for your response.

Yes I am after the structSSI portion of the analysis. I have read through that paper, and I think maybe I follow now. Is it best to define your group comparison (or rounds of group comparisons) using the treePValues command's "environments"?

i.e.
First round (comparing everything):

data("chlamydiae", package = "structSSI")
environments <- sample_data(chlamydiae)$SampleType
abundances <- otu_table(chlamydiae)
chl.tree <- get.edgelist(as.igraph(phy_tree(chlamydiae)))
chl.pval <- treePValues(chl.tree, abundances, environments)
chl.hfdr <- hFDR.adjust(chl.pval, chl.tree, alpha = 0.1)

Then say I want to compare only "Soil" vs "Skin" samples, I would just define a new environment, i.e.:

enviro2 <- as.factor(c("Soil", "Skin"))
chl.pval2 <- treePValues(chl.tree, abundances, enviro2)
chl.hfdr2 <- hFDR.adjust(chl.pval2, chl.tree, alpha = 0.1)

This would then be analogous to the contrasts in DESeq? It seems like it works, and then I wouldn't need to do any sort of phyloseq modification, i.e. subsetting the phyloseq to just my Soil and Skin samples.

In regards to the boxplots/ways to get fold change direction, if I understand correctly, I then could use basically what I had been before with DESeq to get the fold change information, but replace the p/adjusted p-values it calculates with the hierarchical multiple testing adjusted ones? Are there alternative helper functions I could use for this?

Does this seem to be correct in the approach?

Thanks,
Ben

@spholmes
Copy link
Owner

spholmes commented Jun 7, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants