Analysis (DESeq2) at the genus level (rather than OTU) #683

Closed
zapratte opened this Issue Nov 8, 2016 · 2 comments

Projects

None yet

3 participants

@zapratte
zapratte commented Nov 8, 2016

This isn't an issue per say, but I'm not entirely sure where to put this. We'd like to conduct analyses (particularly DESeq2 and heat maps) at the genus level, rather than the OTU level. I'm quite clumsy at R. My current workflow for DESeq2 is mostly bits and pieces taken from other scripts:
map<-import_qiime_sample_data(file.choose())
tree<-read_tree(file.choose())
biomfile<-import_biom(file.choose(),parseFunction=parse_taxonomy_greengenes)
data<-merge_phyloseq(biomfile,tree,map)
deseqdata<-phyloseq_to_deseq2(data,~Microbiome)
gm_mean<-function(x, na.rm=TRUE){exp(sum(log(x[x > 0]), na.rm=na.rm) / length(x))}
geoMeans<-apply(counts(deseqdata),1,gm_mean)
deseqdata<-estimateSizeFactors(deseqdata,geoMeans=geoMeans)
deseqdata<-DESeq(deseqdata,fitType="local")
res<-results(deseqdata,contrast=c("Microbiome","Factor1","Factor2"))
alpha<-0.01
sigtab<-res[which(res$padj<alpha),]
sigtab<-cbind(as(sigtab,"data.frame"),as(tax_table(data)[rownames(sigtab),],"matrix"))
write.table(sigtab,"SidDifSpecies.txt",sep="\t")

Any help manipulating the data and/or script to analyze at the genus level would be greatly appreciated. Please excuse the clumsy post and code- it's my first one.

@StackerEd

There are a few ways I can think of doing it - I'd say the best ways are to filter out Genus level OTU's in phyloseq and convert to DESeq or pull it from the data frame of the DESeq object (I believe this will be harder). Use subset_taxa to achieve this

assuming your data puts "NA" in your taxonomy table, this will work. replace this with "" blank characters or whatever your table uses

GenusFiltered <- subset_taxa(data, Genus != "NA" & Species == "NA")

@joey711
Owner
joey711 commented Dec 29, 2016 edited

I would suggest you agglomerate to the genus level, using the tax_glom function

physeq = merge_phyloseq(biomfile, tree, map)
GenusFiltered <- subset_taxa(physeq, Genus != "NA" & Species == "NA")
physeqGenus = tax_glom(GenusFiltered, "Genus")

And then go ahead and do your analysis using the typical phyloseq code, etc.

@joey711 joey711 closed this Dec 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment