-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
using the same sample for two treatment categories #278
Comments
Are the initial samples closer to the "light" or the "dark"? It is possible to duplicate the initial values for both Light and Dark, but this risks being misleading to the reader because it will appear in the graphic that you made an extra pair of observations with high precision. It should only take 3 lines or so, using |
The initial samples are what the "light" and "dark" samples are made from so I wouldn't say the initial samples were closer to either. I will be careful to make it clear in the caption that the initials are plotted twice. If you had a minute to sketch an example that would be great. I'll look at "subset_samples" and "merge_samples" to see if I can figure it out too. Thanks!! ----- Original Message ----- Are the initial samples closer to the "light" or the "dark"? It is possible to duplicate the initial values for both Light and Dark, but this risks being misleading to the reader because it will appear in the graphic that you made an extra pair of observations with high precision. It should only take 3 lines or so, using Reply to this email directly or view it on GitHub: |
Issue 278library("phyloseq")
data("soilrep")
soilrep
# For speed, lets keep only the most-abundant 100 OTUs
keepTaxa = names(sort(taxa_sums(soilrep), decreasing = TRUE)[1:20])
soilrep = prune_taxa(keepTaxa, soilrep)
soilrep
# What are the variables listed?
sample_variables(soilrep)
get_variable(soilrep, "Treatment")
# Create a Time variable
sample_data(soilrep)$Time <- substr(sample_data(soilrep)$Sample, 1, 1)
# Rename entries for `warmed` and `clipped`
sample_data(soilrep)$warmed <- ifelse(sample_data(soilrep)$warmed == "yes",
"warmed", "unwarmed")
sample_data(soilrep)$clipped <- ifelse(sample_data(soilrep)$clipped == "yes",
"clipped", "unclipped") This dataset has its own control samples, and so can help illustrate the steps needed to duplicate "control" samples and merge them with the rest of the data. We will treat the taxmat = matrix(sample(letters[1:6], 70, replace = TRUE), nrow = ntaxa(soilrep),
ncol = 7)
rownames(taxmat) <- taxa_names(soilrep)
colnames(taxmat) <- c("Domain", "Phylum", "Class", "Order", "Family", "Genus",
"Species")
soilrep = merge_phyloseq(soilrep, tax_table(taxmat))
soilrep
Of course, the I want this to match the example's NA control values, so I'll re-assign Time-1 variables entries for controlSamples = sample_names(subset_samples(soilrep, warmed == "unwarmed" &
Time == "1"))
sample_data(soilrep)[controlSamples, ]$warmed <- NA
sample_data(soilrep)
1. Copy a subset of the control samples you want to duplicateIn this case, warmed is controlUU = subset_samples(soilrep, is.na(warmed) & Time == "1")
controlUU
2. Hacking part: rename each sample in its component object# sample data
sd = sample_data(controlUU)
rownames(sd) <- paste0(rownames(sd), "_dup")
# otu_table
OTU = otu_table(controlUU)
sample_names(OTU) <- paste0(sample_names(OTU), "_dup")
# Avoid conflict between factor and character
sd$clipped <- as.character(sd$clipped)
# Pretend these came from the warmed case, even though they didn't They are
# already in the main dataset as NA
sd$warmed <- "warmed" 3. Add the copied/sample-renamed components to the original datasetdata1 = merge_phyloseq(soilrep, sd, OTU)
data1
Also need to set the original control copies as sample_data(data1)$warmed[is.na(sample_data(data1)$warmed)] <- "unwarmed" 4. Compare both plots, with and without duplicated samplesplot_bar(soilrep, "Time", fill = "Phylum", facet_grid = warmed ~ clipped) plot_bar(data1, "Time", fill = "Phylum", facet_grid = warmed ~ clipped) |
Note that I fixed a minor internal issue in phyloseq to allow the hacking part to work. Relates to character variables being coerced to factors automatically when |
Thanks so much Joey!!! To update phyloseq to 1.7.11, I did the following: install.packages("devtools") Here is the output. There were 18 warnings but it seemed to work. Does that seem right to you???
|
Joey, I followed your example with my data and it worked perfectly! Thank you!!! I had a quick question about the order of things. Should I first make the object with the duplicate initial condition, then rarefy, then use tax_glom or topp/topf and plot_bar? Or should I rarefy first, then duplicate the initials samples, then tax_glom and plot_bar? If I wanted to plot relative abundance would I transform_samples before or after tax_glom or topp/topf? Thanks again! |
Lauren, Sorry, I haven't finished testing version As for pre-processing your counts, you probably want to do any agglomeration ( As for rarefying: You shouldn't rarefy. Ever. See this thread on the topic: And for completeness, Susan and I explain in detail why you shouldn't rarefy in our pre-publication draft on Q-Bio-arXiv. http://arxiv.org/abs/1310.0424 Make sure to look at revision 2, its figures are nicer :-) We're hoping to get the complete article out in an open-access journal very soon. Hope that helps. Version Cheers joey p.s. I'll close this once |
Hope that works! joey |
hi,
for our experiments, we made two initial microcosm mixtures (filtered and unfiltered), divided each into two, treatment (exposed to light) and control (kept in dark), and sampled them over time. so for every time point we sequenced 4 samples, one treatment (light) and one control (dark) for each of two conditions (filtered and unfiltered). but we only sequenced the initial microcosm mixtures (filtered and unfiltered) once at time 0 so there is an NA in the map file for the treatment (light or dark).
a problem arises using plot_bar.
here is the code and attached is the plot.
plot_bar(dataR_nosea_taxglom_relab, "Time", fill = "Phylum", facet_grid = Treatment ~ Filtered_Unfiltered)
as you can see in the figure, the initial samples plot under NA. I would like them to plot under both treatments (light and dark). I'm guessing this means I need to create a phyloseq object with the samples labeled differently and in there twice but I haven't been able to figure out how to do this. any suggestions or examples?
thank you!!!
lauren
The text was updated successfully, but these errors were encountered: