Speedyseq is an R package for microbiome data analysis that extends the popular phyloseq package. Speedyseq began with the limited goal of providing faster versions of phyloseq’s plotting and taxonomic merging functions, but now contains a growing number of enhancements to phyloseq which I have found useful.
Install the current development version with the remotes package,
# install.packages("remotes") remotes::install_github("mikemc/speedyseq")
Method 1: Call speedyseq functions explicitly when you want to use speedyseq’s version instead of phyloseq. This method ensures that you do not unintentionally call speedyseq’s version of a phyloseq function.
library(phyloseq) data(GlobalPatterns) system.time( # Calls phyloseq's psmelt df1 <- psmelt(GlobalPatterns) # slow ) #> user system elapsed #> 6.320 0.063 6.390 system.time( df2 <- speedyseq::psmelt(GlobalPatterns) # fast ) #> user system elapsed #> 0.344 0.004 0.245 dplyr::all_equal(df1, df2, ignore_row_order = TRUE) #>  TRUE detach(package:phyloseq)
Method 2: Load speedyseq, which will load phyloseq and all speedyseq functions and cause calls to the overlapping function names to go to speedyseq by default.
library(speedyseq) #> Loading required package: phyloseq #> #> Attaching package: 'speedyseq' #> The following objects are masked from 'package:phyloseq': #> #> filter_taxa, plot_bar, plot_heatmap, plot_tree, psmelt, tax_glom, tip_glom, #> transform_sample_counts data(GlobalPatterns) system.time( ps1 <- phyloseq::tax_glom(GlobalPatterns, "Genus") # slow ) #> user system elapsed #> 31.856 0.106 32.031 system.time( # Calls speedyseq's tax_glom ps2 <- tax_glom(GlobalPatterns, "Genus") # fast ) #> user system elapsed #> 0.241 0.000 0.230
Loading speedyseq will also load the
magrittr pipe (
%>%) to allow pipe
chains with phyloseq objects,
gp.filt.prop <- GlobalPatterns %>% filter_taxa2(~ sum(. > 0) > 5) %>% transform_sample_counts(~ . / sum(.))
Faster implementations of phyloseq functions
psmelt()and the plotting functions that use it:
- The taxonomic merging functions
tip_glom()also has significantly lower memory usage.
These functions should generally function as drop-in replacements for
phyloseq’s versions, with additional arguments allowing for modified
behavior. Differences in row order (for
psmelt()) and taxon order (for
tax_glom()) can occur; see
New taxonomic merging functions
- A general-purpose merging function
merge_taxa_vec()that provides a vectorized version of phyloseq’s
- A function
tree_glom()that performs direct phylogenetic merging of taxa. This function provides an alternative to the indirect phylogenetic merging done by
tip_glom()that is much faster and arguably more intuitive.
See the Changelog for details and examples.