Home

Welcome to the examples page!

Just some practical examples that demo some functionality (Not a complete overview of all features).
If you have a cool contribution, just let us know!

Filling the taxonomy_table gaps

taxa_resolve_NAs(): NAs in the taxonomy will be filled with the next higher taxonomic rank known. If your taxonomy isn't using taxonomy prefixes like k__ f__ these will be added to track which taxonomic level was inserted. Meaning a family f__ can occur in the Genus rank but will be recognizable. Biggest advantage is that you have more resolution at lower taxonomic ranks splitting out the sometimes bulky NA category! These NAs occur due to the classifier using a CLA algorithm couldn't decide to for instance which Genus it belonged and assigned them to the earliest consensus for instance at Family level. That knowledge can be propagated down the ranks!
On the fly we can add a GenusSpecies rank if Species level data (usually truncated WITHOUT the Genus name). In the GenusSpecies column we format it nicely not prefixing the ranks but include [rank] if different from Species level.

# Fill NA taxonomic ranks with higher known (using phyloseq object ps.bac2)
ps.bac3 <- taxa_resolve_NAs( bac2, verbose=FALSE, genusspecies=TRUE )

Before:
After:

The resolver is able to handle 'gaps':

Sample_sum plot

# function defaults
sample_sum_plot( ps, percentKeep = 90, cutoff = FALSE, numberKeep = FALSE, logy = TRUE, color = "", size.points = 3,
                 stats = TRUE, subtitle = "", crosshair = TRUE, xlab.rel = 0.8, namesColumn = "" )

Will plot the total sum of otu counts per sample and sort them decreasing. Some simple stats, naming, tweaking and coloring options are provided. It returns a classic ggplot2 object so you can tweak it later as much as you want.
Tip: use ggplotly( sample_sum_plot_object ) from the plotly library to make your ggplots interactive!!

# Example
ps.bac3.ssplot <- sample_sum_plot( ps.bac3, numberKeep = 70, color="copdcaco", logy=TRUE, namesColumn = "ID" )

Simple bar plot

# function defaults
ps_plot_bar( ps, taxrank = "Phylum", top = 10, taxglom = TRUE, taxglom.next = "", sort.stack = TRUE,
             x = "Sample", y = "Abundance", logy = FALSE, group = "", group_fun = "mean", facet_grid = NULL,
             NArm = FALSE, xlab.rel = 0.5, legend = TRUE, legend.col = 1, legend.size = 8,
             title = paste0("Barplot ",taxrank," - ", if (top > 0) {paste0("Top", top)} else {"all taxa"}),
             title.center = TRUE, show.other = TRUE, other.label = "Other", other.color = "white" )

Showing a stacked bar plot featuring stack sorting (highest overall abundant taxa are stacked first), an alternating color scheme for easy category tracking, easy top-x and further tweaking possible. It returns a classic ggplot2 object so any post-tweaks are possible.

# Demo bar plot of a relative abundant phyloseq object
ps_plot_bar( ps.r, taxrank="Class", top=7, xlab.rel=0.7, x="Seq_id")

There are some handy tweaks while getting to know the data. For instance allow bars to be split on a lower taxonomic rank.
The same data tax_glommed on Class (color) but leaving segments of 'Group':

ps_plot_bar( tmp, taxrank="Class", top=7, xlab.rel=0.7, x="Seq_id", taxglom = FALSE, taxglom.next = "Group")

The same data showing Accession level segments:

ps_plot_bar( tmp, taxrank="Class", top=7, xlab.rel=0.7, x="Seq_id", taxglom = FALSE, taxglom.next = "ID_accession")

Prevalence plotting

# function defaults
prevalence_plot( ps, rank = "Phylum", xlog = TRUE, ylog = FALSE, yrelative = FALSE, strbck = "#f0f0f0",
                 title = paste0("Taxa (", rank, ") prevalence over all samples (Nsamples=", nsamples(ps), ")") )

Plotting the prevalence of taxa occurring under a specified rank is sometimes usefull to get the overall picture of taxa distribution. I shows each taxon underneath a specified Rank as the total SUM of abundance (x-axis) and the total number of samples it was present in (y-axis). Segmented by specified Rank.
There are some classic options for tweaking, or you can manipulate the returned ggplot2 object directly.

# example
prevalence_plot( ps_arg2.fpkm, rank="Class" )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly