The BIOM output file from the step 4 of Frogs pipeline (« FROGS Affiliation OTU » tool) contains a abundance table of OTUs and their taxonomies. Thanks to these informations, a metabolic and functional prediction can be done with «Function Table for Tax4Fun matrix (Galaxy Version 1.0.0)» tool.
Three remarks:
- The database used for «Function Table for Tax4Fun matrix (Galaxy Version 1.0.0)» tool is Silva version 123.
- FROGS team recommends a taxonomic affilation with blastn, then this generate multi-affiliations. OTUs whose taxonomy contains the term « multi-affiliation » won't find correspondences with KEGG organims.
- «Function Table for Tax4Fun matrix (Galaxy Version 1.0.0)» tool take as input file a phyloseq object in rdata format.
Four intermediate steps must be performed before using the tool «Function Table for Tax4Fun matrix (Galaxy Version 1.0.0)», these steps allow:
- Convert the BIOM file in a tabular file,
- Replace «multi-affiliation » terms by empty strings,
- Convert the tabular file in a BIOM file,
- Build a phyloseq object in rdata format from the previous BIOM file.
In order to achieve these steps quickly following the Frogs pipeline, the workflow «Function Table for Tax4Fun matrix » is available in .ga format You can download this workflow from your Galaxy account, check tools parameters, and then run this pipeline on your own data.
Figure 1: Workflow "Function Table for Tax4Fun matrix".
Figure 2: “FROGS BIOM to TSV (Galaxy Version 2.1.0)” tool.
Figure 3: “Find and replace (Galaxy Version 1.0.0)” tool.
Figure 4: “FROGS TSV_to_BIOM (Galaxy Version 2.0.0)” tool.
Figure 5: “FROGSSTAT Phyloseq Import Data (Galaxy Version 1.0.1)” tool.
At the end of these four steps, «Function Table for Tax4Fun matrix (Galaxy Version 1.0.0)» tool can be run:
Figure 6 : “Function Table (Galaxy Version 1.0.0) ” tool.
A description of input and output files, references, links to the package R «themetagenomics» and manual are available on the tool form.
The arguments, described in the manual, are as follows: t4f(otu_table=ABUND,rows_are_taxa=FALSE,tax_table=TAX,reference_path=tmp,type='uproc',short=TRUE,cn_normalize=TRUE,sample_normalize=TRUE,drop=TRUE)
The column headers of the output tables of the tool are shifted one column to the left.
Use of Tax4Fun with themetagenomics R package, otu_table and tax_table of phyloseq object to generate function table.
The David et al. time series dataset is used as example (source: https://cran.r-project.org/web/packages/themetagenomics/vignettes/functional_prediction.html).
Data file (format rdata): One phyloseq object containing the OTU abundance table and their taxonomies. This file can be the result of FROGS Phyloseq Import Data tool.
The David dataset has an OTU abundance table with 1493 taxa and 746 samples, and a taxonomy table with 1493 taxa by 7 taxonomic ranks.
The OTU abundance table (taxa are rows):
The taxonomy table (taxa are rows):
Output file is the function table that contains the KO term counts across samples, the KEGG metadata that describes the KO terms, and t4f specific metadata that has the FTU scores.
The function table «fxn_table» : The function table (format txt) contains the KO term counts across samples.
The KEGG metadata « fxn_meta» : The KEGG metadata (format txt) describes the KO terms.
Tax4Fun specific metadata «method_meta» : The Tax4Fun specific metadata (format txt) has the FTU quality control score for each sample. The FTU score is the fraction of OTUs that could not be mapped to KEGG organisms (KO terms).
Helene Billard - UMR-1280 PhAN Inra-Universite de Nantes
Sarah Maman - Sigenae GenPhySE - Inra Occitanie
Depending on the help provided you can cite us in acknowledgements, references or both. Examples
Acknowledgements :
We wish to thank SIGENAE group and UMR-1280 PhAN Inra-Universite de Nantes.
References :
SIGENAE [http://www.sigenae.org/] Package R "themetagenomics" (authors : Stephen Woloszynek): https://cran.r-project.org/web/packages/themetagenomics/ Manual ("t4f" : description, usage, arguments, sorties, references): https://cran.r-project.org/web/packages/themetagenomics/themetagenomics.pdf