REFT (Root Exudate Feature Toolkit) is an R package for molecule-oriented analysis of root exudate and metabolomics annotation tables. It supports batch database searching, SMILES matching, and calculation of six molecular descriptors.
- Read Excel annotation tables
- Match PubChem records in the order
Name -> Other_name(Kegg_name) -> Kegg_ID -> HMDB_ID - Return a matching log and unmatched records
- Optionally export Excel output files when
output_diris explicitly supplied - Calculate six molecular descriptors using
rcdk:- MW
- nHBAcc
- ALogP
- FMF
- HybRatio
- nAcid
- Query KEGG reactions linked to EC numbers for microbe-associated reaction annotation
install.packages(c(
"readxl", "dplyr", "purrr", "stringr", "tibble",
"writexl", "webchem", "rcdk", "rcdklibs"
))install.packages("REFT")install.packages("REFT_0.1.4.tar.gz", repos = NULL, type = "source")By default, reft_run() and reft_run_simple() return results in R and do not write files.
library(REFT)
res <- reft_run_simple(
input_file = "example_root_exudate.xlsx"
)
head(res$descriptors)To write output files, explicitly provide an output directory. In examples, use tempdir() or another user-chosen location.
res <- reft_run_simple(
input_file = "example_root_exudate.xlsx",
output_dir = tempdir()
)library(REFT)
res <- reft_run(
input_file = "your_data.xlsx",
name_col = "Name",
other_col = "Other_name(Kegg_name)",
hmdb_col = "HMDB_ID",
kegg_col = "Kegg_ID",
output_dir = tempdir()
)res <- reft_kegg_microbe_run(
input_file = "microbe_ec.csv",
output_dir = tempdir()
)
head(res$results)If you want to cache PubChem results, explicitly choose a path. In examples or tests, use tempdir().
options(REFT.pubchem_cache_file = file.path(tempdir(), "REFT_pubchem_cache.rds"))When output_dir is explicitly supplied, reft_run() writes:
metabolites_6_descriptors.xlsxunmatched_smiles.xlsxpubchem_match_log.xlsx
When output_dir is explicitly supplied, reft_kegg_microbe_run() writes:
microbe_ec_kegg_reactions.xlsx
Both reft_run() and reft_run_simple() return a named list:
descriptors: final result table containing SMILES and six descriptorsunmatched: records that were not matched to SMILESmatch_log: PubChem matching log
reft_kegg_microbe_run() returns a named list:
results: final microbe-EC-reaction tableec_to_reaction: EC-to-reaction mapping tablereaction_details: reaction detail tablecompound_table: compound formula table
- KEGG_ID and HMDB_ID currently follow the original workflow and are queried through PubChem name-based searching, so the hit rate may vary among identifiers.
rcdkrequires a working Java environment.- If PubChem or KEGG network access is unstable, some records may return
NA.
REFT can be installed without loading rcdk at package startup. Only the molecular descriptor calculation step requires rcdk/rJava.
If descriptor calculation fails on Windows, check that Java is installed and that R and Java use matching architectures.