Skip to content

cran/REFT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REFT

REFT (Root Exudate Feature Toolkit) is an R package for molecule-oriented analysis of root exudate and metabolomics annotation tables. It supports batch database searching, SMILES matching, and calculation of six molecular descriptors.

Main features

  • Read Excel annotation tables
  • Match PubChem records in the order Name -> Other_name(Kegg_name) -> Kegg_ID -> HMDB_ID
  • Return a matching log and unmatched records
  • Optionally export Excel output files when output_dir is explicitly supplied
  • Calculate six molecular descriptors using rcdk:
    • MW
    • nHBAcc
    • ALogP
    • FMF
    • HybRatio
    • nAcid
  • Query KEGG reactions linked to EC numbers for microbe-associated reaction annotation

Install dependencies

install.packages(c(
  "readxl", "dplyr", "purrr", "stringr", "tibble",
  "writexl", "webchem", "rcdk", "rcdklibs"
))

Install REFT

Option 1: Install from CRAN

install.packages("REFT")

Option 2: Install from a source tarball

install.packages("REFT_0.1.4.tar.gz", repos = NULL, type = "source")

Quick start

By default, reft_run() and reft_run_simple() return results in R and do not write files.

library(REFT)

res <- reft_run_simple(
  input_file = "example_root_exudate.xlsx"
)

head(res$descriptors)

To write output files, explicitly provide an output directory. In examples, use tempdir() or another user-chosen location.

res <- reft_run_simple(
  input_file = "example_root_exudate.xlsx",
  output_dir = tempdir()
)

Custom column names

library(REFT)

res <- reft_run(
  input_file = "your_data.xlsx",
  name_col = "Name",
  other_col = "Other_name(Kegg_name)",
  hmdb_col = "HMDB_ID",
  kegg_col = "Kegg_ID",
  output_dir = tempdir()
)

KEGG-microbe workflow

res <- reft_kegg_microbe_run(
  input_file = "microbe_ec.csv",
  output_dir = tempdir()
)

head(res$results)

Optional PubChem cache

If you want to cache PubChem results, explicitly choose a path. In examples or tests, use tempdir().

options(REFT.pubchem_cache_file = file.path(tempdir(), "REFT_pubchem_cache.rds"))

Output files

When output_dir is explicitly supplied, reft_run() writes:

  • metabolites_6_descriptors.xlsx
  • unmatched_smiles.xlsx
  • pubchem_match_log.xlsx

When output_dir is explicitly supplied, reft_kegg_microbe_run() writes:

  • microbe_ec_kegg_reactions.xlsx

Returned object

Both reft_run() and reft_run_simple() return a named list:

  • descriptors: final result table containing SMILES and six descriptors
  • unmatched: records that were not matched to SMILES
  • match_log: PubChem matching log

reft_kegg_microbe_run() returns a named list:

  • results: final microbe-EC-reaction table
  • ec_to_reaction: EC-to-reaction mapping table
  • reaction_details: reaction detail table
  • compound_table: compound formula table

Notes

  • KEGG_ID and HMDB_ID currently follow the original workflow and are queried through PubChem name-based searching, so the hit rate may vary among identifiers.
  • rcdk requires a working Java environment.
  • If PubChem or KEGG network access is unstable, some records may return NA.

Java / rcdk note

REFT can be installed without loading rcdk at package startup. Only the molecular descriptor calculation step requires rcdk/rJava.

If descriptor calculation fails on Windows, check that Java is installed and that R and Java use matching architectures.

About

❗ This is a read-only mirror of the CRAN R package repository. REFT — Root Exudate Feature Toolkit. Homepage: https://github.com/gaoguozhen1/REFT Report bugs for this package: https://github.com/gaoguozhen1/REFT/issues

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages