Skip to content

RaoulWolf/chemspiderapi

Repository files navigation

chemspiderapi

R build status Codecov test coverage

As of 18 June 2021 RaoulWolf/chemspiderapi is the new official repository of {chemspiderapi}. The old repository (NIVANorge/chemspiderapi) is orphaned.

ChemSpider has introduced a new API syntax in late 2018, and the old ChemSpider API syntax will be shut down at the end of November 2018. {chemspiderapi} provides R wrappers around the new API services from ChemSpider.

The aim of this package is to:

  1. Translate the new ChemSpider API services into R-friendly functions.
  2. Include thorough quality checking before the query is send, to avoid using up the query quota on, e.g., spelling errors.
  3. Implement the R functionality in a way that is suitable for both {base}- and {tidyverse}-style programming.

{chemspiderapi} relies on API keys to access ChemSpider’s API services. For this, we recommend storing the ChemSpider API key using the {keyring} package.

To limit the rate of API queries, we recommend using the {ratelimitr} package or purrr::slowly() within the {tidyverse} package collection.

To handle PNG images, the {magick} and {png} packages are recommended.

We furthermore recommend the {memoise} package to “remember” the results of API queries (i.e., to not ruin the API allowance).

Additionally, the superb {webchem} package provides access to a lot of additional chemistry-related API services and is highly recommended.

Installation

R package

Install the package from GitHub (using the {remotes} package; automatically installed as part of {devtools}):

# install.packages("devtools")
remotes::install_github("RaoulWolf/chemspiderapi")

The development version can be installed with:

# install.packages("devtools")
remotes::install_github("RaoulWolf/chemspiderapi", ref = "dev")

Currently the only tested continuous integration environment for {chemspiderapi} is Linux. It should install smoothly on Windows 10 and mac OS machines as well. Please open an issue if you run into any troubles.

Dependencies

{chemspiderapi} relies on two essential dependencies. The {curl} package is used to handle the API queries and the {jsonlite} package is necessary to wrap and unwrap information for the queries.

If not already installed, these packages will be installed automatically when installing {chemspiderapi}. Should this result in trouble, the dependency packages can be installed manually:

install.packages(c("curl", "jsonlite"))

If {curl} or {jsonlite} are missing, (almost) all functions of {chemspiderapi} will fail and throw an error.

Coverage

As of 2021-06-29, the following functionalities are implemented and fully annotated (100%):

FILTERING

ChemSpider Compound API {chemspiderapi} Wrapper
filter-element-post post_element()
filter-formula-batch-post post_formula_batch()
filter-formula-batch-queryId-results-get get_formula_batch_query_id_results()
filter-formula-batch-queryId-status-get get_formula_batch_query_id_status()
filter-formula-post post_formula()
filter-inchi-post post_inchi()
filter-inchikey-post post_inchikey()
filter-intrinsicproperty-post post_intrinsic_property()
filter-mass-batch-post post_mass_batch()
filter-mass-batch-queryId-results-get get_mass_batch_query_id_results()
filter-mass-batch-queryId-status-get get_mass_batch_query_id_status()
filter-mass-post post_mass()
filter-name-post post_name()
filter-queryId-results-get get_query_id_results()
filter-queryId-results-sdf-get get_query_id_results_sdf()
filter-queryId-status-get get_query_id_status()
filter-smiles-post post_smiles()

LOOKUPS

ChemSpider Compound API {chemspiderapi} Wrapper
lookups-datasources-get get_data_sources()

RECORDS

ChemSpider Compound API {chemspiderapi} Wrapper
records-batch-post post_batch()
records-recordId-details-get get_record_id_details()
records-recordId-externalreferences-get get_record_id_external_references()
records-recordId-image-get get_record_id_image()
records-recordId-mol-get get_record_id_mol()

TOOLS

ChemSpider Compound API chemspiderapi Wrapper
tools-convert-post post_convert()
tools-validate-inchikey-post post_validate_inchikey()

Best practices for ChemSpider’s Compound APIs

This section will be updated with practical examples in the future.

The basic workflow order for the above FILTERING queries is:

  1. POST Query

  2. GET Status

  3. GET Results (after GET Status returns "Complete")

In practice, this means the following possible workflows can be implemented (from left to right):

POST Query GET Status GET Results
post_element() get_query_id_status() get_query_id_results()
post_formula() get_query_id_status() get_query_id_results()
post_formula_batch() get_formula_batch_query_id_status() get_formula_batch_query_id_results()
post_inchi() get_query_id_status() get_query_id_results()
post_inchikey() get_query_id_status() get_query_id_results()
post_intrinsic_property() get_query_id_status() get_query_id_results()
post_mass() get_query_id_status() get_query_id_results()
post_mass_batch() get_mass_batch_query_id_status() get_mass_batch_query_id_results()
post_mass() get_query_id_status() get_query_id_results()
post_name() get_query_id_status() get_query_id_results()
post_smiles() get_query_id_status() get_query_id_results()

Typically, the result will be one or multiple ChemSpider IDs (record_id). They can be used as input into the above RECORDS queries, e.g., get_record_id_details().

Vignettes

As of 2021-06-29, the following five vignettes are available:

  • Storing and Accessing API Keys: A basic example on how to safely store and retrieve API keys using the keyring package.

  • Rate-Limiting API Queries: Multiple examples on how to slow down API queries. base, ratelimitr, and tidyverse workflows are provided.

  • Remembering API Queries: A basic example on how to “remember” the results of API queries using the memoise package. This can greatly reduce redundant API queries.

  • Saving MOL and SDF Files: Examples on how to save MOL and SDF files as returned by get_record_id_mol() and get_query_id_results_sdf().

  • Saving PNG Images: Examples on how to save PNG files, as returned by get_record_id_image(), using functionalities of the {magick} or {png} packages.

Funding

This package was originally created at the Norwegian Institute for Water Research (NIVA) at NIVA’s Section for Ecotoxicology and Risk Assessment and funded by the Research Council of Norway (RCN), project 268294: Cumulative Hazard and Risk Assessment of Complex Mixtures and Multiple Stressors (MixRisk).

License

MIT © Raoul Wolf

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published