webchem
is a R package to retrieve chemical information from the web.
This package interacts with a suite of web APIs for chemical information.
Source | Functions | API Docs | API key |
---|---|---|---|
Chemical Identifier Resolver (CIR) | cir_query() |
link | none |
ChemSpider | get_csid() , csid_compinfo() , csid_extcompinfo() |
link | link |
PubChem | get_cid() , cid_compinfo() |
link | none |
Chemical Translation Service (CTS) | cts_convert() , cts_compinfo() |
link | none |
ChemSpider functions require a security token. Please register at RSC (https://www.rsc.org/rsc-id/register) to retrieve a security token.
webchem
is currently not available on CRAN.
install.packages("devtools")
library("devtools")
install_github("edild/webchem")
library("webchem")
CAS numbers and molecular weight for Triclosan.
Use first
to return only the first hit.
cir_query('Triclosan', 'cas')
#> [1] "3380-34-5" "112099-35-1" "88032-08-0"
cir_query('Triclosan', 'cas', first = TRUE)
#> [1] "3380-34-5"
cir_query('Triclosan', 'mw')
#> [1] "289.5451"
Query SMILES and InChIKey from CAS (Triclosan).
Inputs might by ambiguous and we can specify where to search using resolver=
.
cir_query('3380-34-5', 'smiles')
#> [1] "C1=CC(=CC(=C1OC2=CC=C(C=C2Cl)Cl)O)Cl"
cir_query('3380-34-5', 'stdinchikey', resolver = 'cas_number')
#> [1] "InChIKey=XEFQLINVKFYRCS-UHFFFAOYSA-N"
Convert InChiKey (Triclosan) to ChemSpider ID and retrieve the number of rings
cir_query('XEFQLINVKFYRCS-UHFFFAOYSA-N', 'chemspider_id', first = TRUE)
#> [1] "5363"
cir_query('XEFQLINVKFYRCS-UHFFFAOYSA-N', 'ring_count')
#> [1] "2"
You'll need a API key:
token = '<YOUR TOKEN HERE'
Retrieve the ChemSpider ID of Triclosan
(id <- get_csid('Triclosan', token = token))
#> [1] "5363"
Use this ID to query information from ChemSpider
csid_extcompinfo(id, token = token)
#> CSID
#> "5363"
#> MF
#> "C_{12}H_{7}Cl_{3}O_{2}"
#> SMILES
#> "c1cc(c(cc1Cl)O)Oc2ccc(cc2Cl)Cl"
#> InChI
#> "InChI=1/C12H7Cl3O2/c13-7-1-3-11(9(15)5-7)17-12-4-2-8(14)6-10(12)16/h1-6,16H"
#> InChIKey
#> "XEFQLINVKFYRCS-UHFFFAOYAS"
#> AverageMass
#> "289.5418"
#> MolecularWeight
#> "289.5418"
#> MonoisotopicMass
#> "287.951172"
#> NominalMass
#> "288"
#> ALogP
#> "5.53"
#> XLogP
#> "5"
#> CommonName
#> "Triclosan"
Retrieve PubChem CID
get_cid('Triclosan')
#> [1] "5564" "131203" "627458" "15942656" "16220126" "16220128"
#> [7] "16220129" "16220130" "18413505" "22947105" "23656593" "24848164"
#> [13] "25023954" "25023955" "25023956" "25023957" "25023958" "25023959"
#> [19] "25023960" "25023961" "25023962" "25023963" "25023964" "25023965"
#> [25] "25023966" "25023967" "25023968" "25023969" "25023970" "25023971"
#> [31] "25023972" "25023973" "45040608" "45040609" "67606151" "71752714"
cid <- get_cid('3380-34-5')
Use this CID to retrieve some chemical properties:
props <- cid_compinfo(cid)
props$InChIKey
#> [1] "XEFQLINVKFYRCS-UHFFFAOYSA-N"
props$MolecularWeight
#> [1] "289.541780"
props$IUPACName
#> [1] "5-chloro-2-(2,4-dichlorophenoxy)phenol"
CTS allows to convert from nearly every possible identifier to nearly every possible identifier:
cts_convert(query = '3380-34-5', from = 'CAS', to = 'PubChem CID')
#> [1] "5564"
cts_convert(query = '3380-34-5', from = 'CAS', to = 'ChemSpider')
#> [1] "5363"
(inchk <- cts_convert(query = 'Triclosan', from = 'Chemical Name', to = 'inchikey'))
#> [1] "XEFQLINVKFYRCS-UHFFFAOYSA-N"
Moreover, we can a lot of information stored in the CTS database using InChIkey
info <- cts_compinfo(inchikey = inchk)
info[1:5]
#> $inchikey
#> [1] "XEFQLINVKFYRCS-UHFFFAOYSA-N"
#>
#> $inchicode
#> [1] "InChI=1S/C12H7Cl3O2/c13-7-1-3-11(9(15)5-7)17-12-4-2-8(14)6-10(12)16/h1-6,16H"
#>
#> $molweight
#> [1] 289.5418
#>
#> $exactmass
#> [1] 287.9512
#>
#> $formula
#> [1] "C12H7Cl3O2"
Without the fantastic web services webchem
wouldn't be here.
Therefore, kudos to the web service providers and developers!
If you're more familiar with python than with R, you should check out Matt Swains repositories - ChemSpiPY, PubChemPy and CirPy provide similar functionality as webchem
.
- Please report any issues, bugs or feature requests.
- License: MIT