-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search by canonical SMILES to retrieve all stereoisomers #42
Comments
My fork (https://github.com/BalooRM/PubChemPy) has an update to pubchempy.py that permits searching by SMILES to retrieve specific isomers. In the example below, the canonical SMILES for albuterol, which has 2 stereoisomers and a non-specific structure in PubChem, are retrieved by using a fastidentity search with the identitytype = same_isotope. There are other isotopes for albuterol in PubChem. The synchronous ("fast") searches are documented here: The following output is generated by the test code which follows. get_compounds by SMILES CID 2083 IUPAC Name 4-[2-(tert-butylamino)-1-hydroxyethyl]-2-(hydroxymethyl)phenol Canonical SMILES CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O Isomeric SMILES CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O get_compounds by SMILES: searchtype='fastidentity', identity_type='same_isotope' CID 2083 IUPAC Name 4-[2-(tert-butylamino)-1-hydroxyethyl]-2-(hydroxymethyl)phenol Canonical SMILES CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O Isomeric SMILES CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O CID 123600 IUPAC Name 4-[(1R)-2-(tert-butylamino)-1-hydroxyethyl]-2-(hydroxymethyl)phenol Canonical SMILES CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O Isomeric SMILES CC(C)(C)NC[C@@H](C1=CC(=C(C=C1)O)CO)O CID 182176 IUPAC Name 4-[(1S)-2-(tert-butylamino)-1-hydroxyethyl]-2-(hydroxymethyl)phenol Canonical SMILES CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O Isomeric SMILES CC(C)(C)NC[C@H](C1=CC(=C(C=C1)O)CO)O get_cids by SMILES [2083] get_cids by SMILES: searchtype=fastidentity, identity_type='same_isotope' https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/fastidentity/smiles/cids/JSON?identity_type=same_isotope [2083, 123600, 182176] import pubchempy as pcp mycid = 2083 mycansmiles = "CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O" myisosmiles = "CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O" print("get_compounds by SMILES") for compound in pcp.get_compounds(mycansmiles, 'smiles'): print ('CID\t', compound.cid) print ('IUPAC Name\t', compound.iupac_name) print ('Canonical SMILES\t', compound.canonical_smiles) print ('Isomeric SMILES\t', compound.isomeric_smiles) print("\nget_compounds by SMILES: searchtype='fastidentity', identity_type='same_isotope'") for compound in pcp.get_compounds(mycansmiles, 'smiles', searchtype='fastidentity', identity_type='same_isotope'): print ('CID\t', compound.cid) print ('IUPAC Name\t', compound.iupac_name) print ('Canonical SMILES\t', compound.canonical_smiles) print ('Isomeric SMILES\t', compound.isomeric_smiles) print("\nget_cids by SMILES") print(pcp.get_cids(mycansmiles, 'smiles')) print("\nget_cids by SMILES: searchtype=fastidentity, identity_type='same_isotope'") print("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/fastidentity/smiles/cids/JSON?identity_type=same_isotope") print(pcp.get_cids(mycansmiles, 'smiles',searchtype='fastidentity', identity_type='same_isotope')) |
Is it possible to perform a PUG REST synchronous (fastidentity) search to retrieve all related isomers for a canonical SMILES string (unspecified sterochemistry)? get_cids() returns a list with a single CID.
For example, the following request returns the desired information as JSON.
CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O is the canonical SMILES for albuterol (CID = 2083).
https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/fastidentity/smiles/CC(C)(C)NCC(C1=CC(=C(C=C1)O)CO)O/cids/JSON?identity_type=same_isotope
Returns:
The text was updated successfully, but these errors were encountered: