Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the PubChem similarities between two compounds? #8

Closed
beyondpie opened this issue Oct 11, 2015 · 2 comments
Closed

How to get the PubChem similarities between two compounds? #8

beyondpie opened this issue Oct 11, 2015 · 2 comments
Labels

Comments

@beyondpie
Copy link

Hi, nice to find this package.
Currently, I have multiple compounds (with SDF formats, in fact, they are from Zinc Database).
Is it possible that I use their SDF formats to get their PubChem similarities ?
Thanks ~
Songpeng

@mcs07
Copy link
Owner

mcs07 commented Oct 11, 2015

If you want similarities between compounds in an SDF file, I would recommend generating fingerprints and calculating similarities locally using RDKit (or OpenBabel, CDK, etc.). Something like:

mols = Chem.SDMolSupplier('myfile.sdf')
fp1 = AllChem.GetMorganFingerprint(mols[0], 2)
fp2 = AllChem.GetMorganFingerprint(mols[1], 2)
DataStructs.TanimotoSimilarity(fp1, fp2)

But if you specifically want to use PubChem fingerprints you can do something like this with PubChemPy:

def tanimoto(compound1, compound2):
    fp1 = int(compound1.fingerprint, 16)
    fp2 = int(compound2.fingerprint, 16)
    fp1_count = bin(fp1).count('1')
    fp2_count = bin(fp2).count('1')
    both_count = bin(fp1 & fp2).count('1')
    return float(both_count) / (fp1_count + fp2_count - both_count)

I added a more complete example here: https://github.com/mcs07/PubChemPy/blob/master/examples/Chemical%20fingerprints%20and%20similarity.ipynb

@beyondpie
Copy link
Author

Great !
Yes, I also use RDKit. In this part, I only want to get the PubChem similarities.
Now I see, by compound.fingerprint in your package, I can not only get the similarities, but also the PubChem fingerprints ~
Thanks a lot !
Songpeng

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants