# Compute PAINS filters

PAINS smarts as implemented in RDKit can be found here: https://github.com/rdkit/rdkit/blob/3af7aeaaea348ef25e974056ad1b593efa4e7f8d/Data/Pains/wehi_pains.csv

As well as some udeful discussion around them here: https://github.com/rdkit/rdkit/pull/536

Adding Chem.AddHs(mol) first to the query molecule, as per this article: http://rdkit.blogspot.com/2015/08/curating-pains-filters.html

An interesting article on PAINS by Derek Lowe is here: https://www.science.org/content/blog-post/no-easy-road-getting-rid-pains

New, updated PAINS smarts are here: https://github.com/rdkit/rdkit/blob/master/Data/Pains/wehi_pains.csv

In [13]:
from rdkit.Chem.FilterCatalog import *

params = FilterCatalogParams()
params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_A) #there are 3 filter sets: A, B and C. Choosing only A here
params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_B)
params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_C)

catalog = FilterCatalog(params)
    
def pains(x):
    entry = catalog.GetFirstMatch(x)
    
    if entry is None:
        return "False" #returns False if the smiles does not have any PAINS
    
    elif entry.HasFilterMatch(x):
        p_status = entry.HasFilterMatch(x) #otherwise returns True
        return p_status

In [7]:
# Let's pass to pains() a molecule that does not have PAINS substructure:

mol = Chem.MolFromSmiles('CNc1c2cc(Oc7c(cc(F)cn7)C(=O)NC(C4C(=O)C=C(CC4)C)NC(=O)C=C)ccc2[nH]n1')
pains(mol)

'False'

In [8]:
# Lets pass a molecule that has a PAINS substructure:

pa = Chem.MolFromSmiles('O=C(Cn1cnc2c1c(=O)n(C)c(=O)n2C)N/N=C/c1c(O)ccc2c1cccc2')
pains(pa)

True

In [11]:
#If you wish to check PAINS in a dataframe containing smiles:

import pandas as pd

smi = {'smiles': ['c1ccccc1C(=O)O', 'c1ccccc1N', 'c1ccccc1O', 'c1ccccc1ON']}
da = pd.DataFrame(data=smi)
da['mol'] = da['smiles'].apply(Chem.MolFromSmiles)

da['PAINS?'] = da['mol'].apply(pains)

In [12]:
da

Unnamed: 0,smiles,mol,PAINS?
0,c1ccccc1C(=O)O,<rdkit.Chem.rdchem.Mol object at 0x15651fb50>,False
1,c1ccccc1N,<rdkit.Chem.rdchem.Mol object at 0x15651ff40>,False
2,c1ccccc1O,<rdkit.Chem.rdchem.Mol object at 0x156580040>,False
3,c1ccccc1ON,<rdkit.Chem.rdchem.Mol object at 0x1565800b0>,False
