Python wrapper to ease the calculation of PaDEL molecular descriptors and fingerprints.
Olivier J. M. Béquignon is neither the copyright holder of PaDEL nor responsible for it.
The work carried out here concerns
- the Python wrapper,
- the ePaDEL executable,
- the extendedlibpadeldescriptor library.
From source:
git clone https://github.com/OlivierBeq/PaDEL_pywrapper.git
pip install ./PaDEL_pywrapper
with pip:
pip install padel-pywrapper
Descriptors of the module PaDEL_pywrapper.descriptors
can be computed as follows:
from PaDEL_pywrapper import PaDEL
from PaDEL_pywrapper.descriptor import ALOGP, Crippen, FMF
from rdkit import Chem
smiles_list = [
# erlotinib
"n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C",
# midecamycin
"CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C",
# selenofolate
"C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N",
]
mols = [Chem.MolFromSmiles(smiles) for smiles in smiles_list]
descriptors = [ALOGP, Crippen, FMF]
padel = PaDEL(descriptors)
print(padel.calculate(mols))
Instances of descriptors can be supplied as well.
descriptors = [ALOGP(), Crippen(), FMF()]
padel = PaDEL(descriptors)
print(padel.calculate(mols))
To calculate all possible descriptors, import the descriptors
list from the module PaDEL_pywrapper
directly:
from PaDEL_pywrapper import descriptors
padel = PaDEL(descriptors)
print(padel.calculate(mols))
By default, the ignore_3D
parameter is set to True
, preventing any provided 3D descriptor to be calculated.
Should molecules with 3D coordinates be provided, one can turn on 3D descriptor calculation.
from rdkit.Chem import AllChem
mols = [Chem.AddHs(mol) for mol in mols]
_ = [AllChem.EmbedMolecule(mol) for mol in mols]
padel = PaDEL(descriptors, ignore_3D=False)
print(padel.calculate(mols))
mol = Chem.MolFromSmiles('CCC')
padel = PaDEL(descriptors, ignore_3D=False)
print(padel.calculate([mol]))
# ValueError: Cannot calculate descriptors for a conformer-less molecule
Fingerprints of the module `PaDEL_pywrapper.descriptors can be computed as follows:
from PaDEL_pywrapper.descriptor import GraphOnlyFP
fp = GraphOnlyFP
padel = PaDEL([fp], ignore_3D=False)
print(padel.calculate(mols))
Custom parameter sets can be provided for some fingerprints:
fp = GraphOnlyFP(size=2048, searchDepth=8)
padel = PaDEL([fp], ignore_3D=False)
print(padel.calculate(mols))
class PaDEL:
...
def calculate(self, mols: Iterable[Chem.Mol], show_banner: bool = True, njobs: int = 1, chunksize: int = 100):
- mols : Iterable[Chem.Mol]
RDKit molecule objects for which to obtain PaDEL descriptors. - show_banner : bool
Displays default notice about PaDEL descriptors. - njobs : int
Maximum number of simultaneous processes. Ignored ifself.descriptors
are instances and not class names. - chunksize : int
Maximum number of molecules each process is charged of. Ignored ifself.descriptors
are instances and not class names.
Details about each descriptor and fingerprint can be obtained as follows:
print(ALOGP.description)
print(GraphOnlyFP.description)
For full details about all descriptors, one can obtain the path to the original Excel file of the PaDEL descriptors with:
print(padel.details)