# Chapter 4: Compound Screening - Interactive Notebook

This notebook demonstrates code and concepts from [Chapter 4: Compound Screening](../chapters/chapter4-compound-screening.qmd) of the book.

You can run and modify the code cells below to explore compound screening hands-on.

## Install and Import Libraries

We'll use PubChemPy and ChEMBL WebResource Client for compound search. If not installed, uncomment the pip commands below.

In [15]:
#!pip install pubchempy chembl_webresource_client pandas

In [16]:
# !pip install pubchempy chembl_webresource_client pandas
import pubchempy as pcp
from chembl_webresource_client.new_client import new_client
import pandas as pd

## Search PubChem for Aspirin

Let's search PubChem for aspirin and display its canonical SMILES.

In [17]:
compounds = pcp.get_compounds('aspirin', 'name')
for c in compounds:
    print(f"CID: {c.cid}, Canonical SMILES: {c.canonical_smiles}")

CID: 2244, Canonical SMILES: CC(=O)OC1=CC=CC=C1C(=O)O


  print(f"CID: {c.cid}, Canonical SMILES: {c.canonical_smiles}")


## Search ChEMBL for Approved Drugs

Let's search ChEMBL for approved drugs and display their ChEMBL IDs and names.

In [18]:
molecule = new_client.molecule
approved_drugs = molecule.filter(max_phase=4)
for drug in approved_drugs[:5]:
    print(drug['molecule_chembl_id'], drug['pref_name'])

CHEMBL2 PRAZOSIN
CHEMBL3 NICOTINE
CHEMBL4 OFLOXACIN
CHEMBL5 NALIDIXIC ACID
CHEMBL6 INDOMETHACIN


## Link to Book Chapter

For more details and explanations, see [Chapter 4: Compound Screening](../chapters/chapter4-compound-screening.qmd) in the book.

## Drug-likeness Filtering (Lipinski's Rule of Five)

Let's filter compounds by Lipinski's Rule of Five using RDKit. This rule helps identify drug-like molecules based on molecular weight, LogP, H-bond donors, and H-bond acceptors.

In [19]:
# !pip install rdkit pandas pubchempy
import pandas as pd
from tools.chem_utils import lipinski_filter

# Example: Filter a list of SMILES
smiles_list = [
    'CC(=O)OC1=CC=CC=C1C(=O)O',  # Aspirin
    'CC1=CC(=O)NC(C)=C1',        # Paracetamol
    'CCCCCCCCCCCCCCCC(=O)O'      # Palmitic acid (not drug-like)
]
for smi in smiles_list:
    print(f'{smi}: Lipinski drug-like? {lipinski_filter(smi)}')

ModuleNotFoundError: No module named 'tools'

## ADMET Property Calculation (Local, RDKit)

Let's calculate basic ADMET-relevant properties locally using RDKit. We'll compute molecular weight, LogP, H-bond donors/acceptors, rotatable bonds, and topological polar surface area (TPSA).

In [None]:
from tools.chem_utils import calc_admet_properties

# Example: Calculate ADMET properties for aspirin
smiles = 'CC(=O)OC1=CC=CC=C1C(=O)O'
props = calc_admet_properties(smiles)
if props:
    print('RDKit ADMET properties for aspirin:')
    for k, v in props.items():
        print(f'{k}: {v}')
else:
    print('Could not calculate properties.')

RDKit ADMET properties for aspirin:
MW: 180.15899999999996
LogP: 1.3101
HBD: 1
HBA: 3
RotB: 2
TPSA: 63.60000000000001


## ADMET Prediction with SwissADME (Web Tool)

For more advanced ADMET predictions, you can use the free SwissADME web tool. Paste your SMILES into https://www.swissadme.ch/ to get a comprehensive report including absorption, distribution, metabolism, excretion, and toxicity predictions.

Example workflow:
1. Copy the SMILES string (e.g., for aspirin: `CC(=O)OC1=CC=CC=C1C(=O)O`).
2. Go to [SwissADME](https://www.swissadme.ch/).
3. Paste the SMILES and submit.
4. Review the ADMET and drug-likeness results.

## Compound Prioritization Example

Let's combine drug-likeness and ADMET filters to prioritize compounds for synthesis or further study.

In [None]:
# Example: Prioritize a list of compounds using local RDKit properties
compound_data = [
    {'name': 'Aspirin', 'smiles': 'CC(=O)OC1=CC=CC=C1C(=O)O'},
    {'name': 'Paracetamol', 'smiles': 'CC1=CC(=O)NC(C)=C1'},
    {'name': 'Palmitic acid', 'smiles': 'CCCCCCCCCCCCCCCC(=O)O'}
]

prioritized = []
for c in compound_data:
    if lipinski_filter(c['smiles']):
        props = calc_admet_properties(c['smiles'])
        if props and props['TPSA'] < 90:  # Example: prioritize low TPSA (likely BBB permeable)
            prioritized.append(c['name'])

print('Prioritized compounds (Lipinski + TPSA < 90):', prioritized)

Prioritized compounds (Lipinski + TPSA < 90): ['Aspirin', 'Paracetamol']


### References and Further Reading
- [ChEMBL](https://www.ebi.ac.uk/chembl/) [@chembl]
- [PubChem](https://pubchem.ncbi.nlm.nih.gov) [@pubchem]