# This notebook is for playing around with OpenEye Toolkits

In [1]:
import os
os.environ['OE_LICENSE'] = '/home/ian/oe_license.txt'

See [this guide](https://docs.eyesopen.com/toolkits/python/oechemtk/molreadwrite.html#chapter-molreadwrite) for more info

Listing 1: High-level Molecule I/O using molstreams

In [1]:
from openeye import oechem

ifs = oechem.oemolistream()
ofs = oechem.oemolostream()

mol = oechem.OEGraphMol()

while oechem.OEReadMolecule(ifs, mol):
    oechem.OEWriteMolecule(ofs, mol)

In [None]:
from openeye import oechem

ifs = oechem.oemolistream()
ofs = oechem.oemolostream()
# Generator methods for reading molecules
for mol in ifs.GetOEGraphMols():
    oechem.OEWriteMolecule(ofs, mol)

Listing 2: Reading molecules into memory

In [None]:
ifs = oechem.oemolistream()
mollist = []

for mol in ifs.GetOEGraphMols():
    mollist.append(oechem.OEGraphMol(mol))

Listing 3: Explicitly setting file formats

In [None]:
from openeye import oechem

ifs = oechem.oemolistream()
ofs = oechem.oemolostream()

# This will convert from SDF to PDB
ifs.SetFormat(oechem.OEFormat_SDF)
ofs.SetFormat(oechem.OEFormat_PDB)

# There's many OEFormats to choose from: https://docs.eyesopen.com/toolkits/python/oechemtk/OEChemConstants/OEFormat.html#OEChem::OEFormat::PDB

for mol in ifs.GetOEGraphMols():
    oechem.OEWriteMolecule(ofs, mol)

Listing 4: Reading and writing molecule files

In [2]:
from openeye import oechem

ifs = oechem.oemolistream()
ofs = oechem.oemolostream()

if ifs.open("./molecules_bx_2024_06_18_154410.sdf"):
    if ofs.open("molecules_bx_2024_06_18_154410.pdb"):
        for mol in ifs.GetOEGraphMols():
            oechem.OEWriteMolecule(ofs, mol)
    else:
        oechem.OEThrow.Fatal("Unable to create 'molecules_bx_2024_06_18_154410.pdb'")
else:
    oechem.OEThrow.Fatal("Unable to open 'molecules_bx_2024_06_18_154410.sdf'")

Listing 5: Reading and writing molecule from memory buffers

In [None]:
from openeye import oechem


smiles = '''\
CCO
c1cnccc1'''

ims = oechem.oemolistream()
ims.SetFormat(oechem.OEFormat_SMI)
ims.openstring(smiles)

mols = []
mol = oechem.OEMol()
for mol in ims.GetOEMols():
    mols.append(oechem.OEMol(mol))

oms = oechem.oemolostream()
oms.SetFormat(oechem.OEFormat_SDF)
oms.openstring()

for mol in mols:
    oechem.OEWriteMolecule(oms, mol)

molfile = oms.GetString()
print("MOL string\n", molfile.decode('UTF-8'))

Listing 6: Reading and writing compressed molecule files

In [None]:
from openeye import oechem

# example files - these don't exist!
ifs = oechem.oemolistream("input.sdf.gz")
ofs = oechem.oemolostream("output.oeb.gz")

for mol in ifs.GetOEGraphMols():
    oechem.OEWriteMolecule(ofs, mol)

# Command Line Format Control
Using the methods outlined above, it is possible to allow the stream format to be controlled from the command line. OEChem TK’s oemolstreams control the format by interpreting the input and output file names.

**Listing 7: Controlling File Format from the Command Line**

In [None]:
from openeye import oechem
import sys

if len(sys.argv) != 3:
    oechem.OEThrow.Usage("%s <input> <output>" % sys.argv[0])

ifs = oechem.oemolistream()
ofs = oechem.oemolostream()

if not ifs.open(sys.argv[1]):
    oechem.OEThrow.Fatal("Unable to open %s" % sys.argv[1])

if not ofs.open(sys.argv[2]):
    oechem.OEThrow.Fatal("Unable to create %s" % sys.argv[2])

for mol in ifs.GetOEGraphMols():
    oechem.OEWriteMolecule(ofs, mol)

Listing 8: Controlling standard in and standard out File Format

In [None]:
from openeye import oechem

ifs = oechem.oemolistream(".sdf")
ofs = oechem.oemolostream(".mol2")

for mol in ifs.GetOEGraphMols():
    oechem.OEWriteMolecule(ofs, mol)

The `oemolstreambase.SetFlavor` method takes two unsigned integer arguments, the first is the format for which the flavor is being specified and the second is the flavor itself. The formats are specified as discussed in [File Formats](https://docs.eyesopen.com/toolkits/python/oechemtk/molreadwrite.html#section-molreadwrite-fileformats). The input flavors are specified in the `OEIFlavor` namespace and the output flavors are specified in the `OEOFlavor` namespace. Unlike the formats, the flavors are a bitmask and may be binary OR’d together. Under the `OEIFlavor` and `OEOFlavor` namespaces, there is a namespace for each format as well as a `OEIFlavor_Generic` namespace. The `OEOFlavor_Generic` namespace is used to control aromaticity perception and other properties common to all formats. To completely specify a flavor, one would typically binary-OR a `OEOFlavor_Generic` flag and a format specific flag and pass the resultant value to `oemolstreambase.SetFlavor`.
The default behavior for the `PDB reader` is that `TER` specifies the termination of a disconnected fragment within the same molecule while `END` specifies the termination of a connection table. However, some users may want to have the reader split `PDB` input files into different molecules every time a `TER` appears.

The following code is an example of changing the `PDB` reader flavor.

Listing 9: Changing oemolistream Reader Flavor

In [None]:
from openeye import oechem

ifs = oechem.oemolistream('input.pdb')
ofs = oechem.oemolostream('output.mol2')

flavor = oechem.OEIFlavor_Generic_Default | oechem.OEIFlavor_PDB_Default | oechem.OEIFlavor_PDB_TER
ifs.SetFlavor(oechem.OEFormat_PDB, flavor)

for mol in ifs.GetOEGraphMols():
    oechem.OEWriteMolecule(ofs, mol)

# Documentations for OEDock
See [here](https://docs.eyesopen.com/toolkits/python/dockingtk/theory/docking.html#chapter-docking) for more

# Initialization
An OEDock object must be initialized with a receptor object, prior to docking, scoring or annotating any molecules. This is done by passing a `OEDesignUnit` containing a receptor (see Receptors) to the `OEDock.`Initialize method.

`OEDesignUnit` derives from the [`OEBase` class](https://docs.eyesopen.com/toolkits/python/oechemtk/OESystemClasses/OEBase.html#OESystem::OEBase)
The abstract class OEBase defines the interface for run-time class extensibility and run-time type identification. Classes which derive from OEBase can store and retrieve data by association with integer or character string ‘tag’ identifiers.

OEGraphMol: https://docs.eyesopen.com/toolkits/python/oechemtk/molctordtor.html#chapter-molctordtor

Create a molecule

In [None]:
from openeye import oechem

mol = oechem.OEGraphMol()

Destroy a molecule

In [None]:
from openeye import oechem

mol = oechem.OEGraphMol()
del mol

Creating a molecule from a SMILES string

In [None]:
from openeye import oechem

# create a new molecule
mol = oechem.OEGraphMol()

# convert the SMILES string into a molecule
oechem.OESmilesToMol(mol, "c1ccccc1")

# Splitting Macromolecular Complexes (this method so far works the best, could use this as a second choice for ligand extraction)
See [here](https://docs.eyesopen.com/toolkits/python/oechemtk/proteinprep.html#splitting-macromolecular-complexes)
Listing 8: Splitting an Input Molecule

In [12]:
import openeye.oechem as oechem

# Function to read a molecular complex/ligand/protein from a file
def read_molecule(filename):
    ifs = oechem.oemolistream()
    if not ifs.open(filename):
        oechem.OEThrow.Fatal("Unable to open file %s" % filename)
    mol = oechem.OEGraphMol()
    oechem.OEReadMolecule(ifs, mol)
    ifs.close()
    return mol

# Function to split the complex into its components
def split_molecular_complex(mol):
    protein = oechem.OEGraphMol()
    ligand = oechem.OEGraphMol()
    water = oechem.OEGraphMol()
    other = oechem.OEGraphMol()

    # Split the complex
    oechem.OESplitMolComplex(ligand, protein, water, other, mol)

    return protein, ligand, water, other

def oe_split_complex(input_filename, output_basename):
    # Read the complex
    complex_molecule = read_molecule(input_filename)

    # Split the complex
    protein, ligand, water, other = split_molecular_complex(complex_molecule)

    # Write the components to files
    oechem.OEWriteMolecule(oechem.oemolostream(output_basename + "_protein.pdb"), protein)
    oechem.OEWriteMolecule(oechem.oemolostream(output_basename + "_ligand.pdb"), ligand)
    oechem.OEWriteMolecule(oechem.oemolostream(output_basename + "_water.pdb"), water)
    oechem.OEWriteMolecule(oechem.oemolostream(output_basename + "_other.pdb"), other)

# Example usage
input_filename = "./Mpro-x0072_0.pdb"  # Replace with your file path
output_basename = "./Mpro-x0072_0"  # Replace with your desired output file basename
oe_split_complex(input_filename, output_basename)

# Prepping a ligand/protein for docking (it is better to use Obabel for prepping ligands!)

In [14]:
import openeye.oechem as oechem

# Function to write the prepared ligand to a file
def write_molecule(molecule, output_filename):
    ofs = oechem.oemolostream()
    if not ofs.open(output_filename):
        oechem.OEThrow.Fatal("Unable to open file %s" % output_filename)
    oechem.OEWriteMolecule(ofs, molecule)
    ofs.close()

# Main function
def oe_prepare_ligand(input_filename, output_filename):
    # Read the ligand
    ligand = read_molecule(input_filename)

    # Add hydrogens
    oechem.OEPlaceHydrogens(ligand)

    # Write the prepared ligand
    write_molecule(ligand, output_filename)

# Example usage
input_filename = "molecules_bx_2024_06_18_154410.sdf"  # Replace with your input file path
output_filename = "molecules_bx_2024_06_18_154410_with_hydrogens.pdb"  # Replace with your output file path
oe_prepare_ligand(input_filename, output_filename)

In [17]:
# the same function could be used for prepping proteins
input_protein = "Mpro-x0072_0_protein.pdb"
output_protein = "Mpro-x0072_0_protein_with_hydrogens.pdb"
oe_prepare_ligand(input_protein, output_protein)

In [21]:
os.system("python3 ./OpenEye/make_design_units.py Mpro-x0072_0.pdb")

DPI: 0.12, RFree: 0.23, Resolution: 1.65
Processing BU # 1 with title: ---, chains AB, alt: A
Processing BU # 2 with title: ---, chains AB, alt: B
Skipping redundant DU with alts outside the site of interest, renaming existing to collapse alts
Discarding redundant alt DU with title ---(AB)altB > LIG(A-1101)
Skipping redundant DU with alts outside the site of interest, renaming existing to collapse alts
Discarding redundant alt DU with title ---(AB)altB > LIG(B-1101)
Superposition - RMSD: 0.00, Ref: , Fit: , SeqScore: 3102


0

# Reading a Design Unit 
See [here](https://docs.eyesopen.com/toolkits/python/oechemtk/OEBioFunctions/OEReadDesignUnit.html#OEBio::OEReadDesignUnit)

In [28]:
from openeye import oechem
du = oechem.OEDesignUnit()
oechem.OEReadDesignUnit('Mpro-x0072_0_DU_0_receptor.oedu', du)

True

In [None]:
# make design units for complexes
!python ./OpenEye/make_design_units.py "./Mpro-x0072_0.pdb"

# Docking Protocol
See [here](https://docs.eyesopen.com/toolkits/python/dockingtk/theory/docking.html#chapter-docking) for a guide

## Initialization
An OEDock object must be initialized with a receptor object, prior to docking, scoring or annotating any molecules. This is done by passing a OEDesignUnit containing a receptor (see Receptors) to the OEDock.Initialize method.

In [None]:
!python ./OpenEye/MakeReceptor.py -in ./Mpro-x0072_0_DU_0.oedu -out Mpro-x0072_0_DU_0_receptor.oedu

In [50]:
# Import necessary OpenEye modules
from openeye.oechem import *
from openeye.oeomega import *
from openeye.oedocking import *

# Initialize OEChem
# oechem.OEThrow.SetLevel(OEErrorLevel.Warning)
# oechem.OEThrow.SetOutputStream(OEFixedAtomHydrogenHandling)
# oedocking.OEDock.SetVerbose(True)

# Load receptor from a file
receptor_filename = "Mpro-x0072_0_DU_0_receptor.oedu"
# receptor = OEGraphMol()
# if not oemolistream(receptor_filename, receptor):
#     raise RuntimeError(f"Unable to open receptor file: {receptor_filename}")

# Initialize the docking
dock = OEDock()
dock.Initialize(receptor)
dock.IsInitialized()
# Save the initialized receptor if needed
# initialized_receptor_filename = "initialized_receptor.oedu"
# ofs = oemolostream(initialized_receptor_filename)
# oechem.OEWriteMolecule(ofs, receptor)
# ofs.close()

# print("Receptor initialized for docking successfully.")


False

<openeye.oechem.OEDesignUnit; proxy of <Swig Object of type 'std::vector< OEBio::OEDesignUnit * >::value_type' at 0x72b29c409a10> >

In [33]:
receptor_du = oechem.OEDesignUnit()
oechem.OEReadDesignUnit('Mpro-x0072_0_DU_0_receptor.oedu', receptor_du)
OEDock(receptor_du)
# Initialize()

NameError: name 'OEDock' is not defined

## Docking Molecules
Once the OEDock object has been initialized molecules are docked using the OEDock.DockMultiConformerMolecule method.

Docking requires a multiconformer representation of the molecule as input. Docking selects the top scoring docked pose from the provided ensemble of conformers. The score of the docked molecule can be obtained by calling the OEMolBase.GetEnergy method of pose.

OEDock can also return alternate as well as top scoring poses of the docked molecule.

In [54]:
# Import necessary OpenEye modules
from openeye.oechem import *
from openeye.oedocking import *

# Initialize OEChem
# oechem.OEThrow.SetLevel(OEErrorLevel.Warning)

# Load docked molecules from a file
docked_molecules_filename = "molecules_bx_2024_06_18_154410.sdf"
ifs = oemolistream(docked_molecules_filename)
docked_molecules = []
for mol in ifs.GetOEGraphMols():
    docked_molecules.append(OEGraphMol(mol))

ifs.close()

# Retrieve and print the docking score for each molecule
for i, mol in enumerate(docked_molecules):
    docking_score = mol.GetEnergy()
    print(f"Molecule {i+1} docking score: {docking_score:.2f}")

print("Docking scores retrieved successfully.")


Molecule 1 docking score: 0.00
Molecule 2 docking score: 0.00
Molecule 3 docking score: 0.00
Molecule 4 docking score: 0.00
Molecule 5 docking score: 0.00
Molecule 6 docking score: 0.00
Molecule 7 docking score: 0.00
Molecule 8 docking score: 0.00
Molecule 9 docking score: 0.00
Molecule 10 docking score: 0.00
Molecule 11 docking score: 0.00
Molecule 12 docking score: 0.00
Molecule 13 docking score: 0.00
Molecule 14 docking score: 0.00
Docking scores retrieved successfully.
