I have a series of protein structures (over 100) which are dimers, with ligands in both chains.  

**I would like to create design units where the ligand is in 1) chain A and not chain B, 2) in chain B and not in chain A.** However, creating a design unit doesn't allow me to specify which chain the ligand resides in. 

If you can advise on how to do this, I will be very grateful.  


**Failing that I would like to know which chain the ligand is in after the creation of the design unit, in a programatic fashion**. 

I've included the protein 'Mpro-P0208.pdb' as an example.  The site residues (which I'm defining as within 4 angstroms of the ligand) in chain A are 41, 49, 140 - 142, 144 - 145, 163 - 166, 187 - 189. Almost all those residues are the site residues for chain B.  The structure also contains alternate locations and there are therefore 2 design units created with the `OEMakeDesignUnits` function. 

Using the `GetSiteResidues` on each design unit gives the same residues in different chains for each design unit.  As I do not know how the ligand/chain is chosen or renamed when using the  `OEMakeDesignUnits` function, I don't know how to determin which chain the ligand is in. See code below: 

In [8]:
from openeye import oespruce, oechem

from glob import glob

Make design units from PDB:

In [9]:
pdb = 'Mpro-P0208.pdb'

errfs = oechem.oeosstream()
oechem.OEThrow.SetOutputStream(errfs)
oechem.OEThrow.Clear()
oechem.OEThrow.SetLevel(oechem.OEErrorLevel_Verbose) # capture verbose error output

ifs = oechem.oemolistream(pdb)
ifs.SetFlavor(oechem.OEFormat_PDB, oechem.OEIFlavor_PDB_Default | oechem.OEIFlavor_PDB_DATA | oechem.OEIFlavor_PDB_ALTLOC)  # noqa
mol = oechem.OEGraphMol()
flag = oechem.OEReadMolecule(ifs, mol)
ifs.close()

opts = oespruce.OEMakeDesignUnitOptions()
opts.GetSplitOptions().SetMinLigAtoms(7) # minimum fragment size (in heavy atoms)
opts.GetPrepOptions().SetStrictProtonationMode(True)
opts.GetPrepOptions().GetBuildOptions().SetCapNTermini(False)
opts.GetPrepOptions().GetBuildOptions().SetCapCTermini(False)
opts.GetPrepOptions().GetBuildOptions().SetBuildLoops(True)
opts.GetPrepOptions().GetBuildOptions().SetBuildSidechains(True)
opts.GetPrepOptions().GetBuildOptions().GetCapBuilderOptions().SetAllowTruncate(False)
metadata = oespruce.OEStructureMetadata()


dus = list(oespruce.OEMakeDesignUnits(mol, metadata, opts))


Getting the site residues for each design unit seems to indicate that the site is different in each design unit: 

In [13]:
for i, du in enumerate(dus):
    p1 = oechem.OEGraphMol()
    l1 = oechem.OEGraphMol()    
    du.GetLigand(l1)
    du.GetProtein(p1)
    
    print(du.GetSiteResidues()[:5])
    
    with oechem.oemolostream(f'protein-{i}.pdb') as ofs:
        oechem.OEWriteMolecule(ofs, oechem.OEGraphMol(p1))
    with oechem.oemolostream(f'ligand-{i}.pdb') as ofs:
        oechem.OEWriteMolecule(ofs, oechem.OEGraphMol(l1))

['SER:1: :A:1: ', 'HIS:41: :B:2: ', 'TYR:54: :B:2: ', 'PHE:140: :B:2: ', 'LEU:141: :B:2: ']
['HIS:41: :A:1: ', 'MET:49: :A:1: ', 'PHE:140: :A:1: ', 'LEU:141: :A:1: ', 'ASN:142: :A:1: ']


however, writing out the proteins show that it's basically the same site as the physical location of the chains has changed (or the labels have changed). Below I'm showing the backbone nitrogen atom of the HIS-41:  

In [28]:
!grep 'ATOM.*N   HIS [AB]  41' protein-0.pdb 

ATOM    303  N   HIS A  41     -27.069  -2.718  28.828  1.00 35.79           N
ATOM   2671  N   HIS B  41      12.500 -12.615  -5.472  1.00 40.85           N


In [29]:
!grep 'ATOM.*N   HIS [AB]  41' protein-1.pdb 

ATOM    303  N   HIS A  41      12.638 -12.480  -5.591  1.00 35.79           N
ATOM   2671  N   HIS B  41     -27.048  -2.689  28.604  1.00 40.85           N
