# System building: Protein in Membrane with Ligand

In this tutorial, we will showcase how to build a protein ligand system for simulating binding. The sample system is Trypsin (the protein) and benzamidine (the ligand).

Let's start by doing some imports and definitions:

In [1]:
from htmd.ui import *
from htmd.home import home
from os.path import join
config(viewer='webgl')
datadir = home(dataDir='building-protein-ligand')

2022-03-18 11:07:15,277 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2022-03-18 11:07:15,605 - rdkit - INFO - Enabling RDKit 2021.09.4 jupyter extensions



Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. https://dx.doi.org/10.1021/acs.jctc.6b00049

HTMD Documentation at: https://www.htmd.org/docs/latest/

You are on the latest HTMD version (unpackaged : /home/sdoerr/Work/htmd/htmd).



## Load the protein-ligand complex

One can obtain the protein-ligand complex from the PDB database (ID:3PTB). The complex is already available in the data distributed with HTMD and either one could be used:

In [2]:
# One can download it directly from the RCSB servers
prot = Molecule('3PTB')
# Or use the pdb file found in the HTMD data directory
prot = Molecule(join(datadir, 'trypsin.pdb'))

In [3]:
prot.view()

_ColormakerRegistry()

NGLWidget()

## Clean the structures

The PDB crystal structure contains the protein as well as water molecules, a calcium ion and a ligand. Here we will start by removing the ligand from the protein Molecule as we will add it later to manipulate it separately.

In [4]:
prot.remove('resname BEN')

2022-03-18 11:07:19,831 - moleculekit.molecule - INFO - Removed 9 atoms. 1692 atoms remaining in the molecule.


array([1630, 1631, 1632, 1633, 1634, 1635, 1636, 1637, 1638], dtype=int32)

## Preparing the protein

In this step, we prepare the protein for simulation by adding hydrogens, setting the protonation states, and optimizing the protein (more details on the protein preparation tutorial):

In [5]:
prot = systemPrepare(prot, pH=7.0)




---- Molecule chain report ----
Chain A:
    First residue: ILE    16  
    Final residue: HOH   809  
---- End of chain report ----



2022-03-18 11:07:23,112 - moleculekit.tools.preparation - INFO - Modified residue CYS    22 A to CYX
2022-03-18 11:07:23,113 - moleculekit.tools.preparation - INFO - Modified residue HIS    40 A to HIE
2022-03-18 11:07:23,113 - moleculekit.tools.preparation - INFO - Modified residue CYS    42 A to CYX
2022-03-18 11:07:23,114 - moleculekit.tools.preparation - INFO - Modified residue HIS    57 A to HIP
2022-03-18 11:07:23,115 - moleculekit.tools.preparation - INFO - Modified residue CYS    58 A to CYX
2022-03-18 11:07:23,115 - moleculekit.tools.preparation - INFO - Modified residue HIS    91 A to HID
2022-03-18 11:07:23,116 - moleculekit.tools.preparation - INFO - Modified residue CYS   128 A to CYX
2022-03-18 11:07:23,116 - moleculekit.tools.preparation - INFO - Modified residue CYS   136 A to CYX
2022-03-18 11:07:23,117 - moleculekit.tools.preparation - INFO - Modified residue CYS   157 A to CYX
2022-03-18 11:07:23,118 - moleculekit.tools.preparation - INFO - Modified residue CYS   168

## Define segments

To build a system in HTMD, we need to separate the chemical molecules into separate segments. This prevents the builder from accidentally bonding different chemical molecules and allows us to add caps to them.

In [6]:
prot = autoSegment(prot, sel='protein')
prot.set('segid', 'W', sel='water')
prot.set('segid', 'CA', sel='resname CA')

2022-03-18 11:07:23,209 - moleculekit.tools.autosegment - INFO - Created segment P0 between resid 16 and 245.


Center the protein to the origin

In [7]:
prot.center()

## Let's work on the ligand!

Load the ligand from the HTMD data:

In [8]:
ligand = Molecule(join(datadir, 'BEN.mol2'))

Let's center the ligand and visualize it:

In [9]:
ligand.center()
ligand.view()

NGLWidget()

In [10]:
# We can give a convenient segid and resname to the ligand
# The resname should be BEN to match the parameters in the
# rtf and prm files.
ligand.set('segid','L')
ligand.set('resname','BEN')

But the ligand is now located inside the protein...
We would like the ligand to be:

* At a certain distance from the protein
* Rotated randomly, to provide different starting conditions

## Let's randomize the ligand position

In [11]:
ligand.rotateBy(uniformRandomRotation())

This took care of the ligand rotation around its own center. 
We still need to position it far from the protein.
First, find out the radius of the protein:

![maxdist](http://pub.htmd.org/tutorials/system-building-protein-ligand/maxdist.png)

In [12]:
from moleculekit.util import maxDistance
D = maxDistance(prot, 'all')
print(D)

28.830644436506475


In [13]:
D += 10
# Move the ligand 10 Angstrom away from the furthest protein atom in X dimension
ligand.moveBy([D, 0, 0])  
# rotateBy rotates by default around [0, 0, 0]. Since the ligand has been moved
# away from the center it will be rotated in a sphere of radius D+10 around [0, 0, 0]
ligand.rotateBy(uniformRandomRotation())

### Mix it all together

In [14]:
mol = Molecule(name='combo')
mol.append(prot)
mol.append(ligand)
mol.reps.add(sel='protein', style='NewCartoon', color='Secondary Structure')
mol.reps.add(sel='resname BEN', style='Licorice')
mol.view()

NGLWidget()

## Solvate

> Water is the driving force of all nature. --Leonardo da Vinci

![waterbox](http://pub.htmd.org/tutorials/system-building-protein-ligand/waterbox.png)

In [15]:
# We solvate with a larger box to fully solvate the ligand
DW = D + 5
smol = solvate(mol, minmax=[[-DW, -DW, -DW], [DW, DW, DW]])
smol.reps.add(sel='water', style='Lines')
smol.view()

2022-03-18 11:07:23,604 - htmd.builder.solvate - INFO - Using water pdb file at: /home/sdoerr/Work/htmd/htmd/share/solvate/wat.pdb
2022-03-18 11:07:24,373 - htmd.builder.solvate - INFO - Replicating 8 water segments, 2 by 2 by 2
Solvating: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:04<00:00,  1.70it/s]
2022-03-18 11:07:30,226 - htmd.builder.solvate - INFO - 20147 water molecules were added to the system.


NGLWidget()

## Build the System with a specific forcefield

HTMD aims to be force-field agnostic. After you have a built system, you can either build it in Amber or CHARMM. The following sections work on the same previously solvated system and can be interconverted.

Special care must be taken care in this case due to the use of benzamidine, which is not present by default on the respective forcefields.

### CHARMM forcefield

In [16]:
charmm.listFiles()

---- Topologies files list: /home/sdoerr/Work/htmd/htmd/share/builder/charmmfiles/top/ ----
/top/top_all22_prot.rtf
/top/top_all22star_prot.rtf
/top/top_all35_ethers.rtf
/top/top_all36_carb.rtf
/top/top_all36_cgenff.rtf
/top/top_all36_lipid.rtf
/top/top_all36_na.rtf
/top/top_all36_prot.rtf
/top/top_water_ions.rtf
---- Parameters files list: /home/sdoerr/Work/htmd/htmd/share/builder/charmmfiles/par/ ----
/par/par_all22_prot.prm
/par/par_all22star_prot.prm
/par/par_all35_ethers.prm
/par/par_all36_carb.prm
/par/par_all36_cgenff.prm
/par/par_all36_lipid.prm
/par/par_all36_na.prm
/par/par_all36_prot.prm
/par/par_all36m_prot.prm
/par/par_water_ions.prm
---- Stream files list: /home/sdoerr/Work/htmd/htmd/share/builder/charmmfiles/str/ ----
/str/carb/toppar_all36_carb_glycolipid.str
/str/carb/toppar_all36_carb_glycopeptide.str
/str/carb/toppar_all36_carb_imlab.str
/str/carb/toppar_all36_carb_model.str
/str/cphmd/protpatch_protein_toppar36.str
/str/lipid/toppar_all36_lipid_bacterial.str
/str/li

### Build and ionize using CHARMM

In [17]:
topos_charmm  = charmm.defaultTopo() + [join(datadir, 'BEN.rtf')]
params_charmm = charmm.defaultParam() + [join(datadir, 'BEN.prm')]

bmol_charmm = charmm.build(smol, topo=topos_charmm, param=params_charmm, outdir='./build_charmm')

2022-03-18 11:07:34,415 - htmd.builder.charmm - INFO - Writing out segments.
2022-03-18 11:07:40,823 - htmd.builder.builder - INFO - 6 disulfide bonds were added


Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 42, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 58, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 136, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 201, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 191, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 220, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 168, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 182, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 22, insertion: '', segid: 'P0'>
                   and: Uni

2022-03-18 11:07:41,186 - htmd.builder.charmm - INFO - Starting the build.
2022-03-18 11:07:41,739 - htmd.builder.charmm - INFO - Finished building.
2022-03-18 11:07:44,567 - htmd.builder.ionize - INFO - Adding 10 anions + 0 cations for neutralizing and 0 ions for the given salt concentration 0 M.
2022-03-18 11:07:49,244 - htmd.builder.charmm - INFO - Writing out segments.
2022-03-18 11:07:56,064 - htmd.builder.charmm - INFO - Starting the build.
2022-03-18 11:07:56,547 - htmd.builder.charmm - INFO - Finished building.


### AMBER forcefield

In [18]:
amber.listFiles()

---- Forcefield files list: /home/sdoerr/miniconda3/envs/testall/dat/leap/cmd/ ----
leaprc.amberdyes
leaprc.conste
leaprc.constph
leaprc.DNA.bsc1
leaprc.DNA.OL15
leaprc.ffAM1
leaprc.ffPM3
leaprc.gaff
leaprc.gaff2
leaprc.GLYCAM_06EPb
leaprc.GLYCAM_06j-1
leaprc.lipid14
leaprc.lipid17
leaprc.mimetic.ff15ipq
leaprc.modrna08
leaprc.music
leaprc.phosaa10
leaprc.phosaa14SB
leaprc.phosaa19SB
leaprc.protein.fb15
leaprc.protein.ff03.r1
leaprc.protein.ff03ua
leaprc.protein.ff14SB
leaprc.protein.ff14SB_modAA
leaprc.protein.ff14SBonlysc
leaprc.protein.ff15ipq
leaprc.protein.ff15ipq-vac
leaprc.protein.ff19ipq
leaprc.protein.ff19SB
leaprc.protein.ff19SB_modAA
leaprc.RNA.LJbb
leaprc.RNA.OL3
leaprc.RNA.ROC
leaprc.RNA.Shaw
leaprc.RNA.YIL
leaprc.water.fb3
leaprc.water.fb4
leaprc.water.opc
leaprc.water.opc3
leaprc.water.spce
leaprc.water.spceb
leaprc.water.tip3p
leaprc.water.tip4pd
leaprc.water.tip4pew
leaprc.xFPchromophores
---- OLD Forcefield files list: /home/sdoerr/miniconda3/envs/testall/dat/leap/cmd

### Build and ionize using Amber

In [19]:
topos_amber = amber.defaultTopo() + [join(datadir, 'BEN.mol2')]
frcmods_amber = amber.defaultParam() + [join(datadir, 'BEN.frcmod')]

In [20]:
bmol_amber = amber.build(smol, topo=topos_amber, param=frcmods_amber, outdir='./build_amber')

2022-03-18 11:08:15,764 - htmd.builder.amber - INFO - Detecting disulfide bonds.
2022-03-18 11:08:15,777 - htmd.builder.builder - INFO - 6 disulfide bonds were added


Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 26, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 42, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 117, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 184, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 174, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 198, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 149, insertion: '', segid: 'P0'>
                   and: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 163, insertion: '', segid: 'P0'>

Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'A', resid: 8, insertion: '', segid: 'P0'>
                   and: Uniq

2022-03-18 11:08:21,552 - htmd.builder.amber - INFO - Starting the build.
2022-03-18 11:08:41,021 - htmd.builder.amber - INFO - Finished building.
2022-03-18 11:08:52,058 - htmd.builder.ionize - INFO - Adding 10 anions + 0 cations for neutralizing and 0 ions for the given salt concentration 0 M.
2022-03-18 11:09:00,448 - htmd.builder.amber - INFO - Starting the build.
2022-03-18 11:09:19,650 - htmd.builder.amber - INFO - Finished building.
2022-03-18 11:09:34,759 - moleculekit.tools.sequencestructuralalignment - INFO - Alignment #0 was done on 223 residues: 2-224


## Visualize

The built system can be visualized (with waters hidden to be able to visualize the inserted ions):

In [21]:
bmol_charmm.view(sel='not water') # visualize the charmm built system
# bmol_amber.view(sel='not water') # uncomment to visualize the amber built system

NGLWidget()

The `bmol_charmm` and `bmol_amber` are `Molecule` objects that contain the built system, but the full contents to run a simulation are located in the `outdir` (`./build` in this case).