## Adding Hydrogens to Nonstandard Molecules
* https://openmm.github.io/openmm-cookbook/latest/notebooks/cookbook/Adding%20Hydrogens%20to%20Nonstandard%20Molecules.html
* Modeller.addHydrogens() can be used to add missing hydrogens to standard molecules, like proteins and nucleic acids. But what if you need to simulate something it doesn’t know about, such as a drug molecule or a non-natural amino acid? It can also handle this case, but you need to give it some help.

* Start by creating an XML file describing the molecule or molecules. Here is an example of a file defining hydrogens for NLN, a modified ASN residue that is missing a hydrogen and can have a glycan bonded to it. (This example is taken from the glycam-hydrogens.xml file that is bundled with OpenMM. You would not actually need to define this particular residue yourself.)

In [7]:
''''# MyHydrogen.xml
<Residues>
  <Residue name="NLN">
    <H name="H" parent="N" terminal="-C"/>
    <H name="H1" parent="N" terminal="N"/>
    <H name="H2" parent="N" terminal="N"/>
    <H name="H3" parent="N" terminal="N"/>
    <H name="HA" parent="CA"/>
    <H name="HB2" parent="CB"/>
    <H name="HB3" parent="CB"/>
    <H name="HD21" parent="ND2"/>
  </Residue>
</Residues>
'''

'\'# MyHydrogen.xml\n<Residues>\n  <Residue name="NLN">\n    <H name="H" parent="N" terminal="-C"/>\n    <H name="H1" parent="N" terminal="N"/>\n    <H name="H2" parent="N" terminal="N"/>\n    <H name="H3" parent="N" terminal="N"/>\n    <H name="HA" parent="CA"/>\n    <H name="HB2" parent="CB"/>\n    <H name="HB3" parent="CB"/>\n    <H name="HD21" parent="ND2"/>\n  </Residue>\n</Residues>\n'

* There is one <Residue> tag for every residue you want to define.

* It contains one <H> tag for every hydrogen that can appear in the residue.

* 'parent' is the name of the heavy atom the hydrogen is bonded to.

* All atom and residue names must exactly match the names present in your Topology.

* For terminal argument:
    * The optional terminal attribute indicates hydrogens that might or might not be present, depending on the residue’s position in the chain. It should contain one or more of the characters “N”, “C”, and “-”. 
    * “N” indicates the hydrogen should be added to N-terminal residues. 
    * “C” indicates it should be added to C-terminal ones. 
    * “-” indicates it should be added to residues that are not at either end of the chain.

In [13]:
from openmm.app import *

pdb1 = PDBFile('../villin.pdb')
pdb2 = PDBFile('../ala_ala_ala.pdb')

modeller = Modeller(pdb1.topology, pdb1.positions)

In [15]:
Modeller.loadHydrogenDefinitions('MyHydrogen.xml')

In [16]:
modeller = Modeller(pdb1.topology, pdb1.positions)
modeller.add(pdb2.topology, pdb2.positions)
mergedTopology = modeller.topology
mergedPositions = modeller.positions

In [17]:
mergedTopology

<Topology; 2 chains, 2801 residues, 8900 atoms, 6143 bonds>

In [18]:
mergedPositions

Quantity(value=[Vec3(x=2.516, y=1.4160000000000001, z=1.9440000000000002), Vec3(x=2.4350000000000005, y=1.3730000000000002, z=1.987), Vec3(x=2.5980000000000003, y=1.368, z=1.9760000000000002), Vec3(x=2.5180000000000002, y=1.51, z=1.9809999999999999), Vec3(x=2.5090000000000003, y=1.3920000000000001, z=1.798), Vec3(x=2.47, y=1.2930000000000001, z=1.7760000000000002), Vec3(x=2.442, y=1.497, z=1.7100000000000002), Vec3(x=2.459, y=1.463, z=1.6079999999999999), Vec3(x=2.495, y=1.5910000000000002, z=1.724), Vec3(x=2.293, y=1.5010000000000001, z=1.74), Vec3(x=2.2870000000000004, y=1.561, z=1.831), Vec3(x=2.241, y=1.5830000000000002, z=1.6210000000000002), Vec3(x=2.263, y=1.5250000000000001, z=1.532), Vec3(x=2.132, y=1.593, z=1.6260000000000003), Vec3(x=2.2920000000000003, y=1.679, z=1.6219999999999999), Vec3(x=2.2100000000000004, y=1.3730000000000002, z=1.7590000000000001), Vec3(x=2.2520000000000002, y=1.2930000000000001, z=1.6980000000000002), Vec3(x=2.209, y=1.3470000000000002, z=1.865), Vec