# Run HEroBM backmapping #

### 1. Create the inference config file ###

Specify the following parameters:
- mapping: which is the CG mapping to use
- input: input file to backmap. It could be either a pdb, gro or any format compatible with MDAnalysis.
- inputtraj: optional xtc or trr trajectory to load into the input file.
- isatomistic: set to True if the input file is at atomistic resolution. In this case, the input will first be converted to CG, according to the specified mapping, then the model will be used to backmap the CG back again to atomistic resolution. This is used to evaluate model performance.
- selection: optional selection of atoms/beads/residues/molecules to apply on input as a pre-processing
- trajslice: optional indexing of frame using the python slice format [from]:[to][:step]
- model: could be either a deployed model or the .pth model file used for training (usually, '.../run_name/best_model.pth').
- output: Optionally, you can provide a folder where to save the backmapped result

In [None]:
args_dict = {
    "mapping": "martini3.membrane",
    "input":       "/scratch/angiod/POPC/traj/output.pdb",
    "inputtraj":   "/scratch/angiod/POPC/traj/output.xtc",
    "isatomistic": True, # Set this to False when backmapping actual CG
    "selection": "resname POPC and not element H",
    "trajslice": "990:991",
    "model": "/home/angiod@usi.ch/HEroBM/deployed/martini3.popc.pth",
    # "bead_types_filename": "bead_types.bbcommon.yaml",
    "output": "../backmapped/POPC/martini3.membrane",
    "device": "cpu",
}

### 2. Run backmapping ###

'tolerance' parameter is used as a threshold for energy minimisation.

Note: if you test the A2A model on other proteins, most probably the energy minimised version will have a lower RMSD with respect to the raw backmapped version. This is expected, as the model was trained on a single system and is not yet able to generalise extremely well, thus might create clashes in some cases. Running even mild energy minimisation fixes any clash and gives a sound structure to simulate.

In [None]:
from herobm.scripts.run_inference import run_backmapping

backmapped_filenames, backmapped_minimised_filenames, true_filenames, cg_filenames = run_backmapping(args_dict, tolerance=500.0)

After training completion, you will find the results in the './results/tutorial/A2A.martini' folder.

Model weights are saved as 'best_model.pth', while the 'config.yaml' file contains all the directives to load the model, to either perform inference or to continue training/fine-tune the model.

In [None]:
import MDAnalysis as mda
import nglview as nv
from MDAnalysis.analysis import rms

In [None]:
backmapped_u = mda.Universe(*backmapped_filenames[:1])
cg_u = mda.Universe(*cg_filenames[:1])
sel = 'all'

merged_u = mda.Merge(backmapped_u.select_atoms(sel), cg_u.select_atoms(sel))

w = nv.show_mdanalysis(merged_u)
w.add_representation('spacefill', selection='.BB .SC1 .SC2 .SC3 .SC4', radiusScale=0.2)
w.add_representation('licorice', selection='protein and not (.BB .SC1 .SC2 .SC3 .SC4)')
w

In [None]:
backmapped_u = mda.Universe(*backmapped_filenames[:1])
ref_u = mda.Universe(*true_filenames[:1])

print("RMSD BB:", rms.rmsd(backmapped_u.select_atoms('name N CA C O').positions, ref_u.select_atoms('name N CA C O').positions, superposition=False))
print("RMSD SC:", rms.rmsd(backmapped_u.select_atoms('protein and not name N CA C O').positions, ref_u.select_atoms('protein and not name N CA C O').positions, superposition=False))
print("RMSD ALL:", rms.rmsd(backmapped_u.select_atoms('all').positions, ref_u.select_atoms('all').positions, superposition=False))

sel = 'all'
w = nv.show_mdanalysis(backmapped_u)
w.add_representation('licorice', selection='all')
w

Check reconstructed and original together

In [None]:
merged_u = mda.Merge(backmapped_u.select_atoms(sel), ref_u.select_atoms(sel))

w = nv.show_mdanalysis(merged_u)
w.add_representation('licorice', selection='all')
w

Finally, have a look to the minimised structure and compare it to the ground truth

In [None]:
backmapped_minimised_u = mda.Universe(*backmapped_minimised_filenames[:1])
merged_minimised_u = mda.Merge(backmapped_minimised_u.select_atoms(sel), ref_u.select_atoms(sel))

s = 'name N CA C O and not resname ACE NME'
print("RMSD BB:", rms.rmsd(backmapped_minimised_u.select_atoms(s).positions, ref_u.select_atoms(s).positions, superposition=False))
s = 'protein and not name N CA C O OXT and not type H and not resname ACE NME'
print("RMSD SC:", rms.rmsd(backmapped_minimised_u.select_atoms(s).positions, ref_u.select_atoms(s).positions, superposition=False))

w = nv.show_mdanalysis(merged_minimised_u)
w.add_representation('licorice', selection='protein')
w