In [7]:
import sys
import numpy as np
import pandas as pd

### Extract MFEP trajectory using MFEP profile ("mfep-profile-np.dat" generated using "mfep-profile-np.ipynb") and COLVAR file produced at the end of metadynamics simulations

From the COLVAR file produced the end of the metadynamics simulations, throw away header lines (start with #!) and read columns corresponding to time, s(R) and z(R)

In [8]:
%%bash
awk '$1!="#!" {print $1,$2,$3}' COLVAR > time_cvs.dat

In [9]:
time_cvs=pd.read_csv("time_cv.dat",header=None,sep="\s+",usecols=[i for i in range(3)])
time_cvs.columns = ["time","s(R)", f"z(R)"]

Convert z(R) from nm$^2$ to Å$^2$ and round of time, s(R) and z(R) values

In [10]:
s = np.array(time_cvs["s(R)"]).reshape(-1,1)
time_cvs["time"]=(time_cvs["time"]).astype(int)
time_cvs["z(R)"]=(time_cvs["z(R)"]*100).round(1)
time_cvs["s(R)"]=time_cvs["s(R)"].round(1)

Read the "mfep-profile-np.dat" file generated using "mfep-profile-np.ipynb" and store s(R) and z(R) values into s and z numpy arrays

In [11]:
gprofile=pd.read_csv("mfep-profile-np.dat",header=0,sep="\s+",usecols=[i for i in range(3)])
gprofile.index=[i for i in range(1,26)]
gprofile["s(R)"]=gprofile["s(R)"].round(1)
s=np.array(gprofile["s(R)"])
z=np.array(gprofile["z(R)"])

For each MFEP (s,z) bin, scan through the "time_cvs" dataframe to extract a smaller dataframe at which the corresponding (s,z) value occurs. Some deviation is allowed in s (sdev) and z (zdev) values in the scanning process to achieve sufficient sampling. The extracted dataframes are first appended in a list. Then times are extracted from each dataframe and are written to "time{i}.dat" where "i" is the bin index

In [13]:
list1=[]
sdev=0.1
zdev=0.05
for (i,j) in zip(s,z):
        k=i-sdev
        l=i+sdev
        m=j-zdev
        n=j+zdev
        x=time_cvs[(time_cvs["s(R)"]>=k) & (time_cvs["s(R)"]<=l) & (time_cvs["z(R)"]>=m) & (time_cvs["z(R)"]<=n)]
        list1.append(x)
for i in range(1,len(list1)+1):
    list1[i-1]["time"].to_csv(f"time{i}.txt",index=None)

Now, extracted times from each bin are read, and within each bin one of the times is arbitrarily chosen and corresponding pdb file is extracted from gmx trajectory

In [14]:
%%bash
for i in {1..25}
do 
    tail -1 time$i.txt | awk '{printf "%1.0f\n",$1}'
done > time_final.txt

for i in {1..25}
do
 rm $i.pdb
 echo 1 | gmx trjconv -s abl1_md.tpr -f abl1_md-fit.xtc -o $i.pdb -dump $(awk -v var="$i" 'NR==var{print}' time_final.txt)
done

Note that major changes are planned in future for trjconv, to improve usability and utility.
Select group for output
Selected 1: 'Protein'
Note that major changes are planned in future for trjconv, to improve usability and utility.
Select group for output
Selected 1: 'Protein'
Note that major changes are planned in future for trjconv, to improve usability and utility.
Select group for output
Selected 1: 'Protein'
Note that major changes are planned in future for trjconv, to improve usability and utility.
Select group for output
Selected 1: 'Protein'
Note that major changes are planned in future for trjconv, to improve usability and utility.
Select group for output
Selected 1: 'Protein'
Note that major changes are planned in future for trjconv, to improve usability and utility.
Select group for output
Selected 1: 'Protein'
Note that major changes are planned in future for trjconv, to improve usability and utility.
Select group for output
Selected 1: 'Protein'
Note that major changes are

             :-) GROMACS - gmx trjconv, 2020.1-Ubuntu-2020.1-1 (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd    
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray     
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang  
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson    
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund   
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall   
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov  
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen 
 Christian Wennberg    Maarten Wolf      Artem Zhmurov   
                           and the project leader

Concatenate all pdb files to make a MFEP trajectory

In [15]:
%%bash
rm mfep-traj-np.pdb
for i in {1..25}
do
 awk '{print}' $i.pdb >> mfep-traj-np.pdb
done