Daniel R. Roe edited this page Aug 3, 2018 · 31 revisions

Welcome to the CPPTRAJ wiki!

Here you will find (hopefully) useful information on using and developing CPPTRAJ.



3D data sets: More options when saving/reading 3D data sets.

3D data written in "standard" format can now be read back in by CPPTRAJ. One new keyword of particular note is sparse: 3D data sets in "standard" format can be written as sparse with this keyword, i.e. voxels (bins) with no population can be skipped, which can dramatically reduce the file size for sparse grids.


Cluster analysis: add ability to save more than 1 representative structure per cluster

A new keyword, savenreps <#>, is now available for cluster analysis which allows the top <#> cluster representatives to be saved instead of just 1. See the PR for full details.


Add particle mesh Ewald energy calculation

Adds particle mesh Ewald electrostatics to the energy command. Reciprocal part of the sum is handled by the helPME library.


Add command to analyze Amber constant pH simulation output

Cpptraj can now read in and analyze Amber constant pH simulation data. See the PR for full details.


Addition of some script-like syntax ('for', etc)

This adds for loops and script-like variables.

  • Script variables. Script variables start with $ and are distinct from data sets. They can be set via the set command or are created via for loops (more on those below). E.g.
set prefix = OUTPUT_DIR/run1
trajin mytraj.nc
rmsd first :1-12 out $prefix/rmsd.dat
rmsd first nofit :3 out $prefix/rmsd.dat

Script variables can be used pretty much anywhere. There are also special modes for set. You can set a variable to contain the total number of atoms, residues, or molecules in a mask, or you could set a variable to the current total number of input trajectory frames (i.e. from trajin statements). Script variables can also be appended to. Say you have two trajectories, one of length 100 and one of length 50:

trajin traj1.nc
set split = trajinframes
set split += ,
trajin traj2.nc
set split += trajinframes

The script variable $split would be set to 100,150 and could then be used e.g. for the splitframes arguments of the cluster command. The show command can be used to show all current script variables and their values.

  • 'for' loops. Can do regular numerical for loops, e.g.
for i=1;i<10;i++
  distance d$i @$i @20

would add 10 distance commands named d1 - d10 between atoms 1-10 and atom 20. Can also do for loops for mask expressions, e.g.

for atoms Natom inmask :2-129@N&!:PRO atoms Hatom inmask :2-129@H v=1;v++
  vector v$v $Natom ired $Hatom

Would add vector commands for each N-H bond vector in residues 2-129 (skipping prolines).


Expanded mask syntax

  • Select by chain ID. ::
::B@CA # Select all atoms named CA in chain B
  • Select by molecule number. ^
^1,2:DC@P # Select all atoms named P in residues named DC in molecules 1 and 2
  • Select by original (e.g. PDB) residue number. :;
:;3-5,8-10 # Select residues originally numbered 3 through 5 and 8 through 10
  • Select molecules by distance.
@5<^3.0 # Select all molecules within 3 Angstroms of atom number 5

The chain ID and original residue number selection are most useful for PDB topology or Amber topology with PDB info added.


Fixes and enhancements for SPAM

  • Distances are now properly imaged in the energy calculation.
  • The purewater calculation is now an order of magnitude faster.


Improvements to the 'check' Action

  • check is now orders of magnitude faster due to pair list usage.


  • Add support for reading energy terms from CHARMM output.
  • New modes for dataset command: droppoints and keeppoints:
  drop|keep}points {range <range arg> | [start <#>] [stop <#>] [offset <#>]}
                   [name <output set>] <set arg1> ...
    Drop specified points from or keep specified points in data set(s).
  • New ways to modify dimension info for data sets with dataset:
  dim {xdim|ydim|zdim|ndim <#>} [label <label>] [min <min>] [step <step>]
    Change specified dimension in set(s).

Version 18 Beta

New Features and Enhancements

CPPTRAJ Version 17 (April 18 2017)

New Features and Enhancements

  • CUDA-enabled closest/watershell commands.
  • Additional general speed improvements for closest/watershell commands.
  • comparetop command for reporting differences between topologies.
  • Wavelet analysis via wavelet, with WAFEX (wavelet analysis feature extraction).
  • Handle very large systems (tested on ~11.6 M atoms).
  • Support Gromacx XTC format.
  • nativecontacts command can now track non-native contacts.
  • autoimage anchor keyword can now focus on specified region of molecule for tightly packed systems.
  • SPAM (spam) command now finished, works with OpenMP/MPI.
  • OpenMP parallelized volmap command.
  • GIST (gist) command speed improved (OpenMP), handles water models with extra points.

Other Improvements

  • Can read custom nucleic acid bases in nastruct command.
  • Enable changing of matrix data set mode type for matrix read in from file.
  • Properly detect symmetric matrix data for read-in matrix data.
  • More consistent results with Intel compilers.
  • sybylatom and sybylbond options for Mol2 trajout, similar to antechamber -ac and -bc options.
  • align command for performing best-fit structure alignment.
  • New NetCDF cluster pairwise matrix format.
  • Sieved frames now included in cluster results calculations.
  • cluster pairwise matrices can be treated like data sets; used for multiple cluster commands, written in different formats, etc.
  • cluster hierarchical agglomerative action speedup via OpenMP.
  • More options for choosing best representative structures in clustering.
  • Force trajectories can now be processed.
  • Fix torsion analysis in the stat command.
  • combinecrd command now works with box info.
  • Improvements for printing topology information (atoms, resinfo, molinfo, etc). Output can be redirected to files.
  • CHARMM shape matrix data now properly handled.

CPPTRAJ Version 16 (March 15, 2016)

File Formats

  • Charmm COR (read only)
  • Gromacs XVG (read only)
  • Gromacs TOP (read only)
  • CCP4 Grid Density (read/write)
  • PDB files:
    • Read and write more information (insertion codes, chain IDs, CRYST1, CONECT records, etc).
    • Improved element determination.
  • OpenDX files
    • bincenter: On write grid mesh can be aligned on grid bin centers instead of corners.
    • gridwrap: Like bincenter but also wrap grid density.
    • gridext: Like bincenter but also print an extra layer of empty bins.
  • NetCDF trajectory files: Can now read/write force information.
  • Mol2 files: sybyltype: Can output SYBYL atom types in Mol2 files if AMBERHOME is set.
  • Add keepext keyword to the amber restart, ncrestart, mol2, and pdb file formats to keep the extension intact and prepend the set number to the extension instead (e.g. out.X.pdb).


  • Across trajectory parallelization with cpptraj.MPI. Input trajectory/ensemble reads/writes are now parallelized in cpptraj.MPI.
    • New command ensemblesize should be used with cpptraj.MPI to improve efficiency of ensemble set up in parallel.
  • Data files can be converted via command line, e.g. cpptraj -d input.dat -w output.agr
  • onlymembers <list> keyword for trajout/outtraj; during ensemble run only write trajectory data for specified ensemble members.
  • Simple math can be performed with 2D/3D (matrix/grid) data sets (addition, subtraction, etc).
  • Can be configured with both MPI and OpenMP at the same time.
  • Non-data set file output from most Actions (e.g. hbond avgout) can be combined.
  • rotdif is now an Analysis.
  • Will now compile under Windows.
  • dataset
    • Data sets can be concatenated with dataset cat <dataset> <dataset> ...
    • Can create new data set with X values from one set and Y values from another: dataset makexy <set1> <set2>
  • Can manipulate reference structures using crdaction, so e.g. reference coordinates can be stripped, translated, rotated, etc.
  • Can be compiled without ARPACK (makes diagmatrix with larger matrices very slow).
  • Compressed (gzip/bzip2) input files can be read.
  • dihedralscan has been renamed permutedihedrals to better reflect what the command does.


  • esander: Calculate energies from sander (PME/GB) if compiled with sander API.
  • diffusion
    • Now uses data set framework, can control of output format/precision etc.
    • New syntax, average is the default (old syntax still supported):
        [{out <filename> | separateout <suffix>}] [time <time per frame>]
        [<mask>] [<set name>] [individual] [diffout <filename>] [nocalc]
* Automatically calculate diffusion constants from linear regression.
* Fix orthothombic imaging bug.
* Support non-orthorhombic imaging; faster than having a preceding **unwrap**.
  • New keyword parmout <top file> added to closest and strip commands for writing out topology files.
  • nastruct
    • Base pairing detection improved. Can detect non-WC base pairing, and base pairs in multi-stranded systems (e.g. G-quadruplex.
    • Better handling of systems where base pairs break and reform.
    • allframes: Recalculate base pairing each frame.
    • groovecalc: Can now specify 3dna to perform groove calculation of El Hassan and Calladine (as opposed to simple which is just based on phosphate/O4' distances).
    • New output column for BP.<suffix> file: BP, set to 1 if base pair present and 0 otherwise.
    • calcnohb: If specified base pair parameters will be calculated even if no hydrogen bonds present between previously found base pair.
  • closest: center option added for using center of mass of solute (instead of all atoms).
  • symmrmsd: Speedup; fixed bug in Hungarian algorithm that caused slow convergence.
  • rms
    • savematrices: Save rotation matrices to data set <name>[RM]
    • nomod keyword to prevent modification of coords even when calculating best-fit RMSD.
  • rotate
    • Can rotate around axis created from two points defined by user: axis0 <mask0> axis1 <mask1>.
    • Can rotate from rotation matrices data set (from e.g. rms savematrices): usedata <set name> [inverse]
  • replicatecell: Can now replicate more than 1 image in any direction.


  • vectormath: Can perform calculations on two vectors each of size N, or one vector of size 1 and another vector of size N.
  • ti: Calculate average DV/DL from input data sets and perform Gaussian quadrature integration.
  • modes: Fix fluct and displ for when eigenvalues are not in units of cm^-1. Also add option calcall to not skip zero/negative eigenvectors.
  • curvefit: Add gauss keyword to fit Gaussian.

As of September 24, 2015

File Formats

  • Gromacs GRO (read only)


  • Ability to read and process constant pH REMD logs.
  • PDB files:
    • CONECT records used to determine bonds; also add noconect option to PDB parm read to prevent using CONECT records to determine bonds.
    • For PDB write, add include_ep keyword to enable write of extra points.
  • Fix segfault when using DataSet_Coords_TRJ with trajectories containing velocity information.
  • Fix reading MDOUT from pememd TI run.
  • Fix residue single char output (TYR mistakenly converted to R instead of Y).
  • Data sets should now be properly appended to by readdata.
  • Can be configured to use FFTW3 (configure -fftw3 --with-fftw3=).
  • Data sets can now be selected using wildcard chars * and ?, so one could for example specify C?[Life*] to get C1[Lifetime], C2[Lifelink], etc.


  • nastruct:
    • Add Zp calculation (3DNA style).
    • Better determination of base pairing; can now recognize non-WC base pairs.
    • Improved determination of base pair steps.
  • check: Add around <mask> functionality to check command to restrict part of the system checked.


  • remlog: Ability to determine stats for different dimensions in remlog analysis.
  • kde: Fix to Kullback-Liebler divergence analysis in kde that occasionally resulted in 'nan' being written.
  • ired
    • Fixes for the relaxation calculation (integration of Cm(t) and calculation of Cj(t)).
    • Add full delta*S^2 matrix output option, ds2matrix <file>.
  • avg: oversets keyword; can now calculate average over multiple sets (as opposed to average of each individual set).
  • cluster: Fixes to dpeaks clustering. Add option to choose points manually, choosepoints manual via options distancecut <distcut> and densitycut <densitycut>. Probably more reliable than choosepoints auto at the moment.
  • calcstate: Assign states to frames based on data set criteria; resulting sets can be used in e.g. lifetime analysis.
  • multicurve: Non-linear curve fitting for multiple input data sets.
  • wavelet: Wavelet analysis on Cartesian coordinates.

As of May 1, 2015

  • Ability to perform math from command line, including limited ability to perform math on 1D data sets.
  • curvefit: Non-linear curve fitting.
  • cluster:
    • k-means clustering, [kmeans clusters <n> [randompoint [kseed <seed>]] [maxit <iterations>]
    • Can visualize pairwise distance matrix from cluster analysis in 2D (drawgraph) or 3D (drawgraph3d).
  • grid: Non-orthogonal grids; new keyword boxref). Usage: grid <filename> boxref <ref name/tag> <nx> <ny> <nz>
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.