Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable use of CVs defined by PyTorch neural network models #570

Draft
wants to merge 73 commits into
base: master
Choose a base branch
from

Conversation

zwpku
Copy link
Member

@zwpku zwpku commented Aug 28, 2023

This branch implements a class called torchANN, which allows to define cv components by loading pretrained PyTorch neural network models.

Installation Steps

  1. Download LibTorch. This package is required in order to enable the torchann class. First, download the code and unzip it.

         wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-cxx11-abi-shared-with-deps-latest.zip
         unzip libtorch-cxx11-abi-shared-with-deps-latest.zip
    

    In this way, the library is uncompressed under the current directory. Let's say it is located at /path/to/libtorch.

  2. Patch MD engine. This step is done as usual using the script update-colvars-code.sh. Enter the source code of Colvars package, and run:

         ./update-colvars-code.sh /path/to/md-engine        
    
  3. Compilation. This step depends on the engine to be compiled.

    • NAMD: add "--with-colvars-torch --torch-prefix path/to/libtorch" to the argument of ./config

      Assume packages that are required to build NAMD, e.g. charm, tcl/tcl-threaded, are already prepared.
      Then, one can compile the NAMD package with the following commands:

        ./config Linux-x86_64-g++ --charm-arch multicore-linux-x86_64 --with-colvars-torch    \
              --torch-prefix /path/to/libtorch  --with-fftw3 --fftw-prefix /path/to/fftw
        cd Linux-x86_64-g++
        make 
    
    • GROMACS: add "-DTorch_DIR=/path/to/libtorch/share/cmake/Torch" when running cmake

    An example of the command is:

        cmake .. -DCMAKE_INSTALL_PREFIX=/home/username/local/gromacs  \
                        -DFFTWF_LIBRARY=/home/username/mambaforge/lib/libfftw3f.so  \
                        -DFFTWF_INCLUDE_DIR=/home/username/mambaforge/include \
                        -DTorch_DIR=/path/to/libtorch/share/cmake/Torch/  \
                        -DCMAKE_CXX_COMPILER=/usr/bin/mpicxx \
                        -DOpenMP_gomp_LIBRARY=/home/username/mambaforge/lib/libgomp.so
    
    • LAMMPS: only installation by cmake is supported. In the directory of LAMMPS source code, run
         mkdir build && cd build
         cmake ../cmake -D PKG_COLVARS=yes -D COLVARS_TORCH=yes 
    

    and set the variable Torch_DIR in the file CMakeCache.txt. When a cpu version of libtorch library is used, it may
    also be necessary to set MKL path to empty:

         MKL_INCLUDE_DIR:PATH=
    

    Alternatively, one could combine these steps in one command:

         cmake ../cmake -D PKG_COLVARS=yes -D COLVARS_TORCH=yes      \ 
             -D  Torch_DIR=/path/to/libtorch/share/cmake/Torch -D MKL_INCLUDE_DIR=
    

    After that, run make and make install to compile and install the package.

The class has only been tested using simple neural network models (i.e. an autoencoder on alanine dipeptide), under NAMD and GROMACS engines. Feedbacks are welcome!

A (trivial) example

  1. Create a PyTorch model
import torch

class MyModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
    def forward(self, x):
        return x

model = MyModel()
scripted_cv_filename = f'./identity.pt'
torch.jit.script(model).save(scripted_cv_filename)

This Python script simply creates a model which is an identity map and save it to a file named identity.pt.

  1. Define the COLVARS config file

This file defines two CVs using torchann class taking other cv components (here dihedral angles) as inputs.

colvarsTrajFrequency    10000
colvarsRestartFrequency 10000

colvar {
  name nn_0
  lowerBoundary -180.0
  upperBoundary 180
  width 5.0
  extendedLagrangian on
  extendedFluctuation 5.0
  extendedTimeConstant 200

  torchann {
    modelFile identity.pt
    m_output_index 0
    period 360

    dihedral {
      group1 { 
	atomnumbers 5
      }
      group2 { 
	atomnumbers 7
      }
      group3 { 
	atomnumbers 9
      }
      group4 { 
	atomnumbers 15
      }
    }

    dihedral {
      group1 { 
	atomnumbers 7
      }
      group2 { 
	atomnumbers 9
      }
      group3 { 
	atomnumbers 15
      }
      group4 { 
	atomnumbers 17
      }
    }

  }
}

colvar {
  name nn_1
  lowerBoundary -180.0
  upperBoundary 180
  width 5.0
  extendedLagrangian on
  extendedFluctuation 5.0
  extendedTimeConstant 200

  torchann {
    modelFile identity.pt
    m_output_index 1
    period 360

    dihedral {
      group1 { 
	atomnumbers 5
      }
      group2 { 
	atomnumbers 7
      }
      group3 { 
	atomnumbers 9
      }
      group4 { 
	atomnumbers 15
      }
    }

    dihedral {
      group1 { 
	atomnumbers 7
      }
      group2 { 
	atomnumbers 9
      }
      group3 { 
	atomnumbers 15
      }
      group4 { 
	atomnumbers 17
      }
    }
  }
}

abf {
  colvars nn_0 nn_1
  fullSamples	200
}

@giacomofiorin
Copy link
Member

@zwpku Thank you for handling merge conflicts coming from master. PR #643 contains more of them, but that could be merged after this PR to make it easier.

To move forward, can you provide a list of commands to install libTorch, so that the feature can be included in the automated tests? I can help integrate them with the GitHub Actions workflows and the hodgepodge of testing scripts in this repo if needed.

libTorch should ideally be installed from a precompiled package, unless compilation from source can proceed in a reasonable time (e.g. less than 30 minutes) when no cache is present.

@giacomofiorin
Copy link
Member

Hi @zwpku I'd like to move ahead with #643, which will create some merge conflicts with this branch. To avoid that, it may be better to merge this branch first, if you think it's ready for review. Do you have the PyTorch installation instructions suitable for reproducing the results of the tests that you added?

@zwpku
Copy link
Member Author

zwpku commented Feb 21, 2024

Hi @giacomofiorin , sorry for being inactive for a while. Once downloaded, Libtorch can be directly linked against when compiling patched MD engines. However, when I tried to repeat all the steps (from downloading to testing) today, some errors were reported during the compiling of engines (both NAMD and Gromacs). I will resolve the problem tomorrow and let you know the steps.

@zwpku
Copy link
Member Author

zwpku commented Feb 22, 2024

Hi @giacomofiorin, I updated this branch with master and included the list of commands for installation at the beginning of this page. On my local machine, the installation works for both NAMD 3.0b3 and gromacs-2023.2.

It seems that older versions of NAMD (e.g. versions from year 2022) are no longer supported?

Let me know if any further information about the installation is needed.

@giacomofiorin
Copy link
Member

Thanks for the quick reply!

NAMD 2.x (i.e. the master branch) is effectively abandoned by the NAMD team, which will make all new releases based on the devel branch.

@giacomofiorin
Copy link
Member

@zwpku We have now a small conflict in the GROMACS 2023 patch due to a change in master.

Would you have any strong objections against supporting GROMACS >= 2024.0 and only those for this feature? (I can make that change)

@zwpku
Copy link
Member Author

zwpku commented Apr 23, 2024

@giacomofiorin, sure, it is fine to only support Gromacs >=2024 for this feature.

@zwpku
Copy link
Member Author

zwpku commented Apr 23, 2024

@HanatoK @giacomofiorin
An issue I noticed in my tests was that ABF simulations with CV defined by this class slow down quite a bit.
The PyTorch model I used involves a rotation/translation layer (implemented by myself in Python) and coordinates of backbone atoms. Maybe the computation can be faster when the model is simpler (e.g. functions of CVs). But I guess in general this class via calling libtorch library is slower than the neuralNetwork class. I wanted to further look into the performance of this class but it didn't happen yet...

I wonder whether one can implement similar CV classes using APIs from other ML packages, e.g. tensorflow or JAX. Any idea?

@HanatoK
Copy link
Member

HanatoK commented Apr 23, 2024

@zwpku I am also working on a new NAMD interface that supports GPU-resident calculations, and I also use libtorch as an example. Based on my experience, it depends on how you implement the rotation/translation layer in python. My impression is that if you use the default automatic diffferentiation then it would be slow. I think the performance is better when using manual derivatives (my code is in https://github.com/HanatoK/namd_pytorchforces_example/blob/1f05b007b724502fccc23e06207f5ac865691285/build_rmsd_cv_opt.py#L143-L155). Have you tried to implement a custom backward or something similar?

As for JAX, I am also looking into it, but it seems lacking documentation of its C++ API. By the way, the pytorch community is moving to its new AOTInductor interface, and TorchScript is in maintenance mode, but again the documentation of AOTInductor is a problem.

@zwpku
Copy link
Member Author

zwpku commented Apr 24, 2024

@HanatoK Thanks a lot for this information. The model I used for CV was obtained by learning eigenfunctions associated to the dynamics. I included a transformation after the input layer of the model to make sure the function is invariant under rotation and translation. In the end, it has the form $f(x)=g(T(x))$, where the input $x$ are coordinates of non-hydrgen atoms and $T:\mathbb{R}^d\rightarrow \mathbb{R}^d$ is the optimal transformation minimizing the RMSD. My implementation (from some time ago) is here: https://github.com/zwpku/molann/blob/cdd140e40036a6c41b3a2df08f002c6fb2443b6e/molann/ann.py#L157
As a result, in my case computing derivatives of f wrt x by autodiff needs to differentiate the steps in RMSD minimization, in particular the svd function. That possibly made my CVs expensive to use in simulations.

@zwpku
Copy link
Member Author

zwpku commented Apr 24, 2024

@HanatoK @giacomofiorin
I got a question (not about this class but about the rmsd function) when writing the previous comment and would like to know your answer. To my understanding $RMSD(x) = |x-c(x)-R(x)x_{ref}|^2$, where c(x) and R(x) are the optimal translation and the optimal rotation, respectively. If this is correct, then computing the derivative of RMSD wrt x would also need to differentiate both c(x) and R(x) by chain rule, since they both depend on x.
On the other hand, in the code shared by @HanatoK in the comment above, it seems that this dependence is not taken into account (I've only read part of the code so maybe I misunderstood). Also, in Colvars there seems to be an option in the rmsd CV class that controls whether to handle this dependence.
It is not clear to me how to understand the code. Did I miss something, or is there some cancellation that allows you to avoid differentiating c(x) and R(x)? Thanks!

@HanatoK
Copy link
Member

HanatoK commented Apr 24, 2024

@zwpku The RMSD is something special. I don't think it requires to calculate the derivative of the rotation matrix $R(x,x_\mathrm{ref})$ with respect to $x$ or $x_\mathrm{ref}$. You can see "The algorithm" section in this paper, page 1855. The $\mathcal{U}^T(q_\mathrm{max})$ is just the transpose of the rotation matrix.

For other CVs depending on the aligned positions, I think Colvars has indeed the fit gradients for them, but I am not quite sure.

@giacomofiorin
Copy link
Member

Hi @zwpku! @HanatoK is correct that the RMSD does not normally need to compute those derivatives, because they are zero as long as the atoms being included in the RMSD are the same over which the roto-translation is computed. If the two are not the same (i.e. if fittingGroup is defined) then the fit gradients are non-zero and will need to be calculated. The same is also true if rotated coordinates are used in a CV that is not the RMSD, which is the only one whose gradients are zero by definition.

Unfortunately, due to limitations in the current data structures we do not have support for computing/storing the fit gradients for vector-valued variables (#87), which includes Cartesian coordinates. But that limitation should not affect you, because you have already defined the torchANN function to return a scalar value.

Regarding speed, I would get this PR finished up for setting up NN functions of CVs. But if you want to use Cartesian coordinates directly, I would recommend checking out @HanatoK's preliminary work to that sense. I do acknowledge that you may want to use rotated-frame coordinates, which is not possible now but may become possible later (or is it?).

@zwpku
Copy link
Member Author

zwpku commented Apr 25, 2024

@HanatoK @giacomofiorin Thank both of you for the explanations. Now the case of RMSD function is clear to me.
torchann defines a function of CVs. To define a function of Cartesian coordinates directly, one can combined it with cartesian, as in the example below:

torchann {
    modelFile model.pt
    m_output_index 0
    cartesian {
         atoms { 
         atomNumbers {
             1 5 6 7 9 11 15 16 17 19
         }
    }
}

It is not clear whether I can define a function of fitted coodinates by simply turning on the options rotateToReference and centerToReference in atoms:

  torchann {
      modelFile model.pt
      m_output_index 0
      cartesian {
          atoms { 
              centerToReference yes
              rotateToReference yes
	      atomNumbers {
	          1 5 6 7 9 11 15 16 17 19
	      }
              refPositionsFile all.pdb
           }
    }

I will look into the code to better understand it.

@HanatoK
Copy link
Member

HanatoK commented Apr 25, 2024

@zwpku I attach a configuration of binding free-energy (BFEE) calculation here for your reference.

eulerTheta {                             
        atoms {                               
            indexGroup  ligand                 
            centerReference    on              
            rotateReference    on              
            centerToOrigin     on              
	    enableFitGradients on              
            fittingGroup {                    
                indexGroup  protein            
            }                                 
            refpositionsfile  ./complex_largeBox.xyz        
         }                                    
         refpositionsfile  ./complex_largeBox.xyz           
}

The system has two biomolecules, namely protein and ligand. We want to align the system simulated with respect to the reference positions only using the protein, and then measure the relative Euler theta angle of the ligand with respect to the reference.

If you only work with a single molecule then you might not need the fittingGroup. eulerTheta accepts refpositionsfile but your torchann don't need it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants