Skip to content

vendruscolo-lab/ab42-kinetic-ensemble

Repository files navigation

A kinetic ensemble of the Alzheimer's Aβ peptide

This repository contains the full code and some small example data to reproduce our results on the kinetic ensemble of amyloid-β 42. See also the original implementation of the constrained VAMPNets.

Reproducibility information

The analysis was performed on a single Google compute engine instance using 12 vCPUs, 78 GB memory and 1x NVIDIA Tesla V100 GPU. The instance used the c3-deeplearning-tf-ent-2-1-cu100-20200131 image with CUDA 10.0 and tensorflow 2.1, based on Debian 10. The original environment for this machine is provided in env-tf-original.yml. Training a single network takes approximately 2 hours on this architecture.

Dataset

The full dataset can be found on Zenodo. The size of the full dataset is around 45 GB and includes the following directories:

  • trajectories/: The simulation trajectories, subsampled to 250 ps timesteps, performed in 5 rounds with 1024 individual trajectories each. The aggregated simulated time is 314 µs for the reduced form of Aβ42 and 317 µs for the oxidised form. Also includes the chemcial shifts backcalculated with CamShift as implemented in Plumed.
  • intermediate/: Intermediate data files, such as the calculated inter-residue minimum distances, and the full model outputs in the form of transition matrices, weights, and timescales.
  • models/: The neural network models including weights and trajectory indices used for training.
  • structures{,-alt}/: State structures sampled from the trajectories sampled two different ways.
  • figs/: Raw figures for the paper, generated by the notebooks.

Notebooks

The easiest way to try out the notebooks is by using conda. We include several environment specifications: env-tf.yml specifies the environment to be used for running the neural network, env-analysis.yml specifies the packages needed for the analysis and plotting of the results, and env-msmbuilder.yml specifies the environment for use with the msm-classic.ipynb notebook for the conventional Markov models. Because the installation of tensorflow is mostly highly specific to your machine, we strongly recommend following the official installation instructions. To create the environments, run conda env create -f env-analysis.yml and activate the new environment with conda activate analysis. You will also need to install a tensorflow 2.* compatible version of vamptools for training.

  • msm-vampe-hyperpar.ipynb: Hyperparameter search code, can be run with papermill and the env-tf.yml environment.
  • msm-vampe-training.ipynb: Training code, can be run with papermill and the env-tf.yml environment.
  • msm-vampe-convergence.ipynb: Simple convergence test using subsets of the trajectory data with the trained models. Run with the env-tf.yml environment.
  • msm-vampe-analysis.ipynb: Full analysis and plots of the ensemble, run with the env-analysis.yml environment.
  • msm-classic.ipynb: Classic MSM building attempt, run with the env-msmbuilder.yml environment.
  • model.py: The neural network model code.
  • data.py: The data generators and handlers without any tensorflow related code, for msm-vampe-analysis.ipynb.

About

Code for "A kinetic ensemble of the Alzheimer's Aβ peptide"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published