aNiMAte

aNiMate is an unsupervised reconstruction approach that eliminates post-hoc atomic model fitting into 3D maps by directly providing atomistic details of the structural spread of arbitrarily sized molecules captured by cryo-EM, through deformation of a given atomic model along its normal modes.

Dependencies are listed in requirements.txt and are already incorporated in a Docker/Singularity image that's available on Dockerhub. You can build your own image by editing the Dockerfile recipe provided and running docker build.

SLAC central install

aNiMAte has been extensively tested at the SLAC Shared Scientific Data Facility (SDF) where the Singularity image is available at /sdf/group/ml/CryoNet/singularity_images/animate_latest.sif. To update it:
cd /sdf/group/ml/CryoNet/singularity_images
singularity pull -F docker://fpoitevi/animate:latest

Getting Started

After cloning the code, everything should be run from within the upper directory of the code. SDF SLURM job submission are included under scripts and they rely on config files like the ones included under configs. The config files specify the arguments passed to the three modes the code can run in: 1) Generating datasets (Relion starfiles), 2) Training (from starfiles or simulation), and 3) Evaluation (using a checkpoint after/during training).

1- Generating Datasets

In generation mode, the code generates a simulated cryo-EM dataset (particles .mrcs files and .star relion 3.1 starfile), starting from an atomic model PDB files. The simulation parameters are specified in a config file. An example simulation for Adenylate kinase is included under configs/ak-atomic-primal-sim.ini, and can be run on SDF as follows:

sbatch -t 1:00:00 scripts/submit_job_generate.sh configs/ak-atomic-primal-sim.ini sim-data

This will generate a dataset under the relative directory sim-data, but a fully resolved path can also be passed as the second argument to the SLURM script.

2- Training

There are two submodes for training a model: 1) with simulated data that is generated on-the-fly using NMA (same dynamic model as the reconstructed model), and 2) with data read from relion 3.1 starfiles (like the one simulated above or experimental datasets). An example simulation model training can be run as follows:

sbatch -t 1:00:00 --gpus 1 scripts/submit_job.sh configs/ak-atomic-primal-sim.ini

While the model is training, intermediate results (tensorboard logs, model checkpoints, etc...) are saved under ./logs with each training run as a subdirectory named after the SLURM job ID (ex. logs/8007570_0). Training can be monitored on SDF by running a tensorboard server pointing to the logs directory.

singularity exec -B /sdf /sdf/group/ml/CryoNet/singularity_images/animate_latest.sif tensorboard --logdir=logs/ --port=6007 &

Reproducing Paper Results

For large bio-molecules, like the ones in the aNiMAte paper, you'll most likely need to run on multiple GPUs. The atomic models and their corresponding precalculated NMA modes are included in this repo under data. The datasets (starfiles) used in the paper are on already on SDF, so in order to submit a training run for the Spliceosome on the GTX 2080 GPUs, run something like:

sbatch --nodes 2 --gres gpu:geforce_rtx_2080_ti:8 --cpus-per-task 32 -t 24:00:00 scripts/submit_job.sh configs/splice-atomic-primal-relion.ini

This will run a training run on 16 GPUs with 4 threads per GPU for data IO. Each GPU outputs its own log under logs/[SLURM_JOB_ID]_{GPU_INDEX}. Similarily, a training run for the Ribosome can be run using configs/ribosome-atomic-primal-relion.ini

3- Evaluation/Inference

After/During a training run, an evaluation run can be submitted with the provided scripts/submit_job_eval.sh script. Example below.

sbatch -t 02:00:00 -n 16 scripts/submit_job_eval.sh logs/8007570_0/config.ini logs/8007570_0/models/checkpoints/model_current.pth

This script takes two arguments, the first specifying the training config file, and the second pointing to a specific model checkpoint to load. Usually, you'll need to provide the evaluation dataset/starfile, which can be the same as the training starfile. Add or update the val_relion_star_file config argument in the config INI file passed to submit_job_eval.sh first before running the above command.

After evaluation is done, a new subdirectory will be created under logs with the format logs/[SLURM_JOB_ID] that contain the evaluation output and NMA plots. More visualizations can be generated from the provided Jupyter notebook notebooks/inspect_predictions.ipynb. Furthermore, a notebook is provided to evaluate the "Delta" score described in the paper: notebooks/evaluate_delta_scores.ipynb.

Accessing Paper Results

The actual runs/models for the paper with their corresponding job IDs are listed below. They can be accessed on SDF or downloaded from Google Drive mirroring /sdf/group/ml/CryoNet/aNiMAte/publication which can be found at https://bit.ly/aNiMAte_data. In that Google Drive directory, each subdirectory corresponds to one dataset which in turn contains 3 directories: configs/, data/ and logs/ which respectively contain the input config and data files as well as the curated output files.

Description	SLURM Job ID	original SDF Path	curated SDF Path
Spliceosome, trained on half 1	7993005	`/sdf/group/ml/CryoNet/cryonettorch/logs/7993005_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/7993005_0`
spliceosome, evaluated on half 1		`/sdf/group/ml/CryoNet/cryonettorch/logs/7993005`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/7993005`
Spliceosome, trained on half 2	7997982	`/sdf/group/ml/CryoNet/cryonettorch/logs/7997982_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/7997982_0`
Spliceosome, evaluated on half 2		`/sdf/group/ml/CryoNet/cryonettorch/logs/7997982`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/7997982`
Spliceosome, trained on full data	8008310	`/sdf/group/ml/CryoNet/cryonettorch/logs/8008310_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/8008310_0`
Spliceosome, evaluated on full data		`/sdf/group/ml/CryoNet/cryonettorch/logs/8008310`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/8008310`
Spliceosome (no-head), trained on half 1	8007570	`/sdf/group/ml/CryoNet/cryonettorch/logs/8007570_0`
Spliceosome (no-head), trained on half 2	7992273	`/sdf/group/ml/CryoNet/cryonettorch/logs/7992273_0`
Spliceosome (no-head), trained on full data	8019309	`/sdf/group/ml/CryoNet/cryonettorch/logs/8019309_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/8019309_0`
Spliceosome (no-head), evaluated on full data		`/sdf/group/ml/CryoNet/cryonettorch/logs/8019309`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10180/logs/8019309`
Ribosome, trained on half 1	10438442	`/sdf/group/ml/CryoNet/aNiMAte/logs/10438442_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/10438442_0`
Ribosome, evaluated on half 1		`/sdf/group/ml/CryoNet/aNiMAte/logs/10438442`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/10438442`
Ribosome, trained on half 2	7971120	`/sdf/group/ml/CryoNet/cryonettorch/logs/7971120_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/7971120_0`
Ribosome, evaluated on half 2		`/sdf/group/ml/CryoNet/cryonettorch/logs/7971120`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/7971120`
Ribosome, trained on full data	8007515	`/sdf/group/ml/CryoNet/cryonettorch/logs/8007515_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/8007515_0`
Ribosome, evaluated on full data		`/sdf/group/ml/CryoNet/cryonettorch/logs/8007515`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/8007515`
Ribosome (no-ES12), trained on half 1	8333192	`/sdf/group/ml/CryoNet/cryonettorch/logs/8333192_0`
Ribosome (no-ES12), trained on half 2	8333194	`/sdf/group/ml/CryoNet/cryonettorch/logs/8333194_0`
Ribosome (no-ES12), trained on full data	8209068	`/sdf/group/ml/CryoNet/cryonettorch/logs/8209068_0`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/8209068_0`
Ribosome (no-ES12), evaluated on full data		`/sdf/group/ml/CryoNet/cryonettorch/logs/8209068`	`/sdf/group/ml/CryoNet/aNiMAte/publication/empiar10028/logs/8209068`

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
configs		configs
data		data
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aNiMAte

Getting Started

1- Generating Datasets

2- Training

Reproducing Paper Results

3- Evaluation/Inference

Accessing Paper Results

About

Releases

Packages

Contributors 2

Languages

License

compSPI/aNiMAte

Folders and files

Latest commit

History

Repository files navigation

aNiMAte

Getting Started

1- Generating Datasets

2- Training

Reproducing Paper Results

3- Evaluation/Inference

Accessing Paper Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages