Skip to content

clinfo/DEFMap

Repository files navigation

DEFMap: Dynamics Extraction From cryo-em Map

This package provides an implementation of a dynamics prediction from a cryo-EM density map.

Google Colaboratory version of DEFMap is now released. If you want to try it, please visit ColabDEFMap.

Dependency

Environment

  • Ubuntu: 16.04 or 18.04 (confirmed)
  • CUDA Toolkit version: 10.0 (any compatibility version is ok)
  • NVIDIA driver version: 410.66 (any compatibility version is ok)

Package

  • language: English
  • python: >3.6
  • keras-gpu: 2.2.4
  • tensorflow-gpu: 1.13.1
  • htmd: 1.15.2
  • EMAN2: 2.3
  • Pymol (optional):
  • UCSF Chimera (optional):

Hardware requirements

DEFMap requires no specific high-end computer. For optimal performance, we recommended a computer with the following specs:

  • RAM: 96+GB
  • CPU: 12+ cores, 2.60 GHz.
  • GPU: NVIDIA GeForce GTX1080Ti (or better GPU)

Setup

Please refer to the following link. Typical installation time is expected to be less than 30 minutes.

If you want to use docker, please refer to the following link.

Example usage

Usage 1: Dynamics Prediciton and Voxel Visualization

DEFMap only uses a cryo-EM map file for dynamics prediction and voxel visualization.

  • Scale voxel length and map resolution
cd data
e2proc3d.py 015_emd_3984.map 015_emd_3984_5.0A_rescaled.mrc --clip=160,160,160 --scale=0.9 --process=filter.lowpass.gauss:cutoff_freq=0.2
# See the following link for e2proc3d.py options: https://blake.bcm.edu/emanwiki/EMAN2/Programs/e2proc3d
  • Create dataset
cd ../preprocessing/
python prep_dataset.py -m ../data/015_emd_3984_5.0A_rescaled.mrc -o ../data/sample.jbl -p
  • Inference
cd ../
# It's going to take a while (less than 10 minutes depending on your computer).
python 3dcnn_main.py infer --test_dataset data/sample.jbl -o model/model_res5A.h5 --prediction_output result/prediction.jbl

The joblib output file contains python dictionary object (key: voxel coordinate, value: logarithm of RMSF).

  • Visualization
    Decide a threshold of the map intensity using like UCSF Chimera. Then, voxels with the intensity above the threshold will be selected for visualization.
    In this example, 0.0252 is used as the threshold.

First, run the following command.

python postprocessing/rmsf_map2grid.py -m ./data/015_emd_3984_5.0A_rescaled.mrc -p result/prediction.jbl -t 0.0252

Then, launch your GUI viewer that can visualize PDB file (in this tutorial, we use PyMol) and open 015_emd_3984_5.0A_rescaled.pdb. Finally, run the following command (show_as nb_spheres; spectrum b, slate_orange_red, minimum=-1, maximum=2) in PyMOL command line, and you can see the following picture.

Usage 2: Dynamics Prediction and The Mapping onto Atomic Model

For atomic-level dynamics visualization, DEFMap needs a PDB file corresponding to the cryo-EM map used for inference.

  • Scale voxel length and map resolution
cd preprocessing
python rescale.py -l ../data/sample_pdb_map.list -s 1.5 -r 5
  • Create dataset
python prep_dataset.py -m ../data/015_emd_3984_5.0A_rescaled.mrc -o ../data/sample.jbl -p
  • Inference
cd ../
# It's going to take a while (less than 10 minutes depending on your computer).
python 3dcnn_main.py infer --test_dataset data/sample.jbl -o model/model_res5A.h5 --prediction_output result/prediction.jbl
  • Visualization

First, run the following command to create dynamics-mapped pdb file.

cd postprocessing
python rmsf_map2model_for_defmap.py -l ../data/sample_for_visual.list -p ../result/prediction.jbl --normalize

Then, launch your GUI viewer that can visualize PDB file (in this tutorial, we use PyMol) and open defmap_norm_model.pdb. Finally, run the following command (spectrum b, slate_orange_red, minimum=-1, maximum=2) in PyMOL command line, and you can see the following dynamics-mapped protein structure.

Usage 3: Dynamics Prediction Using Models Trained by Different Resolutions

Here, we shows the dynamics prediction using models trained by datasets low-pass filtered to 6Å and 7Å (the above examples used the dataset low-pass filtered to 5Å).

  • Scale voxel length and map resolution
cd data
# for 6Å
e2proc3d.py 015_emd_3984.map 015_emd_3984_6A_rescaled.mrc --clip=160,160,160 --scale=0.9 --process=filter.lowpass.gauss:cutoff_freq=0.17
# for 7Å
e2proc3d.py 015_emd_3984.map 015_emd_3984_7A_rescaled.mrc --clip=160,160,160 --scale=0.9 --process=filter.lowpass.gauss:cutoff_freq=0.14
  • Create dataset
cd ../preprocessing/
# for 6Å
python prep_dataset.py -m ../data/015_emd_3984_6A_rescaled.mrc -o ../data/sample_6A.jbl -p
# for 7Å
python prep_dataset.py -m ../data/015_emd_3984_7A_rescaled.mrc -o ../data/sample_7A.jbl -p
  • Inference
cd ../
# It's going to take a while.
# for 6Å
python 3dcnn_main.py infer --test_dataset data/sample_6A.jbl -o model/model_res6A.h5 --prediction_output result/prediction_6A.jbl
# for 7Å
python 3dcnn_main.py infer --test_dataset data/sample_7A.jbl -o model/model_res7A.h5 --prediction_output result/prediction_7A.jbl

The joblib output file contains python dictionary object (key: voxel coordinate, value: logarithm of RMSF).

License

This package is licensed under the MIT License - see the LICENSE file for details.

Authors

Shigeyuki Matsumoto: shigeyuki.matsumoto@riken.jp
Shoichi Ishida: ishida.sho.nm@yokohama-cu.ac.jp (maintainer)

Reference

@article{Matsumoto2021,
  doi = {10.1038/s42256-020-00290-y},
  url = {https://doi.org/10.1038/s42256-020-00290-y},
  year = 2021,
  month = {feb},
  volume = {3},
  number = {2},
  pages = {153--160},
  author = {Shigeyuki Matsumoto and Shoichi Ishida and Mitsugu Araki and Takayuki Kato and Kei Terayama and Yasushi Okuno},
  title = {Extraction of protein dynamics information from cryo-{EM} maps using deep learning},
  journal = {Nature Machine Intelligence}
}