MarkovFit is A Protein Structural Modeling Method for Electron Microscopy Maps Using Markov Random Field.
Copyright (C) 2022 Eman Alnabati, Juan Esquivel-Rodriguez, Genki Terashi, Daisuke Kihara, and Purdue University.
License: GPL v3 for academic use. (For commercial use, please contact us for different licensing.)
Contact: Daisuke Kihara (dkihara@purdue.edu)
Cite:
- Pre-required software
- Compile Source Code
- Project Steps
- Generate simulated maps of subunits using EMAN2 package
- Run FFT Search
- Handle and Cluster FFT Search Results
- Compute Pairwise Scores of Pairs of Subunits
- Generate MRF graph and Apply Belief Propagation
- Extract top10 Final Structures using MaxHeap Tree
- Run an example
- Python 3 : https://www.python.org/downloads/
- EMAN2 : https://blake.bcm.edu/emanwiki/EMAN2/Install/
- Chimera : https://www.cgl.ucsf.edu/chimera/download.html
- FFTW: http://www.fftw.org/download.html
- GCC Compiler
-
FFT_Search Folder: make EMVEC_FIT_PowerFit
-
Handle Folder: g++ handle_result.cc -o handle
-
Pairwise_Scores Folder: make pairwise_scores_mpi
e2pdb2mrc.py --apix=voxel_spacing --res=resolution input_pdb_file output_mrc_file
For experimental map fitting: use voxel spacing and resoultion of that map.
./FFT_Search/EMVEC_FIT_PowerFit -a main_map -b subunit1_mrc_map -t main_map_contour_level -T subunit_map_contour_level -c no_processes -P true -M 2 -s voxel_space -p map_type > output_file
-a: Main map
-b: Subunit map
-t [float]: Threshold of density main_map def=0.000
-T [float]: Threshold of density sub_map def=0.000
-c [int ]: Number of cores for threads def=2
-g [float]: Bandwidth of the gaussian filter
def=16.0, sigma = 0.5*[float]
-s [float]: Sampling grid space def=7.0
-M [int]: Sampling Angle interval Mode 1-3 def=2
1: 20.83 degree, 648 samples
2: 10.07 degree, 7,416 samples
3: 4.71 degree, 70,728 samples
-C: Cross Correlation Coefficient and Overlap Mode
Using normalized density value by Gaussian Filter
-P: Pearson Correlation Coefficient and Overlap Mode
Using normalized density value by Gaussian Filter and average density
-p: Map type: 1 for experimental, 2 for simulated def=1
output_file contains the different transformations applied to subunit map along with goodness-of-fit scores.
./Handle/handle --dist-threshold max_dist_to_show --min-dist cluster_dist_thrshold --correct-x center_x --correct-y center_y --correct-z center_z --i input_file --o output_file
--min-dist: Distance used for clustering search results 8 def=8
--i: Input file containing FFT search results
--o: Output file to store results after sorting and clustering
optional:
To print search results and their distance to reference structur after sorting and clustering
--correct-x: X value of center of reference/native structure
--correct-y: Y value of center of reference/native structure
--correct-z: Z value of center of reference/native structure
--dist-threshold: Show only results with distance < threshold to native structure
output_file containing sorted and clustered results
output_file_b4_clstering containing sorted results before clustering
mpirun -np no_processes ./Pairwise_Scores/pairwise_scores_mpi --input-pdb subunit1_pdb ... --input-pdb subunitN_pdb --labels A,B,C --transforms-file sub1_processed_search_result_file --transforms-file subN_processed_search_result_file --calpha main_pdb --transforms-num no_results_per_subunit
-np: No of processes
--input-pdb: Subunit pdb files
--labels: List of subunits IDs separated by comma
--transforms-file: Subunit processed search result file
--calpha main_pdb: Native structure pdb file
--output-prefix: Prefix for output files
--transforms-num: No results to be considered for each subunit
One .mrf file for each subunit containing RMSD and goodness-of-fit scores for each transformation
One .mrf file for each pair of subunits containing RMSD and pairwise scores for each of their transformations
R --slave --vanilla --file=./MRF/mrf.R --args "operation='map'" "singletons=c('sub1.mrf',...,'subN.mrf')" "pairwise=c('sub1-sub2.mrf','sub1-sub3.mrf',...,'sub(N-1)-subN.mrf')" "weights=potential.collection.weights(CC=0.5,Overlap=0.9,PhysicsScore=1,no_clashes=0.8)" "output.prefix='mrf'"
singletons: List of .mrf files of all subunits
pairwise: List of .mrf files of all pairs of subunits
weights: weights to be used for the goodness-of-fit scores and pairwise scores
output.prefix: Prefix of the output file
prefix_top100.txt file containing the transformations of subunits and pairs of subunits sorted bt their final beliefs computed by belief propagation algorithm.
python3 ./MaxHeap/max-heap.py --mrf-file prefix_top100.txt --clash-threshold no_clashes -dir pdb_dir
--mrf-file: File containing the results of MRF and belief propagation
--clash-threshold: Maximum number of acceptable clashes between pairs of subunits
-dir: Directory of the native structure pdb files to generate the transformed structures
PDB files of the top 10 structures of each subuint (subID_decoy_MaxHeap_structure#.pdb)
File for each subunit containing subunit transformations of the top 10 structures (subID_clashesThreshold_MaxHeap.txt)
./ run.sh
run.sh file contains all the commands to geenrate top 10 refined structures of the target in the output folder.
Expected_Output folder contains all output files of the example target.
- subunitID_decoy_MaxHeap_#.pdb: PDB structures.