Select Protein Structures of Genes Included in the Deafness Variation Database.
This repository acts to compile a list of select homology models for genes included in the Deafness Variation Database, which can be accessed at deafnessvariationdatabase.org. Each of the beginning homology models in this repository were acquired from ModBase or SwissProt, a modeling paradigms that use a fixed charged potential to generate predicted structures of proteins not determined experimentally by means of sequence alignment to orthologous proteins. Also included in this repository is a corresponding refined structure for each of the input models. The refinements were done using the Force Field X (FFX) software package that uses the AMOEBA polarizable force field as the potential energy function.
Detailed results of refinement statistics are included in the Refinements.csv file. The scoring method used was MolProbity. To learn more about MolProbity, visit the main page at the following URL: http://molprobity.biochem.duke.edu. In summary, the average MolProbity Score of 472 structures contained in this repository was 2.16. After refinement, these structures showed singificant improvement, with an average MolProbity Score of 1.04. These results provide strong evidence that the physics-based refinements done on these models allows for a better understanding of their true structure.
In doing the refinements, two main computational algorithms were used—a local minimization and a rotamer optimization. In summary, the local minimization uses Quasi-newton techniques to approximate the hessian and follow the local energy landscape while the rotamer optimization acts to discretize a set of side chain conformations for each residue, thereby finding the approximate lowest energy conformations given flexible side chain positions and a fixed backbone. The refinement procedure consists of a loose local minimization, followed by a rotamer optimization and then a tighter local minimization, the last step acting to let rigid rotamer conformations relax.
The organization scheme of the repository is as follows: in the “root” directory (i.e. dvd-structures) subdirectories corresponding to the name of the gene they contain are listed. Within each of those subdirectories, the range of each homology model is displayed as a directory. In each of those directories, there are four files: the original pdb file, the refined pdb file, a pdf of the output of MolProbity for the original structure, and pdf of the output of MolProbity for the refined structure. The original file is named as follows: GENE_MODELRANGE.pdb and the refined structure is: GENE_MODELRANGE_FFX.pdb while the MolProbity output for the original and refined structure is named Original_MolProbity.pdf and FFX_MolProbity.pdf, respectively, for every gene and model range. As an example, in the directory “dvd-structures” there contains a directory “ACTG1” which corresponds to the gene ACTG1. In that directory, there is a single model range that covers residues 6 to 375. In that directory there are four files, ACTG1_6-375.pdb which corresponds to the original model, ACTG1_6_375_FFX.pdb which corresponds to the model that was refined using the FFX software package, Original_MolProbity.pdf which corresponds to the MolProbity output of the original model, and FFX_MolProbity.pdf which corresponds to the MolProbity output of the refined model.