-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple Structure Alignment Datastructures #277
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The core data structures for the Multiple Alignment object have been created: MultipleAlignment, BlockSet, Block, Pose.
The distanceMatrix is renamed to distanceTables to match with the AFPChain nomenclature. The description of replaceOptAln has also been changed to be more general.
The pose contains the translation and the rotationMatrix as information of the 3D transformation of the proteins. A Demo for the display of the multiple alignment has been created.
In order to generalize the 3D GUI features of the Structure Alignment and implement a Multiple Alignment GUI for the new MultipleAlignment object.
The multiple alignments can be visualized through the MultipleAlignmentJmol class, adapted from the StructureAlignmentJmol. The coloring of the different blocks and the alignment menus are still not implemented.
Gaps are described by null values in the Blocks of the MultipleAlignment. Now the Jmol class accounts for these gaps and does not color them.
from the Pose class, because it is a static variable that does not depend on the specific BlockSet. It only stores the intra-residue distances of every protein.
The wrong line was commented out, so the molecule was not colored.
Adapted the display method in StructureAlignmentDisplay to rotate and display in Jmol the atoms of a MultipleAlignment.
Minor changes to respond to TODOs
Interfaces for the classes Block, Pose and BlockSet have been created to generalize and document all the methods needed for a MultipleAlignment object.
The interfaces have been implemented again and the Jmol display also works for the new MultipleAlignment DS composition.
Add some methods to calculate internal variables (update), and moved the cache variables (RMSD, TM-score, similarity, coverage) from the MultipleAlignment to these two classes.
Another layer in the OO data structure has been added to allow returning alternative alignments. An ensemble of MSTA is a collection of MultipleAlignment objects. Another change has been the addition of two different implementations of Pose, one to determine global superimpositions and another to determine flexible part superimpositions.
When an object is created with the constructor and its parent is set, the parent also gets a link to the object automatically.
The Ensemble can calculate the distance Matrices for every structure in the updateDistanceMatrix() method. Automatic cross-references added to the setParent() methods, for consistency.
All pairwise structural comparisons are evaluated to build the background distance Matrices. Atoms can be rotated from Pose as well.
A new Pose abstract implementation has been created that calculates the TMscore and RMSD of the alignment. The name of AlignmentJmol has been changed to AbstractAlignmentJmol to be clear that is an abstract class.
A constructor for a new MultipleAlignment can be used from an AFPChain. It creats an equivalent alignment object, for backwards compatibility.
The clone methods now entirely change the links between the cloned and the original objects so that no cross-links occur.
An initial implementation of the CEMC algorithm for multiple structure alignment has been created. Now a seed MultipleAlignment can be created with a parallel pairwise all-to-all alignment. The MC optimization is still not implemented. A demo is available under the structure-gui package.
In the transition to replace AFPChain with the MultipleAlignment class. A core structure for the CEMC algorithm has also been created.
Only the CA atoms were rotated before. Now the whole structure is rotated. The Atoms are now a cache variable of the alignment, the real identifiers are the structureNames. The methods downlad the structures from the identifiers if the atoms are not present in the alignment.
The first class was only used in DB search and was very specific. Now it is general enough to allow any threaded pairwise alignment calculation.
The central structure identification is not the atomArrays, but the structureNames (from where the arrays can be recovered if they are not present)
Added more families and examples to the DemoCEMC
The calculation of the angle was not possible because cos(theta) was out of range [-1,1].
Multiple alignments can be performed, but no gaps or circular permutations are handled yet.
With the idea that using vecmath more consistently throughout will increase performance, but that Atom/JAMA-based code will stick around for a while.
Interfaces no longer inherit from Cloneable, so implementations should flag themselves specifically.
- Added ScoresCache to all levels of the heirarchy, which allows algorithm-specific scores to be added and retrieved. Replaces several methods for individual scores. - Removed update methods from the interface - Removed Pose in favor of raw vecmath transformation matrices
Adapt the alignment panel to work with the MultipleAlignment DS. The changes still don't work, waiting for the last changes in the DS.
This should be the preferred way of fetching CA atoms
Only the connection between the panel and the jmol has to be implemented.
- Setters now only modify downstream parts of the hierarchy. For instance, calling MultipleAlignment.setEnsemble() changes the alignment links without touching either new or old ensemble links, but MultipleAlignment.setBlockSet() does modify the alignment link for each block set. - Added clear() methods for resetting cached variables - Added MultipleAlignmentEnsemble.addMultipleAlignment() method, rather than modifying the underlying list directly. - Improve documentation somewhat - Fix infinite loop in toString methods
Conflicts: biojava-structure/src/main/java/org/biojava/nbio/structure/align/model/MultipleAlignmentImpl.java
Oops, this can go in 4.1 |
Replaced by #278 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduces new data structures for structure alignments, created along with @lafita. The data structure can represent standard pairwise alignments, but also multiple alignments, flexible alignments, and non-topological alignments (#126).
The structure consists of a hierarchy of objects:
Some documentation still needs to be written and will be added to the cookbook.
A few other design decisions bear mention:
String
but will change toStructureIdentifier
following the completion of Make loading of structures more consistent #81.Matrix4d
objects. To support flexible alignments, the definitive matrices are stored in eachBlockSet
. However, a default matrix can be stored inMultipleAlignment
to save memory for rigid alignments.AFPChain
can be converted directly toMultipleAlignmentEnsemble
This pull request also bundles concurrent development of:
AtomCache.getRepresentativeAtoms()
method (that should replacegetAtoms()
everywhere)etc. etc. !
This is a fairly major feature addition, so I'll leave this request open for a few days to allow comments.