New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve structural alignment datastructures #126
Comments
Spencer From a google summer of code project a couple years ago that Andreas and I The developer was a PhD student who is no longer active so this code needs Scooter On Mon, May 12, 2014 at 6:39 AM, Spencer Bliven notifications@github.comwrote:
|
Spencer: what is the "creative work-around" that you took to store that alignment? |
@willishf This issue is about structure alignment. However, I we don't have any order-independent sequence alignment algorithms either, and SequencePair wouldn't be able to store them if we did. So really the issue could be a feature request in both the structure and sequence spaces. @andreasprlic Storing each side of the CP in a separate block in the AFPChain, then getting you to fix all the places in AFPWriter and other classes that assumed a sequential order between blocks. There are still places which assume sequential order within blocks for performance. Basically, this issue is for a rewrite of AFPChain that I've long been thinking about. AFPChain was basically a bean for all the globals during the jFATCAT/jCE port, but now it is the core class for structure alignment. I would prefer a more conceptual data model of what constitutes an alignment. For the structure package this must include order-independent concepts to support CE-CP and derivative algorithms. |
This should have been closed when the MultipleAlignment Data Structure (to store multiple structural alignments) was merged #278. |
BioJava contains a number of algorithms for aligning protein structures. In the most general case, an alignment consists of a mapping between residues of two (or more) proteins. However, for historical and performance reasons alignments are stored as linear, sorted arrays. This makes it difficult to express cases where the order of aligned residues differs between the two proteins. For instance, storing the following alignment requires some creative work-arounds:
Additionally, the class to store structural alignments (AFPChain) contains a number of unneccessary, poorly documented, or algorithm-specific parameters which should be removed or refactored.
AFPChain should be refactored into a new data structure that
Suggested for GSoC 2013
The text was updated successfully, but these errors were encountered: