Chrisjurich/feature new mutation submodule#64
Merged
shaoqx merged 27 commits intoJul 4, 2022
Conversation
…mutation submodule nearly ready
Collaborator
|
Can you change it to merging into develop_refactor? |
shaoqx
approved these changes
Jul 4, 2022
Collaborator
shaoqx
left a comment
There was a problem hiding this comment.
Great job! More review will be in a new pr of my edit
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This branch represents a re-write of the mutation submodule. The base unit of the new mutation submodule is the
Mutationnamedtuple.Mutationdescribes a desired mutation for a given protein and below is a summary of the attributes as seen in the docstring found inenzy_htp/mutation/mutation:The
Mutationnamedtuple is critical for modeling mutations across a full enzyme system as it allows for convenient mapping of the full space in a manner that can be sampled efficiently. The full mutational space of an enzyme contains Nx19Mutationnamedtuples where N is the number of canonical residues in the sequence. In python this can be represented as adict()where each(key, value)pair is(residue number, list() of Mutation namedtuple's). This modeling approach is convenient for both applying restrictions and randomly selecting residues.Mutation restrictions are now represented and encoded via the
MutationRestrictions()object. The source for this object is found inenzy_htp/mutation/mutation_restrictions.py. This object is at its core a list of dict objects each with same keys. One dict exists for each residue and it stores information on whether 1) the residue can be mutated at all or 2) what mutations may be restricted. Potential restrictions include changes to size or charge in specific directions.MutationRestrictions()as a class is compatible with the mapping of a protein's mutation space such that this space is effectively reduced by this object. When a residue is to be locked completely from mutation, the position is removed from the mutation dict and other incompatible Mutation obejcts are removed.The user facing function in this module is
mutate_pdb(). This function combines the previous two concepts as well as random selection from the mutational space and final mutation deployment. For a given protein, the function first creates all possible Mutations in the aforementioned dict layout. Next, the restrictions (if present) are applied to reduce the possible space. Lastly, the final list ofMutationnamedtuple's is selected by randomly choosing an available/eligible position in the structure and then randomly choosing aMutationfrom the respective list. After a mutation is chosen fromo a specific position, it is removed from the mutation dict. As a result, positions cannot be mutated twice and selection speed is drastically improved. Lastly, the desired mutations are deployed thorugh an implementation engine. Currently the only engine is Amber'stleapprogram.Tests for all of the above functionality is contained within
test/mutation/