Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify _construct_atom_map for protein mutations #860

Merged
merged 15 commits into from Oct 29, 2021

Conversation

zhang-ivy
Copy link
Contributor

@zhang-ivy zhang-ivy commented Sep 1, 2021

Description

I've simplified the code for generating protein mutation atom mappings by generating the mapping directly from the old/new residue topologies. Generating the atom mapping no longer involves using AtomMapper or oemols, but oemols are still generated and attached to the output TopologyProposal because they are needed as input to the geometry engine.

By default, we map backbone atoms and CB atoms (except for GLY, where we map CA but not HA2 or HA3).

Motivation and context

Resolves #857

How has this been tested?

I added test_protein_atom_maps() to test_topology_proposal.py, which tests the generation of atom maps for some representative transformations.

Change log


@codecov
Copy link

codecov bot commented Sep 1, 2021

Codecov Report

Merging #860 (a8de474) into master (e165e76) will decrease coverage by 0.07%.
The diff coverage is 98.80%.

Co-authored-by: Mike Henry <11765982+mikemhenry@users.noreply.github.com>
Copy link
Contributor

@dominicrufa dominicrufa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. the test doesnt provide full coverage of protein mutations, but that might be unnecessary?

@@ -336,6 +337,71 @@ def test_mutate_from_alanine():
else:
_ = generate_dipeptide_top_pos_sys(ala.topology, amino, ala.system, ala.positions, system_generator, conduct_geometry_prop=False)

def test_protein_atom_maps():

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like you hardcoded some specific mutations. is this robust enough? i guess you can augment this test as/if you encounter problems in future?

Copy link
Contributor Author

@zhang-ivy zhang-ivy Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is robust enough, as the test cases are representative of the different cases handled by the code:

  1. mutations where the old or new residue is GLY
  2. mutations where the old or new residue is in ['THR', 'ILE', 'VAL'] -- these contain HB atoms whereas all other non GLY amino acids contain HB1/HB2/HB3
  3. all other mutations



# Define function for checking that the atom map is correct
def check_atom_map(topology_proposal, reference_map):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be a utility outside the test fn?

Copy link
Contributor Author

@zhang-ivy zhang-ivy Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will never be used outside of test_protein_atom_maps, so there's no need to move it outside

new_oemol_res_copy : openeye.oechem.oemol object
copy of modified new oemol

local_atom_map : dict, key: int, value int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i vaguely remember there being a reason we make a deepcopy of the oemol to append to the Topology object. can we keep that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scratch this. i dont think we need to keep deepcopy anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. the deepcopy was necessary before because we were deleting backbone atoms from the original oemol (which would then be inputted to AtomMapper). However, we are no longer using the oemol for generating the map.

current_oemol = PolymerProposalEngine.generate_oemol_from_pdb_template(current_residue_pdb_filename)
proposed_oemol = PolymerProposalEngine.generate_oemol_from_pdb_template(proposed_residue_pdb_filename)

old_oemol_res_copy = copy.deepcopy(current_oemol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference to the oemol deepcopy is here.

@zhang-ivy
Copy link
Contributor Author

@jchodera Could you review this and let us know if its ok to merge?

@zhang-ivy
Copy link
Contributor Author

@jchodera : Could you review this and let us know if its ok to merge?

@mikemhenry mikemhenry merged commit 7d76dda into master Oct 29, 2021
@mikemhenry mikemhenry deleted the simplify-protein-atom-mapping branch October 29, 2021 20:18
@mikemhenry mikemhenry mentioned this pull request Jan 17, 2022
@mikemhenry mikemhenry restored the simplify-protein-atom-mapping branch January 26, 2022 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AtomMapper returns weird maps for certain protein mutation transformations
4 participants