Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arbitrary offsets #46

Closed
wants to merge 12 commits into from
Closed

Conversation

dkoes
Copy link

@dkoes dkoes commented May 31, 2022

Description

This is a modification of previous pull request that supports arbitrary offsets from variable dihedral.

Much faster coordinate building from angles. fastbuild is more than 300 times faster with autograd on than the default routines for a 437 residue protein (and scales better). It uses parallel matrix operations and a logarithmic building of the backbone. Should be straightforward to define alternative sidechain geometries.

This is a pure python implementation. A C++ implementation only improved speed about 10% and I decided that wasn't worth the extra complexity and opacity to pytorch tools.

Todos

  • Need to fill in all the necessary information to build sidechain hydrogens in sc_all_atom_build_info. This is essential in order to get the desired orders of magnitude speed ups when constructing full protein for openmm
  • Need to deal with terminal atoms (e.g. OXT). In theory this can be implemented in the existing framework by creating C and N terminal residue types with their own sidechain definitions, but since it's only a couple residues a manual solution is probably acceptable, if less elegant.
  • Need to implement backward pass of MakeSCCoords. Once this is done, switching to use this custom autograd function should improve speed by an additional 3X
  • Tests need to be written.
  • Batched versions of these functions might provide some additional performance benefit.

Questions

  • Why can't sidechainnet angles be appropriately oriented to begin with (instead of having to shift by pi?)

Status

  • Ready to go. Seriously, swap this in ASAP (need hydrogens) so training can be that much faster.

jonathanking and others added 12 commits November 4, 2021 11:21
Since custom datasets can have custom split names, the code no longer
assumes to know the split names when correcting 1GJJ_1_A.
p.build_coords_from_angles(p.angles, add_hydrogens=True) should build
coordinates if needed

sb = StructureBuilder(p.seq, ang=torch.Tensor(p.angles),device=p.device)
should not throw an error
nan instead of zero for uncalculated coordinates
build coords automatically
notebook
..not just 180 degrees.  Reference dihedral must be immediately after
source atom position (which is what is expected in any sane ordering).
@jonathanking jonathanking mentioned this pull request Jun 6, 2022
7 tasks
@jonathanking jonathanking deleted the branch jonathanking:research_openmm October 1, 2022 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants