Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store initial and final topologies for all phases -- small molecule pipeline #1210

Merged
merged 10 commits into from Jul 31, 2023

Conversation

ijpulidos
Copy link
Contributor

@ijpulidos ijpulidos commented Jul 10, 2023

Description

These changes make serialization of initial (old) and final (new) topologies for complex and solvent phases for the small molecule pipeline, adding the RelativeFEPSetup._serialize_topologies method which is called at the end of the constructor.

Note that while these changes do NOT break the API, it is a change in the behavior since now the stored objects are different.

Motivation and context

For extracting trajectories for debugging or visualization purposes, it is important to have the topologies available in PDB for the different phases and the initial and final states of the transformation.

Resolves #1154

How has this been tested?

Test written, need to check locally that the files make sense.

Change log

Storage of topologies for initial and final states of the transformation, for all phases of the simulation. Useful for extracting trajectories.

@ijpulidos ijpulidos added this to the 0.10.3 - Bugfix release milestone Jul 10, 2023
@ijpulidos
Copy link
Contributor Author

Tested locally and visualizing the different generated PDB files for an example seem to work as expected.

@mikemhenry mikemhenry changed the base branch from main to 0.10.x July 11, 2023 21:01
@ijpulidos ijpulidos changed the title Initial and final topologies are serialized for complex and solvent phase -- small molecule pipeline Initial and final topologies are stored for complex and solvent phase -- small molecule pipeline Jul 27, 2023
if not os.path.exists(models_path):
os.makedirs(models_path)
for phase_key, phase_data in topologies_to_store.items():
pdb_filename = AnyPath(f"{models_path}/{self._trajectory_prefix}_{phase_key}.pdb")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for keeping this an AnyPath

@ijpulidos ijpulidos requested a review from zhang-ivy July 27, 2023 21:58
@ijpulidos ijpulidos changed the title Initial and final topologies are stored for complex and solvent phase -- small molecule pipeline Store initial and final topologies for all phases -- small molecule pipeline Jul 27, 2023
@codecov
Copy link

codecov bot commented Jul 27, 2023

Codecov Report

Merging #1210 (499b39e) into 0.10.x (4834ff7) will increase coverage by 0.07%.
The diff coverage is 100.00%.

Stores topologies and positions in PDB file for the given the phase, for both the initial (old) and final (new)
states. Stores both solvent and solute whenever possible (only "solute" in vacuum).

Generates two PDB files, one for each state (old/new).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the RCSB is phasing out PDB files in favor of mmCIF files, should we allow the user to control which file format is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion about this, but I would advocate on just sticking with one to avoid unnecessary complexity, as possible. If we feel we should be trying to push for PDBx/mmcif files, then I'm okay with that. Assuming openmm.PDBFile can handle that correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or openmm.PDBxFile in that case, I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, lets use PDB files for now, but we will want to keep in mind that CIF files are eventually going to replace PDB files, so may want to open an issue about this (for the repo as a whole, not just the issue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. Created the issue in #1221

@@ -260,47 +258,6 @@ def warn(*args, **kwargs):
found_tla = True
assert found_tla == True, 'Spectator TLA not in old topology'


def test_host_guest_deterministic_geometries():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this test being deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It basically got replaced by the test at

def test_relative_setup_topologies_storage(tmp_path):

The diff between the two is shown in the following screenshot
perses_tests_diff

You can see only the forcefield was changed between the tests (since we want to be testing openff rather than gaff, preferably), and the lines to test the storage pdbs were added.

@zhang-ivy
Copy link
Contributor

Left some comments, but otherwise looks good!

@mikemhenry mikemhenry added this pull request to the merge queue Jul 31, 2023
Merged via the queue into 0.10.x with commit b362e4b Jul 31, 2023
7 checks passed
@mikemhenry mikemhenry deleted the 1154-overhaul-topology-info branch July 31, 2023 22:37
github-merge-queue bot pushed a commit that referenced this pull request Sep 14, 2023
* run CI on new 0.10.x branch (#1212)

* run CI on new 0.10.x branch

* Update .github/workflows/CI.yaml

Co-authored-by: Iván Pulido <2949729+ijpulidos@users.noreply.github.com>

---------

Co-authored-by: Iván Pulido <2949729+ijpulidos@users.noreply.github.com>

* print perses version on startup (#1176)

* Update comments in `RESTCapableHybridTopologyFactory` (#1189)

* update comments

* bump ci

---------

Co-authored-by: Iván Pulido <2949729+ijpulidos@users.noreply.github.com>
Co-authored-by: Mike Henry <11765982+mikemhenry@users.noreply.github.com>

* use python executable from env (#1174)

* use python executable from env

* run tests using the shell

* parmed 4 seems to be giving us some issues

* actually I think we want the newer parmed

* pin pymbar for now

* ooof, parmed != pymbar

* add some more debugging to figure out how we are getting old parmed

* Update devtools/conda-envs/test_env.yaml

* see if shell=True fixes it

* removed parmed by mistake

* Support pymbar 4 (#1173)

* switch to using pymbar 3 & 4 support from openmmtools

* fix typo on import

* fix yaml

* switch to using pymbar 4

* missed a pymbar import

* missed another one

* bump ci

* missed a import

* go back to how it was

* Update devtools/conda-envs/test_env.yaml

---------

Co-authored-by: Iván Pulido <2949729+ijpulidos@users.noreply.github.com>

* Remove example testing (#1214)

* Store initial and final topologies for all phases -- small molecule pipeline (#1210)

* Initial and final topologies serialized per phase.

* Using properties instead of private attrs.

* Fix test

* Remove uneeded code/attributes

* bump ci

* Store phase topologies separately

* Fix tests. vacuum topologies expected.

* Better docstring.

---------

Co-authored-by: Mike Henry <11765982+mikemhenry@users.noreply.github.com>

* Improve docker building (#1200)

* Added note about example for adding oe license

* make docker file much simpler

* build images in CI

* forgot to add conda-forge

* fix permissions on a step

* get oe_license file mounted in docker container

* mount path must be absolute

* setup singulairty to test + fix testing on docker image

* add some testing deps to the image

* add -v for tests, fix envar

* tests are failing, want to test rest of pipeline

* use latest openmm version

* hardcode perses version for now

* bump ci

* make sure we can make openeye dir

* support a dev build as well

* don't hardcode value

* fix name clash

* forgot to add conda-forge

* bump ci

* test docker image and fix missing deps

* install ps

* also push latest tag

* don't build on tag since the conda-forge package won't exist yet

* don't test the examples

* Remove docker deb build, we can do these ourselves

* better document container use

* build a latest version for apptainer

* build with 11.2 to make things more compatable

* skip docker test to see if other bits build okay

* see if I can get the step to fail if there is an error

* skip docker tests to make sure apptainer builds okay

* Add mpiplus and mpi4py to docker image

* give correct path to oe license

* add mpi stuff to docker

* clean up diskspace before build

* skip tests for singularity now that the only failures come from a package bug

* Clean examples -- CLI protein-ligand example for Tyk2 (#1223)

* Improving examples dir structure and readme

* Adding Tyk2 CLI example

* removing new-cli/ripk2 example (deprecated)

* fix typo for link

* Clarify tyk2 cli example docs

* Realtime analysis interval to default to checkpoint interval (#1227)

* CI miscellaneous fixes (#1217)

* CI minor fixes. Allow codecov to fail.

* bump ci

---------

Co-authored-by: Mike Henry <11765982+mikemhenry@users.noreply.github.com>

* Changing offline freq default to checkpoint interval

* Fixing input yaml for example

* commenting offline-freq param

---------

Co-authored-by: Mike Henry <11765982+mikemhenry@users.noreply.github.com>

* MPI example with dipeptide mutation (#1228)

* peptide mutation MPI example added

* better documentation of motivation

* Fix/issue 1194 (#1230)

* set cutoff distance in sterics_custom_nonbonded_force

* matching cutoff for custom forces. Improving logic.

* test for HTF nonbonded cutoff

---------

Co-authored-by: Iván Pulido <2949729+ijpulidos@users.noreply.github.com>

* Fix/issue 1196 (#1229)

* CI miscellaneous fixes (#1217)

* CI minor fixes. Allow codecov to fail.

* bump ci

---------

Co-authored-by: Mike Henry <11765982+mikemhenry@users.noreply.github.com>

* added dels to contexts

* Update perses/app/setup_relative_calculation.py

---------

Co-authored-by: Iván Pulido <2949729+ijpulidos@users.noreply.github.com>

* Fix spectator support (#1233)

* fix spectator support. Enabling test.

* Test to run on GPU CI.

* pin <4 for pymbar on GPU

---------

Co-authored-by: Iván Pulido <2949729+ijpulidos@users.noreply.github.com>
Co-authored-by: Ivy Zhang <35546250+zhang-ivy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overhaul perses writing of topology information
3 participants