New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Planning for 7.7/8.0 #3191
Comments
I added one more to the list of features needing design that we should investigate: #2514: Fully flexible unit cells during NPT simulations |
@peastman : What if we call this the "initial draft" of features for 7.8 and tag all these issues for that milestone? If we decide to change the milestone to 8.0, we can simply rename the milestone to 8.0 and they'll all update. |
Some specific feedback about soem larger features: This would be incredibly useful, but we would have to balance investment in the OpenMM Again, I'd be cautious here: How useful is the "latest version of AMOEBA"? Is there an automated parameterization scheme we can leverage? Would it be more useful to implement more "a la carte" polarizability for the Open Force Field effort to use to integrate polarization, given they are produce easy-to-use, open source, fully automated parameterization tools? The solution we identified---use the assignment scheme already within OpenMM for validation and just bend it to also apply paraeters---should be quick to implement. Beyond ~1 day worth of effort, probably not worth further investment.
The whole setup pipeline is our weakest link here. We might discuss how we could link up with others that have more programmatic ways to prepare systems for the longer term. Notably, HTMD (from @giadefa) has some very robust programmatic approaches to preparing membrane systems.
These all sound sensible and not that time-consuming to implement! For the "more research needed" features, we might seriously also discuss (perhaps at the next OpenMM call)
|
Also, this fancier hydrogen mass repartitioning feature just came across my radar, and might be an easy feature to include. |
This doesn't involve parametrizing anything ourselves. It just means converting the latest version of the force field included with Tinker. We've had a lot of requests for it. |
I'm asking "how useful is the latest version of AMOEBA to our users"? Can they simulate protein-ligand complexes? Even if we don't have to parameterize anything, if our users have to run a ton of MP2 calculations and derive parameters by hand, it's not going to be too useful to them either. |
I don't think protein-ligand simulations are one of the main things AMOEBA gets used for. |
Well, that's because the tools to do this don't exist. Can you point me to the latest AMOEBA publications, parameters, and tools? We can do a little poll to see how helpful this actually would be to our users. I suspect it really isn't as helpful as focusing on things that can actually be used to model protein-ligand complexes. |
It's a common request. It comes up regularly both on the forum and in emails people send me. Protein-ligand complexes are only one narrow slice of what OpenMM gets used for. |
I'm extremely dubious: I just don't think that a force field only useful for protein-in-solvent---without even cofactors---is all that useful. Can you point me to the latest AMOEBA publications, parameters, and tools? We can do a little poll to see how helpful this actually would be to our users. Users can still use Tinker and Tinker-HP (which now supports GPUs) to run those simulations. The developers are funded by an ERC Synergy grant that provides millions of Euros to support that work. I just don't see this as being a useful priority for us, though supporting mix-and-match components of the potential could be highly useful above and beyond what Tinker provides. |
No, but I'm sure @jayponder can. |
Here's an idea: Could we make it easy to fit AMOEBA parameters for new small molecules by leveraging our QML potentials? Could we build a tool that would use one of the QML potentials we provide to generate AMOEBA parameters users could then easily use to simulate protein:ligand complexes with AMOEBA? And potentially support QML/AMOEBA simulations in a manner analogous to this really exciting QM/AMOEBA work from the AMOEBA folks? This could leverage our strengths. |
Hi @peastman I wonder what's the status of supporting more easily PME in alchemical free energy OpenMM implementations. Last time I looked into it it wasn't straightforward to use softcore potentials in reciprocal space calculations. I know @jchodera uses (or used to use) a end-states reweighting approach that I believe is not generally applicable. |
I vote for #1155 (report minimization progress)! Ideal it should be possible to get |
Also, it would be good to have #2898 (prevent barostat from re-imaging molecules). We are seeing a lot of confused clients thinking that something went wrong with a system when the barostat wrapped it. |
Sorry, I've literally been in the wilderness most of this past week, and away from internet and cell phone connections... The standard AMOEBA parameterization tool is POLTYPE, which is available on Github from https://github.com/pren/poltype. Actually, instead you want the POLTYPE2 version, as that's the version that's been under active development for some time now (months to years..). It's fronted and primarily developed by Pengyu Ren's lab, but lots of people are working on it. The newer POLTYPE2 is not "perfect", but it's pretty good at this point. That's what everyone in the AMOEBA community is using for ligands, cofactors, etc. And it's the "official" AMOEBA parameter generator. POLTYPE is Python-based, and as of now you need to download it and run it yourself. It requires a Tinker installation, and access to either Psi4 or Gaussian. The plan is to put up a fully automated server site, but we're not there yet and likely won't be in the near future. Under the hood POLTYPE is roughly analogous to GAFF, CGenFF and LigParGen, but does significantly more semi-serious to serious QM in order to fit AMOEBA values- especially for electrostatics/polarization, but also for stuff like torsions on POLTYPE auto-generated model fragments, etc. These days it's fast enough to fairly easily run sets of ligands/drugs, even larger ones. Though you won't want to try to push all of PubChem through it :) |
LOL :) That ERC grant is spearheaded by Jean-Philip Piquemal and others in France. I think I'm listed somewhere in that huge proposal as a deep collaborator. It's a lot of funding, but as far as I can tell it's supporting dozens of different projects and topics, most of which are not AMOEBA-related. And I've never seen a penny of that money... For what it's worth, Pengyu is developing POLTYPE without any direct funding. And my entire group is one postdoc and two grad students, whom I struggle to keep fed... We do the best we can to advance Tinker/AMOEBA and try to make things available via Github. While we do now have our own GPU-capable codes (Tinker9 and Tinker-HP; see https:/github.com/TinkerTools, which also has the main POLTYPE2 version), we would like to and intend to keep a fully correct, efficient and up-to-date version of AMOEBA running in OpenMM for the broader OpenMM community to use. |
What is the output of POLTYPE? It would be great if we could make it so molecules parametrized with it can be used in OpenMM. |
It generates a complete Tinker-compatible parameter file for the input molecule(s). As of now, I believe input is either SMILES strings, MOL2 files, or Tinker coordinates files. I think you folks already have tools to automate the conversion of the Tinker parameter file format to your XML-style format? We'd probably prefer to not build different outputs directly into the code, but we could discuss that... |
We do have a quite old, kind of hacked together script for converting Tinker parameter files to OpenMM force fields. It only supports the specific features that we needed for converting AMOEBA, so it isn't really a general purpose Tinker parameter converter. I'll need to update it for the features in the newer AMOEBA. The two obvious approaches would be either for POLTYPE to generate OpenMM XML files, or for OpenMM to have a well supported class for reading Tinker parameter files. I think both approaches could be made to work. The first approach obviously only needs to support the specific features found in POLTYPE output files. For the second approach, we would need to decide how general to make it. @pren your thoughts on this would be great. |
POLTYPE produces parameters for small molecules. Tinker has a library of macromolecule (AMOEBA) parameters that are often needed as well. So a class that reads Tinker AMOEBA parameters would be the minimal.
From: Peter Eastman ***@***.***>
Sent: Friday, July 30, 2021 11:46 AM
To: openmm/openmm ***@***.***>
Cc: Ren, Pengyu ***@***.***>; Mention ***@***.***>
Subject: Re: [openmm/openmm] Planning for 7.7/8.0 (#3191)
We do have a quite old, kind of hacked together script for converting Tinker parameter files to OpenMM force fields. It only supports the specific features that we needed for converting AMOEBA, so it isn't really a general purpose Tinker parameter converter. I'll need to update it for the features in the newer AMOEBA.
The two obvious approaches would be either for POLTYPE to generate OpenMM XML files, or for OpenMM to have a well supported class for reading Tinker parameter files. I think both approaches could be made to work. The first approach obviously only needs to support the specific features found in POLTYPE output files. For the second approach, we would need to decide how general to make it.
@pren<https://github.com/pren> your thoughts on this would be great.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#3191 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABNC6XW5ISDMM3OKQRFCOW3T2LJLZANCNFSM5BDI2XRQ>.
|
We've already got an XML file for the standard AMOEBA parameters, and we'll be creating an updated one for the newer version. So all that needs to be converted is the extra parameters generated by POLTYPE. |
Hi @jayponder @pren! Thanks for pointing out POLTYPE. A few questions:
|
Hi John,
Yes. Though as per my earlier reply, see the "poltype2" branch.
It is in active development. Though the current "development version", found in the "poltype2" repo is stable enough for general use. Lots of us are using it almost daily. If you prefer a different nomenclature, it's probably what you might consider an "alpha" or "beta" release version.
I would suggest citing Pengyu's POLTYPE Github site, and the original POLTYPE paper: "Automation of AMOEBA polarizable force field parameterization for small molecules", J. C. Wu, G. Chattree and P. Ren, Theoretica Chemica Acta, 131, 1138 (2012). There is not an publication for POLTYPE2 at present, though I'm sure Pengyu will publish one in due course- I'll leave it to him to elaborate on publication plans.
There are not really different "versions" of AMOEBA, just slight tweaks to the parameter sets over time. For proteins, nucleic acids, water, ions, etc. the current "production" parameters are all in the "amoebabio18.prm" parameter file distributed with the Tinker codes. This is our unified parameter set from circa 2018. The component pieces (proteins, etc.) were all published separately, and references can be found near the top of the parameter file. These are the parameters to be used with small organics, ligands, etc. generated via POLTYPE2.
I'll let Pengyu reply here. But anyone with even minimal Python experience can install it.
See the README_INSTALL.MD file.
As mentioned in my email above, you need either Psi4 or Gaussian. I am personally a big fan of Psi4, and if there are remaining dependencies specific for Gaussian, we should try to eliminate them so that only Psi4 is actually required (in case Gaussian is not available). One possible issue I am aware of is that Psi4 does not have gradients of implicit solvation models. This may be essentially the only remaining pure Gaussian dependency.
GDMA is distributed with Psi4 as an option to be included at build time, and is (I think) included in the executable with their Conda distribution. For use with Gaussian, you need to download the freely available GDMA code from Anthony Stone's web site. It is a small Fortran90 code that is relatively simple and trivially fast to build.
Again, up to Pengyu. But my guess is we would probably prefer for the translation to be done on the OpenMM end instead of on our end. This is a valid point for discussion.
I was actually not even aware of this, and don't know what any Perl code (!) is doing. My guess is that it's some small, trivial thing. Have to get back to you on this...
This is not really for you guys to do, and would probably be frustrating at present as the POLTYPE2 code is still changing frequently. POLTYPE2 is a Tinker family code, and developed by Pengyu's group. My intent in raising this whole conversation was not for you folks to feel like you have to distribute it as part of OpenMM. I simply wanted everyone to know that a parameterization tool is already available, and is under very active development. As I noted earlier, our eventual intent is to make AMOEBA parameterization, at least for molecules up to some reasonable size limit, available as a web service.
As above, this is for us to do. As you say, this is definitely on our roadmap. as it is essentially the GAFF or CGenFF for AMOEBA, and it belongs with Tinker and AMOEBA. We will try to make it easy for OpenMM users to access, and we are certainly open to suggestions to make that simpler. But Pengyu, and the rest of the Tinker and AMOEBA crew as needed, will develop, package and distribute POLTYPE and future parameterization tools. Best, Jay |
Goodness, somehow I had totally missed #3191 (comment), which anticipated the answers to many of my questions! Apologies for this, and thanks for the detailed reply. Going through this now... |
@jmichel80 : You'll be excited to check out #3173, which allows you to disable the direct-space PME calculation in The direct-space PME contributions can be handled by a Bookkeeping is still very complicated here, but this at least means you can fully treat softcore electrostatics as well in a flexible manner while retaining exact PME at the endstates. |
This is fantastic, @jayponder, since this would not even be possible without your help. I was hoping we could focus our limited resources (one NIH grant that was cut by >35%) on what would deliver the most unique value to users without being duplicative of other better-resourced projects. From twitter, it looked like Tinker-HP was ready to launch into the stratosphere. :) |
Hi John, One needs to be careful interpreting Twitter... I believe this has been amply proven by our collective experience with the political disfunction here in America over the past few years. |
Thanks! I'll take a look at it. |
@peastman is right here. I do find it hard to use and pretty outdated, but we should be continuing to fully support it until we can gracefully transition to a more modern platform like GitHub Discussions. |
One of the (many) braindead features of the PDB format is that CONECT records don't necessarily imply covalent bonds. In particular, the spec for the CONECT record says: "For hydrogen bonds, when the hydrogen atom is present in the coordinates, a CONECT record between the hydrogen atom and its acceptor atom is generated." So the CONECT record here is quite possibly correct, albeit incredibly unhelpful :). |
... wow. I did not know that. Insane. |
Another reason to encourage people to move to PDBx/mmCIF. It provides an enumeration of connection types so you can distinguish a covalent bond from a hydrogen bond. |
But still (generally) no bond orders. Sigh. |
I believe we need both bond orders and formal charges in order to easily interconvert representations between toolkits. PDBx/mmCIF provides (optional) capabilities for both. @ijpulidos and the OpenFF crew are talking to the RCSB folks soon---it's possible we could persuade them to include these in published models in the RCSB, but given these models frequently lack many atoms, it wouldn't make any sense to do this. However, as a community standard for interchange, it could be sensible to standardize around PDBx/mmCIF with the bond orders and formal charges specified. |
I think we're getting close to being ready to build a release candidate. Are there any remaining fixes anyone particularly wants to be sure we get in first? |
Would be great to see this one addressed #3124 ! |
#1155 is waiting for 6 years. |
Both of those are new features. Since we're past the beta, we're only doing bug fixes and documentation changes right now. We decided to cut the feature list short so we could get the context handling changes out sooner. |
Last chance to speak up before I start building the release candidate! |
Not really a feature as such, but would be really great to see what else needs to be done to make the PyPI builds a reality (#2871 and #3239). I don't think it's a lot - but I have a fellowship due in a week and then a pile of other things I've been putting aside while I focus on that, so can't look into it further for a while yet. Don't want to hold your release back on that, of course.
…________________________________
From: Peter Eastman ***@***.***>
Sent: 07 December 2021 18:15
To: openmm/openmm ***@***.***>
Cc: Tristan Croll ***@***.***>; Comment ***@***.***>
Subject: Re: [openmm/openmm] Planning for 7.7/8.0 (#3191)
Last chance to speak up before I start building the release candidate!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#3191 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFM54YD47DJD36KGYCGF5FLUPZFMJANCNFSM5BDI2XRQ>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
How about #3302 ? |
What in particular? There's some discussion of possible new features there, but I don't see anything that's a clear bug fix? |
We're hoping we can squeeze in a solution to #3301 (comment), and also get a couple of more days of data on the execution time cycling problem before locking into the release candidate! |
That's a feature, not a bug fix. This release already has a lot of optimizations to the long range correction. We can hopefully find ways to make it even faster (perhaps with #3368), but that's a feature for the next release. With the execution time cycling, is there any reason to think it's caused by a bug? As far as I can tell, everything works correctly. The only question is whether we can find ways to make it faster, and even at that we're only talking about a small speedup. |
7.7.0 is a feature release, not a bugfix release, right?
And none of them have actually fixed the 100x slowdown because we still haven't implemented the One Small Useful API extension I suggested. :)
Unclear, which is why I asked for another couple of days to convince ourselves it isn't.
In some cases, it's 2x. |
All features need to go in before the beta. We already cut a bunch of features so we could get out the context handling changes (needed for ML) as soon as possible. When we're all ready to build the release candidate, that's not the time for adding new features. At this point we're down to only merging high priority bug fixes. |
It's been another two days. Shall I start building it? |
After many trials with conda-forge, the release is now out! I'm now making my way through the release checklist.
|
Hooray! Where on this list should I tweet an announcement fo the new release, and to which URL should I point them---the forum announcement or the GitHub release notes? For benchmarks, I'm easily able to collect benchmarks in the short term for these GPUs:
Once @dotsdl is able to build the new FAH OpenMM core22 from the openmm-7.7.0 conda-forge package, I can quickly set up and run a benchmark across all GPUs on Folding@home. For future, how do you want to discuss improvements/updates to the release checklist process for the next release? Should I open an issue for that? |
It can be at the same time as the forum announcement. I would point to the GitHub release.
That would be great. Hopefully I can also benchmark a variety of GPUs on the NVIDIA cluster. Overhauling our benchmarks section would be useful. We probably want to have lots of benchmarks for just a few GPUs, plus just a couple of benchmarks that are run on many different GPUs. Folding@Home would be really useful for the latter. For the former, we need a controlled environment that will be consistent from one release to the next so we can track changes in performance.
Sounds good. |
How about these benchmarks:
I can run the first three, and Folding@home would be perfect for the fourth one. |
Can we add a box for this in #3191 (comment) if you'd like me to do this now, then?
If we're going to overhaul the page, let's think about what our users want from us a bit more deeply. Perhaps we could open a new thread to discuss? There are four main categories of benchmarks we would like to provide information for:
If we're going to rework the benchmarks page, perhaps we can think about how best to address these four categories of use cases? In the interim, we can at certainly update the benchmarks right away as we collect data for your suggestion #3191 (comment) Does that sound reasonable? For running the benchmarks now, I can collect data on a single A100, but don't have access to multiple A100s or a Titan V. The Folding@home benchmarks will follow in a week or two (depending on when @dotsdl has time to build/test/deploy.) |
Done.
Sounds good.
I do, so I can run them. |
Tweet posted and checkbox ticked: https://twitter.com/openmm_toolkit/status/1475569472693886982 |
We now have updated versions of OpenMM-Torch and OpenMM-Setup. I think those are the only downstream packages we need to update. Which means this issue can now be closed! |
We should start considering what features we want to include in the next release after 7.6 (whatever we end up calling it). Here's my first pass at reviewing open feature requests. To start with, here are some minor changes I suggest we include. I expect each of them to take a day or less to implement, so they won't take long.
#3185: Detect non-physical parameters
#3124: Add box shape option in addSolvent()
#3111: Calculate certain derivatives more robustly
#3071: Reporter options when resuming a simulation
#3053: Improve accuracy of Fortran constants
#2913: Use RMSDForce when defining molecules
#2870: Option not to overwrite metadynamics biases
#2655: Have Context store current step count
#97: Warn if unused flags are present
Here are some larger features that I also suggest including. They'll take more work.
#3181, #1195: Improve reporting of template matching errors
#3123, #2816: Support the latest version of AMOEBA
#3104, #2675: Support GB with newer force fields
#2955: Option to reduce CPU usage with OpenCL
#2921: Improve robustness of adding membranes
#2513: Improvements to Drude particles
The following all need research or design. I suggest investigating them, after which we might or might not decide to include them in this release.
#3097: Include search path when compiling CUDA code
#3077: Adaptive barostat interval algorithm
#3054: Recompute long-range dispersion correction when parameter offsets change
#2898: Prevent barostat from re-imaging molecules
#2871, #1305: PyPI packages
#2757: Allow ForceField to store extra descriptors for a residue
#2725: Provide a public method in forcefield.py to get templates and list unmatched residues
#2514: Fully flexible unit cells during NPT simulations
#1911: Copy CustomIntegrator Python attributes and methods
#1155: Report minimization progress
There also are a couple of other new force fields it would be good to include. We don't currently have open feature requests for them.
I understand Amber19 has issues that may make it difficult to support. We should at least investigate it.
The text was updated successfully, but these errors were encountered: