Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packmol: overhaul and add tests #17

Merged
merged 24 commits into from
Nov 17, 2021
Merged

Conversation

rkingsbury
Copy link
Collaborator

Summary

This PR overhauls PackmolWrapper and adds a set of thorough unit tests. These changes were motivated by my recent experience using the pymatgen PackmolRunner class (which has many deficiencies) and this one to try to build structures. With this PR, I have brought the mdgo version up to feature parity with the pymatgen one and made it more robust to cases where packmol times out or fails to satisfy all packing constraints.

Major changes include:

  • Changing the call signature so that the dict (formerly structures now molecules) also contains the numbers of each molecule. Previously the numbers were passed as a separate list. To me, if we are already providing a [dict] that specifies the names and structures of every molecule, we might as well include the numbers there too to minimize chances of a mistake.
  • Changed the "structure" key of the input dict to "coords"
  • Add several features that the pymatgen PackmolRunner had but this class did not. Namely, support for Molecule objects, ability to customize control parameters, and ability to automatically determine box size
  • Added a timeout option to the run_packmol() command
  • Added logic to catch cases where packmol fails but still returns zero exit code (see packmol returns exit code 0 even when errors m3g/packmol#28)
  • Allow files to be given as Path in addition to str
  • Support filenames or paths that contain spaces
  • Add unit tests for everything

Additional dependencies introduced (if any)

  • None

TODO - thinking longer term

I believe this PR is good to go, pending review.

However, longer term I also feel that it would make sense to move this functionality back into pymatgen and create a proper InputSet class for packmol that follows the forthcoming abstract IO interface I've been developing. For example, instead of make_packmol_input we would use write_input to be consistent with that interface.

It doesn't really make sense to adopt that new interface withinmdgo until it's merged into pymatgen. So one approach would be to merge this PR for use in mdgo now, with the understanding that it may quickly be superceded by a parallel implementation in pymatgen that has a slightly different interface. When that happens, I will deprecate the pymatgen PackmolRunner and we could also add a deprecation warning here. It's a little messy, but since not many people use mdgo yet I think it's OK.

An alternate approach would be to go ahead and change the interface here so that it looks and acts just like what will eventually be in pymatgen, without actually inheriting any of those base classes yet. That way people can start getting used to the new interface and not have to change their code when we update it to use the abcs in pymatgen later.

Happy to discuss in more detail @htz1992213 .

Before a pull request can be merged, the following items must be checked:

  • Code is in the standard Python style. The easiest way to handle this
    is to run the following in the correct sequence on your local machine. Start with running
    black on your new code. This will automatically reformat
    your code to PEP8 conventions and removes most issues. Then run
    pycodestyle, followed by
    flake8.
  • Docstrings have been added in the Google docstring format.
    Run pydocstyle on your code.
  • Type annotations are highly encouraged. Run mypy to type check your code.
  • Tests have been added for any new functionality or bug fixes.
  • All linting and tests pass.

@rkingsbury
Copy link
Collaborator Author

I guess the tests will not work in the CI environment because packmol is not installed as a dependency. Packmol is conda-installable but not pip-installable so I'm not sure about the best way to get the CI to install it.

@rkingsbury
Copy link
Collaborator Author

@htz1992213 after thinking more about this, I think there are 2 possible ways we can proceed. Either way, I'd like to develop a separate packmol IO class as part of my htmd branch in pymatgen. There's no reason this PackmolWrapper can't exist alongside the IO class in pymatgen; although I think eventually we should encourage people towards the full IO class once it's available.

Given that, we can either:

  1. Review and merge this as is, or
  2. Modify this PR so that the call signature of PackmolWrapper does not change at all (i.e., go back to structures and numbers args), to facilitate backward compatibility. This would avoid people having to update their code between now and when the IO class is ready.

Please let me know which way you'd like to go. Thanks!

@htz1992213
Copy link
Collaborator

@rkingsbury Thanks! I will work on the PR and get back to you!

@rkingsbury
Copy link
Collaborator Author

FYI @htz1992213 I have now implemented a full I/O class for packmol in pymatgen. See this file

It's not merged yet so still somewhat subject to change. A lot of the contents are taken from my work here, with some slight modifications like not using the mdgo method of volume estimation, for example.

@htz1992213
Copy link
Collaborator

@rkingsbury Thanks for the overhaul, the changes look great! One minor thing is about the tests. I added a packmol executable so that the CI can run the tests. The MoleculeMatcher in the test_random_seed test utilizes openbabel features which require additional dependencies and cause the CI to fail. Would it be possible to use an alternative approach to check packed structures are the same? For example by directly checking the XYZ file?

Copy link
Collaborator

@htz1992213 htz1992213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkingsbury I have suggested changes to the test that use alternative checks. Could you please take a look? Thanks!

tests/test_mdgopackmol.py Outdated Show resolved Hide resolved
tests/test_mdgopackmol.py Outdated Show resolved Hide resolved
tests/test_mdgopackmol.py Outdated Show resolved Hide resolved
tests/test_mdgopackmol.py Outdated Show resolved Hide resolved
rkingsbury and others added 5 commits November 15, 2021 19:00
Co-authored-by: Tingzheng Hou <thou@berkeley.edu>
Co-authored-by: Tingzheng Hou <thou@berkeley.edu>
Co-authored-by: Tingzheng Hou <thou@berkeley.edu>
Co-authored-by: Tingzheng Hou <thou@berkeley.edu>
@rkingsbury
Copy link
Collaborator Author

rkingsbury commented Nov 16, 2021

@rkingsbury Thanks for the overhaul, the changes look great! One minor thing is about the tests. I added a packmol executable so that the CI can run the tests. The MoleculeMatcher in the test_random_seed test utilizes openbabel features which require additional dependencies and cause the CI to fail. Would it be possible to use an alternative approach to check packed structures are the same? For example by directly checking the XYZ file?

Thanks for the good suggestions @htz1992213 ; I've adopted all of them. In addition, per discussion over in rkingsbury/pymatgen#3, I removed the " from pathnames in the input file as I received a report that they were causing problems with older version of packmol. However, it seems like removing them caused the tests here to start failing.

 E               ValueError: Packmol failed with return code 0 and stdout: : Could not open file. 
E                        Could not find file: /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmpjcwz1yni/subdirectory
E                        Please check if all the input and structure 
E                        files are in the current directory or if the
E                        correct paths are provided.

Which version of packmol are you using in the CI? I set up a clean conda environment and installed the conda version of packmol, which is v20.010. All tests passed. Support for paths with spaces was added in v20.3.0. So based on my limited testing, it seems like having the " was not necessary to make this work with packmol versions pre-20.3.0, but then I don't understand why the CI version is sensitive to it. Basically, I'm confused!

I'm happy to hear opinions one way or another on whether we should leave the " in place.

@htz1992213 htz1992213 merged commit 4d8f00f into HouGroup:main Nov 17, 2021
@orionarcher
Copy link

orionarcher commented Nov 18, 2021

Just wanted to chime in. I've now had issues with the quotation marks in both v20.010 and v20.3.2. I'm not sure what we settled on here, but I am curious if anyone solved the quotation mark issue. I am also quite puzzled by the "quotation mark mystery".

@rkingsbury
Copy link
Collaborator Author

Just wanted to chime in. I've now had issues with the quotation marks in both v20.010 and v20.3.2. I'm not sure what we settled on here, but I am curious if anyone solved the quotation mark issue. I am also quite puzzled by the "quotation mark mystery".

Hey @orioncohen , I think Tingzheng and I figured it out. Here are the highlights as best I've been able to figure out:

  1. v20.010 from conda is actually v20.3.1. The version number is bad (I should open an issue about it at packmol but haven't yet)
  2. Quotation marks break older versions (prior to 20.3)
  3. Quotation marks break newer version when there are no spaces in the pathnames (I think?)
  4. Quotation marks are required by the newer version when there are spaces in the pathnames (I think?)

That's the logic that's implemented in Tingzheng's recent fix, which allowed all tests to pass here. Is this consistent with your testing? I'm going to add the fixes to my pymatgen version as well.

@rkingsbury
Copy link
Collaborator Author

Made a note about the version number issue here m3g/packmol#6

rkingsbury added a commit to rkingsbury/pymatgen that referenced this pull request Nov 22, 2021
See HouGroup/mdgo#17. Thanks @htz1992213
and @orioncohen for bug reporting and fixes.
janosh added a commit to materialsproject/pymatgen that referenced this pull request Mar 10, 2023
…tion into atomate2 (#2692)

* LAMMPS: add sets.py

* LAMMPS: linting with black

* LAMMPS: basic InputSet class

* Example generator docstring

* example

* linting

* revert alchemy change related to VaspInput

* fixed an issue where quotation marks inside strings were causing packmol to not recognize file names

* rm spurious quotation marks.

See rkingsbury#3

* fixed bug where working directory was changed and never restored

* two bug fixes

* Packmol: better handling of " in filepaths

See HouGroup/mdgo#17. Thanks @htz1992213
and @orioncohen for bug reporting and fixes.

* Packmol: add test for chdir behavior

* added myself to author list

* LAMMPS: add sets.py

* LAMMPS: linting with black

* LAMMPS: basic InputSet class

* Example generator docstring

* example

* linting

* revert alchemy change related to VaspInput

* LammpsInputConstructor initiation

* LammpsInputConstructor initiation

* Revert "LammpsInputConstructor initiation"

This reverts commit 62dd52e.

* helper methods for stages

* method to initiate new stage

* method to add single LAMMPS command

* method to add comment

* method to LAMMPS command as string

* method to add stage commands

* method to add commands using dict

* method to generate LAMMPS input

* comment fix

* method to generate LammpsInputFile

* method for str representation of LammpsInputFile

* method for reading LammpsInputFile str

* docstring format change

* from_file method

* Minor fix and __str__ method

* merge methods for add commands

* Merge LammpsInputConstructor and LammpsInputFile

* Remove LammpsInputConstructor

* Update LammpsInputFile Docstring

* formatting and minor fixes

* Update vasp outputs in case of serial compilation

* Linting

* Linting, pre-commit,...

* Simplified interface for LammpsInputFile

* Updated documentation

* Added unit tests

* Added parse rule about & symbol for the input file

* Tentative concrete implementation of Lammps sets for minimization

* Moved structure to get_input_set

* Adding a BaseLammpsGenerator object for cleaner further implementation

* pre-commit changes

* Typos, updated maintainers and contributors

* Modified the input_settings from dict of dicts to list of tuples. Easier handling of non-default cases.

* Added options to ignore comments when reading/writing inputs, and ignore stages when reading/writing inputs

* Added documentation.

* Remove tuples. Add functionalities to lammps inputs. Add/update tests.

* Made LammpsMinimization MSONable. These sets still require some refactorization to make them more sustainable (in the same fashion as vasp and lammps).

* Data -> structure bug in lammps sets

* Removed sets.py from pymatgen. This is workflow-related stuff and will now live in https://github.com/Matgenix/atomate2-lammps

* Added type hinting

* Upgraded documentation. Added basic sets.py back after discussion.

* Followed unittest2pytest

* Added functionalities to LammpsInputFile

* Added keep_stages option to LammpsInputSet and Generator.

* Removed changes in outputs, should not be part of this PR (and should in any case be updated).

* Included Matthew's suggesions.

* Including suggestions from @rkingsbury

* Updated tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Adding coul (short for coulomb in lammps io) to list of words to ignore by codespell.

* Reverted changes in vasp outputs, made a separate PR.

* Same as previous commit

* Attribute .list_of_commands is not .stages, and its structure is a list of dicts.

* Refactoring of some methods and parts. Added the possibility to add a whole stage, including checks needed on the given stage.

* Small refactoring, added a method to rename stages.

* make PackmolSet.from_directory @abc.abstractmethod

* ruff --fix $(gh pr view 2692 --json files --jq '.files.[].path' | grep '.py')

---------

Co-authored-by: Ryan Kingsbury <RKingsbury@lbl.gov>
Co-authored-by: orioncohen <orion@lbl.gov>
Co-authored-by: Ryan Kingsbury <rkingsbury@users.noreply.github.com>
Co-authored-by: mholekev <mholekev@ucsd.edu>
Co-authored-by: Janosh Riebesell <janosh.riebesell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants