New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/gsd with queue #1543
Feat/gsd with queue #1543
Conversation
This will more easily allow for derived classes of GSDDumpWriter since they can choose when to write or not.
- Move initFileIO to protected - Add method for querying file initialization
This class provided a triggerable multi frame write.
Also move initFileIO to the constructor. This prevents the need for double caching of the buffer size prior to opening the file.
Ensure that we do not rely on Python's garbage collector to ensure that the file is written after the call to write() ends.
Also check for errors in new GSD API calls.
While this is unexpected, it is required to keep performance high as until the first write for a file occurs, all data is recorded and sorted in a GSDDumpWriter::GSDFrame object. By writing the first frame, we prevent this. However, we then don't know if the first frame should be included in the actual analysis.
All tests including logging tests now pass.
46c68d8
to
f141d6e
Compare
f141d6e
to
021d2c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had one suggestion
|
||
|
||
@pytest.fixture(scope='function') | ||
def hoomd_snapshot(lattice_snapshot_factory): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to call this frame
instead of snapshot
? Or better to keep consistent with the name of the module? Are there plans to rename this module to frame
in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these are hoomd.Snapshot
objects. I renamed gsd.Snapshot
to gsd.Frame
to avoid confusion with hoomd.Snapshot
.
@joaander The test errored due to timeout which is currently set at 3000 second or 50 minutes but MPI tests finish in 30 seconds on hodges. Also, MPI tests for the neighbor list are erroring for so reason on hodges. Will look into that. |
I can't look at it right now. Maybe later this week. If you can find a configuration that reproduces the deadlock, I usually debug them by 1) Running the tests with mpirun and wait for the deadlock. 2) Attach a debugger to the processes and see where each rank is waiting. |
pre-commit.ci autofix |
The deadlocking is solved. I forgot to not call gsd functions on all ranks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The implementation is very clean. I have a few suggestions to improve the documentation.
hoomd/GSDDumpWriter.cc
Outdated
frame.dihedral_data, | ||
frame.improper_data, | ||
frame.constraint_data, | ||
frame.pair_data); | ||
} | ||
} | ||
|
||
// emit on all ranks, the slot needs to handle the mpi logic. | ||
m_write_signal.emit(m_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this code was used, it would be a problem for the burst writer. However, this has now been replaced by Logger
. I opened #1554 to remove these dead code paths. No need to make any changes in this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By this
, I refer to m_write_signal.emit
.
|
||
|
||
@pytest.fixture(scope='function') | ||
def hoomd_snapshot(lattice_snapshot_factory): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these are hoomd.Snapshot
objects. I renamed gsd.Snapshot
to gsd.Frame
to avoid confusion with hoomd.Snapshot
.
Co-authored-by: Joshua A. Anderson <joaander@umich.edu>
This behavior is not intuitive but required for good performance.
@joaander I just thought of a potential mitigation of the initial write from the burst writer. We could either
What do you think? |
Yes, we could encourage users to use |
Adds option to run immediately upon attaching for those still wishing to use this.
This improve the ability for writers to immediately upon attaching write log data depending on one or more computes or tuners.
6eb4630
to
2e430c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some documentation revisions directly to the branch. Please review and make any further edits as needed. It looks like you need to merge in conflicts from trunk-major
.
I have one comment on parameter naming to think about. Other than that, this is good to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Description
Add a
hoomd.write.Deque
class which stores astd::deque
of logger and trajectory data up to a specified maximum sizeN
. The deque pushes data to the front and pops old data off the back of the deque. Theanalyze
method only collects and removes data as necessary. All writes are triggered by thedump
method exposed in Python. The purpose of this class is to allow selective high frequency data storing, by allowing users to determine when to write data.One foreseen pattern of use would be to create a custom action which composes a
Deque
object callingdump
when the specified conditions have been met.Motivation and context
To provide a means for high-performance, high-frequency writes in HOOMD.
How has this been tested?
Tests based on the GSD tests were added.
Change log
Checklist:
sphinx-doc/credits.rst
) in the pull request source branch.