New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParticleSpecies & RecordComponent Serialize #963
Conversation
I have implemented some of the suggestions above on a branch building on this one https://github.com/franzpoeschel/openPMD-api/tree/topic-serialize-suggestion |
Thank you for the suggestion, I like the |
Preparation to derive information necessary to construct a new series that can access the same RecordComponent.
- pickle Record Components - it's aliiive
7a1fdda
to
8a28920
Compare
// This is a big hack for now, but it works for our use | ||
// case, which is spinning up remote serial read series | ||
// for DASK. | ||
static auto series = openPMD::Series( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how much of a hack this static
member is to fix up lifetimes, but it seems to work pretty well :)
We only have little control when it gets destroyed, which is likely as late as process end/module unload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when a user wants to unpickle two different Series within the same application? The static auto series
will not be constructed a second time and then be stuck with the old file path.
Do you do this for avoiding the Series from going out of scope?
Would something like return seriesAccessor( std::move( series ), group)
help? This would ensure that the seriesAccessor
takes no reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I see the problem.
add_pickle(
cl,
[](openPMD::Series & series, std::vector< std::string > const & group ) {
uint64_t const n_it = std::stoull(group.at(1));
return series.iterations[n_it].meshes[group.at(3)][group.at(4)];
}
)
This lets the Series
go out of scope and returns only the record component..
I think that this is a fundamental design issue with our Python bindings: Python has garbage collection, but we don't use it. Realistically, calling del series
should not be sufficient in Python to destroy the Series, since the series
should stay active as long as there is an active dependent member like a contained RecordComponent
.
So, moving away from a C++-centered memory model for our Python API should probably do the trick.
Three quick and dirty fixes, each quicker and dirtier than the last:
- Is it somehow possible in PyBind to quickly tell the garbage collection that in the above code the
series
and the returnedMeshRecordComponent
depend on each other? This should at least implement the above solution locally. - Make a subclass
PickledMeshRecordComponent
that privately stores theseries
. - Just do:
(This solution will only allow one deserialized object to be used at a time though)
static openPMD::Series series; series = openPMD::Series(...); // this is executed every time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fundamental design issue with our Python bindings: Python has garbage collection, but we don't use it.
Offline discussed: What we mean is we don't use the possibility to tell the garbage collector dependencies between our object hierarchy yet (in both directions) yet.
Is it somehow possible in PyBind to quickly tell the garbage collection that in the above code the series and the returned MeshRecordComponent depend on each other? This should at least implement the above solution locally.
Gotta investigate that, would be a nice and clean solution 👍
Currently not tested and likely to fail in this situation:
- unpickle of multiple objects of the same class (I test different classes to be pickled/unpickled)
good follow-up to at least have the last object (of the same kind) always be in usable state.
And lift-off 🚀 openpmd_dask.mp4 |
Preparation to derive information necessary to construct a new series that can access the an equivalent
ParticleSpecies
orRecordComponent
(see #952).