Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish pickle serialization for Pybind modules #160

Closed
breznak opened this issue Dec 14, 2018 · 13 comments · Fixed by #821
Closed

Finish pickle serialization for Pybind modules #160

breznak opened this issue Dec 14, 2018 · 13 comments · Fixed by #821
Assignees
Labels
code code enhancement, optimization, cleanup..programmer stuff Python Binding
Milestone

Comments

@breznak breznak added the code code enhancement, optimization, cleanup..programmer stuff label Dec 14, 2018
@breznak breznak added this to the pybind milestone Dec 14, 2018
@breznak
Copy link
Member Author

breznak commented Jan 14, 2019

@dkeeney can I assigne this to you? Prioritize as you find best

@dkeeney
Copy link

dkeeney commented Jan 14, 2019

yes, I will take that one.

@breznak
Copy link
Member Author

breznak commented Mar 22, 2019

Related to #266

  • iff we need python serialization, this might be a priority, as those are broken/skipped tests
  • or could be just defer serialization from python to c++ and always call cpp code?

@dkeeney
Copy link

dkeeney commented Mar 22, 2019

could be just defer serialization from python to c++

The problem is that to get a top to bottom serialization we need to include the state of the .py modules. In cases where the region plugin is implemented in .py code we could have a situation like this:

python app --> Network --> python region --> c++ algorithm

When the python app calls for serialization of the Network, the serialization needs to include the python region implementation. So, in bindings/py/cpp_src/PyBindRegin.cpp the C++ code will be calling pickle on the python code. That code needs to propagate the pickle until it gets to the bindings for the C++ algorithms and there it should call the C++ serialization.

If we are going to use Cereal serialization then this might be implemented a little different. So perhaps it would be best to put in Cereal first and then fix the pickle interface.

@breznak
Copy link
Member Author

breznak commented Jul 1, 2019

Just a memo that pickle is broken with python2 (due to pybind11) #530 (comment) and decided workaround to skip such tests selectively for py2.7 #530 (comment)

@breznak
Copy link
Member Author

breznak commented Jul 1, 2019

Now that @Lupino made it possible to serialize using pickle in python, we have 2 ways of serialization for python:

  • C++ serializable interface binding (save/load methods)
    • available for all (Serializable) classes
    • same syntax for c++,py code
    • works for both py versions (2, 3)
  • native python's pickle
    • works seamlessly for py-only, cpp-bindings code

My question is, for python:

  • should we support both methods?
  • only provide either of them?

@dkeeney
Copy link

dkeeney commented Jul 1, 2019

should we support both methods?

  • When @Lupino implemented pickle he called save(stream ) and load(stream ) which uses the Cereal Serialization of the C++ object.

  • Going the other direction, when the Cereal Serialization is serializing C++ bindings\py\cpp_src\plugin\PyBindRegion.cpp, it calls pickle to serialize the python code that it calls.

So, there is really only one serialization method, although it may be accessed in two different ways. C++ is serialized with Cereal and .py code is serialized with pickle.

@ctrl-z-9000-times
Copy link
Collaborator

I think we should keep both sets of methods, especially if they're already implemented.

Pickle is convenient, but im pretty sure it adds meta data which C++ does not understand.

The underlying C++ methods work in both python and C++, and data saved in one language can be loaded into the other.

@dkeeney
Copy link

dkeeney commented Jul 1, 2019

data saved in one language can be loaded into the other.

Well, not quite. If .py code serialized a TM using pickle then there will be a pickle wrapper around the Cereal Serialization of the C++ object that gets stored and C++ code would not be able to deserialize it. But if .py code called save(stream) directly on the TM object then the created archive file would be the same as TM being serialized by C++ code.

However, if .py code wants to serialize one of its own objects which contains a TM object as a member then it would have to use pickle to get everything in one file. So I think we need both entry points for .py serialization.

This means that we should probably require this pickle function for all C++ objects that are exposed to .py via the bindings, not just TM and SP. Anything that could be included in some other .py object.

@breznak
Copy link
Member Author

breznak commented Aug 9, 2019

his means that we should probably require this pickle function for all C++ objects that are exposed to .py via the bindings, not just TM and SP. Anything that could be included in some other .py object.

can we just add an override to the bindings that adds pickle() to all objects and it just calls save()?

@dkeeney
Copy link

dkeeney commented Aug 9, 2019

can we just add an override to the bindings that adds pickle() to all objects and it just calls save()?

I think that is the idea... This is the one I just did for the Array object.

			.def(py::pickle(
				[](const Array& self) {
					std::stringstream ss;
					self.save(ss);
					return py::bytes(ss.str());
			},
				[](const py::bytes& s) {
					std::istringstream ss(s);
					Array self;
					self.load(ss);
					return self;
			}));

@breznak
Copy link
Member Author

breznak commented Aug 9, 2019

This is the one I just did for the Array object.

yes, I'm doing the same for RDSE serialization #608 and I've plain copied that code from somewhere else. My point was if we can do an equivalent of Serializable.hpp with a templated method pickle() for python, but I guess Py does not do templates, so we can't (?)

@dkeeney
Copy link

dkeeney commented Nov 8, 2019

I think the last of the pickle serialization problems is covered in PR #747.
...at least until another pickle serialization problem comes up.

@dkeeney dkeeney closed this as completed Nov 8, 2019
@breznak breznak mentioned this issue Jun 2, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code code enhancement, optimization, cleanup..programmer stuff Python Binding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants