-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
Add marshal module with ability to serialize/unserialize bytecode functions
#16615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add marshal module with ability to serialize/unserialize bytecode functions
#16615
Conversation
|
@iabdalkader What do you think about this approach using |
|
Code size report: |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #16615 +/- ##
==========================================
- Coverage 98.59% 98.53% -0.06%
==========================================
Files 167 169 +2
Lines 21599 21807 +208
==========================================
+ Hits 21295 21488 +193
- Misses 304 319 +15 ☔ View full report in Codecov by Sentry. |
I think it's much better: CPython compatibility is always better than custom API, and in the future this can be extended to support other things. I thought I should give it a quick test, and I can confirm everything is still working fine. Amazing, thank you! This |
dc7def8 to
b63c48b
Compare
OK, great. Then I'll stick with this approach of implementing the
Yes, it allocates a little RAM. But If you have frozen code then
This already exists and you can use it here. |
720de59 to
9737cc6
Compare
|
OK, this PR is done and ready for final review:
|
|
This looks great thanks! For reference my earlier attempts to expose a |
projectgus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments, but I'm really excited about this feature! Seems like it will enable some very interesting use cases for distributed MicroPython!
9737cc6 to
9667482
Compare
To make it easier to diagnose why CPython crashed. Signed-off-by: Damien George <damien@micropython.org>
The `mp_obj_code_t` and `mp_type_code` code object was defined internally in both `py/builtinevex.c` and `py/profile.c`, with completely different implementations (the former very minimal, the latter quite complete). This commit factors these implementations into a new, separate source file, and allows the code object to have four different modes, selected at compile-time: - MICROPY_PY_BUILTINS_CODE_NONE: code object not included in the build. - MICROPY_PY_BUILTINS_CODE_MINIMUM: very simple code object that just holds a reference to the function that it represents. This level is used when MICROPY_PY_BUILTINS_COMPILE is enabled. - MICROPY_PY_BUILTINS_CODE_BASIC: simple code object that holds a reference to the proto-function and its constants. - MICROPY_PY_BUILTINS_CODE_FULL: almost complete implementation of the code object. This level is used when MICROPY_PY_SYS_SETTRACE is enabled. Signed-off-by: Damien George <damien@micropython.org>
This allows retrieving the code object of a function using `function.__code__`, and then reconstructing a function from a code object using `FunctionType(code_object)`. This feature is controlled by `MICROPY_PY_FUNCTION_ATTRS_CODE` and is enabled at the full-features level. Signed-off-by: Damien George <damien@micropython.org>
Serialises a bytecode function/generator to a valid .mpy as bytes. Signed-off-by: Damien George <damien@micropython.org>
This commit implements a small subset of the CPython `marshal` module. It implements `marshal.dumps()` and `marshal.loads()`, but only supports (un)marshalling code objects at this stage. The semantics match CPython, except that the actual marshalled bytes is not compatible with CPython's marshalled bytes. The module is enabled at the everything level (only on the unix coverage build at this stage). Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
9667482 to
e40a3fd
Compare
|
Just to confirm, we don't need to enable |
That is correct. |
Summary
This PR adds the
marshalmodule, withmarshal.dumps()andmarshal.loads()functions. These functions can serialize/unserialize Python objects, but for now only bytecode functions are supported. The semantics of this module match CPython.Motivation: the original motivation here was to be able to serialize an existing function, send it over a network connection (or otherwise) to a remote MicroPython device, then unserialize it and execute it. This way one MicroPython device can dynamically execute code on another MicroPython device.
And the original implementation was a bit simpler than this PR, it was a dedicated pair of functions in the
micropythonmodule that were just used to serialize/unserialize a (bytecode) function. Eg:But, instead of a MicroPython-specific API, it's definitely better to try and match an existing CPython API if possible. And in this case the
marshalmodule does almost what was needed to serialize/unserialize bytecode functions. The main difference here is thatmarshalworks with code objects, not functions. So the above code is:Using
marshalis a little more involved because you have to know about code objects (whichcompile()also returns, for example). But at least usingmarshalis fully compatible with CPython.This PR implements the
marshalmodule and the above marshal example works in MicroPython with this PR.Testing
Tests are added to CI, and run on the unix coverage variant.
Trade-offs and Alternatives
The major trade-off/alternative here is a MicroPython-specific API vs a CPython-compatible API, namely the
marshalmodule.Using a MicroPython-specific API, eg
micropython.serialize_function()andmicropython.unserialize_function():function.__code__.type(lambda:0)(code_object, globals()).Implementing the
marshalmodule:compile()function.function.__code__, which is what py/objfun: Add function.__code__ attribute #12280 attempted to do.Also note that the
picklemodule is no good here, because it cannot serialize functions. It simple serializes a reference to a function, which must already be in scope when the reference is unserialized.