Add support to save native, viper and inline-asm code to .mpy files #4535

dpgeorge · 2019-02-21T04:44:15Z

This set of patches provides support for saving and loading .mpy files that contain native code (native, viper and inline-asm). A lot of the ground work was already done for this in the form of removing pointers from generated native code. The changes here are mainly to link in qstrs to the native code, and change to the format of .mpy files to contain native code blocks.

A top-level summary:

@micropython.native, @micropython.viper and @micropython.asm_thumb/asm_xtensa are now allowed in .py files when compiling to .mpy, and they work transparently to the user
entire .py files can be compiled to native via mpy-cross -X emit=native and for the most part the generated .mpy files should work the same as their bytecode version
.mpy file format is changed to 1) specify in the header if the file contains native code, and if so the architecture (eg x86, Thumb, Xtensa); 2) for each function block the kind is specified (bytecode, native, viper, asm)
when native code is loaded from a .mpy file the native code must be modified (in place) to link qstr values in, just like bytecode (see link_qstr() functions in the patch)

To do:

mpy-cross needs to support compiling for multiple architectures via a command-line switch like mpy-cross -march=thumb2 (at the moment it must be compiled separately for each arch, like mpy-cross-thumb2)
mpy-tool.py needs support for freezing native code

In addition, this now defines a public, native ABI for dynamically loadable native code generated by other languages, like C, and so is intended to replace #1627.

This was tested on the unix port (x86, x86-64), stm32 and esp8266. To test it on unix do:

$ mpy-cross -X emit=native -mcache-lookup-bc -o file_native.mpy file.py
$ micropython -m file_native

pfalcon · 2019-02-22T14:49:03Z

py/persistentcode.c

-    byte *bytecode = m_new(byte, bc_len);
-    read_bytes(reader, bytecode, bc_len);
+    // load function kind
+    int kind = read_byte(reader);


Perhaps it's possible to combine "kind" with some other flags into this byte?

Ok, since it's only 2 bits I've now combined it with the number of bytes in the bytecode/function data, like: write_uint((len << 2) | (kind - 2))

pfalcon · 2019-02-22T14:51:04Z

py/persistentcode.c

+        mp_uint_t *ct = const_table;
+        size_t i;
+        if (kind != MP_CODE_NATIVE_VIPER) {
+            for (i = 0; i < prelude.n_pos_args + prelude.n_kwonly_args; ++i) {


I woulnd't find comments like "first entries in const table are function argument names" superfluous.

Ok, I added a few comments to the load/save functions.

pfalcon · 2019-02-22T14:57:10Z

tools/mpy-tool.py

@@ -428,6 +428,9 @@ def read_bytecode_qstrs(file, bytecode, ip):
        ip += sz

 def read_raw_code(f):
+    kind = read_uint(f)


Well, this doesn't correspond to C code, which reads a byte.

pfalcon · 2019-02-22T15:02:45Z

In addition, this now defines a public, native ABI for dynamically loadable native code generated by other languages, like C, and so is intended to replace #1627.

I don't see anything specific to that in this PR. And it's easier said than done, there're a lot of different things to consider for that. For example, C module may want to print a Python object. But exposing all the bunch of print functions doesn't make sense, instead mp_printf() should be extended to be able to print an object, just like it already was to print qstr. And that's just one of the examples.

dpgeorge · 2019-02-25T05:09:38Z

I updated and force-pushed this branch. New additions include:

initial support for frozen native code in mpy-tool.py; tested by freezing upip into the unix executable
added support to run test suite via compiling to native mpy, and enabled that on travis

dpgeorge · 2019-02-25T05:23:49Z

I don't see anything specific to that in this PR.

That and this PR actually have quite a lot in common. The main difference here is that the existing native function types (native/viper/asm) are reused to implement dynamically loadable native code from another language (like C). #1627 added some constants (false, true, none, ellipsis) to the mp_fun_table table, but these are now already there due to the previous improvements to the native code emitter. And while #1627 had a simple way to include qstrs in loadable native code (by allocating an array on the heap for the extra ones and letting the native code populate them at load time), this PR does it more efficiently by using the existing .mpy way of loading qstrs and then linking their values into the loaded native code.

So this PR here is kind of like a rework of #1627 given all the things that have changed since then, and also includes the ability to save native/viper/asm in .mpy files.

dpgeorge · 2019-02-25T05:33:30Z

I pushed a commit with an example of pure C code that generates a valid .mpy file containing a C function that can be imported and executed. The outer module is bytecode and there is a single function (called foo) which is native and is compiled from the corresponding foo() C function. Executing foo, it prints out the numbers 0-9, along with True/False if the number is odd/even, and returns a new list [42, 43].

pfalcon · 2019-02-28T08:46:23Z

I pushed a commit with an example of pure C code that generates a valid .mpy file containing a C function that can be imported and executed.

Quickly recording comments I had in mind on Monday looking at it, but didn't write down then:

So, this looks like a cute and creative way, but level of hackiness can be judged by the fact that "foo" function is not references anywhere, i.e. apprently it's just assumed that there's a single function in the module, which will end up at a given offset.

And yeah, sure, it's possible to write code in that way (@Native indeed does!), and thanks for showing a way to do that in C. But I doubt utility and practicality of that. After all, a (reasonable) human uses C when compiling Python isn't enough, and they they would like to use a plethora of devices and optimizations offered by C and native C interface of the system, not code it up thru the needle's eye of a JIT compiler.

And beyond that, as I mentioned in my recent comment on #1627 (comment), the starting requirement for dynamically loaded modules should be, should have always been, "allow to compile the same module either statically or dynamically [and in static case, get the same (or 95%) of efficiency of natively static module]". As we already have static modules, they dictate the source format and structure. Dyna support could rehash something (e.g., add more macro calls, or wrap a few of existing things in macros), but that's it. I have a prototype which works like that.

While working on that prototype,which of course reuses your idea of own port-portable dynamod format, it occurred to me that that format won't be enough after all, e.g. it doesn't support dynamic linking itself, and one of reasons people will write modules in C is to interface with existing system libraries. So, supporting native executable/shlib format is a must for systems which have it.

Let's count:

Static module format, we already have.
Static module format extended to dynaloading, self-contained OS-independent modules.
Native OS dynamodules.

And now you propose:

A completely new dynamodule format, not compatible with (== much less efficient) existing static modules, and which requires using completely different API from static modules.

Again, I'm not sure of practicality of p.4. But the C example is of course golden, as reference to someone who'll implement other types of JIT/AOT.

dpgeorge · 2019-02-28T13:28:56Z

the fact that "foo" function is not references anywhere, i.e. apprently it's just assumed that there's a single function in the module, which will end up at a given offset.

Yes, it's a basic example to show that it's possible at all. It's also possible to have multiple entry points into the C code and a tool would need to be written to create a full-featured .mpy file from such a piece of C code (eg give a list of C symbols mapped to their Python name, in a macro like the existing MP_DEFINE_CONST_FUN_OBJ).

and they they would like to use a plethora of devices and optimizations offered by C

I don't see why they can't. As long as it compiles with -fPIC and -fPIE and doesn't have any relocations it will work.

and native C interface of the system,

On bare metal (which is really the focus of all this work) there is not really such a thing, or at least it is yet to be defined. The current (dynamic) native C interface is the ABI defined by the mp_fun_table which is enough to do anything that you can do in Python.

the starting requirement for dynamically loaded modules should be, should have always been, "allow to compile the same module either statically or dynamically"

With enough macros and maybe some extra preprocessing this could probably be achieved with the format here. To go a step further and do proper linking of symbols would be a lot of extra work, and I guess take a lot of extra code space to store all the possible symbol names and their addresses in the bare-metal firmware.

it occurred to me that that format won't be enough after all, e.g. it doesn't support dynamic linking itself, and one of reasons people will write modules in C is to interface with existing system libraries. So, supporting native executable/shlib format is a must for systems which have it.

As above, it's a lot of work to add full dynamic linking to a bare-metal port, and it would be specific for each architecture. And I don't think it would buy much (over the current scheme of having no linking): the main case for dynamic loadable code would be to implement things that are too slow in Python, and for such routines (eg a FFT) the calls into the runtime system are not the bottleneck, it's the non-Python computation that is.

Let's count: Static module format, we already have.

Yes, we only have one format at the moment. And the proposal here is to add just one, to allow dynamic native code to be loaded in the most simplest way (no linking, except for the qstr table). This comes for free with being able to load native/viper/asm and already opens up many possibilities.

Static module format extended to dynaloading, self-contained OS-independent modules.

As I said above, I'm not sure how this can be done efficiently. But I'm not opposed to eventually adding linking support to .mpy files, eg so the C code can have a BSS section and create static data structures. What's wrong though with starting at the simplest case (what's proposed in this PR) and improving it as usage experience dictates?

But remember, this PR here is primarily about adding support to save native/viper/asm code to .mpy files. It does it in a obvious and pretty minimal way.

pfalcon · 2019-02-28T14:51:43Z

Yes, it's a basic example to show that it's possible at all.

Sure, with macros, then external preprocessors it can be brushed up. It still will remain awkward API-wise IMHO.

and they they would like to use a plethora of devices and optimizations offered by C
I don't see why they can't.

API issue, again. I gave example with mp_printf above - it's rather awkward to call Python print() from C.

and native C interface of the system,
On bare metal (which is really the focus of all this work) there is not really such a thing,

Well, there's - it's possible to write a new module, using the whole of existing MicroPython API. And with #4195 even possible to distribute them without uPy itself (at the risk that if module really used a private API, it'll stop working with another version).

or at least it is yet to be defined. The current (dynamic) native C interface is the ABI defined by the mp_fun_table which is enough to do anything that you can do in Python.

Yes, for dynamic case it yet needs to be defined, and my point was that defining it as mp_fun_table is a way too restrictive, so this PR can't "replace" work being done in #1627.

But remember, this PR here is primarily about adding support to save native/viper/asm code to .mpy files. It does it in a obvious and pretty minimal way.

Yes, +1 to that. Please merge ASAP ;-).

Well, it's worth consider implications of the file naming. Using the same suffix for all (binary) MicroPython modules is rather efficient. But is it really user friendly? Consider for example that it won't be possible to have both bytecode and compile module of the same name. Is it relevant? I'm sure that a lot of clueless users, who will download random binaries from internet prepared by other clueless users - will find that yes. Should it be the limiting factor? Well, I'm ready to have all the usecases 1, 2, 3 I listed above to have .mpy extension, and if someone complains, say "it's not me, it's @dpgeorge who did that" ;-).

hoihu · 2019-02-28T19:54:20Z

But remember, this PR here is primarily about adding support to save native/viper/asm code to .mpy files. It does it in a obvious and pretty minimal way.

+1
this is a great feature and really opens up a lot of use cases for our application!

andrewleech · 2019-03-01T11:20:06Z

Excited for this feature, I just ran into the limitation of not being able to freeze a viper function along with the rest of my code in the FROZEN_MPY, it's scary how often you fix/extend something just before I realise I need it!

dpgeorge · 2019-03-04T14:20:06Z

Yes, for dynamic case it yet needs to be defined, and my point was that defining it as mp_fun_table is a way too restrictive, so this PR can't "replace" work being done in #1627.

But #1627 doesn't do anything really different to what's done here, it also just provides mp_fun_table (with a few extras eg to create qstrs, which are no longer needed with the qstr linking done here). The set of macros from #1627 that allow a C module to be statically or dynamically compiled could be adjusted to work with this PR.

Well, it's worth consider implications of the file naming. Using the same suffix for all (binary) MicroPython modules is rather efficient.

Yes it's worth considering. But I don't see how it could be anything other than .mpy, for the fact that this PR allows mixed bytecode and native code in the same file. If you have some pure bytecode in foo.mpy and then want to change just one small function to be faster with native code generation, it seems sensible that the file stays foo.mpy.

Consider for example that it won't be possible to have both bytecode and compile module of the same name.

If needed that can be handled by using different names, eg foo.mpy and foo_x86.mpy, or by using different import paths and managing sys.path.

pfalcon · 2019-03-05T16:54:56Z

But #1627 doesn't do anything really different to what's done here ...

Well, that's exactly my point - neither #1627 nor this PR provides adequate implementation of "dynamically loadable modules", though they may provide some bits and pieces towards that yeah.

But I don't see how it could be anything other than .mpy, for the fact that this PR allows mixed bytecode and native code in the same file.

Right, the threshold is passed with saying "... and you can reuse the same format for native C modules". Should that be .mpy? If that format after found to not be adequate enough for native modules, and other formats are added, should those still be .mpy?

If needed that can be handled by using different names, eg foo.mpy and foo_x86.mpy, or by using different import paths and managing sys.path.

Yes, something like that could be done.

Anyway, this apparently would better be continued within frame of #1627, and this just hopefully merged soon.

dpgeorge · 2019-03-07T12:12:56Z

This was rebased on current master (with recent changes to optimise mpy size), some things fixed, tests added, coverage improved.

The new compile-time option is MICROPY_DEBUG_MP_OBJ_SENTINELS, disabled by default. This is to allow finer control of whether this debugging feature is enabled or not (because, for example, this setting must be the same for mpy-cross and the MicroPython main code when using native code generation).

Simplifies the code and fixes handling of the Ellipsis const in native code generation (which also needs the constant table so must set this flag).

n_obj no longer includes a count for mp_fun_table to make it a bit simpler.

This commit adds support for saving and loading .mpy files that contain native code (native, viper and inline-asm). A lot of the ground work was already done for this in the form of removing pointers from generated native code. The changes here are mainly to link in qstr values to the native code, and change the format of .mpy files to contain native code blocks (possibly mixed with bytecode). A top-level summary: - @micropython.native, @micropython.viper and @micropython.asm_thumb/ asm_xtensa are now allowed in .py files when compiling to .mpy, and they work transparently to the user. - Entire .py files can be compiled to native via mpy-cross -X emit=native and for the most part the generated .mpy files should work the same as their bytecode version. - The .mpy file format is changed to 1) specify in the header if the file contains native code and if so the architecture (eg x86, ARMV7M, Xtensa); 2) for each function block the kind of code is specified (bytecode, native, viper, asm). - When native code is loaded from a .mpy file the native code must be modified (in place) to link qstr values in, just like bytecode (see py/persistentcode.c:arch_link_qstr() function). In addition, this now defines a public, native ABI for dynamically loadable native code generated by other languages, like C.

This adds support to freeze .mpy files that contain native code blocks.

Build with: $ ./mk_native.sh Execute with: $ ./micropython import native_ex native_ex.foo()

dpgeorge · 2019-03-08T11:24:36Z

Ok, this was merged between 02cc288 and 1e23a29.

The commit with the example of generating a .mpy from a C file was not merged, it needs to be cleaned up and generalised a bit first.

Two small PacketBuffer fixes

dpgeorge mentioned this pull request Feb 21, 2019

Update .mpy version to version 4 #4536

Closed

pfalcon reviewed Feb 22, 2019

View reviewed changes

dpgeorge force-pushed the py-native-mpy branch from 24d9884 to 83c26b0 Compare February 25, 2019 05:05

dpgeorge force-pushed the py-native-mpy branch from f607856 to e9f37d3 Compare March 5, 2019 12:58

dpgeorge force-pushed the py-native-mpy branch 2 times, most recently from 72bd732 to c4695d6 Compare March 7, 2019 11:57

dpgeorge added 9 commits March 8, 2019 15:53

py/emitnative: Consolidate where HASCONSTS is set to load-const-obj fun.

01a1f31

Simplifies the code and fixes handling of the Ellipsis const in native code generation (which also needs the constant table so must set this flag).

py/emitnative: Provide concentrated points of qstr emit.

205edb4

py/emitnative: Adjust accounting of size of const_table.

3986820

n_obj no longer includes a count for mp_fun_table to make it a bit simpler.

py/emitglue: Remove union in mp_raw_code_t to combine bytecode & native.

636ed0f

tools/mpy-tool.py: Add support for freezing native code.

ea3c80a

This adds support to freeze .mpy files that contain native code blocks.

py/persistentcode: Bump .mpy version to 4.

9a5f92e

minimal/frozentest: Recompile now that mpy format and version changed.

7852b28

dpgeorge force-pushed the py-native-mpy branch from bdba214 to 5a16a11 Compare March 8, 2019 04:58

dpgeorge added 5 commits March 8, 2019 16:51

mpy-cross: Enable building of x64 native .mpy files.

31d2d83

tests/run-tests: Support running native tests via mpy.

6995523

travis: Enable test for running native code via mpy.

2bcb240

tools/upip.py: Use "raise arg" instead of no-arg raise form, for native.

6e11d86

unix/Makefile: Update coverage tests to match those in Travis.

c6a9bb2

dpgeorge force-pushed the py-native-mpy branch from 5a16a11 to 5f72835 Compare March 8, 2019 05:52

dpgeorge added 2 commits March 8, 2019 17:20

tests/import: Add test for importing x64 native code.

1e23a29

unix: Add example of dynamic loadable native C code.

81dee3e

Build with: $ ./mk_native.sh Execute with: $ ./micropython import native_ex native_ex.foo()

dpgeorge force-pushed the py-native-mpy branch from 5f72835 to 81dee3e Compare March 8, 2019 06:21

dpgeorge closed this Mar 8, 2019

dpgeorge deleted the py-native-mpy branch March 8, 2019 11:24

dpgeorge mentioned this pull request Jul 10, 2019

trying to build mpy with dynamic libs for esp32 with xtensa-gcc #4916

Closed

tannewt pushed a commit to tannewt/circuitpython that referenced this pull request Apr 8, 2021

Merge pull request micropython#4535 from tannewt/packetbuffer_fixup

5b7f90b

Two small PacketBuffer fixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support to save native, viper and inline-asm code to .mpy files #4535

Add support to save native, viper and inline-asm code to .mpy files #4535

dpgeorge commented Feb 21, 2019

pfalcon Feb 22, 2019

dpgeorge Feb 25, 2019

pfalcon Feb 22, 2019

dpgeorge Feb 25, 2019

pfalcon Feb 22, 2019

dpgeorge Feb 25, 2019

pfalcon commented Feb 22, 2019

dpgeorge commented Feb 25, 2019

dpgeorge commented Feb 25, 2019

dpgeorge commented Feb 25, 2019

pfalcon commented Feb 28, 2019

dpgeorge commented Feb 28, 2019

pfalcon commented Feb 28, 2019 •

edited

hoihu commented Feb 28, 2019

andrewleech commented Mar 1, 2019

dpgeorge commented Mar 4, 2019

pfalcon commented Mar 5, 2019

dpgeorge commented Mar 7, 2019

dpgeorge commented Mar 8, 2019

Add support to save native, viper and inline-asm code to .mpy files #4535

Add support to save native, viper and inline-asm code to .mpy files #4535

Conversation

dpgeorge commented Feb 21, 2019

pfalcon Feb 22, 2019

Choose a reason for hiding this comment

dpgeorge Feb 25, 2019

Choose a reason for hiding this comment

pfalcon Feb 22, 2019

Choose a reason for hiding this comment

dpgeorge Feb 25, 2019

Choose a reason for hiding this comment

pfalcon Feb 22, 2019

Choose a reason for hiding this comment

dpgeorge Feb 25, 2019

Choose a reason for hiding this comment

pfalcon commented Feb 22, 2019

dpgeorge commented Feb 25, 2019

dpgeorge commented Feb 25, 2019

dpgeorge commented Feb 25, 2019

pfalcon commented Feb 28, 2019

dpgeorge commented Feb 28, 2019

pfalcon commented Feb 28, 2019 • edited

hoihu commented Feb 28, 2019

andrewleech commented Mar 1, 2019

dpgeorge commented Mar 4, 2019

pfalcon commented Mar 5, 2019

dpgeorge commented Mar 7, 2019

dpgeorge commented Mar 8, 2019

pfalcon commented Feb 28, 2019 •

edited