Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to save native, viper and inline-asm code to .mpy files #4535

Closed
wants to merge 16 commits into from

Conversation

dpgeorge
Copy link
Member

This set of patches provides support for saving and loading .mpy files that contain native code (native, viper and inline-asm). A lot of the ground work was already done for this in the form of removing pointers from generated native code. The changes here are mainly to link in qstrs to the native code, and change to the format of .mpy files to contain native code blocks.

A top-level summary:

  • @micropython.native, @micropython.viper and @micropython.asm_thumb/asm_xtensa are now allowed in .py files when compiling to .mpy, and they work transparently to the user
  • entire .py files can be compiled to native via mpy-cross -X emit=native and for the most part the generated .mpy files should work the same as their bytecode version
  • .mpy file format is changed to 1) specify in the header if the file contains native code, and if so the architecture (eg x86, Thumb, Xtensa); 2) for each function block the kind is specified (bytecode, native, viper, asm)
  • when native code is loaded from a .mpy file the native code must be modified (in place) to link qstr values in, just like bytecode (see link_qstr() functions in the patch)

To do:

  • mpy-cross needs to support compiling for multiple architectures via a command-line switch like mpy-cross -march=thumb2 (at the moment it must be compiled separately for each arch, like mpy-cross-thumb2)
  • mpy-tool.py needs support for freezing native code

In addition, this now defines a public, native ABI for dynamically loadable native code generated by other languages, like C, and so is intended to replace #1627.


This was tested on the unix port (x86, x86-64), stm32 and esp8266. To test it on unix do:

$ mpy-cross -X emit=native -mcache-lookup-bc -o file_native.mpy file.py
$ micropython -m file_native

byte *bytecode = m_new(byte, bc_len);
read_bytes(reader, bytecode, bc_len);
// load function kind
int kind = read_byte(reader);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's possible to combine "kind" with some other flags into this byte?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, since it's only 2 bits I've now combined it with the number of bytes in the bytecode/function data, like: write_uint((len << 2) | (kind - 2))

mp_uint_t *ct = const_table;
size_t i;
if (kind != MP_CODE_NATIVE_VIPER) {
for (i = 0; i < prelude.n_pos_args + prelude.n_kwonly_args; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I woulnd't find comments like "first entries in const table are function argument names" superfluous.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added a few comments to the load/save functions.

@@ -428,6 +428,9 @@ def read_bytecode_qstrs(file, bytecode, ip):
ip += sz

def read_raw_code(f):
kind = read_uint(f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this doesn't correspond to C code, which reads a byte.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now fixed.

@pfalcon
Copy link
Contributor

pfalcon commented Feb 22, 2019

In addition, this now defines a public, native ABI for dynamically loadable native code generated by other languages, like C, and so is intended to replace #1627.

I don't see anything specific to that in this PR. And it's easier said than done, there're a lot of different things to consider for that. For example, C module may want to print a Python object. But exposing all the bunch of print functions doesn't make sense, instead mp_printf() should be extended to be able to print an object, just like it already was to print qstr. And that's just one of the examples.

@dpgeorge
Copy link
Member Author

I updated and force-pushed this branch. New additions include:

  • initial support for frozen native code in mpy-tool.py; tested by freezing upip into the unix executable
  • added support to run test suite via compiling to native mpy, and enabled that on travis

@dpgeorge
Copy link
Member Author

I don't see anything specific to that in this PR.

That and this PR actually have quite a lot in common. The main difference here is that the existing native function types (native/viper/asm) are reused to implement dynamically loadable native code from another language (like C). #1627 added some constants (false, true, none, ellipsis) to the mp_fun_table table, but these are now already there due to the previous improvements to the native code emitter. And while #1627 had a simple way to include qstrs in loadable native code (by allocating an array on the heap for the extra ones and letting the native code populate them at load time), this PR does it more efficiently by using the existing .mpy way of loading qstrs and then linking their values into the loaded native code.

So this PR here is kind of like a rework of #1627 given all the things that have changed since then, and also includes the ability to save native/viper/asm in .mpy files.

@dpgeorge
Copy link
Member Author

I pushed a commit with an example of pure C code that generates a valid .mpy file containing a C function that can be imported and executed. The outer module is bytecode and there is a single function (called foo) which is native and is compiled from the corresponding foo() C function. Executing foo, it prints out the numbers 0-9, along with True/False if the number is odd/even, and returns a new list [42, 43].

@pfalcon
Copy link
Contributor

pfalcon commented Feb 28, 2019

I pushed a commit with an example of pure C code that generates a valid .mpy file containing a C function that can be imported and executed.

Quickly recording comments I had in mind on Monday looking at it, but didn't write down then:

So, this looks like a cute and creative way, but level of hackiness can be judged by the fact that "foo" function is not references anywhere, i.e. apprently it's just assumed that there's a single function in the module, which will end up at a given offset.

And yeah, sure, it's possible to write code in that way (@Native indeed does!), and thanks for showing a way to do that in C. But I doubt utility and practicality of that. After all, a (reasonable) human uses C when compiling Python isn't enough, and they they would like to use a plethora of devices and optimizations offered by C and native C interface of the system, not code it up thru the needle's eye of a JIT compiler.

And beyond that, as I mentioned in my recent comment on #1627 (comment), the starting requirement for dynamically loaded modules should be, should have always been, "allow to compile the same module either statically or dynamically [and in static case, get the same (or 95%) of efficiency of natively static module]". As we already have static modules, they dictate the source format and structure. Dyna support could rehash something (e.g., add more macro calls, or wrap a few of existing things in macros), but that's it. I have a prototype which works like that.

While working on that prototype,which of course reuses your idea of own port-portable dynamod format, it occurred to me that that format won't be enough after all, e.g. it doesn't support dynamic linking itself, and one of reasons people will write modules in C is to interface with existing system libraries. So, supporting native executable/shlib format is a must for systems which have it.

Let's count:

  1. Static module format, we already have.
  2. Static module format extended to dynaloading, self-contained OS-independent modules.
  3. Native OS dynamodules.

And now you propose:

  1. A completely new dynamodule format, not compatible with (== much less efficient) existing static modules, and which requires using completely different API from static modules.

Again, I'm not sure of practicality of p.4. But the C example is of course golden, as reference to someone who'll implement other types of JIT/AOT.

@dpgeorge
Copy link
Member Author

the fact that "foo" function is not references anywhere, i.e. apprently it's just assumed that there's a single function in the module, which will end up at a given offset.

Yes, it's a basic example to show that it's possible at all. It's also possible to have multiple entry points into the C code and a tool would need to be written to create a full-featured .mpy file from such a piece of C code (eg give a list of C symbols mapped to their Python name, in a macro like the existing MP_DEFINE_CONST_FUN_OBJ).

and they they would like to use a plethora of devices and optimizations offered by C

I don't see why they can't. As long as it compiles with -fPIC and -fPIE and doesn't have any relocations it will work.

and native C interface of the system,

On bare metal (which is really the focus of all this work) there is not really such a thing, or at least it is yet to be defined. The current (dynamic) native C interface is the ABI defined by the mp_fun_table which is enough to do anything that you can do in Python.

the starting requirement for dynamically loaded modules should be, should have always been, "allow to compile the same module either statically or dynamically"

With enough macros and maybe some extra preprocessing this could probably be achieved with the format here. To go a step further and do proper linking of symbols would be a lot of extra work, and I guess take a lot of extra code space to store all the possible symbol names and their addresses in the bare-metal firmware.

it occurred to me that that format won't be enough after all, e.g. it doesn't support dynamic linking itself, and one of reasons people will write modules in C is to interface with existing system libraries. So, supporting native executable/shlib format is a must for systems which have it.

As above, it's a lot of work to add full dynamic linking to a bare-metal port, and it would be specific for each architecture. And I don't think it would buy much (over the current scheme of having no linking): the main case for dynamic loadable code would be to implement things that are too slow in Python, and for such routines (eg a FFT) the calls into the runtime system are not the bottleneck, it's the non-Python computation that is.

Let's count: Static module format, we already have.

Yes, we only have one format at the moment. And the proposal here is to add just one, to allow dynamic native code to be loaded in the most simplest way (no linking, except for the qstr table). This comes for free with being able to load native/viper/asm and already opens up many possibilities.

Static module format extended to dynaloading, self-contained OS-independent modules.

As I said above, I'm not sure how this can be done efficiently. But I'm not opposed to eventually adding linking support to .mpy files, eg so the C code can have a BSS section and create static data structures. What's wrong though with starting at the simplest case (what's proposed in this PR) and improving it as usage experience dictates?


But remember, this PR here is primarily about adding support to save native/viper/asm code to .mpy files. It does it in a obvious and pretty minimal way.

@pfalcon
Copy link
Contributor

pfalcon commented Feb 28, 2019

Yes, it's a basic example to show that it's possible at all.

Sure, with macros, then external preprocessors it can be brushed up. It still will remain awkward API-wise IMHO.

and they they would like to use a plethora of devices and optimizations offered by C
I don't see why they can't.

API issue, again. I gave example with mp_printf above - it's rather awkward to call Python print() from C.

and native C interface of the system,
On bare metal (which is really the focus of all this work) there is not really such a thing,

Well, there's - it's possible to write a new module, using the whole of existing MicroPython API. And with #4195 even possible to distribute them without uPy itself (at the risk that if module really used a private API, it'll stop working with another version).

or at least it is yet to be defined. The current (dynamic) native C interface is the ABI defined by the mp_fun_table which is enough to do anything that you can do in Python.

Yes, for dynamic case it yet needs to be defined, and my point was that defining it as mp_fun_table is a way too restrictive, so this PR can't "replace" work being done in #1627.

But remember, this PR here is primarily about adding support to save native/viper/asm code to .mpy files. It does it in a obvious and pretty minimal way.

Yes, +1 to that. Please merge ASAP ;-).

Well, it's worth consider implications of the file naming. Using the same suffix for all (binary) MicroPython modules is rather efficient. But is it really user friendly? Consider for example that it won't be possible to have both bytecode and compile module of the same name. Is it relevant? I'm sure that a lot of clueless users, who will download random binaries from internet prepared by other clueless users - will find that yes. Should it be the limiting factor? Well, I'm ready to have all the usecases 1, 2, 3 I listed above to have .mpy extension, and if someone complains, say "it's not me, it's @dpgeorge who did that" ;-).

@hoihu
Copy link
Sponsor Contributor

hoihu commented Feb 28, 2019

But remember, this PR here is primarily about adding support to save native/viper/asm code to .mpy files. It does it in a obvious and pretty minimal way.

+1
this is a great feature and really opens up a lot of use cases for our application!

@andrewleech
Copy link
Sponsor Contributor

Excited for this feature, I just ran into the limitation of not being able to freeze a viper function along with the rest of my code in the FROZEN_MPY, it's scary how often you fix/extend something just before I realise I need it!

@dpgeorge
Copy link
Member Author

dpgeorge commented Mar 4, 2019

Yes, for dynamic case it yet needs to be defined, and my point was that defining it as mp_fun_table is a way too restrictive, so this PR can't "replace" work being done in #1627.

But #1627 doesn't do anything really different to what's done here, it also just provides mp_fun_table (with a few extras eg to create qstrs, which are no longer needed with the qstr linking done here). The set of macros from #1627 that allow a C module to be statically or dynamically compiled could be adjusted to work with this PR.

Well, it's worth consider implications of the file naming. Using the same suffix for all (binary) MicroPython modules is rather efficient.

Yes it's worth considering. But I don't see how it could be anything other than .mpy, for the fact that this PR allows mixed bytecode and native code in the same file. If you have some pure bytecode in foo.mpy and then want to change just one small function to be faster with native code generation, it seems sensible that the file stays foo.mpy.

Consider for example that it won't be possible to have both bytecode and compile module of the same name.

If needed that can be handled by using different names, eg foo.mpy and foo_x86.mpy, or by using different import paths and managing sys.path.

@pfalcon
Copy link
Contributor

pfalcon commented Mar 5, 2019

But #1627 doesn't do anything really different to what's done here ...

Well, that's exactly my point - neither #1627 nor this PR provides adequate implementation of "dynamically loadable modules", though they may provide some bits and pieces towards that yeah.

But I don't see how it could be anything other than .mpy, for the fact that this PR allows mixed bytecode and native code in the same file.

Right, the threshold is passed with saying "... and you can reuse the same format for native C modules". Should that be .mpy? If that format after found to not be adequate enough for native modules, and other formats are added, should those still be .mpy?

If needed that can be handled by using different names, eg foo.mpy and foo_x86.mpy, or by using different import paths and managing sys.path.

Yes, something like that could be done.

Anyway, this apparently would better be continued within frame of #1627, and this just hopefully merged soon.

@dpgeorge dpgeorge force-pushed the py-native-mpy branch 2 times, most recently from 72bd732 to c4695d6 Compare March 7, 2019 11:57
@dpgeorge
Copy link
Member Author

dpgeorge commented Mar 7, 2019

This was rebased on current master (with recent changes to optimise mpy size), some things fixed, tests added, coverage improved.

The new compile-time option is MICROPY_DEBUG_MP_OBJ_SENTINELS, disabled by
default.  This is to allow finer control of whether this debugging feature
is enabled or not (because, for example, this setting must be the same for
mpy-cross and the MicroPython main code when using native code generation).
Simplifies the code and fixes handling of the Ellipsis const in native code
generation (which also needs the constant table so must set this flag).
n_obj no longer includes a count for mp_fun_table to make it a bit simpler.
This commit adds support for saving and loading .mpy files that contain
native code (native, viper and inline-asm).  A lot of the ground work was
already done for this in the form of removing pointers from generated
native code.  The changes here are mainly to link in qstr values to the
native code, and change the format of .mpy files to contain native code
blocks (possibly mixed with bytecode).

A top-level summary:

- @micropython.native, @micropython.viper and @micropython.asm_thumb/
  asm_xtensa are now allowed in .py files when compiling to .mpy, and they
  work transparently to the user.

- Entire .py files can be compiled to native via mpy-cross -X emit=native
  and for the most part the generated .mpy files should work the same as
  their bytecode version.

- The .mpy file format is changed to 1) specify in the header if the file
  contains native code and if so the architecture (eg x86, ARMV7M, Xtensa);
  2) for each function block the kind of code is specified (bytecode,
  native, viper, asm).

- When native code is loaded from a .mpy file the native code must be
  modified (in place) to link qstr values in, just like bytecode (see
  py/persistentcode.c:arch_link_qstr() function).

In addition, this now defines a public, native ABI for dynamically loadable
native code generated by other languages, like C.
This adds support to freeze .mpy files that contain native code blocks.
Build with:

    $ ./mk_native.sh

Execute with:

    $ ./micropython
    import native_ex
    native_ex.foo()
@dpgeorge
Copy link
Member Author

dpgeorge commented Mar 8, 2019

Ok, this was merged between 02cc288 and 1e23a29.

The commit with the example of generating a .mpy from a C file was not merged, it needs to be cleaned up and generalised a bit first.

@dpgeorge dpgeorge closed this Mar 8, 2019
@dpgeorge dpgeorge deleted the py-native-mpy branch March 8, 2019 11:24
tannewt pushed a commit to tannewt/circuitpython that referenced this pull request Apr 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants