Skip to content

Conversation

@agatti
Copy link
Contributor

@agatti agatti commented Apr 7, 2025

Summary

This PR lets mpy_ld.py resolve symbols not only from the object files involved in the linking process, or from compiler supplied static libraries, but also from a list of symbols referenced by an absolute address (usually provided by the system's ROM).

This is needed for Xtensa/LX106 targets as some C stdlib functions are provided by the MCU's own ROM code to reduce the final code footprint. Given that there are different methods to reference external symbols for different processors, this change is currently limited to the Xtensa target and it and it will need further modifications to be adapted to other targets.

Natmod builds can set the EXTERN_SYMS variable pointing to a file containing a series of symbols and their absolute address if they want to provide external symbols to the resolution process. Each line must contain two whitespace-separated fields, the name and its address (which will be parsed as an hexadecimal number). Empty lines are ignored, and lines starting with a "#" symbol will be treated as comment lines and thus ignored as well.

Future work will be needed when the ESP32-C2 MCU support is brought in as some code from libgcc and newlib is moved into ROM on that platform as well (see ESP-IDF's linker files in components/esp_rom/esp32c2/ld/ - newlib, newlib-nano, and libgcc).

All natmods are now built for the Xtensa architecture as part of the CI build process.

Testing

Regular natmod test were ran on a NodeMCU v2 ESP8266 board by injecting the following code block when importing natmods on the board (see #14430 (comment)):

import esp, gc
gc.collect()
esp.set_native_code_location(200 * 4096, 30 * 4096)

To achieve this a new argument to tests/run-natmodtests.py is introduced so the above fragment is uploaded as part of the testing process. All tests were executed successfully with ./run-natmodtests.py -p -d /dev/ttyUSB0 -a xtensa -b xtensa_prelude.py extmod/<testfiles> with the following exceptions (xtensa_prelude.py is the extra code fragment being injected):

extmod/btree1.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")
extmod/btree_closed.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")
extmod/btree_error.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")
extmod/btree_gc.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")
extmod/deflate_compress.py: FAIL ("Error: memory allocation failed")

The btree natmod probably is a bit too much for an ESP8266 in its current form, probably the same is valid for the deflate natmod when it comes to compressing buffers.

Trade-offs and Alternatives

All tools/mpy_ld.py changes are active only for the EM_XTENSA architecture and they are opt-in, so there shouldn't be any trade-offs to be made here. As far as alternatives go, the only one is to move natmod code into one or more usermods and bring them in as part of the MicroPython build.

For tests/run-natmodtests.py the same applies (as in the injection being opt-in), although it is available to all platforms.

@codecov
Copy link

codecov bot commented Apr 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.54%. Comparing base (9174cff) to head (193603d).
Report is 9 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #17091   +/-   ##
=======================================
  Coverage   98.54%   98.54%           
=======================================
  Files         169      169           
  Lines       21943    21943           
=======================================
  Hits        21623    21623           
  Misses        320      320           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link

github-actions bot commented Apr 7, 2025

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

@agatti agatti force-pushed the mpy-ld-extern-provider branch from 1e9da04 to 70f8079 Compare April 7, 2025 23:11
@dpgeorge dpgeorge added the tools Relates to tools/ directory in source, or other tooling label Apr 8, 2025
@agatti agatti force-pushed the mpy-ld-extern-provider branch 3 times, most recently from 66abdd6 to dba82c8 Compare April 9, 2025 20:01
@jonnor
Copy link
Contributor

jonnor commented Apr 12, 2025

Looks like useful additions. Especially as it will be needed for ESP32-C2.

@agatti agatti force-pushed the mpy-ld-extern-provider branch from dba82c8 to 8c945a8 Compare April 14, 2025 01:13
@dpgeorge dpgeorge added this to the release-1.26.0 milestone Apr 14, 2025
@dpgeorge
Copy link
Member

This is a nice improvement, thanks!

But, the provided external symbol names/addresses are highly specific to esp8266 (right?) and I think they should all be consolidated into a single file. In fact they are already available in this repo in ports/esp8266/boards/eagle.rom.addr.v6.ld, so I would suggest using this file instead of adding fragments (mpy_ld.py could support loading a small fragment with just the required symbols, but for the examples here I think it can just refer to that eagle.com file).

@agatti
Copy link
Contributor Author

agatti commented Apr 22, 2025

In theory this can also be used for other platforms that need to call into ROM from a natmod (or into functions available at fixed address that have no stub for the linker), it just so happens that esp8266 is the only supported platform that has this specific need at the moment. I've limited this functionality to be only available on Xtensa only because that's what can be validated on hardware on my end.

I thought about parsing eagle.rom.addr.v6.ld, but that's a linkerscript and its syntax is much more involved than what I've implemented. A linkerscript parser that can be used from Python is available at https://github.com/tree-sitter-grammars/tree-sitter-linkerscript but that brings in py-tree-sitter as a requirement and thus a larger code footprint for its integration.

I don't mind to write either a parser that can understand linkerscripts just enough to parse eagle.rom.addr.v6.ld, or integrate py-tree-sitter and modify the CI script to install dependencies (looking back at this, documentation has to be updated either way to keep track of the new feature). I'm partial to the former, but the latter has the potential to be more generic.

@dpgeorge
Copy link
Member

Following on from #17379 (comment), it seems that enabling LINK_RUNTIME=1 is enough to get all the xtensa natmods working on esp8266. So the in-ROM version of functions is never actually used (I didn't check the output .mpy file but I assume this is the case, because it's only if a symbol is undefined that the in-ROM locations are used).

But, as you point out, from a memory-use point of view it would be better to prefer in-ROM routines. Is it possible to make the in-ROM symbols have a preference over ones from elf files? I guess as a first attempt the in-ROM symbols can always override elf symbols, and in the future (if needed) we can have a way to select which is preferred.

I don't mind to write either a parser that can understand linkerscripts just enough to parse eagle.rom.addr.v6.ld

I think it's a good idea to be able to parse eagle.rom.addr.v6.ld. So a user can use that file verbatim if needed, instead of extracting the symbols they need into a separate file. I reckon just a custom parser that can parse just that file is enough. Probably a single regex could do it.

@agatti
Copy link
Contributor Author

agatti commented May 29, 2025

But, as you point out, from a memory-use point of view it would be better to prefer in-ROM routines. Is it possible to make the in-ROM symbols have a preference over ones from elf files? I guess as a first attempt the in-ROM symbols can always override elf symbols, and in the future (if needed) we can have a way to select which is preferred.

Making ROM symbols override ELF ones isn't much of a deal (it's a matter of altering the symbol source lookup order in populate_got), but the unused ELF symbols code will still linger in the MPY file since they are already embedded in the input object file at resolution time. I haven't yet looked at how difficult it'd be to remove the extra symbols before resolution occurs.

Edit

Turns out stripping symbols from within mpy_ld.py is doable but not trivial. The elftools python package has no provisions for altering the ELF data at that level. So this would require performing some binary surgery on the raw data section buffer and patch all symbol addresses located past the operation site to reflect the new starting offset (and fixing up all absolute relocations too), along with possibly monkey-patching some of elftools' code to use the altered copy of the section data rather than reading it straight from the ELF file itself (even more fun with compressed sections!).

I don't know what's the 1.26.0 tentative release date, but I'm partial to have this with ROM symbols support and redundant ELF symbols' code in - the final size of the MPY file is exactly the same when ELF resolution is used, as the only change is the symbol address in opcodes/tables pointing to ROM rather than elsewhere. Having stripped MPY files without duplicated ELF symbols is something I can work on at a later date since it'd take quite some testing to get it right in all cases.

Regarding per-symbol resolution order, how about having a --extern-blacklist parameter containing a comma-separated list of symbols that should never be picked from linkerscripts (with multiple argument occurrences adding to the list of symbols to skip)?

In the meantime I've rewritten the linkerscript parser to also handle the ESP-IDF symbols lists so - once I enable this on RV32 as well - this should also work for the whole ESP32 MCU line. The only caveat would be that some extra argument processing will be needed in the makefile to pick up possibly more than just a single linkerscript (same goes for mpy_ld.py), as libc/libgcc/newlib symbol definitions are split across multiple files.

@dpgeorge
Copy link
Member

I don't know what's the 1.26.0 tentative release date,

It's 1st August 2025. And the aim is to be strict with that date.

I'm partial to have this with ROM symbols support and redundant ELF symbols' code in - the final size of the MPY file is exactly the same when ELF resolution is used, as the only change is the symbol address in opcodes/tables pointing to ROM rather than elsewhere.

Yes, I'm happy with that. We can improve upon it later.

Regarding per-symbol resolution order

I think we can skip that feature for now, and just always prefer ROM.

Turns out stripping symbols from within mpy_ld.py is doable but not trivial

I don't fully understand why this is difficult: at the moment, if you use LINK_RUNTIME=1 but don't use any symbols from libc/libgcc/etc then it doesn't add any code at all. And because libc is an archive made of many .o object files, I assume if you reference a symbol in one of those object files then you only need to include that single object file, not the whole libc archive. Right? So that feature alone (of only included .o's when a symbol within them is referenced) should be enough to get optimal sized .mpy files (as long as the .o in the .a are fine grained enough).

@agatti
Copy link
Contributor Author

agatti commented May 30, 2025

I don't fully understand why this is difficult: at the moment, if you use LINK_RUNTIME=1 but don't use any symbols from libc/libgcc/etc then it doesn't add any code at all. And because libc is an archive made of many .o object files, I assume if you reference a symbol in one of those object files then you only need to include that single object file, not the whole libc archive. Right? So that feature alone (of only included .o's when a symbol within them is referenced) should be enough to get optimal sized .mpy files (as long as the .o in the .a are fine grained enough).

That's indeed correct, but I've probably expressed myself incorrectly as my point applies to single symbols regardless the object files' granularity.

Let's look at the btree natmod as an example.

The ESP8266 ROM has mem{cpy,set,move,cmp,etc.} implementations, and they're exposed in the linkerscript file, so their addresses are used in the final MPY file if they're encountered, as expected when using an externs file.

However, if you build the natmod (even without LINK_RUNTIME in this case) and look at the generated build/btree_c.o file you can see those symbols being referenced, but their implementation is present in the object file itself:

$ nm build/btree_c.o | grep -i " mem"

0000042c T memcpy
0000045c T memmove
00000444 T memset

That object file is then used for linking by mpy_ld.py, and it contains one single .text section so, as far as I know, the only way for those redundant symbols to not show up in the final MPY file in that case without involving $CROSS_COMPILE-objcopy as a preprocessing pass is to cut the ranges out from the raw .text section data bytes and then fix up offsets past the cut out points.

I haven't tried building with -ffunction-sections to see if mpy_ld.py handles that; if it does then it's much easier to strip redundant symbols away, by skipping the whole section rather than operating on section data buffers.

@agatti agatti force-pushed the mpy-ld-extern-provider branch from 8c945a8 to a045c0f Compare June 1, 2025 09:57
@agatti
Copy link
Contributor Author

agatti commented Jun 1, 2025

This iteration of the patch should address the pending review issues and enforce the new resolution behaviour, as in ROM symbols override ELF ones in every case.

However, this made testing more interesting. The modules that are too large to be imported as-is on an ESP8266 and require calling esp.set_native_code_location to be able to be loaded in now exhibited an unexpected behaviour, with "impossible" crashes that took me some time disassembling code to see whether the patch didn't operate correctly or not. I've never had any luck in using a JTAG probe with an ESP8266, so that's all through trial and error. If anybody knows how to get JTAG working on that please let me know.

The natmods in question are btree, deflate, and framebuf. btree is simply too large to be loaded unless some more RAM is freed anyway and so it couldn't be tested (as before).

framebuf tests work except for framebuf_polygon where it would crash inside mp_obj_get_int with a crash EPC1 pointing into the void (somewhere in the 0x8000_0000 address range). I've looked at the MPY CODE segment to check if I was poking in the wrong values, but things look OK: see https://gist.github.com/agatti/4398c2a28ed786171c31e54cef2f18cf, the ROM division symbol addresses are correctly put at 0x0000_1008 and 0x0000_100C and the code for framebuf_poly before and slightly after the crash point (0x0000_1f57 in the disassembled code) looks correct.

For deflate I've just looked at the first one, deflate_compress, crashing inside io.BytesIO.write when building the buffer to compress (appending b"micropython" 10 times).

Now, why it didn't crash before, when ROM symbols were picked only as a last resort? I have two hypotheses that still do not fully explain this.

  • Code (or something else) put in the memory area reserved by esp.set_native_code_location gets somehow corrupted at runtime (that'd explain the framebuf crash)
  • Mixing libgcc code with ESP8266 ROM symbols isn't really supported and some registers are trashed behind the scenes or there's some stack weirdness going (deflate needs __modsi3, which is not present in the ROM and thus has to be fetched from libgcc.a).

Well, there's a third hypothesis, my board's flash has failing cells in the area used by esp.set_native_code_location, but I only have one ESP8266 board I can try this on at the moment. I'd be really surprised if it's this one though.

This needs a bit more investigation, but I doubt I'd make some progress quickly without getting JTAG to work. Again, if anybody knows how to get it to work on Linux, please speak up! :)

Edit: the target address range suspiciously looks like it's been relocated even though it shouldn't have been, but that might just be a coincidence, otherwise why would it crash after a ROM call just took place successfully?

Edit2: and yes, it does get relocated - I guess I'll have to claim relocation type 0x7E for py/persistentcode.c to leave it alone since this is not going to be Xtensa specific and it's rather unprobable that the function table will grow to 135 entries whilst keeping the current file format version number (plus I have no idea on how to expand the Xtensa literal section to get this to work).

@agatti agatti force-pushed the mpy-ld-extern-provider branch from a045c0f to 406f513 Compare June 1, 2025 16:03
@agatti
Copy link
Contributor Author

agatti commented Jun 1, 2025

@dpgeorge I've had to assign a new relocation type number (126, 0x7E) for fixed address entries to prevent their relocation at load time. It should not be a change that breaks existing code as the function table never reached that number of entries. However, if that's something you want to delay for a more significant set of MPY format changes that would require a version number bump, that's understandable.

That said, now all framebuf tests pass - deflate tests behave as when using ELF symbols except for tests/extmod/deflate_stream_error.py that crashes when using ROM symbols (the crash was due to an improper code location offset for esp.set_native_code_location, trashing data).

The request for help for JTAG on an ESP8266 is still valid, even though I've added an optional exception reporter to the ESP8266 port. It prints the contents of struct rst_info if the board rebooted due to an exception or a watchdog kicking in (hardware or software).

So, relocation type number concerns aside this should be ready for review - it incorporates the suggestions you've made earlier but there's some new code in.

@agatti agatti force-pushed the mpy-ld-extern-provider branch from 406f513 to bae6754 Compare June 1, 2025 16:19
@agatti agatti changed the title tools/mpy_ld.py: Add ROM symbols for Xtensa natmods. tools/mpy_ld.py: Resolve fixed-address symbols if requested. Jun 1, 2025
@agatti agatti force-pushed the mpy-ld-extern-provider branch 3 times, most recently from 3cd35ae to 2b46c83 Compare June 3, 2025 20:17
@agatti agatti force-pushed the mpy-ld-extern-provider branch 2 times, most recently from 23515be to 08bcdc9 Compare June 4, 2025 05:29
@agatti agatti force-pushed the mpy-ld-extern-provider branch from 08bcdc9 to c7681e7 Compare June 4, 2025 12:01
agatti added 9 commits June 4, 2025 22:35
This commit lets mpy_ld.py resolve symbols not only from the object
files involved in the linking process, or from compiler-supplied static
libraries, but also from a list of symbols referenced by an absolute
address (usually provided by the system's ROM).

This is needed for ESP8266 targets as some C stdlib functions are
provided by the MCU's own ROM code to reduce the final code footprint,
and therefore those functions' implementation was removed from the
compiler's support libraries.  This means that unless `LINK_RUNTIME` is
set (which lets tooling look at more libraries to resolve symbols) the
build process will fail as tooling is unaware of the ROM symbols'
existence.  With this change, fixed-address symbols can be exposed to
the symbol resolution step when performing natmod linking.

If there are symbols coming in from a fixed-address symbols list and
internal code or external libraries, the fixed-address symbol address
will take precedence in all cases.

Although this is - in theory - also working for the whole range of ESP32
MCUs, testing is currently limited to Xtensa processors and the example
natmods' makefiles only make use of this commit's changes for the
ESP8266 target.

Natmod builds can set the MPY_EXTERN_SYM_FILE variable pointing to a
linkerscript file containing a series of symbols (weak or strong) at a
fixed address; these symbols will then be used by the MicroPython
linker when packaging the natmod.  If a different natmod build method is
used (eg. custom CMake scripts), `tools/mpy_ld.py` can now accept a
command line parameter called `--externs` (or its short variant `-e`)
that contains the path of a linkerscript file with the fixed-address
symbols to use when performing the linking process.

The linkerscript file parser can handle a very limited subset of
binutils's linkerscript syntax, namely just block comments, strong
symbols, and weak symbols.  Each symbol must be in its own line for the
parser to succeed, empty lines or comment blocks are skipped.  For an
example of what this parser was meant to handle, you can look at
`ports/esp8266/boards/eagle.rom.addr.v6.ld` and follow its format.

The natmod developer documentation is also updated to reflect the new
command line argument accepted by `mpy_ld.py` and the use cases for the
changes introduced by this commit.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit introduces a mechanism to customise the code that is
injected to the board when performing a native module import.

A new argument, "-b"/"--begin", is added so regular Python code can be
inserted in the injected fragment between the module file creation and
the effective module import.  This is needed for running natmod tests on
ESP8266 as that board does not have enough memory to fit certain modules
unless additional configuration is performed.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let
the "random" example natmod build for the Xtensa platform.

On the ESP8266, signed integer division code isn't provided as part of
libgcc.a, libm.a, or libc.a, but it is instead provided by the ROM.
Regular builds inject the appropriate symbol addresses as part of the
linking process (see eagle.rom.addr.v6.ld), but natmods need this
information brought in from somewhere else.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let
the "framebuf" example natmod build for the Xtensa platform.

On the ESP8266, integer division code isn't provided as part of
libgcc.a, libm.a, or libc.a, but it is instead provided by the ROM.
Regular builds inject the appropriate symbol addresses as part of the
linking process (see eagle.rom.addr.v6.ld), but natmods need this
information brought in from somewhere else.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let
the "deflate" example natmod build for the Xtensa platform.

Unlike other natmods that require an external symbol list to build
without bringing in the whole runtime libraries set, this natmod is
referencing the `__modsi3` symbol which was removed from the ESP8266's
SDK but not present in ROM.  The latter only has a `__umodsi3`
implementation that only operates on unsigned values, and thus unable to
handle this natmod.  Thus, the extended library resolution process is
enabled for this natmod as a `__modsi3` implementation is made available
that way (still using ROM symbols whenever possible).  This also means
that symbols that appear in both ROM and external libraries sort of
co-exist in the final MPY file, with ROM symbols being used by natmod
code but the implementation from the library still exists in the final
MPY file, unused.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let
the "btree" example natmod build for the Xtensa platform.

On the ESP8266, unsigned integer division code isn't provided as part of
libgcc.a, libm.a, or libc.a, but it is instead provided by the ROM.
Regular builds inject the appropriate symbol addresses as part of the
linking process (see eagle.rom.addr.v6.ld), but natmods need this
information brought in from somewhere else.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit adds an optional configuration option for the ESP8266 port
that, if the board rebooted due to a crash, will print to stdout some
information about the error that triggered the issue.

It is not possible using regular SDK functions to intercept errors and
print information at that stage, and the only error response from the
board is to reboot itself.  This is the next best thing, print some
error information just once at boot time after the crash - the least
invasive option given the situation we're in.

This is disabled by default, and can be enabled by enabling
MICROPY_HW_HARD_FAULT_DEBUG in the port configuration - obviously with a
small increase in the firmware code footprint.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit lets the CI pipeline build all natmods for the Xtensa
target, now that ROM symbols can be used in the linking process.

The restriction was put in place due to build failures on certain
natmods for Xtensa, as ROM symbols would not be used, causing undefined
symbol errors at build time.

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit fixes a small yet harmless issue that occurs when invoking
`ci_native_mpy_modules_build` on a persistent environment, as only X64
MPY files would be removed by the cleaning process.

Now the correct architecture is passed at all times when cleaning before
building a natmod for a particular architecture, forcing a full build of
all files to better simulate the CI environment (where there's no state
persisted between runs for this step).

Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
Copy link
Member

@dpgeorge dpgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating, this looks good now. Tested building all natmod examples for xtensa arch, and ran them on esp8266 (using the new prelude option to run-natmodtests.py).

@dpgeorge dpgeorge force-pushed the mpy-ld-extern-provider branch from c7681e7 to 193603d Compare June 4, 2025 13:02
@dpgeorge dpgeorge merged commit 193603d into micropython:master Jun 4, 2025
68 checks passed
@agatti agatti deleted the mpy-ld-extern-provider branch June 4, 2025 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tools Relates to tools/ directory in source, or other tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants