-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
tools/mpy_ld.py: Resolve fixed-address symbols if requested. #17091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #17091 +/- ##
=======================================
Coverage 98.54% 98.54%
=======================================
Files 169 169
Lines 21943 21943
=======================================
Hits 21623 21623
Misses 320 320 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Code size report: |
1e9da04 to
70f8079
Compare
66abdd6 to
dba82c8
Compare
|
Looks like useful additions. Especially as it will be needed for ESP32-C2. |
dba82c8 to
8c945a8
Compare
|
This is a nice improvement, thanks! But, the provided external symbol names/addresses are highly specific to esp8266 (right?) and I think they should all be consolidated into a single file. In fact they are already available in this repo in |
|
In theory this can also be used for other platforms that need to call into ROM from a natmod (or into functions available at fixed address that have no stub for the linker), it just so happens that esp8266 is the only supported platform that has this specific need at the moment. I've limited this functionality to be only available on Xtensa only because that's what can be validated on hardware on my end. I thought about parsing I don't mind to write either a parser that can understand linkerscripts just enough to parse |
|
Following on from #17379 (comment), it seems that enabling But, as you point out, from a memory-use point of view it would be better to prefer in-ROM routines. Is it possible to make the in-ROM symbols have a preference over ones from elf files? I guess as a first attempt the in-ROM symbols can always override elf symbols, and in the future (if needed) we can have a way to select which is preferred.
I think it's a good idea to be able to parse |
Making ROM symbols override ELF ones isn't much of a deal (it's a matter of altering the symbol source lookup order in Edit Turns out stripping symbols from within I don't know what's the 1.26.0 tentative release date, but I'm partial to have this with ROM symbols support and redundant ELF symbols' code in - the final size of the MPY file is exactly the same when ELF resolution is used, as the only change is the symbol address in opcodes/tables pointing to ROM rather than elsewhere. Having stripped MPY files without duplicated ELF symbols is something I can work on at a later date since it'd take quite some testing to get it right in all cases. Regarding per-symbol resolution order, how about having a In the meantime I've rewritten the linkerscript parser to also handle the ESP-IDF symbols lists so - once I enable this on RV32 as well - this should also work for the whole ESP32 MCU line. The only caveat would be that some extra argument processing will be needed in the makefile to pick up possibly more than just a single linkerscript (same goes for |
It's 1st August 2025. And the aim is to be strict with that date.
Yes, I'm happy with that. We can improve upon it later.
I think we can skip that feature for now, and just always prefer ROM.
I don't fully understand why this is difficult: at the moment, if you use |
That's indeed correct, but I've probably expressed myself incorrectly as my point applies to single symbols regardless the object files' granularity. Let's look at the btree natmod as an example. The ESP8266 ROM has mem{cpy,set,move,cmp,etc.} implementations, and they're exposed in the linkerscript file, so their addresses are used in the final MPY file if they're encountered, as expected when using an externs file. However, if you build the natmod (even without That object file is then used for linking by I haven't tried building with |
8c945a8 to
a045c0f
Compare
|
This iteration of the patch should address the pending review issues and enforce the new resolution behaviour, as in ROM symbols override ELF ones in every case. However, this made testing more interesting. The modules that are too large to be imported as-is on an ESP8266 and require calling The natmods in question are
For Now, why it didn't crash before, when ROM symbols were picked only as a last resort? I have two hypotheses that still do not fully explain this.
Well, there's a third hypothesis, my board's flash has failing cells in the area used by This needs a bit more investigation, but I doubt I'd make some progress quickly without getting JTAG to work. Again, if anybody knows how to get it to work on Linux, please speak up! :) Edit: the target address range suspiciously looks like it's been relocated even though it shouldn't have been, but that might just be a coincidence, otherwise why would it crash after a ROM call just took place successfully? Edit2: and yes, it does get relocated - I guess I'll have to claim relocation type 0x7E for |
a045c0f to
406f513
Compare
|
@dpgeorge I've had to assign a new relocation type number (126, 0x7E) for fixed address entries to prevent their relocation at load time. It should not be a change that breaks existing code as the function table never reached that number of entries. However, if that's something you want to delay for a more significant set of MPY format changes that would require a version number bump, that's understandable. That said, now all The request for help for JTAG on an ESP8266 is still valid, even though I've added an optional exception reporter to the ESP8266 port. It prints the contents of So, relocation type number concerns aside this should be ready for review - it incorporates the suggestions you've made earlier but there's some new code in. |
406f513 to
bae6754
Compare
3cd35ae to
2b46c83
Compare
23515be to
08bcdc9
Compare
08bcdc9 to
c7681e7
Compare
This commit lets mpy_ld.py resolve symbols not only from the object files involved in the linking process, or from compiler-supplied static libraries, but also from a list of symbols referenced by an absolute address (usually provided by the system's ROM). This is needed for ESP8266 targets as some C stdlib functions are provided by the MCU's own ROM code to reduce the final code footprint, and therefore those functions' implementation was removed from the compiler's support libraries. This means that unless `LINK_RUNTIME` is set (which lets tooling look at more libraries to resolve symbols) the build process will fail as tooling is unaware of the ROM symbols' existence. With this change, fixed-address symbols can be exposed to the symbol resolution step when performing natmod linking. If there are symbols coming in from a fixed-address symbols list and internal code or external libraries, the fixed-address symbol address will take precedence in all cases. Although this is - in theory - also working for the whole range of ESP32 MCUs, testing is currently limited to Xtensa processors and the example natmods' makefiles only make use of this commit's changes for the ESP8266 target. Natmod builds can set the MPY_EXTERN_SYM_FILE variable pointing to a linkerscript file containing a series of symbols (weak or strong) at a fixed address; these symbols will then be used by the MicroPython linker when packaging the natmod. If a different natmod build method is used (eg. custom CMake scripts), `tools/mpy_ld.py` can now accept a command line parameter called `--externs` (or its short variant `-e`) that contains the path of a linkerscript file with the fixed-address symbols to use when performing the linking process. The linkerscript file parser can handle a very limited subset of binutils's linkerscript syntax, namely just block comments, strong symbols, and weak symbols. Each symbol must be in its own line for the parser to succeed, empty lines or comment blocks are skipped. For an example of what this parser was meant to handle, you can look at `ports/esp8266/boards/eagle.rom.addr.v6.ld` and follow its format. The natmod developer documentation is also updated to reflect the new command line argument accepted by `mpy_ld.py` and the use cases for the changes introduced by this commit. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit introduces a mechanism to customise the code that is injected to the board when performing a native module import. A new argument, "-b"/"--begin", is added so regular Python code can be inserted in the injected fragment between the module file creation and the effective module import. This is needed for running natmod tests on ESP8266 as that board does not have enough memory to fit certain modules unless additional configuration is performed. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let the "random" example natmod build for the Xtensa platform. On the ESP8266, signed integer division code isn't provided as part of libgcc.a, libm.a, or libc.a, but it is instead provided by the ROM. Regular builds inject the appropriate symbol addresses as part of the linking process (see eagle.rom.addr.v6.ld), but natmods need this information brought in from somewhere else. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let the "framebuf" example natmod build for the Xtensa platform. On the ESP8266, integer division code isn't provided as part of libgcc.a, libm.a, or libc.a, but it is instead provided by the ROM. Regular builds inject the appropriate symbol addresses as part of the linking process (see eagle.rom.addr.v6.ld), but natmods need this information brought in from somewhere else. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let the "deflate" example natmod build for the Xtensa platform. Unlike other natmods that require an external symbol list to build without bringing in the whole runtime libraries set, this natmod is referencing the `__modsi3` symbol which was removed from the ESP8266's SDK but not present in ROM. The latter only has a `__umodsi3` implementation that only operates on unsigned values, and thus unable to handle this natmod. Thus, the extended library resolution process is enabled for this natmod as a `__modsi3` implementation is made available that way (still using ROM symbols whenever possible). This also means that symbols that appear in both ROM and external libraries sort of co-exist in the final MPY file, with ROM symbols being used by natmod code but the implementation from the library still exists in the final MPY file, unused. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit provides the appropriate external symbol addresses to let the "btree" example natmod build for the Xtensa platform. On the ESP8266, unsigned integer division code isn't provided as part of libgcc.a, libm.a, or libc.a, but it is instead provided by the ROM. Regular builds inject the appropriate symbol addresses as part of the linking process (see eagle.rom.addr.v6.ld), but natmods need this information brought in from somewhere else. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit adds an optional configuration option for the ESP8266 port that, if the board rebooted due to a crash, will print to stdout some information about the error that triggered the issue. It is not possible using regular SDK functions to intercept errors and print information at that stage, and the only error response from the board is to reboot itself. This is the next best thing, print some error information just once at boot time after the crash - the least invasive option given the situation we're in. This is disabled by default, and can be enabled by enabling MICROPY_HW_HARD_FAULT_DEBUG in the port configuration - obviously with a small increase in the firmware code footprint. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit lets the CI pipeline build all natmods for the Xtensa target, now that ROM symbols can be used in the linking process. The restriction was put in place due to build failures on certain natmods for Xtensa, as ROM symbols would not be used, causing undefined symbol errors at build time. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit fixes a small yet harmless issue that occurs when invoking `ci_native_mpy_modules_build` on a persistent environment, as only X64 MPY files would be removed by the cleaning process. Now the correct architecture is passed at all times when cleaning before building a natmod for a particular architecture, forcing a full build of all files to better simulate the CI environment (where there's no state persisted between runs for this step). Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
dpgeorge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating, this looks good now. Tested building all natmod examples for xtensa arch, and ran them on esp8266 (using the new prelude option to run-natmodtests.py).
c7681e7 to
193603d
Compare
Summary
This PR lets mpy_ld.py resolve symbols not only from the object files involved in the linking process, or from compiler supplied static libraries, but also from a list of symbols referenced by an absolute address (usually provided by the system's ROM).
This is needed for Xtensa/LX106 targets as some C stdlib functions are provided by the MCU's own ROM code to reduce the final code footprint. Given that there are different methods to reference external symbols for different processors, this change is currently limited to the Xtensa target and it and it will need further modifications to be adapted to other targets.
Natmod builds can set the
EXTERN_SYMSvariable pointing to a file containing a series of symbols and their absolute address if they want to provide external symbols to the resolution process. Each line must contain two whitespace-separated fields, the name and its address (which will be parsed as an hexadecimal number). Empty lines are ignored, and lines starting with a "#" symbol will be treated as comment lines and thus ignored as well.Future work will be needed when the ESP32-C2 MCU support is brought in as some code from libgcc and newlib is moved into ROM on that platform as well (see ESP-IDF's linker files in
components/esp_rom/esp32c2/ld/-newlib,newlib-nano, andlibgcc).All natmods are now built for the Xtensa architecture as part of the CI build process.
Testing
Regular natmod test were ran on a NodeMCU v2 ESP8266 board by injecting the following code block when importing natmods on the board (see #14430 (comment)):
To achieve this a new argument to
tests/run-natmodtests.pyis introduced so the above fragment is uploaded as part of the testing process. All tests were executed successfully with./run-natmodtests.py -p -d /dev/ttyUSB0 -a xtensa -b xtensa_prelude.py extmod/<testfiles>with the following exceptions (xtensa_prelude.pyis the extra code fragment being injected):extmod/btree1.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")extmod/btree_closed.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")extmod/btree_error.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")extmod/btree_gc.py: FAIL ("MemoryError: memory allocation failed, allocating 20734 bytes")extmod/deflate_compress.py: FAIL ("Error: memory allocation failed")The btree natmod probably is a bit too much for an ESP8266 in its current form, probably the same is valid for the deflate natmod when it comes to compressing buffers.
Trade-offs and Alternatives
All
tools/mpy_ld.pychanges are active only for theEM_XTENSAarchitecture and they are opt-in, so there shouldn't be any trade-offs to be made here. As far as alternatives go, the only one is to move natmod code into one or more usermods and bring them in as part of the MicroPython build.For
tests/run-natmodtests.pythe same applies (as in the injection being opt-in), although it is available to all platforms.