mp_raw_code_load/mp_raw_code_save - enable to execute from flash without loading in RAM #4124

adritium · 2018-09-11T21:01:33Z

What changes need to be made to enable this feature?

dpgeorge · 2018-09-12T02:12:50Z

There is only one reason that bytecode in .mpy files needs to be loaded into RAM: because the qstr values embedded in the saved bytecode must be rewritten to match those of the VM/runtime (this is essentially the linking stage, resolving symbols).

It is possible to change this so that bytecode is read-only and doesn't need modification. It would require making a qstr translation table for each bytecode block, to translate between the qstr values in the .mpy file and the qstr values in the VM/runtime. This translation table would need to live in RAM, but otherwise the bytecode can stay in ROM/flash/etc. The downsides to this approach: 1) it takes more RAM if the bytecode is also in RAM (the usual case when executing from the REPL or from a .py file); 2) it decreases performance of the VM due to the extra lookup in the table for each opcode that uses a qstr (this is a hit for both .mpy and .py).

There would be a way to retain performance and low RAM usage of existing REPL/.py files, and also have .mpy in ROM: for each existing opcode which needs a qstr add a new opcode that gets that qstr via the translation table. The following opcodes would need to be added (because they take a qstr as an argument):

MP_BC_LOAD_CONST_STRING_VIA_TABLE
MP_BC_LOAD_METHOD_VIA_TABLE
MP_BC_LOAD_SUPER_METHOD_VIA_TABLE
MP_BC_LOAD_NAME_VIA_TABLE
MP_BC_LOAD_GLOBAL_VIA_TABLE
MP_BC_LOAD_ATTR_VIA_TABLE
MP_BC_STORE_NAME_VIA_TABLE
MP_BC_STORE_GLOBAL_VIA_TABLE
MP_BC_STORE_ATTR_VIA_TABLE
MP_BC_DELETE_NAME_VIA_TABLE
MP_BC_DELETE_GLOBAL_VIA_TABLE
MP_BC_IMPORT_NAME_VIA_TABLE
MP_BC_IMPORT_FROM_VIA_TABLE

adritium · 2018-09-12T02:14:53Z

@dpgeorge thanks for your reply!

adritium · 2018-09-12T18:12:21Z

@dpgeorge since RAM is a constant complaint of platforms micropython runs on, this sounds like a no-brainer.

Right?

adritium · 2018-09-12T18:15:27Z

The following opcodes would need to be added (because they take a qstr as an argument):

For my understanding: would those opcodes exist only in the .mpy?

dpgeorge · 2018-09-13T01:25:01Z

since RAM is a constant complaint of platforms micropython runs on, this sounds like a no-brainer.
Right?

Not really. As I said above, implementing this feature would increase RAM usage for all code that is not in a .mpy file due to the additional qstr translation table (also for code that is in a .mpy but can't be executed from where it is stored, eg an SD card).

The other big issue to solve would be making sure that .mpy's live in a location that is 1) memory mapped to the CPU; 2) contiguous. For the stm32 port this means storing in internal flash, or external memory-mapped QSPI flash, and using a new filesystem that can store files contiguously. For esp8266 it means you can only use the flash below 1MiB for this kind of storage (because that's the only region that is memory mapped). Other systems will have similar constraints.

For my understanding: would those opcodes exist only in the .mpy?

Only the .mpy will use these opcodes, but the VM still needs to implement them which means a moderate increase in code/firmware size. A way to improve this (reduce VM code size) would be to modify existing opcodes that have a qstr argument, so that when they decode the qstr value from the bytecode they check if it needs translation (eg if the high bit of the qstr is set then it is a relative qstr and needs to be translated using the qstr table). This approach would also allow mp_raw_code_load() to pick a strategy when loading a .mpy file: 1) if the bytecode is not memory mapped or contiguous and must be loaded into RAM anyway then it can translate the qstrs as it loads the bytecode [this is how it already works]; 2) if the bytecode is memory mapped and contiguous then it leaves it in ROM, takes a pointer to this ROM, and creates a qstr translation table in RAM, allowing the VM to translate them on the fly when it executes the code.

tve · 2020-05-05T07:42:47Z

I'm wondering whether there is a different solution, which may be easier to implement but perhaps less flexible.

Assume that there is a designated area in flash for execute-from-flash ("EFF") modules: modules must be written into that area explicitly similar to the way one would write a module into the filesystem. (Maybe the area could be mounted into the filesystem namespace...)

When a new EFF module is written its qstr are only resolved against qstr tables that are in flash, i.e. against the constant table in the firmware and against previous EFF modules. New qstr are written to a table in flash that is not immediately part of the qstr table chain. In order to make the new qstr part of the chain a reset is required, thus the newly EFF module can only be used after a reset which then initializes the qstr chain to include the new table (or new entries). RAM-allocated qstr always come on top of all these ROM/Flash qstr.

A natural result of this is that EFF modules form a stack. Only the top-most module can be removed (at a time) and it can really only be marked for removal to be erased on the next reset because qstr table entries used by code resident in RAM may point into it. (I believe this is the case with the qstr translation table as well unless all the strings themselves are copied to RAM.)

All this would lead to a stack model where EFF modules are pushed onto the stack, a reset is necessary before they can be used, and they can be popped off the stack again but only erased after another reset.

I believe that the translation table approach has the advantage that a newly written module can immediately be executed, but I believe it shares the same limitations when it comes to removal, ~~including the stack property~~ (edit: probably not, too late to think more).

Add display init code for Lilygo TTGO T8 ESP32-S2

dpgeorge · 2022-03-30T06:10:07Z

Static bytecode/.mpy files with a qstr indirection table was implemented in f2040bf

adritium mentioned this issue Sep 14, 2018

Introduce abstraction (like pointer indirection) for memory read/write? #4140

Closed

gpshead mentioned this issue Jan 27, 2019

Save RAM: emit code directly to flash/filesystem/readonly-memory #4073

Closed

tannewt added a commit to tannewt/circuitpython that referenced this issue Feb 9, 2021

Merge pull request micropython#4124 from m4tk/main

6efd87b

Add display init code for Lilygo TTGO T8 ESP32-S2

laurensvalk mentioned this issue Jul 31, 2021

[Feature] Make builtin main importable pybricks/support#408

Closed

dpgeorge closed this as completed Mar 30, 2022

massimosala mentioned this issue May 30, 2023

Add VfsMap filesystem, mpremote deploy-mapfs, and ability to import .mpy files from ROM (WIP) #8381

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mp_raw_code_load/mp_raw_code_save - enable to execute from flash without loading in RAM #4124

mp_raw_code_load/mp_raw_code_save - enable to execute from flash without loading in RAM #4124

adritium commented Sep 11, 2018

dpgeorge commented Sep 12, 2018

adritium commented Sep 12, 2018

adritium commented Sep 12, 2018

adritium commented Sep 12, 2018

dpgeorge commented Sep 13, 2018

tve commented May 5, 2020 •

edited

dpgeorge commented Mar 30, 2022

mp_raw_code_load/mp_raw_code_save - enable to execute from flash without loading in RAM #4124

mp_raw_code_load/mp_raw_code_save - enable to execute from flash without loading in RAM #4124

Comments

adritium commented Sep 11, 2018

dpgeorge commented Sep 12, 2018

adritium commented Sep 12, 2018

adritium commented Sep 12, 2018

adritium commented Sep 12, 2018

dpgeorge commented Sep 13, 2018

tve commented May 5, 2020 • edited

dpgeorge commented Mar 30, 2022

tve commented May 5, 2020 •

edited