py/persistentcode: Full set of optimisations to reduce mpy file size #4564

dpgeorge · 2019-03-01T04:07:33Z

This PR combines and improves upon previous PRs to reduce size of mpy files: #4473 (static qstrs), #4549 (qstr window).

It also adds a further optimisation (second commit) to pack the qstr data directly into the bytecode instead of writing it at the end of the bytecode. This allows to shrink the mpy file by about 2 bytes for every opcode that has a qstr argument.

Eg, for the sequence (LOAD_GLOBAL 'print', CALL_FUNCTION 0), instead of encoding it as 0x1c 0x00 0x00 0x64 0x00 0x05 p r i n t it's now encoded as 0x1c 0x05 p r i n t 0x64 0x00 (the 0x05 is the length of the string "print").

The mpy saver can write such a stream in one pass without needing any additional temporary memory, and same goes for the mpy loader.

Size reduction is about 25% for the qstr window, then a further 10% for packed qstrs in bytecode, then a further 5% with the addition of a static qstr set.

On a bunch of 19 test py scripts (including uasyncio, upip, lcd160cr), the sum of all 19 mpy file sizes went from 60064 bytes before the patches here down to 39185 bytes.

pfalcon · 2019-03-02T17:29:52Z

Unexpected twist. One may only wonder why it wasn't done like that from the beginning, and look for premonitions like "but before we could read the whole bytecode efficiently with one read() call", or "but before we could mmap() bytecode".

Skipping that, one can point that there's still no support for specifying the exact window size (in ones, not in powers of two).

dpgeorge · 2019-03-04T13:30:10Z

"but before we could read the whole bytecode efficiently with one read() call", or "but before we could mmap() bytecode".

Well, everything is a compromise / trade off. Efficiency concerns with multiple read() calls can be mitigated with a buffer (and vfs_reader already has one). Not being able to mmap a .mpy file is a valid concern, but I don't think it's an important one: since the data has to anyway be modified the mmap needs to be MAP_PRIVATE and it's really just a potential speed optimisation to load it, not one to reduce memory usage (and on bare-metal mmap doesn't really exist, and if it does then it must anyway allocate RAM, it can't map ROM).

I tested the speed of importing lcd160cr.mpy (a large file) before and after the set of patches here, and got:

        before  after
PYBv1.0  40 ms   20 ms
esp8266 226 ms  161 ms

So the import speed is reduced by a lot, most likely due to less handling of qstrs.

Skipping that, one can point that there's still no support for specifying the exact window size (in ones, not in powers of two).

The reason for a power-of-two is to be able to use & instead of % in the window computation. But if you really think it's an important option to have...

pfalcon · 2019-03-04T13:59:39Z

But if you really think it's an important option to have...

Well, that was a "feature request" from me from the start, and given all the other refactors made, hopefully this one can fit in too.

The reason for a power-of-two is to be able to use & instead of % in the window computation.

Otherwise it's just an if to wrap the index.

Thanks for other repsonses and figures - I agree with all of them.

dpgeorge · 2019-03-04T14:36:25Z

Well, that was a "feature request" from me from the start, and given all the other refactors made, hopefully this one can fit in too.

Ok, I pushed a commit implementing this. It needs further testing though (and squashing).

This is an implementation of a sliding qstr window used to reduce the number of qstrs stored in a .mpy file. The window size is configured to 32 entries which takes a fixed 64 bytes (16-bits each) on the C stack when loading/saving a .mpy file. It allows to remember the most recent 32 qstrs so they don't need to be stored again in the .mpy file. The qstr window uses a simple least-recently-used mechanism to discard the least recently used qstr when the window overflows (similar to dictionary compression). This scheme only needs a single pass to save/load the .mpy file. Reduces mpy file size by about 25% with a window size of 32.

Instead of emitting two bytes in the bytecode for where the linked qstr should be written to, it is now replaced by the actual qstr data, or a reference into the qstr window. Reduces mpy file size by about 10%.

When encoded in the mpy file, if qstr <= QSTR_LAST_STATIC then store two bytes: 0, static_qstr_id. Otherwise encode the qstr as usual (either with string data or a reference into the qstr window). Reduces mpy file size by about 5%.

dpgeorge · 2019-03-05T05:49:01Z

Merged in 5996eeb through 62483bb

pfalcon · 2019-03-05T06:39:16Z

Thanks!

[build] simplify makeqstrdata heuristic

dpgeorge mentioned this pull request Mar 1, 2019

py/persistentcode: Add a qstr window to save mpy files more efficiently. #4549

Closed

dpgeorge added 4 commits March 5, 2019 16:25

py/persistentcode: Pack qstrs directly in bytecode to reduce mpy size.

992a6e1

Instead of emitting two bytes in the bytecode for where the linked qstr should be written to, it is now replaced by the actual qstr data, or a reference into the qstr window. Reduces mpy file size by about 10%.

minimal/frozentest: Recompile now that mpy format changed.

62483bb

dpgeorge force-pushed the py-mpy-size-reduction branch from 6d1df83 to 62483bb Compare March 5, 2019 05:33

dpgeorge closed this Mar 5, 2019

dpgeorge deleted the py-mpy-size-reduction branch March 5, 2019 05:49

dpgeorge merged commit 62483bb into micropython:master Mar 5, 2019

tannewt added a commit to tannewt/circuitpython that referenced this pull request Apr 21, 2021

Merge pull request micropython#4564 from tyomitch/patch-1

e54e5e3

[build] simplify makeqstrdata heuristic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

py/persistentcode: Full set of optimisations to reduce mpy file size #4564

py/persistentcode: Full set of optimisations to reduce mpy file size #4564

dpgeorge commented Mar 1, 2019

pfalcon commented Mar 2, 2019

dpgeorge commented Mar 4, 2019

pfalcon commented Mar 4, 2019

dpgeorge commented Mar 4, 2019

dpgeorge commented Mar 5, 2019

pfalcon commented Mar 5, 2019

py/persistentcode: Full set of optimisations to reduce mpy file size #4564

py/persistentcode: Full set of optimisations to reduce mpy file size #4564

Conversation

dpgeorge commented Mar 1, 2019

pfalcon commented Mar 2, 2019

dpgeorge commented Mar 4, 2019

pfalcon commented Mar 4, 2019

dpgeorge commented Mar 4, 2019

dpgeorge commented Mar 5, 2019

pfalcon commented Mar 5, 2019