Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py: convert None/False/True to small immediate values #5429

Merged
merged 4 commits into from
Jan 12, 2020

Conversation

dpgeorge
Copy link
Member

Based on discussion in #5314, with full credit to @Jongy for the idea:

mp_const_none is widely used throughout the project
...
In order to make the loading of it easier, we can either make the value a small immediate like MP_OBJ_NULL

That's what's done in this PR, to make None, False and True small immediate values, namely MP_OBJ_NONE, MP_OBJ_FALSE, MP_OBJ_TRUE.

I had to steal some of the qstr encoding space in an object (see change to py/mpconfig.h) to encode these new values, so they were separate from true object pointers. But it turned out pretty neat and straightforward to do it. Hardly any places in the code assume anything about the None/False/True objects so they can easily become immediate values.

There are significant size reductions to ports:

   bare-arm:  -416 -0.628% 
minimal x86:  -308 -0.207% [incl -12(data)]
   unix x64:  -992 -0.199% [incl -256(data)]
      stm32: -1928 -0.507% PYBV10
     cc3200: -1240 -0.670% 
      esp32:  -528 -0.047% GENERIC[incl -32(data)]
        nrf:  -932 -0.638% pca10040
       samd:  -560 -0.549% ADAFRUIT_ITSYBITSY_M4_EXPRESS

Note that I only made it work for object representation A at this point. Also, did not do any performance comparison tests with the changes.

@Jongy
Copy link
Contributor

Jongy commented Dec 17, 2019

Very cool! I had the idea earlier of increasing the number of bits used for the object tag in order to add a new tag for those objects, but I thought it was too far to go. Reusing the qstr tags for the price of reducing the bits for qstrs (which anyway don't make any use of the full bits) is a good trade-off.

@Jongy
Copy link
Contributor

Jongy commented Dec 17, 2019

Btw, I wrote a short blog post about the idea behind #5320, I also mentioned this new PR there (see it here)

@dpgeorge
Copy link
Member Author

I had the idea earlier of increasing the number of bits used for the object tag in order to add a new tag for those objects

I also thought about this but it'd mean aligning all static/ROM objects to 8 bytes (so the 3 lower bits of the pointer are 0) which would likely lead to code size increases.

@dpgeorge dpgeorge force-pushed the py-none-false-true-obj branch 3 times, most recently from 0d843c2 to c03f115 Compare December 27, 2019 12:41
@dpgeorge dpgeorge force-pushed the py-none-false-true-obj branch 3 times, most recently from 986a01e to b2e1175 Compare January 9, 2020 01:10
@dpgeorge dpgeorge changed the title WIP: convert None/False/True to small immediate values py: convert None/False/True to small immediate values Jan 12, 2020
This commit adjusts the definition of qstr encoding in all object
representations by taking a single bit from the qstr space and using it to
distinguish between qstrs and a new kind of literal object: immediate
objects.  In other words, the qstr space is divided in two pieces, one half
for qstrs and the other half for immediate objects.

There is still enough room for qstr values (29 bits in representation A on
a 32-bit architecture, and 19 bits in representation C) and the new
immediate objects can be used for things like None, False and True.
This option (enabled by default for object representation A, B, C) makes
None/False/True objects immediate objects, ie they are no longer a concrete
object in ROM but are rather just values, eg None=0x6 for representation A.

Doing this saves a considerable amount of code size, due to these objects
being widely used:

   bare-arm:  -392 -0.591%
minimal x86:  -252 -0.170% [incl +52(data)]
   unix x64:  -624 -0.125% [incl -128(data)]
unix nanbox:    +0 +0.000%
      stm32: -1940 -0.510% PYBV10
     cc3200: -1216 -0.659%
    esp8266:  -404 -0.062% GENERIC
      esp32:  -732 -0.064% GENERIC[incl +48(data)]
        nrf:  -988 -0.675% pca10040
       samd:  -564 -0.556% ADAFRUIT_ITSYBITSY_M4_EXPRESS

Thanks go to @Jongy aka Yonatan Goldschmidt for the idea.
This function is called often and with immediate objects enabled it has
more cases, so optimise it for speed.  With this optimisation the runtime
is now slightly faster with immediate objects enabled than with them
disabled.
@dpgeorge dpgeorge merged commit 4005760 into micropython:master Jan 12, 2020
@dpgeorge dpgeorge deleted the py-none-false-true-obj branch January 12, 2020 14:46
@dpgeorge
Copy link
Member Author

This has now been merged.

Thanks @Jongy for the great idea!

tannewt pushed a commit to tannewt/circuitpython that referenced this pull request Oct 13, 2021
[micropython#4701] Correct DAC clock speed comments for SAMD21 and SAMD51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants