Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13-126-g05f95682e ESP32 port speed decrease #6565

Closed
mtk11 opened this issue Oct 22, 2020 · 12 comments
Closed

v1.13-126-g05f95682e ESP32 port speed decrease #6565

mtk11 opened this issue Oct 22, 2020 · 12 comments

Comments

@mtk11
Copy link

mtk11 commented Oct 22, 2020

Experiencing speed decrease with the latest ESP32 port.

MicroPython v1.12 on 2019-12-20; ESP32 module with ESP32

Pystone(1.2) time for 500 passes = 312ms
This machine benchmarks at 1602 pystones/second

MicroPython v1.13-126-g05f95682e on 2020-10-22; ESP32 module with ESP32

Pystone(1.2) time for 500 passes = 493ms
This machine benchmarks at 1014 pystones/second

@robert-hh
Copy link
Contributor

robert-hh commented Oct 22, 2020

That is close to the clock frequency ratio of 160 vs 240 MHz. Did you run the tests at the same clock speed?

@mtk11
Copy link
Author

mtk11 commented Oct 22, 2020

Same - 240 MHz. With my custom ESP32 build the situation is even worse:

MicroPython v1.13-117-g6eebdbc49-dirty on 2020-10-22; ESP32 module with ESP32
Type "help()" for more information.

import machine
machine.freq(240000000)
machine.freq()
240000000
import pystone_lowmem
pystone_lowmem.main()
Pystone(1.2) time for 500 passes = 606ms
This machine benchmarks at 825 pystones/second

@dpgeorge
Copy link
Member

Please see previous discussions on this topic: #5040, #5109, #5673

@UnexpectedMaker
Copy link
Contributor

I've read through the other issues..... but going to add to this one as it's the latest.

I'm also seeing dramatic slowdowns in performance between v1.12-257-gbf7ae4d88, 1.13 stable and v1.13-150-gb7883ce74

I have 2x Reflow Master Pro's in front of me, same code, but different MPY versions (pre and post littlefs) and they are giving me crazy differences in performance and I can only think that littlefs is the cause as it's during the boot up and importing of modules it slows down as they all seem to happen during filesystem stuff.

I know these times are "un-scientific" - but good enough for my demonstration of the issue I am facing:

MicroPython v1.12-257-gbf7ae4d88 on 2020-05-07; TinyPICO with ESP32-PICO-D4
Boot time to menu: 8.46s 

MicroPython v1.13 on 2020-09-02; TinyPICO with ESP32-PICO-D4
Boot time to menu: 10.55s

MicroPython v1.13-150-gb7883ce74 on 2020-11-06; TinyPICO with ESP32-PICO-D4
Boot time to menu: 13.73s

When I serialise and save my data when I leave the RMP settings UI, time to save and get back to the menu are:

MicroPython v1.12-257-gbf7ae4d88 on 2020-05-07; TinyPICO with ESP32-PICO-D4
2.2s

MicroPython v1.13 on 2020-09-02; TinyPICO with ESP32-PICO-D4
3.1s

MicroPython v1.13-150-gb7883ce74 on 2020-11-06; TinyPICO with ESP32-PICO-D4
4.2s

Since 1.13 shipped I see MPY has now been pinned to core 0 instead of 1, and I remember something about @dpgeorge mentioning that some extra logging or journaling has been added to littlefs since it's addition.

@dpgeorge
Copy link
Member

dpgeorge commented Nov 6, 2020

@UnexpectedMaker you can switch to FAT FS by reformatting the flash partition. Something like uos.VfsFat.mkfs(esp32.Partition('vfs')). It would be interesting to see the results of your above timing but with FAT FS instead.

@UnexpectedMaker
Copy link
Contributor

@dpgeorge ok, I ran some tests on 1.13 stable and MicroPython v1.13-150-gb7883ce74, each time reformatting with uos.VfsFat.mkfs(esp32.Partition('vfs')) and then copying the same files back on.

Interesting results:

MicroPython v1.13 on 2020-09-02; TinyPICO with ESP32-PICO-D4
Boot time to menu: 8s <- down 3secs

MicroPython v1.13-150-gb7883ce74 on 2020-11-06; TinyPICO with ESP32-PICO-D4
Boot time to menu: 10.8s <- down 3secs

So a significant improvement using vfs over littlefs, and about the same slowdown between 1.13 stable and 1.13 latest nightly as I was seeing with littlefs - so performance losses all around :(

I understand that every IDF update/change that's introduced into MPY could have adverse effects on performance, so the non littlefs performance issues could be IDF related, but 1.13 stable and 1.12 appear to be neck and neck when both are running vfs, so it's something fairly new.

@UnexpectedMaker
Copy link
Contributor

I know you're super busy @dpgeorge but any more thoughts on this?

@dpgeorge
Copy link
Member

The main change to esp32 since v1.13 was indeed pinning MicroPython to core 0. And it does seem that this change gave the performance decrease that you see.

Running the performance benchmarks, comparing v1.13 with latest master:

diff of scores (higher is better)
N=100 M=100                v1.13 -> v1.13-180-g5af3c046c  diff      diff% (error%)
bm_chaos.py                  253.29 ->     204.19 :     -49.10 = -19.385% (+/-0.00%)
bm_fannkuch.py                60.67 ->      80.12 :     +19.45 = +32.059% (+/-0.00%)
bm_fft.py                   2633.39 ->    2707.87 :     +74.48 =  +2.828% (+/-0.00%)
bm_float.py                 4085.76 ->    3368.39 :    -717.37 = -17.558% (+/-0.01%)
bm_hexiom.py                  25.64 ->      24.51 :      -1.13 =  -4.407% (+/-0.00%)
bm_nqueens.py               3244.20 ->    2792.79 :    -451.41 = -13.914% (+/-0.00%)
bm_pidigits.py               586.65 ->     475.01 :    -111.64 = -19.030% (+/-0.01%)
builtin_sum_list.py          696.69 ->     682.20 :     -14.49 =  -2.080% (+/-0.00%)
builtin_sum_range.py         722.39 ->     722.40 :      +0.01 =  +0.001% (+/-0.00%)
misc_aes.py                  438.24 ->     413.36 :     -24.88 =  -5.677% (+/-0.00%)
misc_mandel.py              2859.17 ->    3174.19 :    +315.02 = +11.018% (+/-0.00%)
misc_pystone.py             1731.94 ->    1166.32 :    -565.62 = -32.658% (+/-0.01%)
misc_raytrace.py             264.84 ->     219.24 :     -45.60 = -17.218% (+/-0.00%)

It's not a consistent drop in performance (bm_fannkuch and misc_mandel are up) but it is a pretty broad decrease.

Then if I revert the commit that pinned MP to core 0 I get (comparing v1.13 with master+revert):

diff of scores (higher is better)
N=100 M=100                   v1.13 -> v1.13-180+revert   diff      diff% (error%)
bm_chaos.py                  253.29 ->     256.06 :      +2.77 =  +1.094% (+/-0.00%)
bm_fannkuch.py                60.67 ->      86.82 :     +26.15 = +43.102% (+/-0.01%)
bm_fft.py                   2633.39 ->    3095.36 :    +461.97 = +17.543% (+/-0.00%)
bm_float.py                 4085.76 ->    3745.47 :    -340.29 =  -8.329% (+/-0.30%)
bm_hexiom.py                  25.64 ->      28.01 :      +2.37 =  +9.243% (+/-0.00%)
bm_nqueens.py               3244.20 ->    3542.92 :    +298.72 =  +9.208% (+/-0.00%)
bm_pidigits.py               586.65 ->     550.34 :     -36.31 =  -6.189% (+/-0.00%)
builtin_sum_list.py          696.69 ->     694.96 :      -1.73 =  -0.248% (+/-0.00%)
builtin_sum_range.py         722.39 ->     722.38 :      -0.01 =  -0.001% (+/-0.00%)
misc_aes.py                  438.24 ->     434.58 :      -3.66 =  -0.835% (+/-0.01%)
misc_mandel.py              2859.17 ->    3055.87 :    +196.70 =  +6.880% (+/-0.00%)
misc_pystone.py             1731.94 ->    1793.01 :     +61.07 =  +3.526% (+/-0.01%)
misc_raytrace.py             264.84 ->     264.51 :      -0.33 =  -0.125% (+/-0.00%)

You can see that the performance diff is much smaller, and in fact mostly an improvement over v1.13.

For completeness, this is master compared to master+revert:

diff of scores (higher is better)
N=100 M=100    v1.13-180-g5af3c046c -> v1.13-180+revert   diff      diff% (error%)
bm_chaos.py                  204.19 ->     256.06 :     +51.87 = +25.403% (+/-0.00%)
bm_fannkuch.py                80.12 ->      86.82 :      +6.70 =  +8.362% (+/-0.00%)
bm_fft.py                   2707.87 ->    3095.36 :    +387.49 = +14.310% (+/-0.00%)
bm_float.py                 3368.39 ->    3745.47 :    +377.08 = +11.195% (+/-0.36%)
bm_hexiom.py                  24.51 ->      28.01 :      +3.50 = +14.280% (+/-0.00%)
bm_nqueens.py               2792.79 ->    3542.92 :    +750.13 = +26.860% (+/-0.00%)
bm_pidigits.py               475.01 ->     550.34 :     +75.33 = +15.859% (+/-0.01%)
builtin_sum_list.py          682.20 ->     694.96 :     +12.76 =  +1.870% (+/-0.00%)
builtin_sum_range.py         722.40 ->     722.38 :      -0.02 =  -0.003% (+/-0.00%)
misc_aes.py                  413.36 ->     434.58 :     +21.22 =  +5.134% (+/-0.01%)
misc_mandel.py              3174.19 ->    3055.87 :    -118.32 =  -3.728% (+/-0.00%)
misc_pystone.py             1166.32 ->    1793.01 :    +626.69 = +53.732% (+/-0.01%)
misc_raytrace.py             219.24 ->     264.51 :     +45.27 = +20.649% (+/-0.00%)

So you can clearly see that putting MP back on core 1 makes a big improvement to performance.

@UnexpectedMaker
Copy link
Contributor

Thanks for those benchmarks! Do we know what MPY was pinned to core 0? Was it something to do with a BT requirement?

Did you also have time to look into littlefs to see why it has also slowed everything down?

@dpgeorge
Copy link
Member

Do we know what MPY was pinned to core 0? Was it something to do with a BT requirement?

Yes, BLE does not work reliably with MP on core 1. See 311b851

Did you also have time to look into littlefs to see why it has also slowed everything down?

Littlefs is a reliable filesystem with atomic writes and wear leveling, and these features can lead to situations where file access is noticeably slow. From some testing I have done in the past littlefs is faster than FAT for writes, but slower when reading. When there are lots of imports then that can lead to lots of filesystem stat's (reading the directory listing), which can be slow. One way to improve this is to make sure sys.path only has the required paths in it and nothing more. Alternatively, use FAT.

@mtk11
Copy link
Author

mtk11 commented Jan 28, 2021

It seems the flash size greatly affects the speed - ESP32 port speed deterioration v923z/micropython-ulab#296 .

Kind of inconvenient to remove idf or micropython components. For example, in sdconfig.base setting CONFIG_PPP_SUPPORT=n, produces:

network_ppp.c:52:5: error: unknown type name 'ppp_pcb'
ppp_pcb *pcb;
...

Also, any convenient way to look what the included components size is ?

@dpgeorge
Copy link
Member

This should be fixed by 549448e (pystone now gives a score of about 1800 running at 160MHz).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants