mpprint: Correctly format leading zeros with separators. #18092

jepler · 2025-09-17T15:49:44Z

Summary

Correctly format integers with a grouping character and leading zeroes. such as "{:04,d}".format(0x100) -> "0,256".

Testing

I added a new test to ensure the implementation matches standard Python for the tested cases.

Trade-offs and Alternatives

I combined three different padding strings into a single string to reduce growth in const data.

The separator format option is already accepted but not supported for floating point numbers. Now, incorrect separator characters would be inserted in the padding positions when formatting an FP number, like +0,000,0003141.150

codecov · 2025-09-17T15:52:52Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.38%. Comparing base (44986b1) to head (068e110).
⚠️ Report is 62 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #18092   +/-   ##
=======================================
  Coverage   98.38%   98.38%           
=======================================
  Files         171      171           
  Lines       22299    22307    +8     
=======================================
+ Hits        21939    21947    +8     
  Misses        360      360

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2025-09-17T16:00:53Z

Code size report:

   bare-arm:   +76 +0.134% 
minimal x86:   +57 +0.030% 
   unix x64:   +64 +0.007% standard
      stm32:   +72 +0.018% PYBV10
     mimxrt:   +64 +0.017% TEENSY40
        rp2:   +80 +0.009% RPI_PICO_W
       samd:   +68 +0.025% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:   +61 +0.013% VIRT_RV32

py/mpprint.c

robert-hh · 2025-09-17T16:23:30Z

py/mpprint.c

+// strings with minimal flash size:
+//     0000000000000000 <- pad_zeros
+//                 0000_000 <- pad_zeros_comma (offset: 12)
+//                      000,00 <- pad_zeros_comma (offset: 17)


A few typos in the comment.
zeros -> zeroes
pad_zeros_comma (offset: 12) -> pad_zeroes_underscore (offset: 12)

Thank you, I think I fixed this now.

(projectwide zeros vs zeroes seems to be inconsistent but I'm happy to be consistent in this file!)

The #defines in the following lines use zeroes. Sorry, my comment was wrong then as well.

robert-hh · 2025-09-17T16:27:22Z

Besides that is works in my test at a SAMD device. I could have used the UNIX port.

jepler · 2025-09-17T16:44:13Z

fwiw this was actually giving me trouble when I was working on #17688 and wanted to print out the constants in the uctypes module in hex with leading zeros and grouping chars. it's not just a random bug find.

robert-hh · 2025-09-17T16:46:35Z

Having the digits grouped is pretty convenient, so IMHO it's a good change.

AJMansfield · 2025-09-17T18:02:12Z

The separator format option is already accepted but not supported for floating point numbers. Now, incorrect separator characters would be inserted in the padding positions when formatting an FP number, like +0,000,0003141.150

A cpydiff for this would be good!

jepler · 2025-09-17T20:27:01Z

good idea, added.

AJMansfield

A few minor tweaks, but nothing that isn't just a strict formal equivalent.

I've tested this on my Pico2 / RP2350 / Cortex M33 @ 300MHz and can confirm that all relevant tests pass.

AJMansfield · 2025-09-18T15:42:30Z

tests/cpydiff/types_str_formatsep_float.py

+"""
+categories: Types,str
+description: MicroPython accepts but does not properly implement the "," or "_" grouping character for float values
+cause: To reduce code size, MicroPython does not implement this combination. Grouping characters will not appear in the number's significant digits and will appear at incorrect locations in leading leading zeros.


Suggested change

cause: To reduce code size, MicroPython does not implement this combination. Grouping characters will not appear in the number's significant digits and will appear at incorrect locations in leading leading zeros.

cause: To reduce code size, MicroPython does not implement this combination. Grouping characters will not appear in the number's significant digits and will appear at incorrect locations in leading zeros.

AJMansfield · 2025-09-18T16:26:54Z

py/mpprint.c

+    } else if (fill == '0' && !grouping) {
        pad_chars = pad_zeroes;
-        pad_size = sizeof(pad_zeroes) - 1;
+        pad_size = 16;


Perhaps move these size values into #define constants? Just to keep all the information about these overlapping strings all together in one place.

Suggested change

pad_size = 16;

pad_size = pad_zeroes_size;

AJMansfield · 2025-09-18T16:27:37Z

py/mpprint.c

+    } else if (fill == '0') {
+        if (grouping == '_') {
+            pad_chars = pad_zeroes_underscore;
+            pad_size = 5;


Suggested change

pad_size = 5;

pad_size = pad_zeroes_underscore_size;

AJMansfield · 2025-09-18T16:28:00Z

py/mpprint.c

+            pad_size = 5;
+        } else {
+            pad_chars = pad_zeroes_comma;
+            pad_size = 4;


Suggested change

pad_size = 4;

pad_size = pad_zeroes_comma_size;

AJMansfield · 2025-09-18T16:28:29Z

py/mpprint.c

        pad_chars = pad_spaces;
-        pad_size = sizeof(pad_spaces) - 1;
-    } else if (fill == '0') {
+        pad_size = sizeof(pad_spaces);


Maybe this too, for symmetry?

Suggested change

pad_size = sizeof(pad_spaces);

pad_size = pad_spaces_size;

Plus just this up with the other size definitions

#define pad_spaces_size (sizeof(pad_spaces))

AJMansfield · 2025-09-18T16:54:40Z

py/mpprint.c

+#define pad_zeroes       (pad_common + 0)
+#define pad_zeroes_comma (pad_common + 17)
+#define pad_zeroes_underscore (pad_common + 12)


Suggested change

#define pad_zeroes (pad_common + 0)

#define pad_zeroes_comma (pad_common + 17)

#define pad_zeroes_underscore (pad_common + 12)

#define pad_zeroes (pad_common + 0)

#define pad_zeroes_size (16)

#define pad_zeroes_comma (pad_common + 17)

#define pad_zeroes_comma_size (4)

#define pad_zeroes_underscore (pad_common + 12)

#define pad_zeroes_underscore_size (5)

good ideas, done. thanks also for catching the doc mistake.

AJMansfield · 2025-09-18T17:03:30Z

py/mpprint.c

+static const char pad_spaces[16] = {' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' '};
+static const char pad_common[23] = {'0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '_', '0', '0', '0', ',', '0', '0'};


This would've been a perfect case for array range initializers if we didn't need to target MSVC. 🙃

#define pad_spaces_size (16) static const char pad_spaces[pad_spaces_size] = { [0 ... pad_spaces_size - 1] = ' ' }; #define pad_common_size (23) static const char pad_common[pad_common_size] = { [0 ... pad_common_size - 1] = '0', [16] = '_', [20] = ',' };

AJMansfield · 2025-09-18T19:11:03Z

Did a little bit of benchmarking just to make sure I understood how much advantage this code actually gets from having those padding buffers btw, though I'd share.

Tested formatting all integers 1 to 20000, comparing the performance of the base case padding with exclamation marks vs the optimized case, for different padding lengths, on my Pico2 / RP2350 / Cortex M33 @ 300MHz:

sz	char='!'	char=' '	change
1	0.708	0.708	+0.00%
2	0.708	0.708	+0.00%
3	0.709	0.708	-0.14%
5	0.709	0.709	+0.00%
7	0.719	0.712	-0.97%
10	0.742	0.721	-2.83%
15	0.759	0.714	-5.93%
22	0.840	0.768	-8.57%
33	0.908	0.784	-13.66%
47	1.018	0.815	-19.94%
68	1.178	0.880	-25.30%
100	1.409	0.976	-30.73%
150	1.779	1.108	-37.72%
220	2.318	1.339	-42.23%
330	3.203	1.683	-47.46%
470	4.314	2.147	-50.23%
680	6.299	3.127	-50.36%
1000	9.408	4.664	-50.43%

I also did a test with deleting and simplifying away all of the code from the optimized cases, and then padding with zeroes:

sz	deleted	retained	change
none	0.653	0.686	+5.05%
1	0.676	0.712	+5.33%
2	0.676	0.712	+5.33%
3	0.676	0.712	+5.33%
5	0.676	0.712	+5.33%
7	0.685	0.719	+4.96%
10	0.708	0.728	+2.82%
15	0.722	0.723	+0.14%
22	0.810	0.774	-4.44%
33	0.875	0.793	-9.37%
47	0.979	0.824	-15.83%
68	1.133	0.885	-21.89%
100	1.352	0.981	-27.44%
150	1.706	1.114	-34.70%
220	2.222	1.348	-39.33%
330	3.070	1.692	-44.89%
470	4.135	2.152	-47.96%
680	6.049	3.135	-48.17%
1000	9.053	4.672	-48.39%

So it seems like doing this padding buffer optimization at all incurs about a 5% performance penalty to padding short integers 15 characters or less, but ends up cutting the execution time pretty well in half for long padding lengths.

jepler · 2025-09-19T14:49:06Z

Since you're set up for benchmarking, maybe you'd see how using groups of 4 works out. It looks like it would be inexpensive in code size to let any character be padded in groups of 4.

jepler · 2025-09-19T15:09:56Z

text size of build-ADAFRUIT_ITSYBITSY_M4_EXPRESS/py/mpprint.o for various alternatives:

1924       - original version
1997 (+73) - implemented separators
1969 (+45) - space economized version

The space economized version is https://github.com/micropython/micropython/compare/master...jepler:leading-zeroes-alternate?expand=1 and would need to be squashed up. It uses the hard coded patterns for zeroes+grouping and a synthesized 4 byte fill for everything else, getting rid of some of the static array data.

The new padding patterns for commas-and-zeroes and underscores-and- zeroes are smooshed together into the existing pad_zeroes to save space. Only the two combinations of (decimal + commas) and (other bases + underscores) are properly supported. Add a test for it. Closes micropython#18082 Signed-off-by: Jeff Epler <jepler@unpythonic.net>

Signed-off-by: Jeff Epler <jepler@unpythonic.net>

AJMansfield · 2025-09-19T22:58:31Z

Since you're set up for benchmarking, maybe you'd see how using groups of 4 works out. It looks like it would be inexpensive in code size to let any character be padded in groups of 4.

TBH a similar thought occurred to me as well. The real expensive bit of overhead here isn't the actual raw 1-byte-at-a-time data copying (instead of machine-word size transfers etc) --- it's that each print call involves dispatching a function pointer (and a whole lot more extra bookkeeping).

It might be just as much of a speedup --- and possibly a code-size reduction --- to drop the extra .rodata and the conditionals to use it and just fill a new buffer on the stack every time.

(Also another optimization that might be worthwhile is to see if there's a way to get the compiler to speculatively devirtualize some of those calls...)

AJMansfield · 2025-09-21T19:14:45Z

I've spent some time playing around with this and was able to confirm my theory --- using a 20 byte padding buffer on the stack (=lcm(4,5), so it divides neatly into the stride length for both the underscore and comma cases), I'm able to drop about 80ms off the benchmark times for the short-padding cases, while still getting the ~2x speedup over the original "one character at a time" base-case strategy for any padding character on the long-padding cases --- while also dropping 32 bytes off the RP2 build size.

master...AJMansfield:micropython:leading-zeros-alt2

The benchmark results are from my Pico2 RP2350 running in Cortex M33 mode at 300 MHz. (Nice thing about using an embedded processor to benchmark is that you get near enough exact run-to-run repeatability.)

Raw Benchmark Results

master

Current master behavior.
branch: v1.26.0-162-g5284e0980

   text    data     bss     dec     hex filename
 309904       0    5020  314924   4ce2c /home/anson/mpy/micropython/ports/rp2/build-RPI_PICO2_M33/firmware.elf

internal_bench/format:
    0.682s (-00.00%) internal_bench/format-1-int.py
    0.709s (+03.86%) internal_bench/format-2.00-int-space-pad1.py
    0.708s (+03.85%) internal_bench/format-2.01-int-space-pad2.py
    0.708s (+03.83%) internal_bench/format-2.02-int-space-pad3.py
    0.708s (+03.83%) internal_bench/format-2.03-int-space-pad5.py
    0.710s (+04.11%) internal_bench/format-2.04-int-space-pad7.py
    0.720s (+05.53%) internal_bench/format-2.05-int-space-pad10.py
    0.706s (+03.49%) internal_bench/format-2.06-int-space-pad15.py
    0.767s (+12.50%) internal_bench/format-2.07-int-space-pad22.py
    0.784s (+14.87%) internal_bench/format-2.08-int-space-pad33.py
    0.811s (+18.86%) internal_bench/format-2.09-int-space-pad47.py
    0.870s (+27.59%) internal_bench/format-2.10-int-space-pad68.py
    0.959s (+40.63%) internal_bench/format-2.11-int-space-pad100.py
    1.079s (+58.17%) internal_bench/format-2.12-int-space-pad150.py
    1.296s (+90.03%) internal_bench/format-2.13-int-space-pad220.py
    1.616s (+136.90%) internal_bench/format-2.14-int-space-pad330.py
    2.048s (+200.21%) internal_bench/format-2.15-int-space-pad470.py
    2.984s (+337.45%) internal_bench/format-2.16-int-space-pad680.py
    4.452s (+552.57%) internal_bench/format-2.17-int-space-pad1000.py
    0.708s (+03.85%) internal_bench/format-3.00-int-unusual-pad1.py
    0.709s (+03.86%) internal_bench/format-3.01-int-unusual-pad2.py
    0.708s (+03.85%) internal_bench/format-3.02-int-unusual-pad3.py
    0.709s (+03.86%) internal_bench/format-3.03-int-unusual-pad5.py
    0.717s (+05.17%) internal_bench/format-3.04-int-unusual-pad7.py
    0.741s (+08.68%) internal_bench/format-3.05-int-unusual-pad10.py
    0.751s (+10.06%) internal_bench/format-3.06-int-unusual-pad15.py
    0.843s (+23.62%) internal_bench/format-3.07-int-unusual-pad22.py
    0.911s (+33.57%) internal_bench/format-3.08-int-unusual-pad33.py
    1.021s (+49.61%) internal_bench/format-3.09-int-unusual-pad47.py
    1.181s (+73.11%) internal_bench/format-3.10-int-unusual-pad68.py
    1.412s (+106.94%) internal_bench/format-3.11-int-unusual-pad100.py
    1.782s (+161.27%) internal_bench/format-3.12-int-unusual-pad150.py
    2.322s (+240.34%) internal_bench/format-3.13-int-unusual-pad220.py
    3.206s (+370.01%) internal_bench/format-3.14-int-unusual-pad330.py
    4.318s (+532.96%) internal_bench/format-3.15-int-unusual-pad470.py
    6.304s (+824.12%) internal_bench/format-3.16-int-unusual-pad680.py
    9.415s (+1280.15%) internal_bench/format-3.17-int-unusual-pad1000.py
    0.716s (+04.93%) internal_bench/format-4.00-int-group-pad1.py
    0.716s (+04.94%) internal_bench/format-4.01-int-group-pad2.py
    0.716s (+04.97%) internal_bench/format-4.02-int-group-pad3.py
    0.716s (+05.03%) internal_bench/format-4.03-int-group-pad5.py
    0.723s (+05.98%) internal_bench/format-4.04-int-group-pad7.py
    0.733s (+07.42%) internal_bench/format-4.05-int-group-pad10.py
    0.719s (+05.37%) internal_bench/format-4.06-int-group-pad15.py
    0.781s (+14.53%) internal_bench/format-4.07-int-group-pad22.py
    0.800s (+17.28%) internal_bench/format-4.08-int-group-pad33.py
    0.831s (+21.79%) internal_bench/format-4.09-int-group-pad47.py
    0.893s (+30.86%) internal_bench/format-4.10-int-group-pad68.py
    0.989s (+44.97%) internal_bench/format-4.11-int-group-pad100.py
    1.121s (+64.29%) internal_bench/format-4.12-int-group-pad150.py
    1.355s (+98.62%) internal_bench/format-4.13-int-group-pad220.py
    1.700s (+149.15%) internal_bench/format-4.14-int-group-pad330.py
    2.161s (+216.75%) internal_bench/format-4.15-int-group-pad470.py
    3.145s (+361.07%) internal_bench/format-4.16-int-group-pad680.py
    4.659s (+582.97%) internal_bench/format-4.17-int-group-pad1000.py
1 tests performed (55 individual testcases)

leading-zeros

Jepler's original version that implements grouping.
branch: v1.26.0-164-gf17e61759

   text    data     bss     dec     hex filename
 309984       0    5020  315004   4ce7c /home/anson/mpy/micropython/ports/rp2/build-RPI_PICO2_M33/firmware.elf

internal_bench/format:
    0.686s (+00.00%) internal_bench/format-1-int.py
    0.710s (+03.62%) internal_bench/format-2.00-int-space-pad1.py
    0.711s (+03.63%) internal_bench/format-2.01-int-space-pad2.py
    0.711s (+03.62%) internal_bench/format-2.02-int-space-pad3.py
    0.710s (+03.51%) internal_bench/format-2.03-int-space-pad5.py
    0.712s (+03.78%) internal_bench/format-2.04-int-space-pad7.py
    0.721s (+05.20%) internal_bench/format-2.05-int-space-pad10.py
    0.715s (+04.34%) internal_bench/format-2.06-int-space-pad15.py
    0.769s (+12.21%) internal_bench/format-2.07-int-space-pad22.py
    0.786s (+14.61%) internal_bench/format-2.08-int-space-pad33.py
    0.816s (+19.07%) internal_bench/format-2.09-int-space-pad47.py
    0.881s (+28.51%) internal_bench/format-2.10-int-space-pad68.py
    0.977s (+42.49%) internal_bench/format-2.11-int-space-pad100.py
    1.109s (+61.74%) internal_bench/format-2.12-int-space-pad150.py
    1.341s (+95.53%) internal_bench/format-2.13-int-space-pad220.py
    1.685s (+145.73%) internal_bench/format-2.14-int-space-pad330.py
    2.148s (+213.25%) internal_bench/format-2.15-int-space-pad470.py
    3.128s (+356.25%) internal_bench/format-2.16-int-space-pad680.py
    4.665s (+580.40%) internal_bench/format-2.17-int-space-pad1000.py
    0.711s (+03.62%) internal_bench/format-3.00-int-unusual-pad1.py
    0.711s (+03.64%) internal_bench/format-3.01-int-unusual-pad2.py
    0.711s (+03.62%) internal_bench/format-3.02-int-unusual-pad3.py
    0.710s (+03.56%) internal_bench/format-3.03-int-unusual-pad5.py
    0.719s (+04.84%) internal_bench/format-3.04-int-unusual-pad7.py
    0.743s (+08.30%) internal_bench/format-3.05-int-unusual-pad10.py
    0.760s (+10.85%) internal_bench/format-3.06-int-unusual-pad15.py
    0.842s (+22.76%) internal_bench/format-3.07-int-unusual-pad22.py
    0.910s (+32.65%) internal_bench/format-3.08-int-unusual-pad33.py
    1.019s (+48.60%) internal_bench/format-3.09-int-unusual-pad47.py
    1.179s (+71.99%) internal_bench/format-3.10-int-unusual-pad68.py
    1.410s (+105.63%) internal_bench/format-3.11-int-unusual-pad100.py
    1.781s (+159.71%) internal_bench/format-3.12-int-unusual-pad150.py
    2.320s (+238.33%) internal_bench/format-3.13-int-unusual-pad220.py
    3.204s (+367.29%) internal_bench/format-3.14-int-unusual-pad330.py
    4.315s (+529.35%) internal_bench/format-3.15-int-unusual-pad470.py
    6.300s (+818.79%) internal_bench/format-3.16-int-unusual-pad680.py
    9.410s (+1272.32%) internal_bench/format-3.17-int-unusual-pad1000.py
    0.719s (+04.90%) internal_bench/format-4.00-int-group-pad1.py
    0.719s (+04.89%) internal_bench/format-4.01-int-group-pad2.py
    0.719s (+04.90%) internal_bench/format-4.02-int-group-pad3.py
    0.720s (+04.99%) internal_bench/format-4.03-int-group-pad5.py
    0.727s (+06.05%) internal_bench/format-4.04-int-group-pad7.py
    0.739s (+07.81%) internal_bench/format-4.05-int-group-pad10.py
    0.740s (+07.97%) internal_bench/format-4.06-int-group-pad15.py
    0.795s (+16.01%) internal_bench/format-4.07-int-group-pad22.py
    0.824s (+20.12%) internal_bench/format-4.08-int-group-pad33.py
    0.887s (+29.31%) internal_bench/format-4.09-int-group-pad47.py
    0.974s (+42.00%) internal_bench/format-4.10-int-group-pad68.py
    1.092s (+59.22%) internal_bench/format-4.11-int-group-pad100.py
    1.284s (+87.31%) internal_bench/format-4.12-int-group-pad150.py
    1.580s (+130.39%) internal_bench/format-4.13-int-group-pad220.py
    2.076s (+202.72%) internal_bench/format-4.14-int-group-pad330.py
    2.694s (+292.84%) internal_bench/format-4.15-int-group-pad470.py
    3.941s (+474.81%) internal_bench/format-4.16-int-group-pad680.py
    5.977s (+771.71%) internal_bench/format-4.17-int-group-pad1000.py
1 tests performed (55 individual testcases)

leading-zeros-alt2

My new version that uses a fixed-size buffer on the stack that's filled at each call.
branch: v1.26.0-165-g1b42623f9

   text    data     bss     dec     hex filename
 309952       0    5020  314972   4ce5c /home/anson/mpy/micropython/ports/rp2/build-RPI_PICO2_M33/firmware.elf

internal_bench/format:
    0.615s (+00.00%) internal_bench/format-1-int.py
    0.625s (+01.52%) internal_bench/format-2.00-int-space-pad1.py
    0.625s (+01.52%) internal_bench/format-2.01-int-space-pad2.py
    0.625s (+01.53%) internal_bench/format-2.02-int-space-pad3.py
    0.633s (+02.88%) internal_bench/format-2.03-int-space-pad5.py
    0.644s (+04.61%) internal_bench/format-2.04-int-space-pad7.py
    0.653s (+06.19%) internal_bench/format-2.05-int-space-pad10.py
    0.647s (+05.18%) internal_bench/format-2.06-int-space-pad15.py
    0.693s (+12.66%) internal_bench/format-2.07-int-space-pad22.py
    0.713s (+15.84%) internal_bench/format-2.08-int-space-pad33.py
    0.757s (+23.05%) internal_bench/format-2.09-int-space-pad47.py
    0.795s (+29.22%) internal_bench/format-2.10-int-space-pad68.py
    0.881s (+43.27%) internal_bench/format-2.11-int-space-pad100.py
    0.995s (+61.80%) internal_bench/format-2.12-int-space-pad150.py
    1.180s (+91.81%) internal_bench/format-2.13-int-space-pad220.py
    1.487s (+141.75%) internal_bench/format-2.14-int-space-pad330.py
    1.874s (+204.54%) internal_bench/format-2.15-int-space-pad470.py
    2.687s (+336.72%) internal_bench/format-2.16-int-space-pad680.py
    3.994s (+549.14%) internal_bench/format-2.17-int-space-pad1000.py
    0.625s (+01.53%) internal_bench/format-3.00-int-unusual-pad1.py
    0.625s (+01.51%) internal_bench/format-3.01-int-unusual-pad2.py
    0.625s (+01.52%) internal_bench/format-3.02-int-unusual-pad3.py
    0.633s (+02.90%) internal_bench/format-3.03-int-unusual-pad5.py
    0.644s (+04.61%) internal_bench/format-3.04-int-unusual-pad7.py
    0.653s (+06.18%) internal_bench/format-3.05-int-unusual-pad10.py
    0.647s (+05.16%) internal_bench/format-3.06-int-unusual-pad15.py
    0.693s (+12.66%) internal_bench/format-3.07-int-unusual-pad22.py
    0.713s (+15.86%) internal_bench/format-3.08-int-unusual-pad33.py
    0.757s (+23.06%) internal_bench/format-3.09-int-unusual-pad47.py
    0.795s (+29.18%) internal_bench/format-3.10-int-unusual-pad68.py
    0.881s (+43.27%) internal_bench/format-3.11-int-unusual-pad100.py
    0.995s (+61.80%) internal_bench/format-3.12-int-unusual-pad150.py
    1.180s (+91.79%) internal_bench/format-3.13-int-unusual-pad220.py
    1.487s (+141.74%) internal_bench/format-3.14-int-unusual-pad330.py
    1.874s (+204.52%) internal_bench/format-3.15-int-unusual-pad470.py
    2.687s (+336.70%) internal_bench/format-3.16-int-unusual-pad680.py
    3.993s (+549.09%) internal_bench/format-3.17-int-unusual-pad1000.py
    0.632s (+02.74%) internal_bench/format-4.00-int-group-pad1.py
    0.632s (+02.74%) internal_bench/format-4.01-int-group-pad2.py
    0.632s (+02.77%) internal_bench/format-4.02-int-group-pad3.py
    0.633s (+02.94%) internal_bench/format-4.03-int-group-pad5.py
    0.656s (+06.56%) internal_bench/format-4.04-int-group-pad7.py
    0.665s (+08.16%) internal_bench/format-4.05-int-group-pad10.py
    0.659s (+07.11%) internal_bench/format-4.06-int-group-pad15.py
    0.705s (+14.56%) internal_bench/format-4.07-int-group-pad22.py
    0.725s (+17.76%) internal_bench/format-4.08-int-group-pad33.py
    0.769s (+24.99%) internal_bench/format-4.09-int-group-pad47.py
    0.808s (+31.39%) internal_bench/format-4.10-int-group-pad68.py
    0.895s (+45.45%) internal_bench/format-4.11-int-group-pad100.py
    1.007s (+63.73%) internal_bench/format-4.12-int-group-pad150.py
    1.193s (+93.98%) internal_bench/format-4.13-int-group-pad220.py
    1.500s (+143.75%) internal_bench/format-4.14-int-group-pad330.py
    1.885s (+206.44%) internal_bench/format-4.15-int-group-pad470.py
    2.758s (+348.33%) internal_bench/format-4.16-int-group-pad680.py
    4.050s (+558.32%) internal_bench/format-4.17-int-group-pad1000.py
1 tests performed (55 individual testcases)

jepler · 2025-09-22T19:47:30Z

That looks like a real promising alternative, especially if it's smaller than the others.

AJMansfield · 2025-09-26T14:26:40Z

That looks like a real promising alternative, especially if it's smaller than the others.

How to proceed here, then? It feels like there's really two different things here now, and I don't want to steal the grouping feature from you either.

My thought is to PR another version of this that limits its scope just to updating mp_print_strn to use a buffer on the stack, to be evaluated purely on the performance merits without the grouping logic to conflate against it. And then assuming that's accepted, this could then be rebased downstream to add the grouping tests and implement grouping against that new version.

jepler · 2025-09-26T16:05:10Z

If you have a branch that's fixes the bug I was trying to address and is better in other respects, I'm not worried about the git Author or Co-Author credit.

This reworks `mp_print_strn` to use a stack-allocated padding buffer rather than special-cased hardcoded ROM strings in order to reduce code size and improve string formatting performance. Note that this is actually just as performant, even for zeroes and spaces! On my RP2350 Cortex M33 hardware, spaces are about 1% faster for short-padding cases, and 3.4% faster for long-padding cases. I've done some cursory tests for alternate values of `PAD_BUF_SIZE`, but the results definitely won't generalize to other architectures, and probably not even to other implementations of the same architecture. The buffer size of 20 is chosen as the smallest size that easily admits a later implementation of micropython#18092 to support padding with grouping characters, to avoid pessimizing the short-padding cases any more than required. I've also explored alternatives involving using `alloca` for the padding buffer, but the conditionals and fallback logic needed to bound stack usage for the pathological cases end up pessimizing code size beyond what's reasonable for the very marginal additional speed gains. Signed-off-by: Anson Mansfield <amansfield@mantaro.com>

AJMansfield · 2025-09-26T17:31:35Z

If you have a branch that's fixes the bug I was trying to address and is better in other respects, I'm not worried about the git Author or Co-Author credit.

If it was just about a vanity credit I wouldn't be fussed either lol. To me it's far more about preserving the chain of ideas and keeping the development history as easy to follow as possible for the next guy having to dig through a git blame trace to track down some obscure bug.

And either way --- I still think the case for using a buffer on the stack is strong enough to stand on its own, and more easily defended without the whole factorial space of other micro-optimisations that doing it together with the grouping feature adds.

This reworks `mp_print_strn` to use a stack-allocated padding buffer rather than special-cased hardcoded ROM strings in order to reduce code size and improve string formatting performance. Note that this is actually just as performant, even for zeroes and spaces! On my RP2350 Cortex M33 hardware, spaces are about 1% faster for short-padding cases, and 3.4% faster for long-padding cases. I've done some cursory tests for alternate values of `PAD_BUF_SIZE`, but the results definitely won't generalize to other architectures, and probably not even to other implementations of the same architecture. The buffer size of 20 is chosen as the smallest size that easily admits a later implementation of micropython#18092 to support padding with grouping characters, to avoid pessimizing the short-padding cases any more than required. I've also explored alternatives involving using `alloca` for the padding buffer, but the conditionals and fallback logic needed to bound stack usage for the pathological cases end up pessimizing code size beyond what's reasonable for the very marginal additional speed gains. Signed-off-by: Anson Mansfield <amansfield@mantaro.com>

jepler · 2025-09-28T01:08:06Z

I can probably "rebuild" this atop your branch if that's how it ends up happening.

robert-hh reviewed Sep 17, 2025

View reviewed changes

py/mpprint.c Show resolved Hide resolved

jepler force-pushed the leading-zeros branch from 8140ef4 to 52bfb7c Compare September 17, 2025 16:09

robert-hh reviewed Sep 17, 2025

View reviewed changes

jepler force-pushed the leading-zeros branch 2 times, most recently from 83f1b76 to 7bfb342 Compare September 17, 2025 16:26

AJMansfield approved these changes Sep 18, 2025

View reviewed changes

jepler force-pushed the leading-zeros branch from 569ae3f to 0a78e8e Compare September 19, 2025 15:07

jepler added 2 commits September 19, 2025 10:12

cpydiff: Document unsupported float format with grouping char.

068e110

Signed-off-by: Jeff Epler <jepler@unpythonic.net>

jepler force-pushed the leading-zeros branch from 0a78e8e to 068e110 Compare September 19, 2025 15:12

jepler mentioned this pull request Sep 19, 2025

ci.sh: Fix missing set -e. #18099

Closed

AJMansfield mentioned this pull request Sep 26, 2025

py/mpprint: Use a padding buffer on the stack. #18147

Open

dpgeorge added the py-core Relates to py/ directory in source label Sep 30, 2025

	cause: To reduce code size, MicroPython does not implement this combination. Grouping characters will not appear in the number's significant digits and will appear at incorrect locations in leading leading zeros.
	cause: To reduce code size, MicroPython does not implement this combination. Grouping characters will not appear in the number's significant digits and will appear at incorrect locations in leading zeros.

		static const char pad_spaces[16] = {' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' '};
		static const char pad_common[23] = {'0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '_', '0', '0', '0', ',', '0', '0'};

Uh oh!

mpprint: Correctly format leading zeros with separators. #18092

Are you sure you want to change the base?

mpprint: Correctly format leading zeros with separators. #18092

Conversation

jepler commented Sep 17, 2025

Summary

Testing

Trade-offs and Alternatives

Uh oh!

codecov bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

Uh oh!

robert-hh Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robert-hh commented Sep 17, 2025

Uh oh!

jepler commented Sep 17, 2025

Uh oh!

robert-hh commented Sep 17, 2025

Uh oh!

AJMansfield commented Sep 17, 2025

Uh oh!

jepler commented Sep 17, 2025

Uh oh!

AJMansfield left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AJMansfield Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AJMansfield commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jepler commented Sep 19, 2025

Uh oh!

jepler commented Sep 19, 2025

Uh oh!

AJMansfield commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AJMansfield commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

master

leading-zeros

leading-zeros-alt2

Uh oh!

jepler commented Sep 22, 2025

Uh oh!

AJMansfield commented Sep 26, 2025

Uh oh!

jepler commented Sep 26, 2025

Uh oh!

AJMansfield commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

codecov bot commented Sep 17, 2025 •

edited

Loading

robert-hh Sep 17, 2025 •

edited

Loading

AJMansfield Sep 18, 2025 •

edited

Loading

AJMansfield commented Sep 18, 2025 •

edited

Loading

AJMansfield commented Sep 19, 2025 •

edited

Loading

AJMansfield commented Sep 21, 2025 •

edited

Loading

AJMansfield commented Sep 26, 2025 •

edited

Loading