iprintf/small_printf opts using Emscripten OS triple in LLVM backend #8348

kripken · 2019-03-26T23:37:27Z

This uses LLVM with @sunfishcode 's printf-optimizing-patch plus some local patches to define an Emscripten OS in clang, so that we get the printf opt (note the triple change). This PR then has some hackery to split out iprintf in musl, that is, it adds a printf variant with no float support.

This mostly works - printf with only integers leads to 7K less code. Code using floats is a tiny bit bigger, 50 bytes. Code using both float and int printf is an additional 70 bytes - so many programs may be 120 bytes or so larger. I don't see an obvious way to avoid that without adding magic elsewhere in the toolchain.

I worry about optimizing __printf_small, that is, the version of printf with double but not long double, which is not done in this PR. That seems even trickier as the key fmt_fp method in musl is long double specific, and it seems we'd need to duplicate and modify it. That's not easy because the code is tricky, and it's not obvious to me that we can do it without a code size cost if both versions end up linked in. But I didn't look into it very carefully - perhaps @sunfishcode has a good idea?

If there isn't a good option here then overall I lean towards optimizing iprintf, despite the musl hackery, but that we fix the long double issue by removing float128 as we've been considering, and so don't need the __printf_small stuff.

kripken · 2019-03-27T23:43:25Z

Doing some experimentation, iprintf hello world is 14720 bytes. Adding in long double makes it 22040. If long doubles are just doubles, that shrinks to 17468.

So long doubles are the bigger deal here, and I lean towards replacing them with doubles, if we don't find a better solution.

kripken · 2019-04-02T17:54:55Z

Ok, this is now ready for review. For landing, we need to wait for https://reviews.llvm.org/D57620 and then to land our own triple. (Tests will fail for the wasm backend path until then.)

Summary for the squashed commit:

Use the new emscripten triple in LLVM. This lets us avoid needing to define __EMSCRIPTEN__ ourselves, and enables optimizations for iprintf (integer printf), __small_printf (printf with floats but not long double), and friends.
Refactor/hack musl to have compact versions of those functions. To simplify things and avoid the duplicate code problem of having both printf and iprintf linked in (from separate object files) this uses an indirect call for fmt_fp (the key method that formats a floating point number) and pop_arg_long_double (a new method that pops a long double from the arguments). We then pass around either NULL or function pointers to the real implementation, e.g. iprintf sends NULL for both. This means an extra indirect call on them when they are actually used, but this is not too bad since fmt_fp is fairly heavy anyhow - adding an indirect call or two on top is probably not a huge deal.
This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing. This makes things simpler (no need to create two fmt_fps and avoids having duplicate code if both are linked in. This also keeps us consistent with how fastcomp prints currently, so this is not a regression. If users ask for full float128 printing, we can add it later (it has never come up so far).

Code size wise, iprintf (no float support) is around 3K smaller, and small (just double, no long double) is around 750 bytes smaller. Full printf is larger than small because it has the code to cast a float128 to a double. Note that full printf now does not have long double printing at full precision, so we save 4K or so there.

jgravelle-google · 2019-04-02T18:17:05Z

This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing.

I remember one of the arguments for making long double be float128 is so that it would work seamlessly with library code i.e. libc i.e. printf. And I remember my argument against being something along the lines of, if the way to get reasonable codesize is to downcast to 64 bits, then we don't actually get any of that benefit, and just have the additional complexity of working around it.

Not complaining about this implementation, that seems like the most reasonable thing given the constraints. Rather I'm complaining about the constraints, i.e. long double being float128.

kripken · 2019-04-02T20:01:39Z

if the way to get reasonable codesize is to downcast to 64 bits, then we don't actually get any of that benefit, and just have the additional complexity of working around it.

Yeah, I think that's fair. We are doing extra work here just to keep it an option to interoperate with wasi, basically. It's an imperfect tradeoff for sure.

dschuff · 2019-04-02T23:43:28Z

tests/test_other.py

+          }
+        ''' % code)
+      run_process([PYTHON, EMCC, 'src.c', '-O1'])
+      return os.path.getsize('a.out.wasm')


do we need to delete src.c at the end?

No, when the test ends the temp dir it ran in will be cleaned up.

kripken · 2019-04-03T01:12:30Z

LLVM side landed. This can land as soon as we have a new lkgr (the LLVM change should not break the bots, since we don't actually use the new OS triple until this PR).

sbc100 · 2019-04-03T01:15:02Z

tests/runner.py

@@ -135,6 +138,12 @@ def decorated(f):
  return decorated


+def only_wasm_backend(note=''):


How about no_fastcomp to match the existing no_wasm_backend?

Sounds good, changing to no_fastcomp.

sbc100 · 2019-04-03T01:17:35Z

tests/test_core.py

-    src = open(path_from_root('tests', 'core', 'test_strtold.c')).read()
-    expected = open(path_from_root('tests', 'core', expected_file)).read()
-    self.do_run(src, expected)
+    self.do_run_in_out_file_test('tests', 'core', 'test_strtold')


Why this change? Did fastcomp get fixed? Maybe a separate change?

This changed because the wasm backend used to print float128s with full precision, and after this PR, with float64 precision. There is no change to fastcomp - the wasm backend now emits the same as fastcomp here (so no more need to have two options for different outputs in this test).

sbc100 · 2019-04-03T01:19:47Z

tools/shared.py

@@ -952,7 +952,7 @@ def apply_configuration():

 # Target choice.
 ASM_JS_TARGET = 'asmjs-unknown-emscripten'
-WASM_TARGET = 'wasm32-unknown-unknown-wasm'


You can drop the final -wasm here too while you are there.

Cool, removed -wasm.

…mscripten-core#8348) Use the new emscripten triple in LLVM. This lets us avoid needing to define __EMSCRIPTEN__ ourselves, and enables optimizations for iprintf (integer printf), __small_printf (printf with floats but not long double), and friends. Refactor/hack musl to have compact versions of those functions. To simplify things and avoid the duplicate code problem of having both printf and iprintf linked in (from separate object files) this uses an indirect call for fmt_fp (the key method that formats a floating point number) and pop_arg_long_double (a new method that pops a long double from the arguments). We then pass around either NULL or function pointers to the real implementation, e.g. iprintf sends NULL for both. This means an extra indirect call on them when they are actually used, but this is not too bad since fmt_fp is fairly heavy anyhow - adding an indirect call or two on top is probably not a huge deal. This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing. This makes things simpler (no need to create two fmt_fps and avoids having duplicate code if both are linked in. This also keeps us consistent with how fastcomp prints currently, so this is not a regression. If users ask for full float128 printing, we can add it later (it has never come up so far). Code size wise, iprintf (no float support) is around 3K smaller, and small (just double, no long double) is around 750 bytes smaller. Full printf is larger than small because it has the code to cast a float128 to a double. Note that full printf now does not have long double printing at full precision, so we save 4K or so there.

…backend (emscripten-core#8348)" This reverts commit 9e837aa.

…mscripten-core#8348) Use the new emscripten triple in LLVM. This lets us avoid needing to define __EMSCRIPTEN__ ourselves, and enables optimizations for iprintf (integer printf), __small_printf (printf with floats but not long double), and friends. Refactor/hack musl to have compact versions of those functions. To simplify things and avoid the duplicate code problem of having both printf and iprintf linked in (from separate object files) this uses an indirect call for fmt_fp (the key method that formats a floating point number) and pop_arg_long_double (a new method that pops a long double from the arguments). We then pass around either NULL or function pointers to the real implementation, e.g. iprintf sends NULL for both. This means an extra indirect call on them when they are actually used, but this is not too bad since fmt_fp is fairly heavy anyhow - adding an indirect call or two on top is probably not a huge deal. This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing. This makes things simpler (no need to create two fmt_fps and avoids having duplicate code if both are linked in. This also keeps us consistent with how fastcomp prints currently, so this is not a regression. If users ask for full float128 printing, we can add it later (it has never come up so far). Code size wise, iprintf (no float support) is around 3K smaller, and small (just double, no long double) is around 750 bytes smaller. Full printf is larger than small because it has the code to cast a float128 to a double. Note that full printf now does not have long double printing at full precision, so we save 4K or so there.

The optimization option was changed from `-Oz` to `-O0` due to `iprintf` calls in emscripten-core#8348.

The optimization option was changed from `-Oz` to `-O0` due to `iprintf` calls in #8348.

kripken added 6 commits March 25, 2019 17:21

wip [ci skip]

6086b5b

wip

279cbf1

work [ci skip]

2eb2dcb

wip [ci skip]

947048c

more work [ci skip]

5b9b6c7

test both too [ci skip]

b75efe7

kripken added 7 commits April 1, 2019 17:58

hackery [ci skip]

4cfca99

test [ci skip]

ff2d839

test updates

72a7668

Merge remote-tracking branch 'origin/incoming' into float128

019300b

test updates [ci skip]

b6e5680

cleanup

b7937d6

update symbols

fe2420e

kripken marked this pull request as ready for review April 2, 2019 17:55

kripken requested review from jgravelle-google, sbc100 and dschuff April 2, 2019 17:55

dschuff approved these changes Apr 2, 2019

View reviewed changes

Merge remote-tracking branch 'origin/incoming' into float128

a5686da

kripken changed the title ~~[for discussion] iprintf etc. float128 exploration~~ iprintf/small_printf opts using Emscripten OS triple in LLVM backend Apr 3, 2019

sbc100 approved these changes Apr 3, 2019

View reviewed changes

kripken and others added 4 commits April 2, 2019 20:25

only_wasm_backend => no_fastcomp

cf9acde

wasm32-unknown-emscripten

e17ba70

Merge remote-tracking branch 'origin/incoming' into float128

301f65c

Merge remote-tracking branch 'origin/incoming' into float128

7c64dbb

kripken merged commit b29f7dd into incoming Apr 4, 2019

kripken deleted the float128 branch April 4, 2019 00:22

kripken mentioned this pull request Apr 4, 2019

LLVM Backend Performance Measurement #5671

Closed

sbc100 mentioned this pull request May 9, 2019

emcc fails in SDL/SDL_config_minimal.h: typedef redefinition with different types ('unsigned int' vs 'unsigned long') #8569

Closed

VirtualTim added a commit to VirtualTim/emscripten that referenced this pull request May 23, 2019

Revert "iprintf/small_printf opts using Emscripten OS triple in LLVM …

adef080

…backend (emscripten-core#8348)" This reverts commit 9e837aa.

VirtualTim added a commit to VirtualTim/emscripten that referenced this pull request May 23, 2019

Revert "iprintf/small_printf opts using Emscripten OS triple in LLVM …

018b724

…backend (emscripten-core#8348)" This reverts commit 9e837aa.

aheejin added a commit to aheejin/emscripten that referenced this pull request Oct 25, 2022

Fix comment in gen_struct_info.py

e740cce

The optimization option was changed from `-Oz` to `-O0` due to `iprintf` calls in emscripten-core#8348.

aheejin added a commit to aheejin/emscripten that referenced this pull request Oct 25, 2022

Fix comment in gen_struct_info.py

4c5cd7f

The optimization option was changed from `-Oz` to `-O0` due to `iprintf` calls in emscripten-core#8348.

aheejin mentioned this pull request Oct 25, 2022

Fix comment in gen_struct_info.py #18103

Merged

aheejin added a commit that referenced this pull request Oct 26, 2022

Fix comment in gen_struct_info.py (#18103)

857bdf9

The optimization option was changed from `-Oz` to `-O0` due to `iprintf` calls in #8348.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iprintf/small_printf opts using Emscripten OS triple in LLVM backend #8348

iprintf/small_printf opts using Emscripten OS triple in LLVM backend #8348

kripken commented Mar 26, 2019

kripken commented Mar 27, 2019

kripken commented Apr 2, 2019

jgravelle-google commented Apr 2, 2019

kripken commented Apr 2, 2019

dschuff Apr 2, 2019

kripken Apr 2, 2019

kripken commented Apr 3, 2019

sbc100 Apr 3, 2019

kripken Apr 3, 2019

sbc100 Apr 3, 2019

kripken Apr 3, 2019

sbc100 Apr 3, 2019

kripken Apr 3, 2019

		@@ -135,6 +138,12 @@ def decorated(f):
		return decorated


		def only_wasm_backend(note=''):

iprintf/small_printf opts using Emscripten OS triple in LLVM backend #8348

iprintf/small_printf opts using Emscripten OS triple in LLVM backend #8348

Conversation

kripken commented Mar 26, 2019

kripken commented Mar 27, 2019

kripken commented Apr 2, 2019

jgravelle-google commented Apr 2, 2019

kripken commented Apr 2, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kripken commented Apr 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment