Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iprintf/small_printf opts using Emscripten OS triple in LLVM backend #8348

Merged
merged 18 commits into from Apr 4, 2019

Conversation

kripken
Copy link
Member

@kripken kripken commented Mar 26, 2019

This uses LLVM with @sunfishcode 's printf-optimizing-patch plus some local patches to define an Emscripten OS in clang, so that we get the printf opt (note the triple change). This PR then has some hackery to split out iprintf in musl, that is, it adds a printf variant with no float support.

This mostly works - printf with only integers leads to 7K less code. Code using floats is a tiny bit bigger, 50 bytes. Code using both float and int printf is an additional 70 bytes - so many programs may be 120 bytes or so larger. I don't see an obvious way to avoid that without adding magic elsewhere in the toolchain.

I worry about optimizing __printf_small, that is, the version of printf with double but not long double, which is not done in this PR. That seems even trickier as the key fmt_fp method in musl is long double specific, and it seems we'd need to duplicate and modify it. That's not easy because the code is tricky, and it's not obvious to me that we can do it without a code size cost if both versions end up linked in. But I didn't look into it very carefully - perhaps @sunfishcode has a good idea?

If there isn't a good option here then overall I lean towards optimizing iprintf, despite the musl hackery, but that we fix the long double issue by removing float128 as we've been considering, and so don't need the __printf_small stuff.

@kripken
Copy link
Member Author

kripken commented Mar 27, 2019

Doing some experimentation, iprintf hello world is 14720 bytes. Adding in long double makes it 22040. If long doubles are just doubles, that shrinks to 17468.

So long doubles are the bigger deal here, and I lean towards replacing them with doubles, if we don't find a better solution.

@kripken
Copy link
Member Author

kripken commented Apr 2, 2019

Ok, this is now ready for review. For landing, we need to wait for https://reviews.llvm.org/D57620 and then to land our own triple. (Tests will fail for the wasm backend path until then.)

Summary for the squashed commit:

  • Use the new emscripten triple in LLVM. This lets us avoid needing to define __EMSCRIPTEN__ ourselves, and enables optimizations for iprintf (integer printf), __small_printf (printf with floats but not long double), and friends.
  • Refactor/hack musl to have compact versions of those functions. To simplify things and avoid the duplicate code problem of having both printf and iprintf linked in (from separate object files) this uses an indirect call for fmt_fp (the key method that formats a floating point number) and pop_arg_long_double (a new method that pops a long double from the arguments). We then pass around either NULL or function pointers to the real implementation, e.g. iprintf sends NULL for both. This means an extra indirect call on them when they are actually used, but this is not too bad since fmt_fp is fairly heavy anyhow - adding an indirect call or two on top is probably not a huge deal.
  • This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing. This makes things simpler (no need to create two fmt_fps and avoids having duplicate code if both are linked in. This also keeps us consistent with how fastcomp prints currently, so this is not a regression. If users ask for full float128 printing, we can add it later (it has never come up so far).

Code size wise, iprintf (no float support) is around 3K smaller, and small (just double, no long double) is around 750 bytes smaller. Full printf is larger than small because it has the code to cast a float128 to a double. Note that full printf now does not have long double printing at full precision, so we save 4K or so there.

@kripken kripken marked this pull request as ready for review April 2, 2019 17:55
@jgravelle-google
Copy link
Contributor

This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing.

I remember one of the arguments for making long double be float128 is so that it would work seamlessly with library code i.e. libc i.e. printf. And I remember my argument against being something along the lines of, if the way to get reasonable codesize is to downcast to 64 bits, then we don't actually get any of that benefit, and just have the additional complexity of working around it.

Not complaining about this implementation, that seems like the most reasonable thing given the constraints. Rather I'm complaining about the constraints, i.e. long double being float128.

@kripken
Copy link
Member Author

kripken commented Apr 2, 2019

if the way to get reasonable codesize is to downcast to 64 bits, then we don't actually get any of that benefit, and just have the additional complexity of working around it.

Yeah, I think that's fair. We are doing extra work here just to keep it an option to interoperate with wasi, basically. It's an imperfect tradeoff for sure.

}
''' % code)
run_process([PYTHON, EMCC, 'src.c', '-O1'])
return os.path.getsize('a.out.wasm')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to delete src.c at the end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, when the test ends the temp dir it ran in will be cleaned up.

@kripken kripken changed the title [for discussion] iprintf etc. float128 exploration iprintf/small_printf opts using Emscripten OS triple in LLVM backend Apr 3, 2019
@kripken
Copy link
Member Author

kripken commented Apr 3, 2019

LLVM side landed. This can land as soon as we have a new lkgr (the LLVM change should not break the bots, since we don't actually use the new OS triple until this PR).

tests/runner.py Outdated
@@ -135,6 +138,12 @@ def decorated(f):
return decorated


def only_wasm_backend(note=''):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about no_fastcomp to match the existing no_wasm_backend?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, changing to no_fastcomp.

src = open(path_from_root('tests', 'core', 'test_strtold.c')).read()
expected = open(path_from_root('tests', 'core', expected_file)).read()
self.do_run(src, expected)
self.do_run_in_out_file_test('tests', 'core', 'test_strtold')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Did fastcomp get fixed? Maybe a separate change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changed because the wasm backend used to print float128s with full precision, and after this PR, with float64 precision. There is no change to fastcomp - the wasm backend now emits the same as fastcomp here (so no more need to have two options for different outputs in this test).

@@ -952,7 +952,7 @@ def apply_configuration():

# Target choice.
ASM_JS_TARGET = 'asmjs-unknown-emscripten'
WASM_TARGET = 'wasm32-unknown-unknown-wasm'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the final -wasm here too while you are there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, removed -wasm.

@kripken kripken merged commit b29f7dd into incoming Apr 4, 2019
@kripken kripken deleted the float128 branch April 4, 2019 00:22
VirtualTim pushed a commit to VirtualTim/emscripten that referenced this pull request May 21, 2019
…mscripten-core#8348)

Use the new emscripten triple in LLVM. This lets us avoid needing to define __EMSCRIPTEN__ ourselves, and enables optimizations for iprintf (integer printf), __small_printf (printf with floats but not long double), and friends.

Refactor/hack musl to have compact versions of those functions. To simplify things and avoid the duplicate code problem of having both printf and iprintf linked in (from separate object files) this uses an indirect call for fmt_fp (the key method that formats a floating point number) and pop_arg_long_double (a new method that pops a long double from the arguments). We then pass around either NULL or function pointers to the real implementation, e.g. iprintf sends NULL for both. This means an extra indirect call on them when they are actually used, but this is not too bad since fmt_fp is fairly heavy anyhow - adding an indirect call or two on top is probably not a huge deal.

This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing. This makes things simpler (no need to create two fmt_fps and avoids having duplicate code if both are linked in. This also keeps us consistent with how fastcomp prints currently, so this is not a regression. If users ask for full float128 printing, we can add it later (it has never come up so far).

Code size wise, iprintf (no float support) is around 3K smaller, and small (just double, no long double) is around 750 bytes smaller. Full printf is larger than small because it has the code to cast a float128 to a double. Note that full printf now does not have long double printing at full precision, so we save 4K or so there.
VirtualTim added a commit to VirtualTim/emscripten that referenced this pull request May 23, 2019
VirtualTim added a commit to VirtualTim/emscripten that referenced this pull request May 23, 2019
belraquib pushed a commit to belraquib/emscripten that referenced this pull request Dec 23, 2020
…mscripten-core#8348)

Use the new emscripten triple in LLVM. This lets us avoid needing to define __EMSCRIPTEN__ ourselves, and enables optimizations for iprintf (integer printf), __small_printf (printf with floats but not long double), and friends.

Refactor/hack musl to have compact versions of those functions. To simplify things and avoid the duplicate code problem of having both printf and iprintf linked in (from separate object files) this uses an indirect call for fmt_fp (the key method that formats a floating point number) and pop_arg_long_double (a new method that pops a long double from the arguments). We then pass around either NULL or function pointers to the real implementation, e.g. iprintf sends NULL for both. This means an extra indirect call on them when they are actually used, but this is not too bad since fmt_fp is fairly heavy anyhow - adding an indirect call or two on top is probably not a huge deal.

This also makes printf only use doubles for actual formatting. That is, it casts a float128 to a float64 for printing. This makes things simpler (no need to create two fmt_fps and avoids having duplicate code if both are linked in. This also keeps us consistent with how fastcomp prints currently, so this is not a regression. If users ask for full float128 printing, we can add it later (it has never come up so far).

Code size wise, iprintf (no float support) is around 3K smaller, and small (just double, no long double) is around 750 bytes smaller. Full printf is larger than small because it has the code to cast a float128 to a double. Note that full printf now does not have long double printing at full precision, so we save 4K or so there.
aheejin added a commit to aheejin/emscripten that referenced this pull request Oct 25, 2022
The optimization option was changed from `-Oz` to `-O0` due to `iprintf`
calls in emscripten-core#8348.
aheejin added a commit to aheejin/emscripten that referenced this pull request Oct 25, 2022
The optimization option was changed from `-Oz` to `-O0` due to `iprintf`
calls in emscripten-core#8348.
aheejin added a commit that referenced this pull request Oct 26, 2022
The optimization option was changed from `-Oz` to `-O0` due to `iprintf`
calls in #8348.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants