New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cheerp / Emscripten size comparison #76
Comments
Hello Alon. Sorry for the delay, we needed some time to reproduce results and gather all required data. We have been pleasantly surprised that Cheerp is now integrated in emscripten's benchmarks. We have been maintaining our own branch of emscripten to run tests, which we have now rebased on About size results, the differences arise from the test cases being slightly different. I will focus on To reproduce our results you need to simply remove
Of course, we use the same modified sources when building with Emscripten and with Cheerp. About larger scale tests, Various test cases need some patching to run with Cheerp. The branch we published contains all required patches. It should be noted that the tests were originally ported to be compiled to plain JS (Cheerp |
As a side note, the |
Thanks @alexp-sssup !
I see, so we are indeed measuring something different. Yes, good point, when using printf in a benchmark that is just a few lines of code like However, without printf, output from tiny benchmarks like Overall, I'm more interested in moderate or large code size projects (the common case that I see among users), like say Box2D. Do you see the same as what I reported on that one (emscripten being 23% smaller than cheerp)?
I see. So configure/make like say zlib, lua, etc. benchmarks require won't work on cheerp, and I'd need to write a makefile manually if I want those tests to run?
That links requires me to log in, so I can't view it. |
Like primes and memops our version of Box2D is patched. Many of the patches are there for With this patch, in my tests, the size between emscripten and cheerp becomes roughly the same. There is a ~2% difference either up or down depending if you choose the compressed or uncompressed version. As usual the patched version is used when compiling both the emscripten and cheerp builds. configure/make should actually work, by using a wrapper script. This is documented here. About the link, I pasted the one from our private repos instead of the public one. I apologize. Here is the correct one: alexp-sssup/emscripten@60620e0 |
Interesting, yes, I see that when I just remove the printf from box2d (using your patch
) then the cheerp and emscripten sizes become close. But this seems odd. Why does cheerp go from 150K to 122K just by removing printf - is that expected? Also, I'm not sure the benchmark is valid without the printing. Without printf, the LLVM optimizer may be able to remove code that we want to execute, but now has no side effects. If printf is a problem for cheerp, is there some other way to print stuff, that is efficient for you?
Thanks, but I still can't get it to work, though. First, Anyhow, maybe I missed or misunderstood something there. In general, it would be great to have a shared script for these comparisons so we know and agree they are fair - perhaps you want to upstream some of the changes in your fork? |
I found some time this weekend to dive into the box2d differences here in more detail. A large source of differences is in system library code:
Overall, system lib differences account for a lot of the size differences between the compilers, but even though that's interesting to know, it's always going to be a tradeoff between compiling for size or speed - if one compiler started to build system libs with -Oz it might emit smaller code but eventually users would notice it isn't as fast, etc. So maybe this isn't that important. Because of that I also did a dive into the wasm binaries themselves, looking function by function. I focused on the largest functions in box2d, which are
Looking at their binary sizes, Emscripten is smaller on all of them, by 18%, 10%, and 23% respectively. Another way to look at that is to run the Binaryen optimizer on Cheerp output, and it shrinks it by 15%. That's pretty close to the per-function results, which makes sense if the two compiler's output is mostly similar, except that emscripten also runs the Binaryen optimizer. To summarize,
|
Hello Alon, keep in mind that there has been significant changes in our Wasm backend since my last comment, so you will need to use updated packages to reproduce our exact results. I will try to answer all the issue you raised.
|
Thanks for the detailed response!
Was that a typo perhaps, and you meant
Which emscripten version was that with? On the latest of both (Cheerp
The Cheerp results are similar to yours, except a little better - maybe since I tested on a newer version. But your emscripten results without printf are surprisingly poor - maybe also part of the difference is I'm using a newer version, but I don't think we landed any major optimizations recently, so that is strange. Aside from measuring size in bytes, I also gave more in-depth details above, that I don't think you responded to, curious to get your perspective on them, and to check if I got something wrong:
It's the musl libc printf implementation - should be complete AFAIK.
Thanks for the link. I'm conflicted on testing with patches like these, though: on one hand, more comparisons is good, but on the other, I want to test on real-world code, without special porting to emscripten or cheerp.
Continuing my last response, I am open to code to run Cheerp with the right flags etc., and maybe minimal benchmark changes make sense (like removing printf), but I'd rather not modify zlib, bullet, box2d etc. significantly, since emscripten's goal is to run them well without porting (and the version in the test suite is used both for benchmarking and for testing). Perhaps, instead, we could create a separate repo for cross-compiler comparisons?
I see, thanks. Makes sense now.
Interesting. Perhaps we use different versions of dlmalloc then, or build it differently - we use
I see that |
I apologize for taking so long to reply. To answer with the appropiate level of detail and precision I needed to dedicate significant time, which I could not find until now.
|
Oh sorry, I missed that there was a reply here... I do still think these comparisons are useful, but as you said too, it's hard to find time given all the other priorities we have I guess. |
Hi Cheerp devs :)
I see your website says "30% smaller than Emscripten" so I was curious to measure that. Running emscripten's
tests/test_benchmark.py
script, I see the following results which are very different:.wasm
and.js
files together, but it's also true when just looking at the wasm.)test_primes
and 27% smaller ontest_memops
.test_box2d
, emscripten is 23% smaller there.That's for size. Speed-wise, the results are a mix (some a little faster, some a little slower, some about the same).
Several of the tests hit problems when running the Cheerp output:
test_base64
hitsError: this should be unreachable
.test_fasta_float
hitsRuntimeError: index out of bounds
.test_havlak
hangs.test_box2d
hitsRuntimeError: indirect call signature mismatch
.I also couldn't get some tests to build in Cheerp:
test_linpack
,test_bullet
,test_lua_binarytrees
,test_lua_scimark
,test_zlib
. For exampletest_zlib
saysMaybe the script doesn't build them right? It calls Cheerp's llvm-ar etc., and the commands work with emscripten and native builds, but maybe something more needs to be done for Cheerp? Or am I hitting Cheerp limitations - would zlib, Lua, etc. need to be ported to Cheerp first?
This is on Cheerp
1516806243-1~xenial
(which is the latest nightly build I see for Xenial, from Jan 24) and emscripten 1.37.36 (last tagged release, from Mar 13).These results are very different from the ones you reported, perhaps we are not measuring the same thing somehow? (All the details of how I got the measurements mentioned above are in the linked script that runs those benchmarks, I basically just ran that script as-is except for uncommenting the line to enable running Cheerp.) Or maybe our results are about different versions?
The text was updated successfully, but these errors were encountered: