New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add emscripten_memset_js and use it from memset #21683
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
If your change effects the codesize tests you will want to run test/runner other.*code_size* other.*metadce* --rebase
.
Bare in mind these tests are fairly sensitive, so you will want to make sure you hare using llvm and binary from the tip-of-tree version of emsdk. To do this you install emsdk with emsdk install tot && emsdk activate tot
and then use export EM_CONFIG=/path/to/emsdk/.emscripten
. The result if this is that when you run ./emcc
or ./test/runner
from your emscripten checkout (whereever it is) it will inherit everything else from emsdk (i.e. llvm and binaryen)
@@ -1237,7 +1238,7 @@ addToLibrary({ | |||
{{{ makeSetValue('tm', C_STRUCTS.tm.tm_yday, 'arraySum(isLeapYear(fullDate.getFullYear()) ? MONTH_DAYS_LEAP : MONTH_DAYS_REGULAR, fullDate.getMonth()-1)+fullDate.getDate()-1', 'i32') }}}; | |||
{{{ makeSetValue('tm', C_STRUCTS.tm.tm_isdst, '0', 'i32') }}}; | |||
{{{ makeSetValue('tm', C_STRUCTS.tm.tm_gmtoff, 'date.gmtoff', LONG_TYPE) }}}; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe revert these whitespace changes.
_emscripten_memset_js(str, c, n); | ||
return str; | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like emscripten_memcpy.c
only uses the _js
version once in that file, in the last case. That is, it doesn't use it in the first case, here, for -Oz
/ASan. Is there a reason to do things differently for memset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want good performance in -Oz
, right? I can understand not wanting it to apply for asan, though, since the JS fill will bypass asan checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for ASan I think we want to be able to instrument it as you said, and performance is less critical there.
For -Oz
we've focused on reducing size at all costs, which includes avoiding loop unrolling here. I think avoiding adding a fast path through JS is consistent with that?
I see that we currently have the following: I wonder why bulk memory operations are not used in -Oz? I would have though that they should be, the Wasm opcodes for memset and memcpy should provide the smallest code? Reading PR #19128 I was not able to find why Unless there is some counterintuitive reason (which would be a bit sad if that is the case :( ), I think the code for memset would rather want to look like: #ifdef __wasm_bulk_memory__
void *__memset(void *str, int c, size_t n) {
return _emscripten_memset_bulkmem(str, c, n);
}
#elif defined(EMSCRIPTEN_OPTIMIZE_FOR_OZ)
void *__memset(void *str, int c, size_t n) {
unsigned char *s = (unsigned char *)str;
#pragma clang loop unroll(disable)
while(n--) *s++ = c;
return str;
}
#else
... That should make the memset run fast also in interpreted wasm mode I'd believe? (in Wasm VMs, memsets and memcpys should call fast SIMD operations even when interpreting) Looks like the same situation applies to memcpy as well: |
@juj I think you're right, we should use bulk memory in |
It sounds like the correct construction for both memset and memcpy, then, is:
Is that right? For 'smallest' there is the difficulty in weighing the extra bytes for the JS helper against the performance loss from a bare memcpy/memset loop - it seems like it could be significant. And then I guess there's the standalone flag that enters the equation, in standalone mode we can't rely on JS. |
Indeed. In standalone mode we try our best to avoid JS imports. |
Yes, exactly, I believe that's the optimal order. I think we should do that in this PR and then as a separate followup we can fix memcpy. |
Can we fit the JS helper into |
I think its best to stick the simple/small solution in Going forward I imagine most folks will be using bulk memory so it should be a mute point anyway soon. |
In fact.. soon we can completely remove the explicit call to JS any since the plan for enabling bulk memory is to build with bulk-memory enabled. For users targeting legacy browsers we will then have binaryen lower the bulk memory instructions into something else (maybe a call to an import, maybe an inline implementation). |
It feels like this particular change is probably not worth it for most users, due to the conflict between code size demands and performance. (Plus I never figured out how to make the tests pass.) So I'm going to close it. Feel free to ping me if you'd like to see me bring it back. |
Addresses #21620
Sorry if I got something wrong here, I'm kind of flying blind based on the documentation... I was only able to successfully run the "other" tests and some of them fail out of the box on an unmodified checkout for me.
I'm guessing I need to update all of the text files under test/ that mention _emscripten_memcpy_js to also mention _emscripten_memset_js? Or are they automatically generated? When I did a local run it only seemed like my changes added one additional test failure, so I'm guessing I ran the wrong tests.