* MEM_INIT_METHOD != 1 with --with-memory-file 1 now triggers an assertion * Consistently return '' instead of m.group(0) if there is no initializer * Strip trailing zeros for emterpreter as well * Include crc32 in literal only if it gets verified * Enable assertions for the asm2m test run in general * Disable assertions for one test case, fnmatch, to cover that as well * Include the asm2m run name in two lists of run modes * Add browser test to verify all pairs of bytes get encoded correctly * Add browser test to verify that a >32M initializer works without chunking * Omit duplicate var declaration for the memoryInitializer variable * Minor comments and syntax improvements * Capture the memory_init_file setting by its MEM_INIT_METHOD value. * Drop special handling for emterpreter, which shouldn't be needed any more.
This makes the resulting literals more independent from the character encoding the environment assumes for the resulting file. It requires slightly more memory, but large bytes are far less common than small bytes (zero in particular), so the cost should not be too much. If we want to, we can still make this optional later on.
This is almost the standard CRC-32 algorithm, except that we omit the final XOR with -1 so that we can easily compare the result against zero. The length of the initializer is included in the data so that we don't have to worry about leading zeros (after XOR with the init value of -1). Useful read: http://www.ross.net/crc/download/crc_v3.txt
There was a bug where the hex-to-oct conversion would match \\x01. But support for octal escape sequences is optional in any case, and forbidden in strict mode, so we should avoid using these. As per the ECMAScript 5.1 spec, any source character (which may be any unicode code point) can be used inside a string literal, with the exception of backslash, line terminator or the quoting character. So we do just that: dump a lot of raw bytes into the string literal and escape only what needs to be escaped. There is one catch, though: sources are usually encoded in UTF-8, in which case we can't exactly plug in raw bytes, but have to use UTF-8 sequences for the range \x80 through \xff. This may cause problems if the source file is NOT interpreted as UTF-8.
Partial revert of commit 53a969d, plus addition of comment explaining why the verbiage about ld compatibility is there. Added a check for the 'GNU' token in 'emcc -v' output to tests.
… fix it by removing a unnecessarily strict assertion in formatString (code in formatString only needs 4 byte alignment, but asserts 8 bytes)
…ile switch-cases of unbounded length.
… not worth it for the rather slim amount of benefit it provides