Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug in WASM: Pointer math issue when using Emscripten as backend for FreeBasic #20325

Open
angros47 opened this issue Sep 22, 2023 · 44 comments

Comments

@angros47
Copy link

I have found some weird issues that lead to code producing the exception "index out of bounds". The minimal code that produces the error is:

int arr[14];
int main(){
	for (int i=0;i<=5;i++){
		*(int*)(((int)(int*)arr + (-i << (2 & 31))) + 52) = i;
	}
}

This code, that is very similar, seems to work with no problem:

int main(){
	int arr[14];
	for (int i=0;i<=5;i++){
		*(int*)(((int)(int*)arr + (-i << (2 & 31))) + 52) = i;
	}
}

What might cause it?
In case you wonder why I am using such a weird code, instead of a simple "arr[13-i]=i;", the reason is that the code is produced by the C emitter of the FreeBasic compiler (and it works perfectly with the GCC compiler on other platforms). More information is available on the FreeBasic forum, in this thread

@angros47
Copy link
Author

My emcc version is:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.46 (1960782)
clang version 18.0.0 (https://github.com/llvm/llvm-project 75501f53624de92aafce2f1da698b249a7293dc7)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /home/angelo/Documenti/FreeBASIC/emsdk/upstream/bin

@kripken
Copy link
Member

kripken commented Sep 22, 2023

Looks like undefined behavior:

$ ./emcc a.cpp -fsanitize=undefined
$ nodejs c.out.js
a.cpp:4:33: runtime error: left shift of negative value -1

@angros47
Copy link
Author

I tried to modify the test code to this:

int arr[14];
int main(){
	for (int i=0;i<=5;i++){
		*(int*)(((int)(int*)arr + (-(i << (2 & 31)))) + 52) = i;
	}
}

Same issue, "index out of bounds".
The code:

int main(){
	int arr[14];
	for (int i=0;i<=5;i++){
		*(int*)(((int)(int*)arr + (-(i << (2 & 31)))) + 52) = i;
	}
}

works normally.
By using "-(i << (2 & 31))", I made sure the value that is left shifted is positive, and the sign is changed only after, but this does not prevent the issue from happening. So, I tend to exclude this being the cause.

@angros47
Copy link
Author

Even weirder: if compiled with the parameter "-fsanitize=undefined", it returns no error and works as expected. Without that parameter, it crashes

@angros47
Copy link
Author

Further information: if the parameter "-sWASM=0" is added, all seems to work as expected.

Even the first code (with the syntax that is supposed to produce undefined behavior) works perfectly. The problem happens only when WASM code is produced, not when JavaScript code is produced

@angros47 angros47 changed the title Pointer math issue when using Emscripten as backend for FreeBasic Bug in WASM: Pointer math issue when using Emscripten as backend for FreeBasic Sep 24, 2023
@angros47 angros47 changed the title Bug in WASM: Pointer math issue when using Emscripten as backend for FreeBasic Possible bug in WASM: Pointer math issue when using Emscripten as backend for FreeBasic Sep 24, 2023
@sbc100
Copy link
Collaborator

sbc100 commented Sep 25, 2023

All the code examples above also crash when compiled with gcc or clang:

$ gcc -O2 test.c && ./a.out 
Segmentation fault
$ clang -O2 test.c && ./a.out 
Segmentation fault

What makes you think this is an emscripten specific issue?

@angros47
Copy link
Author

Have you compiled it with no change to native code for your machine? You should either try it on a 32 bit machine, or modify the line:

*(int*)(((int)(int*)arr + (-(i << (2 & 31)))) + 52) = i;

To:
*(int*)(((long long int)(int*)arr + (-(i << (2 & 31)))) + 52) = i;

The pointer is temporarily cast to an integer, and Emscripten uses 32 bit pointers, while your machine likely uses 64 bit pointers

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

I see, yes with -m32 I don't see a segfault.

I also can't repro with emscripten:

$ emcc -O2 test.c && node ./a.out.js

@angros47
Copy link
Author

I confirm: with -O2 it doesn't happen. Even weirder

@angros47
Copy link
Author

Also, I read that the left shift of a negative value is defined in C++20 specifications, actually

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

*(int*)(((int)(int*)arr + (-(i << (2 & 31)))) + 52) = i;

does not seem correct? i ranges between 0 and 5 inclusive, so i << 2 ranges between 0 and 20 inclusive, so arr pointer gets subtracted by 20 in the last iteration of the loop. In C/C++, if you have int *arr, adding -20 to the pointer moves the pointer back by 20 elements, and not by 20 bytes.

So (int*)arr + -20 + 52 is same as ((int*)arr)[32], which will go out of bounds of int arr[14];, since it doesn't have enough size.

If you run with -fsanitize=undefined, you can see an error message

RuntimeError: memory access out of bounds

which is like expected, since the memory access does go out of bounds of the array.

with -O2 it doesn't happen. Even weirder

The reason for the issue hiding with optimizations is that it all gets optimized out since arr is never read from.

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

Oh, looks like the + 52 part is outside the parentheses, casted to int, so that needs divided by 4 to convert to elements. 52/4 == 13, so looks like the elements it accesses are

i=0: arr[13]
i=1: arr[9]
i=2: arr[5]
i=3: arr[1]
i=4: arr[-3] -> out of bounds
i=5: arr[-7] -> out of bounds

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

I was able to get memory access out of bounds without the loop and just with i=1:

int arr[14];

int main() {
  int i = 1;
  *(int*)(((int)(int*)arr + (-(i << (2 & 31)))) + 52) = i;
}

@angros47
Copy link
Author

Then, why does it work if I disable the WASM and use the javascript emitter? It should either fail in both cases, or work in both cases?

And why does it work if the array is declared inside "main()"?

Also, there is a double casting: (int)(int*)arr, that means "arr" is first cast to an integer pointer, and immediately after is cast to integer. So, incremending or decrementing a simple integer should just change its value, and when it's cast again to a pointer, the pointer should refer to the new value.

I mean, let's say we declare a pointer p

int arr[100]; //an array. Let's suppose it's allocated at address 1000
int* p=arr; //Value of p should be 1000
p++; //increments p by one element, so its value should be 1004
p=(int*) ((int)p+1); //should be 1005

@angros47
Copy link
Author

Also: why this does cause an exception:

int arr[14];

int main() {
  int i = 1;
  *(int*)(((int)(int*)arr + (-(i << (2 & 31)))) + 52) = i;
}

And this doesn't?


int main() {
  int i = 1;
  *(int*)(((int)(int*)arr + (-(1 << (2 & 31)))) + 52) = i;
}

Isn't the code perfectly equivalent? I just replaced "i" with "1" inside the formula

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

Clearly the compiler/optimizer in llvm is doing something different in those two cases. My guess is that this is an issue the relates to offset parameter of the wasm i32.load and i32.store instruction which I believe is unsigned, so it could be an llvm bug, or it could be undefined behaviour.

(See https://github.com/sunfishcode/wasm-reference-manual/blob/master/WebAssembly.md#store)

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

It seems that its not the i32.store offset. That looks correct. Here is the failing main function:

0000c8 func[1] <main>:
 0000c9: 23 00                      | global.get 0
 0000cb: 41 10                      | i32.const 16
 0000cd: 6b                         | i32.sub
 0000ce: 22 00                      | local.tee 0
 0000d0: 41 01                      | i32.const 1
 0000d2: 36 02 0c                   | i32.store 2 12
 0000d5: 20 00                      | local.get 0
 0000d7: 41 00                      | i32.const 0
 0000d9: 20 00                      | local.get 0
 0000db: 28 02 0c                   | i32.load 2 12
 0000de: 41 02                      | i32.const 2
 0000e0: 74                         | i32.shl
 0000e1: 6b                         | i32.sub
 0000e2: 36 02 08                   | i32.store 2 8
 0000e5: 20 00                      | local.get 0
 0000e7: 28 02 08                   | i32.load 2 8
 0000ea: 20 00                      | local.get 0
 0000ec: 28 02 0c                   | i32.load 2 12
 0000ef: 36 02 b4 08                | i32.store 2 1076
 0000f3: 41 00                      | i32.const 0
 0000f5: 0b                         | end

The store instruction which is the one that is trapping looks reasonable. We have i32.store 2 1076 where 1076 is &arr + 52 (arr lives at address 1024)

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

Also, there is a double casting: (int)(int*)arr, that means "arr" is first cast to an integer pointer, and immediately after is cast to intege

Ops, sorry, you are correct. I see now, the many parentheses tripped me up and I interpreted the subtraction before the second cast.

Indeed it does look like something is off here. A smaller test case

int arr[2];
int main(){
  for (int i = 0; i < 2; ++i) {
    *(int*)((int)arr + 4 + -i) = 1;
  }
}

also fails, whereas

int arr[2];
int main(){
  for (int i = 0; i < 2; ++i) {
    *(int*)((int)arr + 4 -i) = 1;
  }
}

doesn't. So something looks fishy here.

@angros47
Copy link
Author

I can confirm that.
Also, if doesn't file if compiled with -sWASM=0

So, the issue seems to be at WASM level

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

Looks like -fsanitize=undefined also masks the issue.

int arr[2];
int main(){
  for (int i = 0; i < 2; ++i)
    *(int*)((int)arr + 4 + -(i<<2)) = 1;
}

passes with em++ a.cpp -o a.js -fsanitize=undefined, but fails if -fsanitize=undefined is removed.

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

Simpler test case without a for loop:

int arr[2];
int i = 4;
int main() {
  *(int*)((int)arr + 4 + -i) = 1;
}

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

Oh nice, now the failing main function is just a few lines since its doesn't do __stack_pointer manipulation.

@angros47
Copy link
Author

Which browser are you using? I am using Firefox

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

You can repro in node on the command line, no need for a browser:

$ emcc test.c && node ./a.out.js
/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:64
   throw ex;
   ^

RuntimeError: memory access out of bounds
    at wasm://wasm/d2db571a:wasm-function[1]:0xd4
    at Module._main (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:475:90)
    at callMain (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:497:13)
    at doRun (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:521:21)
    at run (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:533:3)
    at runCaller (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:488:18)
    at removeRunDependency (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:235:4)
    at receiveInstance (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:334:3)
    at receiveInstantiationResult (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:339:3)

@angros47
Copy link
Author

I see. I just want to rule out any possible bug in the virtual machine itself

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

Failing wasm: (*(int*)((int)arr + 4 + -i) = 1;)

 (func $__original_main (result i32)
  (local $0 i32)
  (local $1 i32)
  (local $2 i32)
  (local $3 i32)
  (local $4 i32)
  (local $5 i32)
  (local.set $0 (i32.const 0) )
  (local.set $1 (i32.load offset=65536 (local.get $0) ) )
  (local.set $2 (i32.const 0) )
  (local.set $3 (i32.sub (local.get $2) (local.get $1) ) )
  (local.set $4 (i32.const 1) )
  (i32.store offset=65544 (local.get $3) (local.get $4) )
  (local.set $5 (i32.const 0) )
  (return (local.get $5) )
 )

Passing wasm: (*(int*)((int)arr + 4 -i) = 1;)

 (func $__original_main (result i32)
  (local $0 i32)
  (local $1 i32)
  (local $2 i32)
  (local $3 i32)
  (local $4 i32)
  (local $5 i32)
  (local $6 i32)
  (local $7 i32)
  (local.set $0 (i32.const 0) )
  (local.set $1 (i32.load offset=65536 (local.get $0) ) )
  (local.set $2 (i32.const 65540) )
  (local.set $3 (i32.const 4) )
  (local.set $4 (i32.add (local.get $2) (local.get $3) ) )
  (local.set $5 (i32.sub (local.get $4) (local.get $1) ) )
  (local.set $6 (i32.const 1) )
  (i32.store (local.get $5) (local.get $6) )
  (local.set $7 (i32.const 0) )
  (return (local.get $7) )
 )

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

How do you get the nice one-liner wasm disassembly @sbc100 ? wasm-dis gives that verbose output that's not that nice to diff.

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

If you compile test.c to test.o you can then link with -O2 to get a much smaller failing main (thanks for wasm-opt):

$ emcc --minify=0 -O2 test.o 

This gives:

 (func $1 (param $0 i32) (param $1 i32) (result i32)
  (i32.store offset=1032
   (i32.sub
    (i32.const 0)
    (i32.load
     (i32.const 1024)
    )
   )
   (i32.const 1)
  )
  (i32.const 0)
 )

@sbc100
Copy link
Collaborator

sbc100 commented Sep 26, 2023

To get linear dis-assembly I use wasm-objdump -d .

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

Hmm, for me it doesn't fail with -O1 or higher.

@juj
Copy link
Collaborator

juj commented Sep 26, 2023

The first code ends up doing a memory write to memory address -4 with offset +65544. I wonder if that is something that the access bounds validator doesn't like, and it immediately rejects the base address -4 as invalid, without realizing the offset would push it above into a valid memory area?

@sbc100
Copy link
Collaborator

sbc100 commented Sep 27, 2023

Hmm, for me it doesn't fail with -O1 or higher.

You compile test.c with -O0 and then link it with -O2.. that fails:

$ emcc --minify=0 -O2 test.o && node ./a.out.js
/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:64
   throw ex;
   ^
RuntimeError: memory access out of bounds
    at wasm://wasm/d2db571a:wasm-function[1]:0xd4
    at Module._main (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:475:90)
    at callMain (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:497:13)
    at doRun (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:521:21)
    at run (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:533:3)
    at runCaller (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:488:18)
    at removeRunDependency (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:235:4)
    at receiveInstance (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:334:3)
    at receiveInstantiationResult (/usr/local/google/home/sbc/dev/wasm/emscripten/a.out.js:339:3)

@juj
Copy link
Collaborator

juj commented Sep 27, 2023

At least that's my takeaway from the crash..

Screen Shot 2023-09-27 at 2 59 33 AM

@sbc100
Copy link
Collaborator

sbc100 commented Sep 27, 2023

(note that I'm linking test.o .. compiled with -O0)

@sbc100
Copy link
Collaborator

sbc100 commented Sep 27, 2023

The first code ends up doing a memory write to memory address -4 with offset +65544. I wonder if that is something that the access bounds validator doesn't like, and it immediately rejects the base address -4 as invalid, without realizing the offset would push it above into a valid memory area?

I think the problem is that memory address -4 part. It looks like the engines always interpret both the address and the offset immediate as unsigned.. I wasn't aware of that but it looks like what is happening.

@juj
Copy link
Collaborator

juj commented Sep 27, 2023

The failure does occur equally in

  • Safari Version 14.0 (15610.1.28.1.9, 15610)
  • Firefox 117.0.1 (64-bit)
  • Chrome Version 119.0.6032.0 (Official Build) canary (x86_64)
  • Node.js v18.18.0

(tested on macOS Catalina 10.15.7 (19H2))

So the behavior is either specced to work like that, or every VM got it wrong (seems unlikely).

I think the problem is that memory address -4 part.

Yeah, the following code does give out of bounds:

int main() { ((char*)-1)[13] = 1; }

whereas

int main() { ((char*)0)[12] = 1; }

doesn't raise an out of bounds error.

Disassembly for the failing case being

 (func $__original_main (result i32)
  (local $0 i32)
  (local $1 i32)
  (local $2 i32)
  (local.set $0 (i32.const -1) )
  (local.set $1 (i32.const 1) )
  (i32.store8 offset=13 (local.get $0) (local.get $1) ) // RuntimeError: memory access out of bounds
  (local.set $2 (i32.const 0) )
  (return (local.get $2) )
 )

It looks like the engines always interpret both the address and the offset immediate as unsigned..

Wouldn't that need a combination of i32->u32->u64 32-bit->64-bit widening to happen?

i.e. even if address was interpreted as u32, 0xFFFFFFFF, then adding 13 to it would still produce 12 as result? Only if it was interpreted as u32, then internally extended to u64 for the offset calculation (to generate 0x10000000C) would that be a problem?

Though I suppose since it generates a x86 mov instruction, the CPU will always treat both the base number and offset as u64.

@juj
Copy link
Collaborator

juj commented Sep 27, 2023

Even further simplified test case:

char arr[2];
char i = 1;
int main() {
  *(arr + 1 + -i) = 1;
}

which is quite reasonable code. Although for some reason, it generates very silly wasm code:

 (func $__original_main (result i32)
  (local $0 i32)
  (local $1 i32)
  (local $2 i32)
  (local $3 i32)
  (local $4 i32)
  (local $5 i32)
  (local $6 i32)
  (local $7 i32)
  (local $8 i32)
  (local.set $0 (i32.const 0) )
  (local.set $1 (i32.load8_u offset=65536 (local.get $0) ) )
  (local.set $2 (i32.const 24) )
  (local.set $3 (i32.shl (local.get $1) (local.get $2) ) )
  (local.set $4 (i32.shr_s (local.get $3) (local.get $2) ) )
  (local.set $5 (i32.const 0) )
  (local.set $6 (i32.sub (local.get $5) (local.get $4) ) )
  (local.set $7 (i32.const 1) )
  (i32.store8 offset=65541 (local.get $6) (local.get $7) )
  (local.set $8 (i32.const 0) )
  (return (local.get $8) )
 )

the

  (local.set $1 (i32.load8_u offset=65536 (local.get $0) ) )
  (local.set $2 (i32.const 24) )
  (local.set $3 (i32.shl (local.get $1) (local.get $2) ) )
  (local.set $4 (i32.shr_s (local.get $3) (local.get $2) ) )

part could be replaced by a

  (local.set $4 (i32.load8_s offset=65536 (local.get $0) ) )

seems that Wasm backend does not understand how to emit i32.load8_s? (unsigned char i = 1; would also generate noisy wasm, something I think should be fixed in LLVM side?)

@juj
Copy link
Collaborator

juj commented Sep 27, 2023

Looks like

char arr[2];
int i = 1;
int main() {
  *(arr + 1 + -i) = 1;
}

generates the simplest test case:

 (func $__original_main (result i32)
  (local $0 i32)
  (local $1 i32)
  (local $2 i32)
  (local $3 i32)
  (local $4 i32)
  (local $5 i32)
  (local.set $0 (i32.const 0) )                            // $0 = 0
  (local.set $1 (i32.load offset=65536 (local.get $0) ) )  // $1 = MEM8[65536] (== 1)
  (local.set $2 (i32.const 0) )                            // $2 = 0
  (local.set $3 (i32.sub (local.get $2) (local.get $1) ) ) // $3 = $2 - $1 = 0 - 1 = -1
  (local.set $4 (i32.const 1) )                            // $4 = 1
  (i32.store8 offset=65541 (local.get $3) (local.get $4) ) // MEM8[$3 + 65541] = $4; RuntimeError: memory access out of bounds
  (local.set $5 (i32.const 0) )
  (return (local.get $5) )
 )

@kripken
Copy link
Member

kripken commented Sep 27, 2023

On that last testcase, I don't know if it has undefined behavior in C, but looks like it crashes in -O0 and succeeds in -O1, so that seems possible. Or, it might be a wasm backend bug, if there isn't undefined behavior. So the main question here is first whether this has UB or not.

Looking at that -O0 output, I simplified it slightly in a safe way to this:

$ wasm-opt a.out.wasm --simplify-locals --coalesce-locals --reorder-locals --vacuum --extract-function=__original_main  --print 
extracting __original_main
(module $a.out.wasm
 (type $0 (func (result i32)))
 (memory $0 256 256)
 (data $.data (i32.const 65536) "\01\00\00\00")
 (export "__original_main" (func $__original_main))
 (func $__original_main (result i32)
  (i32.store8 offset=65541
   (i32.sub
    (i32.const 0)
    (i32.load offset=65536
     (i32.const 0)
    )
   )
   (i32.const 1)
  )
  (return
   (i32.const 0)
  )
 )
)

That code is expected to crash: the load returns 1, and then 0 - 1 is negative 1, and that is a large unsigned value. Adding the offset of 65541 does not prevent the crash, because load/store offsets work in "infinite size", that is, there is no overflow: we have u32(-1) + 65541 = 4295032836, which is larger than 32 bits, but the VM computes it in as many bits are needed (logically). It is larger than the size of memory, so it traps.

That is, i32.load offset=X (i32.const Y) is very different from i32.load (i32.add (i32.const X) (i32.const Y)). The former does not do a 32-bit overflow, while the latter does. (For that reason, it is not possible to simply replace one with the other as an optimization. Binaryen can do this only if it has additional information about overflows not being possible. That is, if no 32-bit overflow can happen then the two are identical.)

@juj
Copy link
Collaborator

juj commented Sep 27, 2023

On that last testcase, I don't know if it has undefined behavior in C

Are you referring to the test code at

char arr[2];
int i = 1;
int main() {
  *(arr + 1 + -i) = 1;
}

I am fairly positive that this should not have any undefined behavior. The only thing that comes to mind in the standard is that weird rule that pointer calculations should not exceed [begin,end] ranges (even if not dereferenced) - but here arr + 1 is inside the array, and the resulting arr + 1 + -i is also inside the array so not even that rule would be violated.

Looks like the following test case also generates an error:

char arr[2];
int i = -1;
int main() {
  *(arr + 1 + i) = 1;
}

(one can also remove the +1 there and still observe a crash, and while the resulting address should line up to a known ok value, that would be UDB C/C++ standards-wise)

For all intents and purposes, that application should be the same as

char arr[2];
int i = -1;
int main() {
  char *ptr = arr + 1 + i;
  *ptr = 1;
}

(which does not crash) I can't think of a reason that there should be UDB in either of these programs.

Let's invoke some more people to poll: @tlively @dschuff ?

This scenario seems like an incorrect reversal of a base_address + constant_offset memory load somewhere in codegen? In an uncommon fashion, here arr + 1 looks like it has the role of a the constant offset (since arr is global data), and i is the variable base address, so codegen incorrectly switches these two around?

because load/store offsets work in "infinite size", that is, there is no overflow

Although not sure how this could be fixed without pessimizing codegen. To me seems like load/store offsets would want to work with u64 sizes to take advantage of overflow(?)

@kripken
Copy link
Member

kripken commented Sep 27, 2023

Are you referring to the test code [..]?

Sorry, I should have said, I referred to the last testcase before my last comment.

@juj
Copy link
Collaborator

juj commented Sep 28, 2023

That code is expected to crash: the load returns 1, and then 0 - 1 is negative 1, and that is a large unsigned value.

I agree that wasm disassembly is expected to crash given the wasm semantics. It feels like the root issue would be that wasm code should not have gotten generated in the first place?

@kripken
Copy link
Member

kripken commented Sep 28, 2023

I agree, if there is no UB here then that looks like a wasm backend bug. I'm hoping one of the people you cc'd can confirm if there is or isn't UB here.

@juj
Copy link
Collaborator

juj commented Oct 5, 2023

Stumbled today on a video by Jonathan Müller (@foonathan), who is part of the standardization committee, and gave a presentation at C++ On Sea 2023 conference this June:

https://youtu.be/zGWj7Qo_POY?t=34

there he does mention that *(array+17) syntax is by definition equivalent to array[17], when array is a built-in array. I think that is the same situation that occurs in this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants