Join GitHub today
Tiny optim to avoid some spilling of floats in x87 #6924
Original bug ID: 6924
The code generated for
let f (m:float array) i x = Array.set m i x
on x86 (32-bit) unboxes the float argument, spills it, does the checkbound, and loads the float back from the stack before putting it in the array:
sub esp, 8 L100: fld REAL8 PTR [ecx] fstp REAL8 PTR [esp] mov ecx, DWORD PTR [eax-4] shr ecx, 10 cmp ecx, ebx jbe L101 fld REAL8 PTR [esp] fstp REAL8 PTR [ebx*4+eax-4] mov eax, 1 add esp, 8 ret
I guess that the current strategy for the x87 fp stack is to use it only within a basic block, and it would probably be quite difficult to change that in general. Fragments of numerical code which have branches only because of check bounds could probably benefit a little bit from optimizing cases such as above. The spill above can be avoided by the attached patch, which doesn't bind the rhs of the assignment (i.e. the unboxing code) before the checkbound. The patch is quite restrictive, it should be ok to do so as long as the rhs cannot raise an exception.
The code generated after the patch is:
sub esp, 8 L100: mov edx, DWORD PTR [eax-4] shr edx, 10 cmp edx, ebx jbe L101 fld REAL8 PTR [ecx] fstp REAL8 PTR [ebx*4+eax-4] mov eax, 1 add esp, 8 ret
A very rough micro-benchmark on the code below:
let f a (x : float) = for i = 1 to 1000 do for j = 0 to Array.length a - 1 do a.(j) <- x; a.(j) <- x; a.(j) <- x; a.(j) <- x done done let () = let a = Array.make 1024 0. in for i = 1 to 1000 do f a (float i) done
gives the following results:
Before patch, -unsafe mode: 1.8s Before patch, safe mode: 3.2s After patch, -unsafe mode: 1.8s After patch, safe mode: 2.1s
which shows that most of the overhead for bound checks comes from the spilling overhead.
Comment author: jmeber
I strongly advocate inclusion of this patch as it gives big speedups (on x86) of a very common and frequent operation in numeric code (in-place modification of float array with "simple" values) when you still want to benefit from bounds checking.