-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Original bug ID: 5731
Reporter: jeffsco
Assigned to: meurer
Status: closed (set by @xavierleroy on 2015-12-11T18:19:43Z)
Resolution: fixed
Priority: normal
Severity: major
Platform: ARM
OS: Debian
OS Version: armel 2.6.32-5
Version: 4.00.0
Fixed in version: 4.00.1+dev
Category: back end (clambda to assembly)
Monitored by: cookedm
Bug description
Code generation error. A value is loaded into d7. A call to float_of_int generates code that loads s14, which overwrites the value in d7 (low order mantissa). The value in d7 is then used in a later computation, but the value has been destroyed.
If I make small changes to the code, the load of d7 happens after the float_of_int computation, and things work OK. Possibly some constraints on instruction orderings aren't being applied properly.
Steps to reproduce
Need a version of ocamlopt with support for armv7 with hardware floating point (VFPv3).
Then compile the following code:
let rate_pos scounts : float =
let m_MIN = -999.0
in let max1s = Array.make 14 m_MIN
in let max2s = Array.make_matrix 14 14 m_MIN
in let try_build (k1: int) (m: float) : unit =
let denom = 12
in let try1b (sawk1, xct) k =
let () =
if max2s.(k1).(k) > m then
let adjm = if m <= m_MIN then 0.0 else m
in let numer =
if k = k1 then 48
else if sawk1 then 36
else 24
in let f = float_of_int numer /. float_of_int denom
in let () =
if max1s.(k1) <= m_MIN then max1s.(k1) <- 0.0
in
max1s.(k1) <-
max1s.(k1) +. (max2s.(k1).(k) -. adjm) *. f
in
if k = k1 then
(true, xct)
else
(sawk1, xct + scounts.(k))
in
ignore (List.fold_left try1b (false, 0) [])
in let () = Array.iteri try_build max1s
in
0.0
My compile line looks like this:
$ ocamlopt -ffpu vfpv3 -c -S rate.ml
I'm attaching the full assembly code output. The section with the error contains the code for the following lines:
in let f = float_of_int numer /. float_of_int denom
in let () =
if max1s.(k1) <= m_MIN then max1s.(k1) <- 0.0
The erroneous code looks like this (with added annotations):
ldr r12, [r2, #16] @ r12 <- m_MIN block
mov r0, r7, asr #1
ldr r7, [r2, #20]
movs r6, #0xc @ r6 <- denom
fmsr s14, r6
fsitod d10, s14 @ d10 <- float_of_int denom
ldr r6, [r7, #-4]
fldd d7, [r12, #0] @ d7 <- m_MIN
ldr r12, [r2, #28]
fmsr s14, r0 @ *** d7 is destroyed here ***
fsitod d9, s14 @ d9 <- float_of_int numer
cmp r12, r6, lsr #10
bcs .L111
add r6, r7, r12, lsl #2
fldd d6, [r6, #-4] @ d6 <- max1s.(k1)
fdivd d8, d9, d10
fcmpd d6, d7 @ *** This comparison fails ***
fmstat
bhi .L104
Additional information
I ran my tests on a stock OCaml 4.00.0 compiler in an emulated Linux ARMEL environment under QEMU on OS X. I don't think this affects the results; I see exactly the same instruction sequence in my experimental cross-compiler builds (compiling for iOS under OS X).
Here is my configure line:
$ ./configure --host armv5tejl-unknown-linux-gnueabihf
(Need to use gnueabihf to get VFPv3 support.)
Then build ocamlopt as usual.