`make test-complex` fails on 32-bit Linux (Heisenbug?) #10027

rickhg12hs · 2015-02-02T20:37:57Z

$ make test-complex
    JULIA test/complex
     * complex             exception on 1: ERROR: LoadError: test failed: (1.0 + NaN*im == 1.0 + 1.0im)
 in expression: round(Complex(1.5,0.5),RoundDown,RoundUp) == Complex(1.0,1.0)
 in error at error.jl:19
 in default_handler at test.jl:27
 in do_test at test.jl:50
 in runtests at /usr/local/src/julia/julia/test/testdefs.jl:78
 in anonymous at multi.jl:641
 in run_work_thunk at multi.jl:602
 in remotecall_fetch at multi.jl:675
 in remotecall_fetch at multi.jl:690
 in anonymous at task.jl:1644
while loading complex.jl, in expression starting on line 752
ERROR: LoadError: LoadError: test failed: (1.0 + NaN*im == 1.0 + 1.0im)
 in expression: round(Complex(1.5,0.5),RoundDown,RoundUp) == Complex(1.0,1.0)
 in error at error.jl:19
 in default_handler at test.jl:27
 in do_test at test.jl:50
 in runtests at /usr/local/src/julia/julia/test/testdefs.jl:78
 in anonymous at multi.jl:641
 in run_work_thunk at multi.jl:602
 in remotecall_fetch at multi.jl:675
 in remotecall_fetch at multi.jl:690
 in anonymous at task.jl:1644
while loading complex.jl, in expression starting on line 752
while loading /usr/local/src/julia/julia/test/runtests.jl, in expression starting on line 42

make[1]: *** [complex] Error 1
make: *** [test-complex] Error 2
$ ./julia -e 'versioninfo()'
Julia Version 0.4.0-dev+3062
Commit a6ae3f1* (2015-02-02 19:40 UTC)
Platform Info:
  System: Linux (i686-redhat-linux)
  CPU: Genuine Intel(R) CPU           T2250  @ 1.73GHz
  WORD_SIZE: 32
  BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Banias)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

I poked around a bit and noted some curious symptoms. Are some intermediary results getting clobbered somewhere?

julia> round(Complex(1.5,0.5),RoundDown,RoundUp)
1.0 + NaN*im

julia> round(Complex(1.5,0.5))
2.0 + NaN*im

julia> @which round(Complex(1.5,0.5))
round(z::Complex{T<:Real}) at complex.jl:701

julia> @less round(Complex(1.5,0.5))
round(z::Complex) = Complex(round(real(z)), round(imag(z)))
@vectorize_1arg Complex round

function round(z::Complex, digits::Integer, base::Integer=10)
    Complex(round(real(z), digits, base),
            round(imag(z), digits, base))
end

float{T<:FloatingPoint}(z::Complex{T}) = z
float(z::Complex) = Complex(float(real(z)), float(imag(z)))
@vectorize_1arg Complex float

~
~

julia> function my_round(z::Complex)
           rrz = round(real(z))
           riz = round(imag(z))
           Complex(rrz, riz)
       end
my_round (generic function with 1 method)

julia> my_round(Complex(1.5,0.5))
2.0 + NaN*im

julia> function my_round2(z::Complex)
           @show rrz = round(real(z))
           @show riz = round(imag(z))
           Complex(rrz, riz)
       end
my_round2 (generic function with 1 method)

julia> my_round2(Complex(1.5,0.5))
rrz = round(real(z)) = 2.0
riz = round(imag(z)) = 0.0
2.0 + 0.0im

julia>

The text was updated successfully, but these errors were encountered:

jiahao · 2015-02-02T20:43:03Z

I have no idea why this error occurs. The tests were newly introduced by #8291; they appear to be deeper bugs that happen to be exposed by the complex rounding code.

jiahao · 2015-02-02T20:45:05Z

Out of curiosity, what happens if you dump the bits for the NaN value?

function my_round(z::Complex)
           rrz = round(real(z))
           riz = round(imag(z))
           @show bits(rrz), bits(riz)
           Complex(rrz, riz)
       end

rickhg12hs · 2015-02-02T21:28:39Z

@jiahao : Heisenbugginess

julia> function my_round(z::Complex)
                  rrz = round(real(z))
                  riz = round(imag(z))
                  @show bits(rrz), bits(riz)
                  Complex(rrz, riz)
              end
my_round (generic function with 1 method)

julia> my_round(Complex(1.5,0.5))
(bits(rrz),bits(riz)) = ("0100000000000000000000000000000000000000000000000000000000000000","0000000000000000000000000000000000000000000000000000000000000000")
2.0 + 0.0im

jiahao · 2015-02-02T21:30:30Z

Memory access error? stumped

rickhg12hs · 2015-02-02T21:31:02Z

Maybe help?

julia> function my_round2(z::Complex)
                  rrz = round(real(z))
                  riz = round(imag(z))
                  Complex(rrz, riz)
             end
my_round2 (generic function with 1 method)

julia> my_round2(Complex(1.5,0.5))
2.0 + NaN*im

julia> bits(imag(my_round2(Complex(1.5,0.5))))
"1111111111111000000000000000000000000000000000000000000000000000"

jiahao · 2015-02-02T21:40:45Z

I was wondering if the NaN payload could tell us anything, but it's all zeroed.

rickhg12hs · 2015-02-02T21:46:40Z

Here's another look:

julia> function my_round5(z::Complex)
                  rrz = round(real(z))
                  riz = round(imag(z))
                  my_cmplx = zero(Complex)
                  my_cmplx += rrz
                  my_cmplx += riz*im
                  my_cmplx
       end
my_round5 (generic function with 1 method)

julia> my_round5(Complex(1.5,0.5))
2.0 + 0.0im

rickhg12hs · 2015-02-02T21:49:02Z

Does this narrow it down to the constructor?

julia> function my_round6(z::Complex)
                  rrz = round(real(z))
                  riz = round(imag(z))
                  rrz + im*riz
       end
my_round6 (generic function with 1 method)

julia> my_round6(Complex(1.5,0.5))
2.0 + 0.0im

jiahao · 2015-02-02T21:50:44Z

I believe so. @StefanKarpinski any ideas?

jiahao · 2015-02-02T21:57:38Z

How about trying the inner constructor?

function my_round{T}(z::Complex{T})
                  rrz = round(real(z))
                  riz = round(imag(z))
                  Complex{T}(rrz, riz)
       end

rickhg12hs · 2015-02-02T21:59:48Z

julia> function my_round{T}(z::Complex{T})
                         rrz = round(real(z))
                         riz = round(imag(z))
                         Complex{T}(rrz, riz)
              end
my_round (generic function with 1 method)

julia> my_round(Complex(1.5,0.5))
2.0 + NaN*im

Define round for complex numbers

staticfloat · 2015-02-04T18:15:06Z

There seems to be something weird going on from the code_typed -> code_llvm step:

julia> @code_typed my_round(Complex(1.5,0.5))
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:z], Any[Any[:rrz,:riz],Any[Any[:z,Complex{Float64},0],Any[:rrz,Float64,18],Any[:riz,Float64,18]],Any[],Any[]], :(begin  # none, line 2:
        rrz = (top(box))(Float64,(top(rint_llvm))((top(getfield))(z::Complex{Float64},:re)::Float64)) # line 3:
        riz = (top(box))(Float64,(top(rint_llvm))((top(getfield))(z::Complex{Float64},:im)::Float64)) # line 4:
        return $(Expr(:new, Complex{Float64}, :(rrz::Float64), :(riz::Float64)))
    end::Complex{Float64}))))

julia> @code_llvm my_round(Complex(1.5,0.5))

define %Complex.5 @julia_my_round_43277(%Complex.5) {
top:
  %1 = extractvalue %Complex.5 %0, 0, !dbg !1032
  %2 = call double @llvm.rint.f64(double %1), !dbg !1032
  %3 = extractvalue %Complex.5 %0, 1, !dbg !1034
  %4 = call double @llvm.rint.f64(double %3), !dbg !1034
  %5 = insertvalue %Complex.5 undef, double %2, 0, !dbg !1035
  %6 = insertvalue %Complex.5 %5, double %4, 1, !dbg !1035, !julia_type !1036
  ret %Complex.5 %6, !dbg !1035
}

That undef looks awfully suspicious. @vtjnash, @Keno any idea on how I can debug why this undef is appearing? This is a pretty bad issue, as it essentially breaks the 32-bit build.

mbauman · 2015-02-04T18:30:23Z

That's just fine — it's the standard idiom for initializing a new aggregate type in LLVM. See the reference for insertvalue, particularly the example. You start with an object full of undefs and iteratively fill its elements.

vtjnash · 2015-02-04T19:30:07Z

Complex numbers don't round trip correctly to C. That will be fixed very soon in my PR jn/ccall3

staticfloat · 2015-02-04T19:31:49Z

We're not calling a C function with complex numbers though, are we?

jiahao · 2015-02-04T19:43:31Z

I don't believe so. Any C function calls are on the real and imaginary parts separately.

vtjnash · 2015-02-05T14:38:11Z

i'm getting invalid native code if i run this test with a sysimg, but seem to be getting correct results if a sysimg is not involved.

however, i think this might be fixed in llvm3.5

nalimilan · 2015-02-05T15:56:11Z

FWIW it happens with LLVM 3.3, 3.4 and 3.5 when building Fedora packages. E.g with 3.5:
https://copr-be.cloud.fedoraproject.org/results/nalimilan/julia-nightlies/fedora-rawhide-i386/julia-0.4.0-0.20150205.fc21/build.log

tkelman · 2015-02-09T05:58:15Z

We haven't had new nightlies for a week because of this. I think we should revert #8291 until either we figure out how to do it without breaking 32 bit, or we've upgraded LLVM and hopefully it's fixed there.

rickhg12hs · 2015-02-09T12:40:20Z

This is a bit slower and uses a bit more RAM, but works on my 32-bit Fedora.

Maybe a temporary workaround?

julia> @time round(Complex(1.5,0.5))
elapsed time: 2.3746e-5 seconds (100 bytes allocated)
2.0 + NaN*im

julia> @time round(Complex(1.5,0.5))
elapsed time: 2.3956e-5 seconds (100 bytes allocated)
2.0 + NaN*im

julia> @time round(Complex(1.5,0.5))
elapsed time: 3.0242e-5 seconds (100 bytes allocated)
2.0 + NaN*im

julia> my_round(z::Complex) = Complex(map(round,reim(z))...)   # <----- New Function Def
my_round (generic function with 1 method)

julia> @time my_round(Complex(1.5,0.5))
elapsed time: 0.006331542 seconds (19 kB allocated)
2.0 + 0.0im

julia> @time my_round(Complex(1.5,0.5))
elapsed time: 2.9893e-5 seconds (148 bytes allocated)
2.0 + 0.0im

julia> @time my_round(Complex(1.5,0.5))
elapsed time: 2.7448e-5 seconds (148 bytes allocated)
2.0 + 0.0im

julia> @time my_round(Complex(1.5,0.5))
elapsed time: 2.8426e-5 seconds (148 bytes allocated)
2.0 + 0.0im

julia> @time my_round(Complex(1.5,0.5))
elapsed time: 2.6959e-5 seconds (148 bytes allocated)
2.0 + 0.0im

staticfloat · 2015-02-09T20:16:56Z

@rickhg12hs would you be able to put together a PR that I could run on my buildbots to see if it does indeed fix the problem?

vtjnash · 2015-02-09T21:37:05Z

That'll fix it, but only because it very effectively defeats llvm optimization. Simply preventing inlining of round would have the same effect

rickhg12hs · 2015-02-10T02:12:28Z

@staticfloat : PR made. #10145

@vtjnash : Played with @noinline but clearly I don't know what it does and couldn't seem to use it (blind inclusion) to fix this issue.

vtjnash · 2015-02-10T02:16:00Z

would probably look something like:

@noinline noinline_round(x...) = round(x...)
round(c::Complex) = Complex(noinline_round(real(c)), noinline_round(imag(c)))

i realize that's a bit awkward, but annotating inline/noinline at the call site isn't actually supported yet

…ound Update base/complex.jl for Complex round (Fix/Workaround #10027)

JeffBezanson · 2015-02-10T16:03:17Z

+1 to @vtjnash 's suggestion. That would be a much better workaround. Even better if it is only done on 32-bit systems where the problem exists.

rickhg12hs · 2015-03-31T08:31:45Z

Since this is "working" now, should it be closed?

Comment in code mentions it's a workaround. Will this prompt a revisit with new LLVM version?

Is there a "workaround" issue tag that could be applied to prompt future look.

ViralBShah · 2015-03-31T08:36:07Z

Perhaps we should have a revisit label.

tkelman · 2015-12-04T05:20:33Z

This probably works better now that we restrict to >= pentium4

vtjnash · 2016-05-20T22:23:11Z

this was fixed with the sret fixes some time back. I'll remove the workaround code in #16219

jiahao added the system:32-bit Affects only 32-bit systems label Feb 2, 2015

tkelman referenced this issue Feb 3, 2015

Merge pull request #8291 from JuliaLang/cjh/round-complex

78a69a8

Define round for complex numbers

This was referenced Feb 5, 2015

Frequent Win32 segfaults on AppVeyor #10045

Closed

Faster setindex! for RHSs with inefficient linear indexing #10063

Merged

staticfloat mentioned this issue Feb 9, 2015

More comprehensive test coverage #9493

Closed

rickhg12hs mentioned this issue Feb 10, 2015

Update base/complex.jl for Complex round (Fix/Workaround #10027) #10145

Merged

staticfloat added a commit that referenced this issue Feb 10, 2015

Merge pull request #10145 from rickhg12hs/complex-round-32-bit-workar…

1575bc8

…ound Update base/complex.jl for Complex round (Fix/Workaround #10027)

simonster mentioned this issue Feb 13, 2015

Make order of arguments for groupby consistent with other iterators. JuliaCollections/Iterators.jl#30

Merged

JeffBezanson added a commit that referenced this issue Mar 9, 2015

restrict workaround for #10027 to 32-bit systems

f77d8a5

vtjnash closed this as completed May 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`make test-complex` fails on 32-bit Linux (Heisenbug?) #10027

`make test-complex` fails on 32-bit Linux (Heisenbug?) #10027

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

staticfloat commented Feb 4, 2015

mbauman commented Feb 4, 2015

vtjnash commented Feb 4, 2015

staticfloat commented Feb 4, 2015

jiahao commented Feb 4, 2015

vtjnash commented Feb 5, 2015

nalimilan commented Feb 5, 2015

tkelman commented Feb 9, 2015

rickhg12hs commented Feb 9, 2015

staticfloat commented Feb 9, 2015

vtjnash commented Feb 9, 2015

rickhg12hs commented Feb 10, 2015

vtjnash commented Feb 10, 2015

JeffBezanson commented Feb 10, 2015

rickhg12hs commented Mar 31, 2015

ViralBShah commented Mar 31, 2015

tkelman commented Dec 4, 2015

vtjnash commented May 20, 2016

make test-complex fails on 32-bit Linux (Heisenbug?) #10027

make test-complex fails on 32-bit Linux (Heisenbug?) #10027

Comments

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

jiahao commented Feb 2, 2015

jiahao commented Feb 2, 2015

rickhg12hs commented Feb 2, 2015

staticfloat commented Feb 4, 2015

mbauman commented Feb 4, 2015

vtjnash commented Feb 4, 2015

staticfloat commented Feb 4, 2015

jiahao commented Feb 4, 2015

vtjnash commented Feb 5, 2015

nalimilan commented Feb 5, 2015

tkelman commented Feb 9, 2015

rickhg12hs commented Feb 9, 2015

staticfloat commented Feb 9, 2015

vtjnash commented Feb 9, 2015

rickhg12hs commented Feb 10, 2015

vtjnash commented Feb 10, 2015

JeffBezanson commented Feb 10, 2015

rickhg12hs commented Mar 31, 2015

ViralBShah commented Mar 31, 2015

tkelman commented Dec 4, 2015

vtjnash commented May 20, 2016

`make test-complex` fails on 32-bit Linux (Heisenbug?) #10027

`make test-complex` fails on 32-bit Linux (Heisenbug?) #10027