clang built julia is unstable #1013

Closed
ViralBShah opened this Issue Jul 7, 2012 · 12 comments

Projects

None yet

5 participants

@ViralBShah
The Julia Language member

Building with make USECLANG=1 doesn't pass all tests. For me, it gets stuck here - and is perhaps related to the ccall issue discussed in #978. Also, the tests seem to take much longer when julia is built with clang.

$ make testall
    PERL test/unicode/UTF-32BE.txt
    PERL test/unicode/UTF-32LE.txt
    PERL test/unicode/UTF-16BE.txt
    PERL test/unicode/UTF-16LE.txt
    PERL test/unicode/UTF-8.txt
    JULIA test/all
     * all
     * core
     * numbers
     * strings
     * unicode
     * corelib
     * hashing
     * arrayops
     * lapack
     * factorizations
     * fft
     * sparse
     * arpack
     * bitarray
     * random
     * special
     * functional
     * bigint
     * distributions
     * combinatorics
     * statistics
     * integers
     * glpk
@StefanKarpinski
The Julia Language member

Same issue (also on latest OS X).

@vtjnash
The Julia Language member

Very strange. The failing function appears to be stat().

sometime during the call to uv_fs_stat in jl_stat ret = uv_fs_stat(uv_default_loop(), &req, path, NULL); the stack can be seen to be destroyed

It's not clear why this would be a problem, but the allocation of uv_fs_t req seems to be generating broken code.
I tried switching to alloca, but that still failed. However, switching to malloc or making this a static pointer (or presumably, but untested, refactoring to pass around a uv_fs_t buffer instead of a struct stat) would seem to make this OK. The real question is why are these failing?

The answer lies in a few missing complier flags:
CFLAGS=" -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_DARWIN_USE_64_BIT_INODE=1 "
that are used by libuv but not by julia and alter the size required by the stat struct. I don't know what they mean, I just know that adding one or more of them will likely solve the issue.

(sorry if that reads a bit like a story, but I kept typing this as I kept finding new details)

@pao
The Julia Language member

USE_64_BIT_INODE sounds very much like it would affect the size of the stat structure, which contains the inode number.

@ViralBShah
The Julia Language member

I am trying out the suggested CFLAGS, but currently, make testall dies when julia is built with clang, and openblas is used.


$ make test-lapack
    JULIA test/lapack
     * lapack
assertion failed: norm(-(*(l,u),ref(a,p,:)))<Eps
 in load at util.jl:234
 in load at util.jl:246
 in runtests at ./runtests.jl:3
 in include at boot.jl:197
 in process_options at client.jl:172
 in _start at client.jl:214
at /Users/viral/julia-clang/test/lapack.jl:4
 in load at util.jl:257
 in runtests at ./runtests.jl:3
 in include at boot.jl:197
 in process_options at client.jl:172
 in _start at client.jl:214
at ./runtests.jl:48
 in include at boot.jl:197
 in process_options at client.jl:172
 in _start at client.jl:214
make[1]: *** [lapack] Error 1
make: *** [test-lapack] Error 2
@ViralBShah
The Julia Language member

I should add that all openblas tests pass for me. I wonder if this is some clang/ccall related issue, or some other corruption.

@nolta
The Julia Language member

@ViralBShah, are you running on sandybridge? If so, i'm seeing a similar error, and compiling openblas with TARGET=NEHALEM or USE_THREAD=0 seems to fix it.

xianyi/OpenBLAS#125

@ViralBShah
The Julia Language member

Yes, I am on sandybridge. Just seeing if openblas' sandybridge support works well enough, when built with Apple clang. I suspect this is going to take some time to stabilize.

@ViralBShah
The Julia Language member

Apart from the openblas+lapack tests, which seems to be an openblas issue, all other tests pass, with the flags that @vtjnash mentions.

@ViralBShah ViralBShah added a commit that referenced this issue Jul 15, 2012
@ViralBShah ViralBShah Add -D_LARGEFILE_SOURCE -D_DARWIN_USE_64_BIT_INODE=1 to JCFLAGS
for Darwin+clang. Seems to fix the instability described in #1013,
and all tests except lapack pass now.

Should this be added for gcc, and in general for all platforms?
4f6050d
@nolta
The Julia Language member

This is not just a problem with clang. The lapack test also fails on my sandybridge ubuntu 10.04 box, for both gcc 4.4.3 and 4.7.1.

@ViralBShah
The Julia Language member

I am closing this bug and opening #1056 for the openblas issue.

@ViralBShah ViralBShah closed this Jul 15, 2012
@ViralBShah
The Julia Language member
@nolta
The Julia Language member

Yes, i get the exact same error as you do. It's what prompted me to open the openblas issue.

@HarlanH HarlanH added a commit to HarlanH/julia that referenced this issue Jul 18, 2012
@ViralBShah ViralBShah Add -D_LARGEFILE_SOURCE -D_DARWIN_USE_64_BIT_INODE=1 to JCFLAGS
for Darwin+clang. Seems to fix the instability described in #1013,
and all tests except lapack pass now.

Should this be added for gcc, and in general for all platforms?
6ecc231
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment