Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FYI: highly selective segfault when linked against LLVM 3.4 #6369

Closed
cmundi opened this issue Apr 2, 2014 · 17 comments
Closed

FYI: highly selective segfault when linked against LLVM 3.4 #6369

cmundi opened this issue Apr 2, 2014 · 17 comments

Comments

@cmundi
Copy link
Contributor

cmundi commented Apr 2, 2014

Caveat for anyone interested in building Julia with LLVM 3.4

I stumbled onto this after accidentally building Julia against LLVM 3.4. Since then I've done clean experiments (gist here) to confirm that the effect is plausibly related to LLVM 3.4. Although I have isolated the call where the segfault occurs, I still suspect trouble may be starting downstack. I have made no effort to study Julia's io internals and have not found a less byzantine way to trigger this fault.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 2, 2014

I should say clearly: I'm not asking for help on this. I don't regard it as a problem. I'm not interested in deviating from the nominally supported build configs without good reason. I stumbled into this and share it FYI.

@ihnorton
Copy link
Member

ihnorton commented Apr 2, 2014

We are fairly actively maintaining support against the LLVM trunk, so that should work and if it doesn't, Keno or I will fix it pretty quickly (I haven't pulled since ~1 week). However, I believe the plan is to skip 3.4 entirely.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 2, 2014

@ihnorton Cool. Let me know if you need more data.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 2, 2014

Also see this recent development which may get in the way of your investigation: #6371. It seems I was just unlucky enough to stumble onto this LLVM skew issue just before the deletion of StoredArray made things more interesting.

UPDATE: Never mind this StoredArray side-issue... got fixed a few minutes ago. Issue with LLVM 3.4 remains.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 2, 2014

If this isn't reproducible with LLVM 3.5 pre then I'd suggest closing this "issue" -- just an FYI for anyone unlucky enough to use the system LLVM @ 3.4

@cmundi
Copy link
Contributor Author

cmundi commented Apr 2, 2014

FYI. I just confirmed that this segfault continues in the post-StoredArray era.

Building v0.2.0-2402-g9e26e21 with LLVM 3.4 still gives a segafult in the example I described.

I have not tested with any LLVM 3.5 pre and you'll get there before me anyway I expect.

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 2, 2014

I can reproduce with today's LLVM-svn. On OS X it's a bus error; I think the ios may be getting closed prematurely. The callback is in Cairo.jl, for what it's worth.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 2, 2014

Confirmed: we are both seeing the same callback fail in Cairo. Thanks for
the cross-check.
On Apr 2, 2014 10:10 AM, "Matt Bauman" notifications@github.com wrote:

I can reproduce with today's LLVM-svn. On OS X it's a bus error; I think
the ios may be getting closed prematurely. The callback is in Cairo.jlhttps://github.com/JuliaLang/Cairo.jl/blob/f2f8203ff77b8459cc602e2a425f5133046879cf/src/Cairo.jl#L70,
for what it's worth.

Reply to this email directly or view it on GitHubhttps://github.com//issues/6369#issuecomment-39356814
.

@ihnorton
Copy link
Member

I was able to reproduce this a few days ago, but no longer. Not sure what may have fixed it. @cmundi can you confirm?

@cmundi
Copy link
Contributor Author

cmundi commented Apr 16, 2014

Good question. I switched to v0.2.0-2305-g4def095 and did make clean && make distcleanall and then found the build (everything stock; LLVM 3.3) is flat broken for me. (The last time I did a clean build was a few days ago.) The last few lines I see in a completely clean baseline build are

/usr/bin/install -c -m 644 include/libunwind-common.h '/home/cmundi/julia/usr/include'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   797  100   797    0     0   2325      0 --:--:-- --:--:-- --:--:--  2330
bzip2: (stdin) is not a bzip2 file.
/bin/tar: Child returned status 2
/bin/tar: Error is not recoverable: exiting now
make[2]: *** [patchelf-0.6/configure] Error 2
make[1]: *** [julia-release] Error 2
make: *** [release] Error 2

Well that's disappointing. I won't have much time to tinker before the weekend. But if I can just get a stable build I'll be happy to try again in about 24 hours.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 16, 2014

The deps/patchelf-0.6.tar.bz2 is not actually a bzip2 file! The contents of that file are

<HTML>
<HEAD>
</HEAD>
<BODY bgcolor="#FFFFFF">
<table width="100%" height="100%" border=0>
<tr>
<td width="100%" height="100%" align=center>
<span style="font-size:16px; font-family:Georgia,Garmond,'Times New Roman';color:#02316b">
<b>Diese Domain wurde gesperrt...</b>
</span>
<span style="font-size:12px; font-family:Arial,Verdana;color:#02316b">
<br>
<br>Falls Sie der Administrator dieser Domain
<br>sind und Fragen zur Sperrung Ihrer
<br>Domain haben, wenden Sie sich bitte
<br>an unser <a href="http://www.united-domains.de/support/" style="color:#02316b">Support-Team</a>.
</span>
<br>
<br><a href="http://www.united-domains.de" alt="Domain"><img src="http://www.united-domains.de/images/evolution/udag_logo.png" border="0" width="290" height="41"></a>
</td>
</tr>
</table>
</BODY>
</HTML>

Now, mein Deutsch ist alt und schwach but I think that means someone's ISP thinks he was naughty.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 16, 2014

Yep. http://hydra.nixos.org/ ist verboten as of now (April 16, 02:22 UTC). Maybe we should fork https://github.com/NixOS/patchelf? Ah yes. #6532

@pao
Copy link
Member

pao commented Apr 16, 2014

Hydra is back up. Switching to or adding a backup server is #6532. You may now return to your regularly scheduled issue.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 16, 2014

LOL. Thanks. I'll be someplace where I can do a build in about 12 hours.
On Apr 16, 2014 6:28 AM, "pao" notifications@github.com wrote:

Hydra is back up. Switching to or adding a backup server is #6532#6532.
You may now return to your regularly scheduled issue.


Reply to this email directly or view it on GitHubhttps://github.com//issues/6369#issuecomment-40598106
.

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 16, 2014

I'm still seeing a bus error on LLVM-svn (849ca45/r206385) and julia c921451 with OS X, but I'm now missing the backtrace label in (what still looks like) the callback frame. Otherwise looks identical.

@cmundi
Copy link
Contributor Author

cmundi commented Apr 17, 2014

In a quick test (recompile to use LLVM 3.4) I was not able to reproduce the crash. This was a clean build with LLVM 3.3 called out in deps\Versions.make, followed by a second build with LLVM 3.4 called out in deps\Versions.make. Both builds passed test_gadfly.jl by producing a PNG. I just started a completely clean build (as in the original report ans subsequent confirmations) using LLVM 3.4 and will update here.

Update & Correction The segfault still exists, with the same backtrace, in a perfectly clean build with LLVM 3.4. julia v0.2.0-2635-gc77e098

@JeffBezanson
Copy link
Sponsor Member

Closing as we're not planning to use LLVM 3.4, and will be forced to fix any issues around LLVM 3.5 very soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants