Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in building MPIR/yasm #11844

Closed
vbraun opened this issue Sep 24, 2011 · 20 comments
Closed

Race condition in building MPIR/yasm #11844

vbraun opened this issue Sep 24, 2011 · 20 comments

Comments

@vbraun
Copy link
Member

vbraun commented Sep 24, 2011

The included Yasm includes re2c which has a race condition during parallel build. See the upstream ticket for details:

http://tortall.lighthouseapp.com/projects/78676-yasm/tickets/238-parallel-make-fails

Fixed by #11616.

Depends on #11964
Depends on #11616

Upstream: Fixed upstream, but not in a stable release.

CC: @wbhart @nexttime

Component: packages: standard

Reviewer: Leif Leonhardy

Issue created by migration from https://trac.sagemath.org/ticket/11844

@vbraun vbraun added this to the sage-5.0 milestone Sep 24, 2011
@nexttime
Copy link
Mannequin

nexttime mannequin commented Sep 24, 2011

comment:1

Nice.

Which version of MPIR are you referring to?

(I don't know whether the yasm version in our "recent" MPIR packages ever changed at all, or which one we currently use.)

FWIW, IIRC we did a lot of massively parallel builds of MPIR 2.1.x, because of the race condition in MPIR's make install (fixed in MPIR 2.1.3 or .4), but never ran into an error caused by yasm.

@nexttime nexttime mannequin added p: minor / 4 and removed p: major / 3 labels Sep 24, 2011
@vbraun
Copy link
Member Author

vbraun commented Sep 24, 2011

comment:2

This is the mpir-1.2.2.p2.spkg from sage-4.7.2.alpha2. All versions of Yasm have the bug.

You probably didn't trip over the yasm bug because you were using a file system that allows you to open the same file multiple times for writing. Then you don't get the segfault, but you may get corrupted output. If you are extremely lucky gcc might even compile the clobbered output from re2c, and you'd just get mathematically wrong results for integer arithmetic.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Sep 24, 2011

comment:3

Replying to @nexttime:

FWIW, IIRC we did a lot of massively parallel builds of MPIR 2.1.x, because of the race condition in MPIR's make install (fixed in MPIR 2.1.3 or .4), but never ran into an error caused by yasm.

The latter is apparently fixed in MPIR 2.1.4.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Sep 24, 2011

comment:4

If it doesn't necessarily lead to a build error... :)

(I've also added a reference to this ticket on #11616.)

@nexttime nexttime mannequin added p: critical / 2 and removed p: major / 3 labels Sep 24, 2011
@nexttime
Copy link
Mannequin

nexttime mannequin commented Sep 24, 2011

comment:5

Replying to @nexttime:

Replying to @nexttime:

FWIW, IIRC we did a lot of massively parallel builds of MPIR 2.1.x, because of the race condition in MPIR's make install (fixed in MPIR 2.1.3 or .4), but never ran into an error caused by yasm.

The latter is apparently fixed in MPIR 2.1.4.

Not the latter, I meant MPIR's make install race condition. ;-)

@vbraun
Copy link
Member Author

vbraun commented Oct 3, 2011

comment:6

This is now fixed upstream in

yasm/yasm@2bd6651

@vbraun
Copy link
Member Author

vbraun commented Oct 3, 2011

Changed upstream from Workaround found; Bug reported upstream. to Fixed upstream, but not in a stable release.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Oct 3, 2011

comment:7

Wonder whether they can include the fix in MPIR 2.5.0 (which should already be right before the door).

I'll provide updated MPIR spkgs (at least an MPIR 2.1.3.p5 or 2.1.4.pX spkg, the others for #11616) fixing the need to specify ABI=32 on 32-bit operating systems running on 64-bit CPUs anyway, then I can include such a fix by a patch to upstream.

Note that the yasm guys do not check whether the call to tmpfile() succeeded, nor is the file opened in text mode as it was before. (The latter should be irrelevant on our supported platforms -- including Cygwin AFAIK. Microsoft deprecated tmpfile() anyway... ;-) )

@vbraun
Copy link
Member Author

vbraun commented Oct 3, 2011

comment:8

Note that the yasm guys do not check whether the call to tmpfile() succeeded

In the unlikely case that tmpfile() can't find a unique file name, it'll return NULL so at least the yasm build will fail with a clean segfault. There are probably many other things that would go wrong at build time if tmpfile() fails...

@nexttime
Copy link
Mannequin

nexttime mannequin commented Oct 3, 2011

comment:9

Replying to @vbraun:

Note that the yasm guys do not check whether the call to tmpfile() succeeded

In the unlikely case that tmpfile() can't find a unique file name, it'll return NULL

ERRORS
       EACCES Search permission denied for directory in file's path prefix.

       EEXIST Unable to generate a unique filename.

       EINTR  The call was interrupted by a signal.

       EMFILE Too many file descriptors in use by the process.

       ENFILE Too many files open in the system.

       ENOSPC There was no room in the directory to add the new filename.

       EROFS  Read-only file system.

So there are a couple of reasons why tmpfile()could fail (and these may be temporary, i.e., later calls at least from other processes may well succeed).

Letting re2c segfault is of course the pythonic way... ;-)

@vbraun
Copy link
Member Author

vbraun commented Oct 3, 2011

comment:10

Replying to @nexttime:

So there are a couple of reasons why tmpfile()could fail

And all of them will break the build, whether or not re2c catches the error.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Oct 5, 2011

comment:11

Just in case...

@nexttime
Copy link
Mannequin

nexttime mannequin commented Oct 5, 2011

Dependencies: #11896

@jdemeyer
Copy link

Changed dependencies from #11896 to #11964

@jdemeyer
Copy link

comment:12

Nothing to review here...

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 5, 2012

comment:13

This is now fixed by the MPIR 2.4.0.p2 spkg at #11616 (patch to upstream added).

In case that gets merged, this ticket can be closed as a duplicate (or more precisely, as fixed / superseded by #11616).

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 18, 2012

comment:14

Positive review w.r.t. that #11616 fixes this (by a patch from Sage, not MPIR upstream).

But we IMHO shouldn't close this ticket until #11616 got merged.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 18, 2012

Reviewer: Leif Leonhardy

@nexttime nexttime mannequin removed this from the sage-5.0 milestone Apr 18, 2012
@jdemeyer
Copy link

Changed dependencies from #11964 to #11964, #11616

@jdemeyer

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants