Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pari fails to catch error (32-bit gcc 4.8.1) #14873

Closed
vbraun opened this issue Jul 9, 2013 · 16 comments
Closed

Pari fails to catch error (32-bit gcc 4.8.1) #14873

vbraun opened this issue Jul 9, 2013 · 16 comments

Comments

@vbraun
Copy link
Member

vbraun commented Jul 9, 2013

On Fedora 19 32-bit I get the following doctest failure:

sage: a = pari('2^100000000')
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-13afbe44126a> in <module>()
----> 1 a = pari('2^100000000')

/home/vbraun/Sage/sage-5.11.beta3/local/lib/python2.7/site-packages/sage/libs/pari/gen.so in sage.libs.pari.gen.PariInstance.__call__ (sage/libs/pari/gen.c:51081)()

RuntimeError: evaluating PARI string

This doctests checks that Pari errors are caught since the result doesn't fit into the default stack, so the pari stack needs to be enlarged during the computation.

Curiously, this works correctly on Fedora 19 x86_64. I haven't completely thought through if its an error in our code, but marking the variables used in the setjmp branch volatile fixes it. I am tempted to say that treating pari_errno as constant and optmizing it out / putting it in a register is legal if its not declared volatile.

Apply: attachment: trac_14873_pari_volatile.patch

CC: @nexttime @nbruin @pjbruin @tornaria @jdemeyer

Component: packages: standard

Author: Volker Braun

Reviewer: Peter Bruin

Merged: sage-5.12.beta0

Issue created by migration from https://trac.sagemath.org/ticket/14873

@vbraun
Copy link
Member Author

vbraun commented Jul 9, 2013

comment:2

For the record, this is Fedora 19 and

[vbraun@localhost ~]$ gcc --version
gcc (GCC) 4.8.1 20130603 (Red Hat 4.8.1-1)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@vbraun
Copy link
Member Author

vbraun commented Jul 10, 2013

comment:4

Meant to cc the other Bruin, maybe at least one of you cares about Pari ;-)

@pjbruin
Copy link
Contributor

pjbruin commented Jul 11, 2013

comment:5

The bug does not occur on Gentoo x86_64 with GCC 4.4.5. Unfortunately, I don't have a 32-bit system to test the patch, and I don't understand the subtleties of setjmp well enough to see if this is the right fix.

@vbraun
Copy link
Member Author

vbraun commented Jul 12, 2013

comment:6

In fact, that is one of the classic cases where you must declare the variable X volatile: First setjmp(), then change variable X, then longjmp back. If you want the value of the variable X to reflect the change between the setjmp and longjmp then it must be declared volatile.

If it doesn't occur on your system then that is just because your optimizer is not smart enough.

There isn't anything special about 32 vs 64 bit here, it just happened to be triggered in 32 bit only.

@pjbruin
Copy link
Contributor

pjbruin commented Jul 12, 2013

comment:7

Replying to @vbraun:

In fact, that is one of the classic cases where you must declare the variable X volatile: First setjmp(), then change variable X, then longjmp back. If you want the value of the variable X to reflect the change between the setjmp and longjmp then it must be declared volatile.

So far I understand the use of volatile. However, if that is the only problem, I would guess that a volatile declaration would only be needed for pari_retries and __catcherr. The variables pari_errno and __env don't seem to be the problem, since pari_errno is the return value of setjmp, and __env is not used inside the error-catching block. Or am I missing something?

I am also wondering why __catcherr is a global variable and not a stack variable like the others. Maybe the original author, who did not use volatile for this variable, had a compiler that put a local __catcherr into a register, but not a global one?

@vbraun
Copy link
Member Author

vbraun commented Jul 12, 2013

comment:8

I haven't really looked at what __catcherr is used for, but since the comment talks about it not being reentrant I assumed that it is intentionally global. It is accessed non-locally (throug the _pari_endcatch) macro from sage/libs/pari/misc.h. In fact, I don't know what the Pari err_catch() function is supposed to do, a web search (http://permalink.gmane.org/gmane.comp.mathematics.pari.devel/3108) suggests that it is intentionally undocumented and should not be used.

The fact that we pass a pointer to the jump buffer down to err_catch just before leaving its scope

#define _pari_catch {                                                         \
        jmp_buf __env;                                                        \
        ...
        __catcherr = err_catch(CATCH_ALL, &__env);                            \
    }

is very worrying. And by that, I mean: does not work as intended in my 32-bit VM. Pari err_catch() saves the pointer and dereferences it later when the memory location has potentially changed.

I agree that pari_errno would not be a problem if setjmp() were a normal function call. Actually, I just remembered that we had similar issues before (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56982). The problem is that calling setjmp in a combined assign / selection statement is not legal C/C++ (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1494.pdf Section 7.13.1.1).

This bug is also in the pari version we are shipping (in paricom.h the #define CATCH macro), but fixed in the upstream git version.

Updated patch fixes all that is wrong in pari_err.h, including the use of reserved C/C++ identifiers (double underscores and leading underscore for globals are reserved).

@vbraun
Copy link
Member Author

vbraun commented Jul 12, 2013

Updated patch

@pjbruin
Copy link
Contributor

pjbruin commented Jul 15, 2013

comment:9

Attachment: trac_14873_pari_volatile.patch.gz

There are a few more points that I am worried about, which are not strictly about this patch but about PARI error handling in general:

  • The file sage/libs/pari/pari_err.pxi causes sig_on() and sig_off() to be reinterpreted as _pari_sig_on() and _pari_sig_off(), which leads to confusion because (1) sig_on() and sig_off() already exist; (2) the names don't hint at the fact that one of the purposes is to catch PARI errors.
  • The functions sig_on() and sig_off(), modified as above, are very often called in different scopes by Sage (see sage/libs/pari/gen.pyx). It may be that (after applying this patch) _pari_catch_env remains in the scope long enough in all or most current uses in Sage, but there is nothing that guarantees this.
  • The function _pari_trap() from sage/libs/pari/gen.pyx is not declared in any header files; this led to it not being found in some cases when I was experimenting with it.
  • PARI's undocumented err_catch() and err_leave() functions no longer exist in current versions of PARI; one now has to use macros. This means that the current error-catching code will certainly break when we upgrade to a newer PARI. The following comes from a recent version of pari/paricom.h:
pari_CATCH(numer) {
  recovery
} pari_TRY {
  code
} pari_ENDCATCH
will execute 'code', then 'recovery' if exception 'numer' is thrown
[ any exception if numer == CATCH_ALL ].
pari_RETRY = as pari_TRY, but execute 'recovery', then 'code' again [still catching]

In the PARI version that Sage currently uses, similar macros exists without the prefix pari_, and they are implemented differently.

I made a patch that, as an alternative to sig_on()...sig_off(), implements new Cython macros sage_pari_catch(), sage_pari_catch_end() and sage_pari_catch_reset(). (I am open to suggestions for different names.) They use the PARI macros described above, as well as (the unmodified) sig_on()...sig_off(), and do not directly call setjmp(). The syntax is as follows:

sage_pari_catch()
...code...
sage_pari_catch_end()

Within such a block, one can put things like

if [some error or early return condition]:
    sage_pari_catch_reset()
    [return or raise exception]

Note that sage_pari_catch() and sage_pari_catch_end() must be on the same level, which is more restrictive than sig_on()...sig_off(). The advantage is that it is less of a hack (cf. #14817), and there is less potential for errors. I think that the new macros are cleaner and more portable to newer PARI versions.

I have not yet tried to change the hundreds of instances of sig_on()...sig_off() in gen.pyx, but I did make another version of my patch at #12142 that uses the new macros, and everything worked fine.

Perhaps we could first apply Volker Braun's patch to fix the doctest problem, and later do something along the lines of my patch?

@pjbruin
Copy link
Contributor

pjbruin commented Jul 15, 2013

Attachment: trac_14873_pari_catch.patch.gz

alternative macros for PARI error catching

@pjbruin

This comment has been minimized.

@pjbruin
Copy link
Contributor

pjbruin commented Jul 15, 2013

comment:11

PS: the new macros in the patch I just uploaded do not use any global variables and can presumably be nested, unlike the existing ones (I have not tested this yet).

@vbraun
Copy link
Member Author

vbraun commented Jul 15, 2013

comment:12

Since the error handling significantly changed with the new Pari version I would be in favor of doing both the update and necessary changes of our code on a future ticket.

I agree that there are no checks on whether _pari_catch_env stays in scope but at least it does not definitely fall out of scope so we should at least fix this asap.

@pjbruin
Copy link
Contributor

pjbruin commented Jul 15, 2013

Reviewer: Peter Bruin

@pjbruin
Copy link
Contributor

pjbruin commented Jul 15, 2013

comment:13

OK, I think the sensible thing to do is to give a positive review and postpone any further improvements in the error handling code to another ticket.

@vbraun
Copy link
Member Author

vbraun commented Jul 15, 2013

comment:14

I've made #14894 to track further improvements.

@jdemeyer
Copy link

jdemeyer commented Aug 2, 2013

Merged: sage-5.12.beta0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants