make ccall sigatomic (defer SIGINT handling) #2622

stevengj · 2013-03-20T03:02:10Z

Currently, a ctrl-c in the REPL can interrupt a ccall to an external library, leaving that library in an inconsistent state and producing crashes or other problems later (see e.g. PyCall issue #15).

Julia's C API (julia.h) has SIGATOMIC macros to protect critical code regions from being interrupted by ctrl-c (SIGINT), but these are not easily accessible from Julia.

Arguably, every ccall should be made sigatomic by default, on the theory that C code in general is not interrupt-safe (with a separate interruptable_ccall for cases that are known to be re-entrant and interrupt-safe). Note that Python does the same thing. As noted below, the performance penalty for this is almost certainly negligible under ordinary circumstances.

The text was updated successfully, but these errors were encountered:

JeffBezanson · 2013-03-20T03:09:40Z

see also #1468

timholy · 2013-03-20T10:01:43Z

In code that repeatedly calls Cairo, I've also seen issues with Ctrl-C.

stevengj · 2013-03-20T17:38:44Z

Making ccall sigatomic by default seems like the only safe choice. The overhead for this looks like it would be two additions, three loads, two stores, and two comparisons/branches in a typical call; this is negligible for all but the most trivial C function calls.

Moreover, Julia's JIT compiler need not generate the sigatomic checks at all unless jl_install_sigint_handler has been called, which (on a single core) only happens in the REPL by default, so this wouldn't affect non-interactive code.

(One of the comparisons could be eliminated in typical calls by changing the conditional from jl_defer_signal == 0 && jl_signal_pending != 0 to jl_signal_pending != 0 && jl_defer_signal == 0 since both jl_signal_pending and jl_defer_signal will usually be zero.)

stevengj · 2013-03-22T13:51:44Z

I'm labelling this a bug, since crashing generic C code after SIGINT is not a sane default.

JeffBezanson · 2013-03-23T06:32:39Z

This is reasonable. Unfortunately (as in #1468) we'd also like to be able to interrupt things like large matrix multiplies, but there doesn't seem to be much we can do about that on our end.

… non-interruptable ccalls (see also issue JuliaLang#2622)

kmsquire · 2014-01-16T18:30:13Z

@JeffBezanson in #5399:

I'm kind of against this since it adds overhead and probably still won't make everything work.

It would be nice to measure the impact.

How about adding a safe_ccall or atomic_ccall function with the same signature as ccall, but wrapped in sigatomic?

Alternatively, ccall could be made safe by default and unsafe_ccall could be provided. Thoughts?

StefanKarpinski · 2014-01-16T18:49:02Z

Alternatively, ccall could be made safe by default and unsafe_ccall could be provided.

This seems like the better approach to me if we're going to do this since we would want to default to safe and most cases where someone is calling out to C are not really that performance critical.

ivarne · 2014-01-16T19:02:57Z

I also think safe-by-default suits Julia well. If we do not have other behavior to group together with interruptable ccalls in a unsafe_ccall, I think a name containing int(errupt) would have advantages.

JeffBezanson · 2014-01-16T19:05:45Z

I just don't think asynch interrupt is that important. Plus, you might want to break out of a long-running C routine.

This doesn't warrant a safe_ or unsafe_ prefix; it only refers to a very narrow issue of interactively interrupting things. With that approach, the very simplest and most innocent of C routines (e.g. libm functions) would have to be called with unsafe_ccall, which doesn't seem right.

We are already doing delayed binding of ccalls by checking a cached function pointer for NULL. This would add yet another thing.

stevengj · 2014-01-16T19:35:09Z

There's no point in allowing people to break out of long-running C routines if doing so has a high probability of creating crashes later; I doubt that most long-running C code is interrupt-safe.

I find it hard to believe that the overhead will be anything but negligible except for a very small number of extremely trivial C routines, and decorating these with unsafe_ccall seems like a small price to pay for not having ctrl-C be horribly dangerous.

stevengj · 2014-01-16T19:36:29Z

(I would tend to call it interruptable_ccall rather than "unsafe", since the former is more informative.)

kmsquire · 2014-01-16T19:54:46Z

Creating a branch which adds sigatomic_begin() and sigatomic_end() directly to ccall and measuring the impact would give us something more concrete to argue over. ;-)

I won't have time to try this for at least a few days, so if anyone is up to the task, I'm eager to hear about it...

kmsquire · 2014-01-16T19:55:25Z

(And I agree that interruptable_ccall is more informative.)

JeffBezanson · 2014-01-16T20:01:00Z

And then I think there is a question of whether it will even work --- I doubt everything in our runtime system and libraries will be completely interrupt-safe. Technically after an async signal the process may be in an undefined state, no matter what you do.

A lot of important native routines are very tiny, for example sqrt, whose body is basically one instruction. But yes, we should definitely try it before saying any more about performance.

I would prefer almost anything to having to go around and write interruptible_ccall in various places, plus clutter the language with this extra word. If people really want this, I'd rather make every ccall atomic by default and let it be disabled by a compiler switch.

amitmurthy · 2014-01-17T05:38:22Z

Testing a wrapped ccall with these changes:

https://github.com/amitmurthy/julia/compare/amitm;sigatomic?expand=1

I get the following results (extreme case of just trying to detect overhead of sigatomic calls...)

On master:

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.719709771 seconds (5599993224 bytes allocated)

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.726340217 seconds (5599993224 bytes allocated)

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.697574462 seconds (5599993224 bytes allocated)

with wrapped ccall:

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.865826205 seconds (5599993128 bytes allocated)

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.825473527 seconds (5599993128 bytes allocated)

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.870493742 seconds (5599993128 bytes allocated)

I think we should just make every ccall (and any other similar functions) atomic by default. Even a compiler switch may not be required.

Seg faults simply leave a bad taste in the mouth.....

JeffBezanson · 2014-01-17T05:45:49Z

If I'm reading your patch correctly, I don't think this has any effect since the JL_SIGATOMIC_BEGIN happens at compile time. You'd need to generate code that does what JL_SIGATOMIC_BEGIN does.

amitmurthy · 2014-01-17T06:26:19Z

Right. Should have realized from the "emit" names...

Just tested again by just ccall'ing the jl_sigatomic* functions before and after the ccall for sqrt in math.jl and I am getting only slightly increased timings....

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.756323121 seconds (5599993224 bytes allocated)

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.742497754 seconds (5599993224 bytes allocated)

julia> @time map!(x->sqrt(x), [1.0:float64(10^8)]);
elapsed time: 7.76467246 seconds (5599993224 bytes allocated)

Patch - https://github.com/amitmurthy/julia/compare/amitm;sigatomic?expand=1

This should have a runtime effect, right?

stevengj · 2014-01-17T14:29:14Z

Don't benchmark anonymous functions. Write a foo!(X) function using an explicit loop.

Though honestly I don't see the point. The number of C functions as small as sqrt seems quite small, so I find it hard to believe that we'd use interrupt able_ccall in more than a dozen or so places.

JeffBezanson · 2014-01-17T18:08:33Z

At least use sqrt instead of x->sqrt(x); that's adding an extra layer of overhead. Better still call the vectorized sqrt.

amitmurthy · 2014-01-18T07:00:36Z

For

a=[1.0:float64(10^8)]
@time sqrt(a);

it is 1.06 seconds for the safe version and 0.59 seconds for the current version. Maybe the difference will be even smaller when emit_ccall itself generates sigatmic* code? In practical terms, @stevengj is right, we may as well just make ccall resilient to interrupts by default....

tknopp · 2014-01-18T07:40:58Z

I can see Jeffs point very well. Julia is in an outstanding position where the ffi has almost zero overhead as the overhead is jited away. In all other languages I have seen, there is some overhead which prevents calling scalar C functions in tight loops. This usually leads to recommendation to use the ffi only for vector operations, where the overhead is negligible.

stevengj · 2014-01-18T14:00:52Z

@amitmurthy, the difference will certainly be smaller when the sigatomic is inlined; it is much more expensive to do a subroutine call than a comparison.

@tknopp, the problem with this argument is that very very few C functions as simple as sqrt are worth calling in a tight loop. In most cases, if the C function is so trivial that the cost of an extra comparison and load are significant, then it is trivial enough to just inline the calculations and/or rewrite in Julia. In the few cases like sqrt that are basically wrappers around single instructions, then (a) we can use interruptable_ccall and (b) we should consider having Julia inline it anyway.

JeffBezanson · 2014-01-18T21:09:29Z

What typically happens is somebody says "To try out julia, I compared a simple loop (c)calling sqrt in C++ and Julia, and Julia was 50% slower. Why is Julia so much slower for something so simple?" The C++ program, of course, just dies entirely on ^C, but that is not going to occur to anybody as a point of comparison.

From my perspective, we work very hard to generate good code, so it is disheartening when more stuff has to be crammed into the pipeline. Often each extra "thing" is 2%, but over time there are 20 of those things.

But, @stevengj is right that the overhead will be much less with the sigatomic stuff inlined, and in the specific case of sqrt we should be inlining it.

stevengj · 2014-01-18T21:47:23Z

@JeffBezanson, we already get tons of I did [something naive] and Julia was X times slower than Y comments. Slowing down the extremely unusual case of calling your own trivial C function in a tight loop (as opposed to something like sqrt which is in Base and for which we will have already used interruptable_ccall) seems likely to have negligible impact, in comparison with more common mistakes, on first impressions.

JeffBezanson · 2014-01-18T22:05:05Z

Ok, I think the way I want to handle this is to make ccall sigatomic by default, and have a compiler switch to control it. My reasoning is that whether you need ^C depends on whether you are working interactively, which is a fairly global property. When we reach the point of statically compiling whole programs, it makes sense to ask to strip out all sigatomic stuff.

stevengj · 2014-01-18T22:13:42Z

@JeffBezanson, why do you need a compiler switch? Just turn off the sigatomic ccall in the code generator if jl_install_sigint_handler has not been called, as I suggested above. This will automatically disable it in non-interactive usage.

mossr · 2015-01-17T20:29:03Z

@JeffBezanson is there a way to completely remove signal handling (other than the //jl_install_sigint_handler method)? You mentioned a compiler switch, has that been implemented?

Because MATLAB runs on the JVM, if a library installs it's own signal handler then complications can cause MATLAB to crash.

stevengj · 2015-05-22T14:49:54Z

(Note that the sqrt function has been inlined for some time, since 244ec92, so ccall overhead will no longer affect it. Shouldn't we also be using LLVM intrinsics for log and exp and a few others?)

ViralBShah · 2015-05-22T14:57:49Z

How do the LLVM intrinsics work? Do they end up calling the system libm, or does LLVM have fast implementations? We have @simonbyrne 's log implementation in Julia that is probably the fastest one for now.

simonbyrne · 2015-05-22T15:40:55Z

From what I understand they just call the system libm functions, though they are also able to optimise repeated calls with the same argument (#414, #9942, #10922).

see JuliaLang/julia#2622 and JuliaLang/julia#14675

* Remove unnecessary sigatomic * Make flisp calls sigatomic * Make type inference calls sigatomic * Refactor interthread communication through signal * Make sure `sleep` is aborted on `SIGINT` on Linux to deliver the exception faster * Implement force signal throwing when `SIGINT` arrives too frequently * Hack to abort io syscall on `SIGINT` Fix #1468; Fix #2622; Towards #14675

stevengj · 2016-05-06T15:26:39Z

Hooray!

stevengj added a commit to stevengj/julia that referenced this issue Apr 4, 2013

add access to sigatomic_begin/end from Julia, to defer interrupts for…

50d0824

… non-interruptable ccalls (see also issue JuliaLang#2622)

stevengj mentioned this issue Apr 4, 2013

RFC: access to sigatomic_begin/end in Julia for non-interrupt-safe C calls #2759

Merged

kmsquire mentioned this issue Aug 12, 2013

segfault when aborting Pkg2.add() #3991

Closed

JeffBezanson mentioned this issue Jan 16, 2014

Seg fault upon interrupting disk io in progress #5399

Closed

timholy mentioned this issue Mar 27, 2014

Control+C with imread() on TIFF file breaks Images JuliaImages/Images.jl#82

Closed

ivarne mentioned this issue May 21, 2014

Segfault when aborting matrix multiplication #1468

Closed

vtjnash mentioned this issue Jan 1, 2015

Ctrl-C randomly breaks out of Julia #9544

Closed

stevengj mentioned this issue Feb 18, 2015

Keyboard interrupt and segfault. JuliaLang/IJulia.jl#277

Closed

This was referenced Mar 3, 2015

Input Output error after stopping Notebook JuliaLang/IJulia.jl#283

Closed

killing notebook is unreliable JuliaLang/IJulia.jl#270

Closed

stevengj mentioned this issue May 22, 2015

Segfault on break #11382

Closed

This was referenced Jul 26, 2015

at-atomic macro as alternative to disable_sigint() #12309

Closed

REPL crashing on ctrl+c? jump-dev/CPLEX.jl#45

Closed

stevengj mentioned this issue Oct 19, 2015

AssertionError when hitting ^C at REPL #13664

Closed

yuyichao mentioned this issue Mar 14, 2016

RFC: more reliable & extensible ^C REPL interrupt #14032

Closed

randy3k added a commit to JuliaInterop/RCall.jl that referenced this issue Mar 21, 2016

disable SIGINT while running R code

4536b2f

see JuliaLang/julia#2622 and JuliaLang/julia#14675

vtjnash mentioned this issue Mar 25, 2016

Possible Segfault With SIGINT And File IO #10990

Closed

yuyichao mentioned this issue Apr 18, 2016

Optimize and clean up lock #15917

Merged

yuyichao mentioned this issue May 3, 2016

Use safepoint to deliver SIGINT #16174

Merged

vtjnash closed this as completed in #16174 May 6, 2016

stevengj mentioned this issue Sep 23, 2018

Re-enable SIGINT handler in pyjlwrap_call JuliaPy/PyCall.jl#574

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make ccall sigatomic (defer SIGINT handling) #2622

make ccall sigatomic (defer SIGINT handling) #2622

stevengj commented Mar 20, 2013

JeffBezanson commented Mar 20, 2013

timholy commented Mar 20, 2013

stevengj commented Mar 20, 2013

stevengj commented Mar 22, 2013

JeffBezanson commented Mar 23, 2013

kmsquire commented Jan 16, 2014

StefanKarpinski commented Jan 16, 2014

ivarne commented Jan 16, 2014

JeffBezanson commented Jan 16, 2014

stevengj commented Jan 16, 2014

stevengj commented Jan 16, 2014

kmsquire commented Jan 16, 2014

kmsquire commented Jan 16, 2014

JeffBezanson commented Jan 16, 2014

amitmurthy commented Jan 17, 2014

JeffBezanson commented Jan 17, 2014

amitmurthy commented Jan 17, 2014

stevengj commented Jan 17, 2014

JeffBezanson commented Jan 17, 2014

amitmurthy commented Jan 18, 2014

tknopp commented Jan 18, 2014

stevengj commented Jan 18, 2014

JeffBezanson commented Jan 18, 2014

stevengj commented Jan 18, 2014

JeffBezanson commented Jan 18, 2014

stevengj commented Jan 18, 2014

mossr commented Jan 17, 2015

stevengj commented May 22, 2015

ViralBShah commented May 22, 2015

simonbyrne commented May 22, 2015

stevengj commented May 6, 2016

make ccall sigatomic (defer SIGINT handling) #2622

make ccall sigatomic (defer SIGINT handling) #2622

Comments

stevengj commented Mar 20, 2013

JeffBezanson commented Mar 20, 2013

timholy commented Mar 20, 2013

stevengj commented Mar 20, 2013

stevengj commented Mar 22, 2013

JeffBezanson commented Mar 23, 2013

kmsquire commented Jan 16, 2014

StefanKarpinski commented Jan 16, 2014

ivarne commented Jan 16, 2014

JeffBezanson commented Jan 16, 2014

stevengj commented Jan 16, 2014

stevengj commented Jan 16, 2014

kmsquire commented Jan 16, 2014

kmsquire commented Jan 16, 2014

JeffBezanson commented Jan 16, 2014

amitmurthy commented Jan 17, 2014

JeffBezanson commented Jan 17, 2014

amitmurthy commented Jan 17, 2014

stevengj commented Jan 17, 2014

JeffBezanson commented Jan 17, 2014

amitmurthy commented Jan 18, 2014

tknopp commented Jan 18, 2014

stevengj commented Jan 18, 2014

JeffBezanson commented Jan 18, 2014

stevengj commented Jan 18, 2014

JeffBezanson commented Jan 18, 2014

stevengj commented Jan 18, 2014

mossr commented Jan 17, 2015

stevengj commented May 22, 2015

ViralBShah commented May 22, 2015

simonbyrne commented May 22, 2015

stevengj commented May 6, 2016