Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when using assert with parallel=True #9029

Closed
2 tasks done
99991 opened this issue Jun 21, 2023 · 8 comments · Fixed by #9048
Closed
2 tasks done

Segfault when using assert with parallel=True #9029

99991 opened this issue Jun 21, 2023 · 8 comments · Fixed by #9048
Labels
bug - segfault Bugs that cause SIGSEGV, SIGABRT, SIGILL, SIGBUS ParallelAccelerator

Comments

@99991
Copy link

99991 commented Jun 21, 2023

The following code triggers a segfault. Tested on Ubuntu 22.04 with Numba 0.56.4 and Numba 0.57.0.

from numba import njit
import numpy as np

@njit(parallel=True)
def foo():
    n = 1000

    for _ in range(n):
        for _ in range(n):
            values = np.zeros(3)

            np.max(values)

            assert n >= 0

foo()
  • I have tried using the latest released version of Numba (most recent is
    visible in the change log (https://github.com/numba/numba/blob/main/CHANGE_LOG).
  • I have included a self contained code sample to reproduce the problem.
    i.e. it's possible to run as 'python bug.py'.
@esc
Copy link
Member

esc commented Jun 21, 2023

This happens when n is too big. On my machine I can get n to be 323 and with 324 it segfaults.

@esc esc added bug - segfault Bugs that cause SIGSEGV, SIGABRT, SIGILL, SIGBUS ParallelAccelerator labels Jun 21, 2023
@esc
Copy link
Member

esc commented Jun 21, 2023

ping @DrTodd13

@DrTodd13
Copy link
Collaborator

@99991 You can do the body of your nested for loops in another function and that should solve the problem. I wonder if this MWR obscures the issue a bit because parallelizing a "max" over 3 elements isn't very useful and in fact will slow the program down. Can the outer loops be collapsed into one loop with pndindex or one of the loops here made into a prange? If so, then the "max" inside the loop nest will be lowered sequentially and the problem will also go away. We think we know why this segfault is occurring and the fix is probably moderately difficult and we will be working on it but wanted to give some workarounds in the meantime.

@99991
Copy link
Author

99991 commented Jun 22, 2023

wanted to give some workarounds in the meantime.

Thank you!

I wonder if this MWR obscures the issue a bit because parallelizing a "max" over 3 elements isn't very useful and in fact will slow the program down.

Definitely. But I doubt that anyone would read the 700 lines of filter kernels that this has been condensed from and removing the prange was required for the M in MWR.

You can do the body of your nested for loops in another function and that should solve the problem.

Perhaps, but the original code would then need to pass about 16 parameters, which is a bit much I think.

Can the outer loops be collapsed into one loop with pndindex or one of the loops here made into a prange? If so, then the "max" inside the loop nest will be lowered sequentially and the problem will also go away.

The original code computes something for every pixel of an image without any pixel dependencies (embarrassingly parallel), so both of the loops could make use of prange. Hadn't heard of pndindex yet.

Anyway, when using prange for any loop and changing the assert to i >= 0, there are still segfaults:

from numba import njit, prange
import numpy as np
import os

os.environ["NUMBA_PARALLEL_DIAGNOSTICS"] = "4"

@njit(parallel=True)
def foo():
    for y in prange(1000):
        for x in range(1000):
            values = np.zeros(3)

            i = np.argmax(values)

            assert i >= 0

foo()

I do not know how to read the diagnostics, but it seems that something is still happening with argmax.

@gmarkall
Copy link
Member

@DrTodd13 Is it likely we can fix this by 0.58?

@DrTodd13
Copy link
Collaborator

@99991 @stuartarchibald This doesn't appear to be documented anywhere but the parallelism code will only convert a prange to run in parallel if it is able to rewrite the loop into a parfor and it can only do that if there is only one exit from the loop. Because you have an assert in there, there is more than 1 exit from the loop. If you think about it, to some extent an assert here is ill-defined because normally you'd hit the first such extraneous control flow but in parallel you might hit a different assert or a different element would cause the assert. I'm not sure this was planned but it is a bit fortunate because the parfor is turned into a gufunc and gufuncs swallow exceptions anyway.

So, if you take out the assert then the parfors in the loop body (which are the zeros and the argmax) are lowered sequentially and you don't get a segfault. Suggestions Stuart?

@stuartarchibald
Copy link
Contributor

@DrTodd13 maybe:

  • Document the lack of support (or if it's easy, which I don't think it necessarily will be, catch the issue with >1 loop exits at compile time)?
  • Fix the stack overflow issue (which seems to be in Dynamically allocate parfor schedule. #9048)?

How does that sound?

@DrTodd13
Copy link
Collaborator

DrTodd13 commented Jul 7, 2023

@DrTodd13 maybe:

  • Document the lack of support (or if it's easy, which I don't think it necessarily will be, catch the issue with >1 loop exits at compile time)?
  • Fix the stack overflow issue (which seems to be in Dynamically allocate parfor schedule. #9048)?

How does that sound?

I added documentation and a warning when a prange loop isn't parallelized (also in PR 9048). I also fixed the stack overflow issue in 9048.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug - segfault Bugs that cause SIGSEGV, SIGABRT, SIGILL, SIGBUS ParallelAccelerator
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants