Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numba workqueue threading error #665

Closed
theahura opened this issue May 5, 2021 · 9 comments
Closed

Numba workqueue threading error #665

theahura opened this issue May 5, 2021 · 9 comments

Comments

@theahura
Copy link

theahura commented May 5, 2021

Thanks for the great library!

I'm getting the following error:
Terminating: Nested parallel kernel launch detected, the workqueue threading layer does not supported nested parallelism. Try the TBB threading layer.

The context is that I'm calling the umap library from flask and two clients are hitting a umap call at the same time. I'm guessing this is what is causing nested threading -- flask is already using some kind of threading, and umap seems to be doing the same under the hood. I think this is related to numba threading, but I'm setting a random seed so as per the docs I don't think this should be an issue.

Is there a proposed/preferred solution to this issue?

@celinesin
Copy link

i think i might have a related problem, so I'm commenting here to stay in the loop.

I'm trying to serve a small dash app using Flask/wsgi/apache2, and while it runs fine locally, my app gets stuck with fit_transform()... and more specifically, fit()

There aren't any error messages... it just times out.

@lmcinnes
Copy link
Owner

This is not a trivial issue to solve unfortunately -- the problems are ultimately caused by other libraries (here flask and numba) both wanting to do their own threading and not playing well. My best guess for a solution would be to look into getting numba to use a different threading layer. The suggested TBB threading is definitely going to be better. To get numba to use that it requires that you have tbb installed -- you can get it from pip or conda. Ensuring it is enabled is possibly more tricky (I believe it should be by default if it is available) and you would have to consult the numba documentation for that.

@theahura
Copy link
Author

Following up on this thread, using the omp threading layer (similar to tbb) worked for me. Glad to know that's also the suggested approach, so I'll close this out. Thanks for the response!

@lmcinnes
Copy link
Owner

Hopefully one of omp or tbb works for @celinesin . If it doesn't work for you please feel free get the issue re-opened Celine.

@celinesin
Copy link

interim update: omp didn't work, but as promised getting numba to use tbb is indeed not so straight forward ^^;
I've posted in numba (numba/numba#7095)

Thanks for the hints so far!

@stuartarchibald
Copy link

The most common cause of this issue is nesting parallel kernels, here's an example:

from numba import njit, prange
import numpy as np

@njit(parallel=True)
def nested(x):
    for i in prange(len(x)):
        x[i] += 1


@njit(parallel=True)
def main():
    Z = np.zeros((5, 10))
    for i in prange(Z.shape[0]):
        nested(Z[i]) # A parallel region is calling another function with a parallel region!
    return Z

main()

The workqueue threading layer does not support nested kernels, whereas OpenMP and TBB do. The message you are seeing is coming from the workqueue threading layer as it's detected that the code does something it doesn't support and is "protecting" itself from inevitable corruption that would occur were it to attempt to run it.

My first question is.... do you have something like the case above or do you have a parallel kernel being called from two python threads? I think the effect would manifest in the same error message as the workqueue backend isn't threadsafe and it would "see" parallel region launches from two python threads as a kind of nest.

@celinesin
Copy link

(copypasta from the numba thread)
Just another update:

After much help from @stuartarchibald, we've figured out a few things:

  1. i was unable to install tbb and numba together, but that wasn't needed to begin with, since I had a functioning omp.
  2. contrary to my original hypothesis, numba doesn't actually have a problem with flask, what I called "serving the app locally" was already using flask. Therefore, the problem may lie somewhere between numba and wsgi or apache2.
  3. the problem is not fundamentally something about running JIT code, but something about Numba's runtime. I can compile functions, but I can't execute them.

The search continues! My next hint is: https://modwsgi.readthedocs.io/en/develop/user-guides/processes-and-threading.html#

Thanks for all your help!

@lmcinnes
Copy link
Owner

@celinesin : thanks for reporting back -- it sounds like a difficult issue and may not be related to threading, but instead some other aspect of things. You are welcome to open a separate issue for this if you want.

@celinesin
Copy link

And, in the end my problem was incorrect wsgi configuration... what a treasure hunt!
It works now, thanks for all the support!

jsnel added a commit to joernweissenborn/pyglotaran that referenced this issue Nov 20, 2022
This causes issues on Mac OS X. Also the fix results in a 2x speedup on Windows in some cases.

Also know as fix for "Terminating: Nested parallel kernel launch detected, the workqueue threading layer does not supported nested parallelism. Try the TBB threading layer."

Ref insightful comment on GitHub: lmcinnes/umap#665 (comment)
s-weigand pushed a commit to joernweissenborn/pyglotaran that referenced this issue Nov 20, 2022
This causes issues on Mac OS X. Also the fix results in a 2x speedup on Windows in some cases.

Also know as fix for "Terminating: Nested parallel kernel launch detected, the workqueue threading layer does not supported nested parallelism. Try the TBB threading layer."

Ref insightful comment on GitHub: lmcinnes/umap#665 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants