New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joblib makes callstacks impossible to decipher #12322
Comments
It probably would as then one wouldn't have to call the functions determining the number of n_jobs in a given context. Though I have a hard time seeing how we could hard code n_jobs=1 and at the same time support the use cases of n_jobs=None with context managers.
Maybe it could be worth opening an issue about it at joblib? Making joblib 0.11, that did have a fast path for n_jobs=1, if I understand correctly, compatible with scikit learn 0.20 and use a site joblib 0.11 could be another imperfect solution joblib/joblib#786 |
It's fun, because many years ago I wrote joblib because I kept having such pairs of paths in my code, and it turned out that it was a continuous source of bugs. I would advice avoiding that, and trying to find a fix in joblib. |
|
We could probably makes some effort to flatten the call trace when the SequentialBackend is active though
Yes, that's probably were there is room for improvement.
|
I'm not sure there's a way around the nesting, though? So with the call to hrm... |
I'm not sure there's a way around the nesting, though?
Yes, some nesting will be unavoidable, though delayed doesn't add one
level of nesting currently. We might be able to boil down joblib.Parallel
to only one level of nesting in the sequential case.
I just found, back in the days, that writing my own code to special case
no parallelism would lead me to do mistakes, and the resulting bugs were
hard to debug because they would happen only in one of the two
situations.
|
So I'm trying to work with a script of mine which is a pipeline containing a ColumnTransformer in cross-validation. I'm trying to do some profiling, but each of these three things calls into joblib, which adds 3-4 layers each to the stack. Similarly if I want to do interactive debugging, this is pretty hard to manage.
This is for n_jobs=None. I'm not sure if
n_jobs=1
would make it better?I suggest we either hard-code a separate path for
n_jobs=1
into scikit-learn, or there could be a shortcut in joblib with a single level on the callstack.Either way, right now this is very difficult to work with.
The text was updated successfully, but these errors were encountered: