Joblib makes callstacks impossible to decipher #12322

amueller · 2018-10-08T03:07:03Z

So I'm trying to work with a script of mine which is a pipeline containing a ColumnTransformer in cross-validation. I'm trying to do some profiling, but each of these three things calls into joblib, which adds 3-4 layers each to the stack. Similarly if I want to do interactive debugging, this is pretty hard to manage.

This is for n_jobs=None. I'm not sure if n_jobs=1 would make it better?

I suggest we either hard-code a separate path for n_jobs=1 into scikit-learn, or there could be a shortcut in joblib with a single level on the callstack.
Either way, right now this is very difficult to work with.

The text was updated successfully, but these errors were encountered:

rth · 2018-10-08T05:40:48Z

This is for n_jobs=None. I'm not sure if n_jobs=1 would make it better?
I suggest we either hard-code a separate path for n_jobs=1 into scikit-learn,

It probably would as then one wouldn't have to call the functions determining the number of n_jobs in a given context. Though I have a hard time seeing how we could hard code n_jobs=1 and at the same time support the use cases of n_jobs=None with context managers.

or there could be a shortcut in joblib with a single level on the callstack.

Maybe it could be worth opening an issue about it at joblib?

Making joblib 0.11, that did have a fast path for n_jobs=1, if I understand correctly, compatible with scikit learn 0.20 and use a site joblib 0.11 could be another imperfect solution joblib/joblib#786

GaelVaroquaux · 2018-10-09T16:33:00Z

It's fun, because many years ago I wrote joblib because I kept having such pairs of paths in my code, and it turned out that it was a continuous source of bugs.

I would advice avoiding that, and trying to find a fix in joblib.

ogrisel · 2018-10-09T19:31:41Z

n_jobs=1 should behave exactly as n_jobs=None by default. We could probably makes some effort to flatten the call trace when the SequentialBackend is active though (which is the case by default).

GaelVaroquaux · 2018-10-09T19:36:56Z

We could probably makes some effort to flatten the call trace when the SequentialBackend is active though

Yes, that's probably were there is room for improvement.

amueller · 2018-10-09T22:05:35Z

I'm not sure there's a way around the nesting, though?

So with the call to delayed there will always be at least 2 levels, right?
So if I have cross-validation and pipeline and column transformer I will have 6 levels from Parallel, and probably two more levels from joblib.memory, and then some sklearn indirections (_fit_transform_one etc)...

hrm...

GaelVaroquaux · 2018-10-09T22:11:29Z

I'm not sure there's a way around the nesting, though?

Yes, some nesting will be unavoidable, though delayed doesn't add one level of nesting currently. We might be able to boil down joblib.Parallel to only one level of nesting in the sequential case. I just found, back in the days, that writing my own code to special case no parallelism would lead me to do mistakes, and the resulting bugs were hard to debug because they would happen only in one of the two situations.

ogrisel mentioned this issue Oct 11, 2018

Make joblib sequential calls flatter joblib/joblib#790

Open

thomasjpfan added the Enhancement label Feb 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joblib makes callstacks impossible to decipher #12322

Joblib makes callstacks impossible to decipher #12322

amueller commented Oct 8, 2018

rth commented Oct 8, 2018

GaelVaroquaux commented Oct 9, 2018

ogrisel commented Oct 9, 2018

GaelVaroquaux commented Oct 9, 2018 via email

amueller commented Oct 9, 2018

GaelVaroquaux commented Oct 9, 2018 via email

Joblib makes callstacks impossible to decipher #12322

Joblib makes callstacks impossible to decipher #12322

Comments

amueller commented Oct 8, 2018

rth commented Oct 8, 2018

GaelVaroquaux commented Oct 9, 2018

ogrisel commented Oct 9, 2018

GaelVaroquaux commented Oct 9, 2018 via email

amueller commented Oct 9, 2018

GaelVaroquaux commented Oct 9, 2018 via email