-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: mne.parallel crashes when too much data #3190
Comments
In the debug mode, when I run the function called by parallel ( |
I broke the bug down to this:
So.. not a GAT issue. |
Hm. I get a different error.
This is on a local Xeon with 270 GB of RAM. |
(This is in iPython.) |
@jona-sassenhagen aaaah what is this.
me too |
Yep, same error.
Did you check if it's about MNE parallel, or if it can somehow also be done just with Joblib? |
It's FWIU, mne.parallel is trying to be too smart and wants to dump some data down on disc or something |
Retrospectively it would have been better to simply use joblib I think, On Thu, Apr 28, 2016 at 11:36 PM Jean-Rémi KING notifications@github.com
|
The error comes from the param
doesn't work but Any idea what I should do to fix this? |
As a temporary fix, I do
But I'll let whoever understands this caching approach write the fix.. @jona-sassenhagen can you check that this also fixes your error? |
Yup. |
I can look tomorrow. FYI memmaping is in principle a powerful tool to
reduce parallelization overhead by reducing pickling requirements, so we
should try to get it working as well as possible. I suspect we might gain
speed if we do it right.
|
I suspect that the GAT code isn't optimal in this regard, because I first initialize This means that it's possible to get
|
All this is pretty well in line with my observation that joblib with MNE On Fri, Apr 29, 2016 at 12:58 AM Jean-Rémi KING notifications@github.com
|
But mkl only works for matrix operation right? joblib is a bit more generic. Or am I mistaken? [On this topic, I went to a Meetup at NYC where some people presented pyfora, which tries to compile functions to parallelize them, either locally, or on cluster. My fear is that they want to be a little to smart for it to be usable, but it might be worse checking it out] |
it is probably a joblib bug cc @ogrisel |
MKL works on anything that can use SIMD instructions, which very often means linear algebra since they provide highly optimized BLAS and LAPACK -- but it can also work for e.g. |
On my machines MKL=n_threads most of the time outperforms joblib=n_jobs on almost any scenario, filtering maybe an exception. |
Isn't MKL also usually much better memory wise? Or am I misinterpreting what top tells me? |
|
Not really, as explained by @lesteve in joblib/joblib#344: using joblib while disabling then memory mapping feature (by setting explicitly My suggestions would be to not set the If you face an issue with the memory mapping feature of joblib please feel free to report it. |
@agramfort ok to change the default |
No, we can't change |
ok, so we need to add a function that check that nbytes is not > 2Go, else throw an explicit error |
... although I just noticed that |
writing on disk when you dont have ssd can be pretty annoying, so I would keep the defaults, and explicit the docs. |
whatever works but please bench if you change the default behavior. I trust
the joblib folks about good defaults though
|
@kingjr I cannot reproduce with your snippet on Linux in Python or IPython. Can you try again and see if it works? Maybe it was some transient bug. |
... in any case we should decide if we want to use the |
I think we probably won't change our default behavior, since we usually use |
I'm encountering some weird error when I try to decod too much data e.g.
works fine but
returns
or
There is no error if I set
n_jobs=1
, if I reduce the dimensionality of the data (eithern_chan
,n_time
orn_trial
), and if I predict a subset: e.g.works fine, but from 597, it crashes
I am running on an aws m4.4xlarge running with ubuntu anaconda and mne dev, joblib is 0.9.4.
I have no idea what to do...
The text was updated successfully, but these errors were encountered: