Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG? epochs.resample takes foreever #2445

Closed
kingjr opened this issue Sep 1, 2015 · 6 comments
Closed

BUG? epochs.resample takes foreever #2445

kingjr opened this issue Sep 1, 2015 · 6 comments

Comments

@kingjr
Copy link
Member

kingjr commented Sep 1, 2015

epochs.resample seems unresonably slow. Here's an example

import time
import numpy as np
import mne
from mne.datasets import sample
data_path = sample.data_path()
raw_fname = data_path + '/MEG/sample/sample_audvis_filt-0-40_raw.fif'
events_fname = data_path + '/MEG/sample/sample_audvis_filt-0-40_raw-eve.fif'
raw = mne.io.Raw(raw_fname, preload=True)
picks = mne.pick_types(raw.info, meg=True, exclude='bads')  # Pick MEG channels
raw.filter(1, 30, method='fft')  # Band pass filtering signals
events = mne.read_events(events_fname)
event_id = {'AudL': 1, 'AudR': 2, 'VisL': 3, 'VisR': 4}
epochs = mne.Epochs(raw, events, event_id, -0.050, 0.400, proj=True,
                    picks=picks, baseline=None, preload=True,
                    reject=dict(mag=5e-12),verbose=False)

critically

start = time.time()
epochs.resample(64, n_jobs=-1)
print(time.time() - start)

83.4265270233 seconds

And the n_jobs=-1 doesn't seem to take all processes. Is this expected?

@larsoner
Copy link
Member

larsoner commented Sep 1, 2015

It is probably due to scipy's resample function using FFT-based resampling, thus wrapping to fftpack which is slow for non-powers of two. We need to get polyphase filtering / upfirdn-style resampling working (scipy/scipy#5186). @rkmaddox or I (or someone else!) just need to find the time to work on it.

In the meantime, can you use CUDA? It should be an order of magnitude or more faster.

Closing as dup of #2035, where we should probably continue the discussion.

The n_jobs=-1 not taking up all jobs does seem to be a separate issue. Feel free to reopen this and retitle it for that issue if you want. What does verbose=True say about the number of jobs that will be used? You should see updates from parallel with thread numbers, for example.

@larsoner larsoner closed this as completed Sep 1, 2015
@kingjr
Copy link
Member Author

kingjr commented Sep 1, 2015

Ok, thanks for the quick update.

In the meantime, can you use CUDA? It should be an order of magnitude or more faster.

no, don't have the right computer for that.

What does verbose=True say about the number of jobs that will be used? You should see updates from parallel with thread numbers, for example.

It indicates the right amount of thread (8), but only verbose 2, and only 2 are actually used. Unfortunately, I'm under water work-wise, can't spend time on this.

For future ref, I use this quick and dirty alternative:

def resample_epochs(epochs, sfreq):
    """Fast MNE epochs resampling"""
    from scipy.signal import resample

    # resample
    epochs._data = resample(
        epochs._data, epochs._data.shape[2] / epochs.info['sfreq'] * sfreq,
        axis=2)
    # update metadata
    epochs.info['sfreq'] = sfreq
    epochs.times = (np.arange(epochs._data.shape[2],
                              dtype=np.float) / sfreq + epochs.times[0])
    return epochs

@larsoner
Copy link
Member

larsoner commented Sep 1, 2015

Is that actually faster? That is what epochs.resample should more or less be doing under the hood. If what you wrote is faster, I suspect we're doing something wrong in master.

@larsoner
Copy link
Member

larsoner commented Sep 1, 2015

If you could come up with some minimal example e.g. using EpochsArray or the sample data demonstrating the speed difference (with n_jobs=1), I'm happy to take a look.

@larsoner
Copy link
Member

larsoner commented Sep 1, 2015

BTW it would be better to use epochs.decimate than epochs.resample if you know your data are sufficiently low-passed, because it will avoid edge artifacts that resampling will produce.

@larsoner
Copy link
Member

larsoner commented Sep 1, 2015

Your example above, for example, could probably make use of decimation, but you would be a bit close to the Nyquist rate by going to sfreq=64 in this case, something like 90 i.e. 3x the highest freq would be safer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants