MRG: refactor PSD functions #2710

choldgraf · 2015-12-15T20:58:30Z

This is a quick update that adds information to docstring so people know where to look for the under-the-hood PSD estimation. It also adds code to keep the estimated psd/frequencies with the object after the transform method is called.

jasmainak · 2015-12-15T21:10:00Z

I can't seem to see your commit. Did you update the .rst file only or did you add the output htmls to the commit as well?

jasmainak · 2015-12-15T21:22:57Z

mne/decoding/transformer.py

+    scikit-learn pipelines. It relies heavily on the multitaper_psd
+    function found in the time_frequency module. Running the transform
+    method will create attributes for the estimated psd as well as the
+    frequencies used, on top of return these values.


can you update See Also?

what does "on top of return these values." mean?

It means that freqs and data will be created as attributes when you call transform, on top of being returned by transform. (though it's only data that is returned, not freqs...it's a poorly written phrase).

choldgraf · 2015-12-15T21:57:31Z

I added a see also section - let me know if that's what you means. Regarding creating the attributes, I'm not sure how to do it differently (unless I shouldn't do it at all)

jasmainak · 2015-12-15T23:52:50Z

mne/decoding/transformer.py

+
+    See Also
+    --------
+    multitaper_psd, compute_epochs_psd, compute_raw_psd


did you try building the doc? does it hyperlink properly? You can do:

$ cd doc/ $ make dev-html_noplot

jasmainak · 2015-12-16T00:37:08Z

I think the convention for attributes is that you add it to the fit method, not to the transform method. And they end with an underscore. See here: http://scikit-learn.org/stable/tutorial/statistical_inference/settings.html#estimators-objects. I think you can't add an attribute to a transform method.

choldgraf · 2015-12-16T00:53:05Z

Ah I see what you were referring to. regarding underscores and such that totally makes sense. For putting it in fit vs transform, this object already handles that strangely already. The fit method currently does nothing, it only returns self...so I wasn't sure what to do there...

jasmainak · 2015-12-16T10:40:29Z

For now, I'd say ... let's not add the attributes here. This calls for an update to the time-frequency module as we discussed here: #2290

larsoner · 2015-12-16T17:06:27Z

FYI you have a flake error

choldgraf · 2015-12-16T17:31:45Z

I just spent 5 minutes looking for an anti-cornflakes meme on the internet, no luck. Instead I will focus on doing actual work...

It sounds like this PR will change a bit either way. It sounds like the best thing is to add epochs functionality to the time_frequency module, and then have this sklearn structure call that function instead.

So I guess the question is does this warrant its own function (e.g., multitaper_psd_epochs, or should it be added as a flag to compute_epochs_psd, e.g. compute_epochs_psd(...kind='mt'...)?

jasmainak · 2015-12-16T19:25:21Z

I vote for a new argument called method which is string type. So, no new function.

jasmainak · 2015-12-16T19:26:07Z

Also, please remember to update a couple of tests :)

choldgraf · 2015-12-17T01:05:22Z

naturally :)

On Wed, Dec 16, 2015 at 11:36 AM, Mainak Jas notifications@github.com
wrote:

Also, please remember to update a couple of tests :)

—
Reply to this email directly or view it on GitHub
#2710 (comment)
.

agramfort · 2015-12-18T22:11:23Z

let us know when you did the necessary changes.

choldgraf · 2015-12-18T22:14:50Z

will do - trying to finish a PR in h5io right now but I will try to get to
this soon thereafter

On Fri, Dec 18, 2015 at 2:11 PM, Alexandre Gramfort <
notifications@github.com> wrote:

let us know when you did the necessary changes.

—
Reply to this email directly or view it on GitHub
#2710 (comment)
.

choldgraf · 2015-12-19T02:13:49Z

OK there's a first step. I made 2 main changes:

Added a method kw to the compute_epoch_psd function, this lets you do either welch or multitaper psd estimation.
Added support for arrays instead of only Epochs objects. Since the PSDEstimator expects an array this would allow it to call this function instead of doing its own multitaper estimation. Let me know if that's over-reaching and we shouldn't add the code to support arrays...it just wasn't much extra effort...

choldgraf · 2015-12-20T04:05:28Z

More updates - the PSDEstimator should now work using compute_epochs_psd so it's a pretty simple class at this point. Also fixed up the attribute creation etc. If these look reasonable then I'll make some tests as well...

jasmainak · 2015-12-20T04:17:10Z

mne/time_frequency/psd.py

@@ -143,6 +147,8 @@ def compute_epochs_psd(epochs, picks=None, fmin=0, fmax=np.inf, tmin=None,
        to be <= n_fft.
    proj : bool
        Apply SSP projection vectors.
+    method : 'welch' | 'multitaper'
+        The method to use in calculating the PSD.


Can you explain in one line what each method does? For details, we should add references in the manual

jasmainak · 2015-12-20T06:54:44Z

Added support for arrays instead of only Epochs objects. Since the PSDEstimator expects an array this would allow it to call this function instead of doing its own multitaper estimation. Let me know if that's over-reaching and we shouldn't add the code to support arrays...it just wasn't much extra effort...

I noticed that you added a new argument Fs to support this. I would instead just support epochs object, but have a private function support the Fs + array + epochs. When you call it from compute_psd_epochs, it would call this private function with epochs and when you call it from the PSDEstimator, you would call this private function with Fs + array. I can't think of a good reason why you would want to support compute_psd_epochs for arrays in the public API. Even the name of the function has epochs in it :)

jasmainak · 2015-12-20T06:59:05Z

mne/time_frequency/psd.py

+            if picks is not None:
+                data = np.dot(proj[picks][:, picks], data)
+            else:
+                data = np.dot(proj, data)


I would do:

if picks is not None: proj = proj[picks][:, picks] data = np.dot(proj, data)

Saves you one line ;)

jasmainak · 2015-12-20T07:01:03Z

@choldgraf thanks for making this contribution. Please go ahead and add the tests :)

jasmainak · 2015-12-20T07:04:31Z

mne/decoding/transformer.py

@@ -248,6 +248,12 @@ def inverse_transform(self, X, y=None):
 class PSDEstimator(TransformerMixin):
    """Compute power spectrum density (PSD) using a multi-taper method

+    This structures data so that it can be easily incorporated into
+    scikit-learn pipelines. It relies heavily on the multitaper_psd


This sentence will go away once we add the method argument to the init

Agreed - I wasn't sure if that's something people would want but can do that for sure

hmm ... actually, thinking about it some more -- the problem with a method argument is that you will have to pass the different options for multitaper / welch too. Maybe, we can leave it for now ...

Yeah, that's also why I didn't put it in there in the first place.
Functions like spectral_connectivity (and the newly-modified
compute_epochs_psd) handle it by having a bunch of kwargs for the
respective functions (e.g., mt_adaptive, mt_normalize). I decided to pass
on that this time around though it shouldn't be a big deal to extend the
function to make it work with a method parameter.

On Sun, Dec 20, 2015 at 3:55 AM, Mainak Jas notifications@github.com
wrote:

In mne/decoding/transformer.py
#2710 (comment):

@@ -248,6 +248,12 @@ def inverse_transform(self, X, y=None):
class PSDEstimator(TransformerMixin):
"""Compute power spectrum density (PSD) using a multi-taper method

This structures data so that it can be easily incorporated into

scikit-learn pipelines. It relies heavily on the multitaper_psd

hmm ... actually, thinking about it some more -- the problem with a method
argument is that you will have to pass the different options for
multitaper / welch too. Maybe, we can leave it for now ...

—
Reply to this email directly or view it on GitHub
https://github.com/mne-tools/mne-python/pull/2710/files#r48102657.

jasmainak · 2016-01-14T10:08:45Z

examples/time_frequency/plot_epochs_spectra.py

@@ -33,6 +38,9 @@
 epochs = mne.Epochs(raw, events, event_id, tmin, tmax,
                    proj=True, baseline=(None, 0), preload=True,
                    reject=dict(grad=4000e-13, eog=150e-6))
+# Pull 2**n points to speed up computation
+epochs = EpochsArray(epochs.get_data()[..., :1024], epochs.info,
+                     epochs.events, tmin, epochs.event_id)


why not?

tmax = tmin + 1023. / raw.info['sfreq']

and skip this line altogether?

It seemed to me when I tried this before that there were some inconsistencies with the indexing using times (e.g., it would be off by 1 index in one direction or the other), so if I ever want my signal to be length 2**N I just index the data directly. I can use tmin/tmax if you think that's better.

How long ago did you have that problem? We changed to using something that is a little bit more lenient with the floating point arithmetic that goes on under the hood about a year ago IIRC.

It was a while ago - ok I'll give it a shot and see if it works.

On Thu, Jan 14, 2016 at 9:40 AM, Eric Larson notifications@github.com
wrote:

In examples/time_frequency/plot_epochs_spectra.py
#2710 (comment):

@@ -33,6 +38,9 @@
epochs = mne.Epochs(raw, events, event_id, tmin, tmax,
proj=True, baseline=(None, 0), preload=True,
reject=dict(grad=4000e-13, eog=150e-6))
+# Pull 2**n points to speed up computation
+epochs = EpochsArray(epochs.get_data()[..., :1024], epochs.info,

epochs.events, tmin, epochs.event_id)

How long ago did you have that problem? We changed to using something that
is a little bit more lenient with the floating point arithmetic that goes
on under the hood about a year ago IIRC.

—
Reply to this email directly or view it on GitHub
https://github.com/mne-tools/mne-python/pull/2710/files#r49758067.

OK so it sortof works but there's still weird stuff that happens. For example:

tmin = -1. tmax = tmin + 1023. / raw.info['sfreq'] epochs = mne.Epochs(raw, events, event_id, tmin, tmax) print(epochs.get_data().shape)

will print shape (..., 1024)

but if I do

tmin, tmax = -1., 1. raw.info['bads'] += ['MEG 2443'] # bads epochs = mne.Epochs(raw, events, event_id, tmin, tmax, proj=True, baseline=(None, 0), preload=True, reject=dict(grad=4000e-13, eog=150e-6)) tmax = tmin + 1023. / raw.info['sfreq'] epochs.crop(tmin, tmax) print(epochs.get_data().shape)

Now it prints shape (..., 1023)

So depending on whether you define the tmin/tmax before turning into epochs, vs cropping the epochs after the fact, then you are off by 1. So I think I ran into this inconsistency and just said I would index manually because I wanted to be sure that it was correct.

Can you open an issue with a small code snippet using RawArray or the sample dataset to reproduce? We should fix that if possible. For now I'm fine leaving this example to manually set the bounds.

I think the relevant lines are in epochs.py: in the init at 290, get_data at 1154 and the cropping function 1435

The issue is the first index. E.g.:

tmin = -1. tmax = tmin + 1023. / raw.info['sfreq'] epochs = mne.Epochs(raw, events, event_id, tmin, tmax) print(epochs.times) epochs.crop(tmin, tmax) print(epochs.times)

gives:
[-1.00064103 -0.99897607 -0.99731111 ..., 0.69928325 0.70094821
0.70261317]
[-0.99897607 -0.99731111 -0.99564615 ..., 0.69928325 0.70094821
0.70261317]

I feel like the outputs should be the same, but in the second case the first index gets lopped off.

If this is worth making consistent between the two, my intuition is that we should use the behavior of the crop function. If a user wants to have a length 1024 array, to me it makes the most sense to do tmin + 1024. / sfreq, not do tmin + 1023. / sfreq and have to remember that it's inclusive on tmin so it keeps an extra index.

ah just saw your comment, I'll open an issue and use the crop function here.

Yeah we'll have to discuss further in the issue. We have to be careful about changes like this because they could break people's code.

jasmainak · 2016-01-14T10:11:17Z

other than these minor comments, LGTM. I am offline tomorrow but if you fix these, I'll merge the day after.

choldgraf · 2016-01-14T18:39:46Z

OK I think I've addressed all the comments there - see my changes to the docstrings/example/whatsnew and LMKWYT.

jasmainak · 2016-01-16T06:35:38Z

great! ... but coveralls is complaining. @Eric89GXL is this something to be concerned about or is coveralls broken?

agramfort · 2016-01-16T13:06:21Z

mne/time_frequency/multitaper.py

+    # Combining/reshaping to original data shape
+    psd = psd.reshape(np.hstack([dshape, -1]))
+    if ndim_in == 1:
+        psd = psd.squeeze()


squeeze is evil. It squeezes all the dimensions equal to 1. I've been bitten by this in the past. If you know it's the first dim that is one just do as it was before psd = psd[0, :]

that's a good point, will revert it

agramfort · 2016-01-16T13:16:42Z

done with my review. Looks really nice. thanks @choldgraf

let me know when you addressed my comments

larsoner · 2016-01-16T13:35:57Z

I suspect coveralls is broken because it was reporting bad coverage for my PR too

jasmainak · 2016-01-16T15:08:25Z

@choldgraf don't give up. You are almost there :)

choldgraf · 2016-01-16T20:00:01Z

Ummm, I think @agramfort's suggestions may have uncovered another bug. I tried creating sinusoidal data by basing RawArray off of Raw, and ran into some strange behavior when creating Epochs. Basically, if I build an Epochs object off of RawArray, then no Epochs get pulled. I was able to recreate it with this code:

# These are paths to the test data
raw_fname = '/Users/choldgraf/src/python/mne-python/mne/io/tests/data/test_raw.fif'
ev_fname = '/Users/choldgraf/src/python/mne-python/mne/io/tests/data/test-eve.fif'

# Load data
raw = mne.io.Raw(raw_fname)
raw_arr = mne.io.RawArray(raw[:, :][0], raw.info)
ev = mne.read_events(ev_fname)

# Create epochs objects and print shape
ep = mne.Epochs(raw, ev, 1, -.5, 1, preload=True)
ep_arr = mne.Epochs(raw_arr, ev, 1, -.5, 1, preload=True)
for e in [ep, ep_arr]:
    print(e._data.shape)

This outputs:

(7, 376, 902)
(0, 376, 902)

So when Epochs is created using a RawArray as input, none of the events are being pulled...

I should have known that this PR would get more complicated ;)

larsoner · 2016-01-16T20:31:13Z

The first_samp is probably different, and that greys factored into the events. Not really a bug but could be better documented. You can confirm by looking at the drop_log entries

choldgraf · 2016-01-16T20:40:02Z

Yeah - the differences between the drop logs are that all the non-ignored
epochs are marked as "TOO_SHORT". And yep, the first_samp attributes are
different between the two. I'm not really sure what the first_samp is
for...does it mark where the file starts reading in the data or something?
And I guess that's not embedded in the 'info' file.

On Sat, Jan 16, 2016 at 12:31 PM, Eric Larson notifications@github.com
wrote:

The first_samp is probably different, and that greys factored into the
events. Not really a bug but could be better documented. You can confirm by
looking at the drop_log entries

—
Reply to this email directly or view it on GitHub
#2710 (comment)
.

larsoner · 2016-01-16T22:04:19Z

Correct, it's not in info. It's a neuromag thing having to do with when raw data recording actually started. Subtract the first samp from the event times and it should work for the raw array

choldgraf · 2016-01-16T22:55:45Z

OK - in the latest commit it now does the same stuff but on a signal w/
sinusoidal waves, rather than subtract from the events file I just copied
over _first_samps, if that's ok. WDYT?

On Sat, Jan 16, 2016 at 2:04 PM, Eric Larson notifications@github.com
wrote:

Correct, it's not in info. It's a neuromag thing having to do with when raw
data recording actually started. Subtract the first samp from the event
times and it should work for the raw array

—
Reply to this email directly or view it on GitHub
#2710 (comment)
.

larsoner · 2016-01-16T23:08:20Z

I think you'd also need to copy _last_samps and/or _raw_lengths to be complete. But subtracting from the events seems a bit cleaner to me because it avoids private attributes entirely.

choldgraf · 2016-01-16T23:12:29Z

ok cool - see latest push. that look good? I'm just doing ev[:, 0] -=
first_samp

On Sat, Jan 16, 2016 at 3:08 PM, Eric Larson notifications@github.com
wrote:

I think you'd also need to copy _last_samps and/or _raw_lengths to be
complete. But subtracting from the events seems a bit cleaner to me because
it avoids private attributes entirely.

—
Reply to this email directly or view it on GitHub
#2710 (comment)
.

larsoner · 2016-01-16T23:37:05Z

Yeah that should be fine

agramfort · 2016-01-17T08:27:32Z

I am happy ! +1 for merge

thx heaps @choldgraf

@Eric89GXL or @jasmainak I'll let you merge if you're happy

MRG: refactor PSD functions

jasmainak · 2016-01-17T17:08:31Z

Bravo @choldgraf ! congrats on the merge 🍻

agramfort · 2016-01-17T17:23:05Z

🍻 !

choldgraf · 2016-01-17T19:21:56Z

wohoo! merged! https://www.youtube.com/watch?v=XOq1Z1r3oM8&feature=youtu.be&t=1m48s

choldgraf · 2016-01-17T19:22:13Z

thanks for all the help @jasmainak @Eric89GXL @agramfort

agramfort · 2016-01-17T19:30:19Z

now you need to find another occupation for today sorry :)

choldgraf mentioned this pull request Dec 15, 2015

ENH: decoding API #2290

Closed

jasmainak reviewed Dec 15, 2015
View reviewed changes

jasmainak reviewed Dec 20, 2015
View reviewed changes

jasmainak reviewed Jan 14, 2016
View reviewed changes

Updating documentation and example for PSD functions

fedd0f8

choldgraf force-pushed the update_psdestimator_doc branch from 8037702 to fedd0f8 Compare January 15, 2016 23:42

agramfort reviewed Jan 16, 2016
View reviewed changes

Cleaning up MT code and adding a check for PSD power

74f92d1

choldgraf force-pushed the update_psdestimator_doc branch from ac7cd1e to 74f92d1 Compare January 16, 2016 23:11

jasmainak added a commit that referenced this pull request Jan 17, 2016

Merge pull request #2710 from choldgraf/update_psdestimator_doc

8e7010f

MRG: refactor PSD functions

jasmainak merged commit 8e7010f into mne-tools:master Jan 17, 2016

MRG: refactor PSD functions #2710

MRG: refactor PSD functions #2710

Conversation

choldgraf commented Dec 15, 2015

jasmainak commented Dec 15, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

choldgraf commented Dec 15, 2015

Choose a reason for hiding this comment

jasmainak commented Dec 16, 2015

choldgraf commented Dec 16, 2015

jasmainak commented Dec 16, 2015

larsoner commented Dec 16, 2015

choldgraf commented Dec 16, 2015

jasmainak commented Dec 16, 2015

jasmainak commented Dec 16, 2015

choldgraf commented Dec 17, 2015

agramfort commented Dec 18, 2015

choldgraf commented Dec 18, 2015

choldgraf commented Dec 19, 2015

choldgraf commented Dec 20, 2015

Choose a reason for hiding this comment

jasmainak commented Dec 20, 2015

Choose a reason for hiding this comment

jasmainak commented Dec 20, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jasmainak commented Jan 14, 2016

choldgraf commented Jan 14, 2016

jasmainak commented Jan 16, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agramfort commented Jan 16, 2016

larsoner commented Jan 16, 2016 via email

jasmainak commented Jan 16, 2016

choldgraf commented Jan 16, 2016

larsoner commented Jan 16, 2016 via email

choldgraf commented Jan 16, 2016

larsoner commented Jan 16, 2016 via email

choldgraf commented Jan 16, 2016

larsoner commented Jan 16, 2016 via email

choldgraf commented Jan 16, 2016

larsoner commented Jan 16, 2016 via email

agramfort commented Jan 17, 2016

jasmainak commented Jan 17, 2016

agramfort commented Jan 17, 2016

choldgraf commented Jan 17, 2016

choldgraf commented Jan 17, 2016

agramfort commented Jan 17, 2016 via email