[WIP] Support exporting mne objects to xarray DataArray objects #11464

mmagnuski · 2023-02-10T02:02:46Z

Adresses #7793

Part 1 of N parts:

tests adding xarray as dependency
adds .to_xarray() methods to Evoked and Epochs objects
adds a simple example of the functionality

other mne objects will be covered in future PRs (and also maybe exporting metadata to additional xarray coords).

TODO:

requires xarray decorator for pytest?
support picks
.to_xarray() for Evoked
.to_xarray() for Epochs
include 'bads' in .to_xarray()? (or just add "bad" channel coordinate?)
simple example

mmagnuski · 2023-02-10T14:25:33Z

Ok, so far so good, test for a to_xarray sketch work. I was thinking about channel picks and handling types:

I don't think adding picks to .to_xarray() method would be necessary, one can just do evoked.copy().pick(picks).to_xarray()
I'm not sure how to handle channel types. I can just assign additional coordinate that contains these types with a 'ch_type' label:
```
xarr = xarr.assign_coords(
    ch_type=('chan', evoked.get_channel_types())
)
```

larsoner · 2023-02-10T15:46:03Z

I'm not sure how to handle channel types. I can just assign additional coordinate that contains these types

I have no experience with xarray so no ideas here. Maybe @drammock or @hoechenberger ?

hoechenberger · 2023-02-10T16:00:19Z

mne/utils/xarray.py

+
+    if isinstance(mne_inst, Epochs):
+        data = mne_inst.get_data()
+        dims = ('chan', 'trial', 'time')


Can we call the dimension "channel"?

I prefer shorter dimension names as you use them often during various operations, but I will change the name if this would be the common preference. :)

I don't think we're internally consistent on questions like this. E.g., in function names sometimes we use channel and sometimes ch. Also when exporting to data frames, the spectrum class makes a column freq (not frequency). So I'm not really sure what to do WRT "channel". However, I would say that trial is not ideal, it should be epoch (for e.g. resting state recordings cut into sequential chunks "trial" doesn't make sense)

oh, yes I will definitelly change trial to epoch, especially that you can have multiple epochs from the same trial (for example successively presented items to remember).

drammock · 2023-02-10T21:29:26Z

I don't think adding picks to .to_xarray() method would be necessary, one can just do evoked.copy().pick(picks).to_xarray()

not necessary, but wouldn't it avoid an extra copy operation to have picks in the to_xarray method?

I'm not sure how to handle channel types. I can just assign additional coordinate that contains these types with a 'ch_type' label:
python xarr = xarr.assign_coords( ch_type=('chan', evoked.get_channel_types()) )

I think yes, an additional coordinate that is "parallel to" or "overlaid on" the channels coord makes sense here (not sure exactly what the right terminology is, I don't use xarray very much)

mmagnuski · 2023-02-10T23:29:57Z

not necessary, but wouldn't it avoid an extra copy operation to have picks in the to_xarray method?

It could lead to less characters used, for sure:

evoked.to_xarray(picks=picks)

vs

evoked.copy().pick(picks).to_xarray(copy=False)

I am still not sure, however. For operations like .to_data_frame() having picks seems a good idea as you frequently would like to export just a subset of channels. But for xarray, I actually expect you would frequently transform the whole object to xarray without any picking. And then at some point you can just use .sel(chan=ch_names) (or .sel(channel=ch_names) if we settle on such naming) or .isel(chan=picks) if you need a subset.

drammock · 2023-02-10T23:39:10Z

it's a good point that picks might not get used very frequently (though I can imagine maybe it getting used to pull out all channels of one type, e.g., all grads). But what I'm saying is that .to_xarray() is already going to need to make a copy of the data, and if someone wants to subset via picks then they would need another copy operation before the call to .to_xarray(). So it would be more memory efficient to include the picks as a parameter to .to_xarray() (provided that the approach indexes the data array internally to the .to_xarray method)

larsoner · 2023-02-11T00:15:10Z

FWIW internally to_xarray() should probably just call self.get_data(picks=picks) so I think it would be a pass-through argument / low effort to include it

mmagnuski · 2023-02-11T00:36:38Z

But what I'm saying is that .to_xarray() is already going to need to make a copy of the data, and if someone wants to subset via picks then they would need another copy operation before the call to .to_xarray().

That's why the example used .to_xarray(copy=False) to avoid the second copy (current mne.utils.xarray.to_xarray already has copy). But I agree picks should be an easy addition, so I'll add it.

larsoner · 2023-02-11T00:40:24Z

copy=False won't always even work, though -- sometimes the data isn't even loaded from disk. And if you .get_data(picks=[0]) it should read just the first channel's data from disk for example.

So we might need to think/plan out more deeply how all the copy stuff works with preloaded data vs not (at least for Epochs, where it matters).

larsoner · 2023-02-11T00:42:01Z

IIRC I think .get_data always returns a copy of the data, so if that's actually the case I would expect .to_xarray to internally always implicitly make a copy and have the returned xarray own that data (and any modifications to it not be reflected in the Epochs/Evoked object itself)

mmagnuski · 2023-02-11T01:08:01Z

IIRC I think .get_data always returns a copy of the data

The docs for Evoked claim that .get_data() returns a view, but the docs may be wrong. :)

mmagnuski · 2023-02-11T01:12:20Z

@larsoner You are right. The docs say that the data returned in .get_data() are A view on evoked data., but this seems incorrect:

import numpy as np
import mne

info = mne.create_info(list('abcd'), sfreq=250)
data = np.random.rand(4, 350)
erp = mne.EvokedArray(data, info, tmin=-0.5)

data = erp.get_data()
data[0, 0] = 23.

print(data[0, 0])
print(erp.data[0, 0])

into support_xarray

mmagnuski · 2023-04-25T11:19:23Z

(I still remember about this PR and should have some time to finish it soon)

mmagnuski added 5 commits February 10, 2023 02:59

MAINT: add xarray to requirements

200c121

ENH: firt sketch of xarray utils

8d13c3b

TST: first sketch of a test

3422d79

MAINT: should xarray be added this way to github actions?

3857b6f

MAINT: requires_xarray, add docs, remove support for AverageTFR for now

a933cdf

hoechenberger reviewed Feb 10, 2023

View reviewed changes

mmagnuski added 3 commits February 11, 2023 02:04

ENH: add picks etc.

b3fd2a2

ENH: add .to_xarray() to Evoked

5380036

TST: update test

273b1ed

STY: remove unused import

d7cbab6

mmagnuski added 5 commits February 11, 2023 02:13

Merge branch 'main' into support_xarray

ec25a36

FIX: add fill_doc, remove copy arg

ff379dc

TST: test picks

d0536c2

FIX: picks

0c5494e

Merge branch 'support_xarray' of https://github.com/mmagnuski/mne-python

f0ac552

into support_xarray

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Support exporting mne objects to xarray DataArray objects #11464

[WIP] Support exporting mne objects to xarray DataArray objects #11464

mmagnuski commented Feb 10, 2023 •

edited

mmagnuski commented Feb 10, 2023

larsoner commented Feb 10, 2023

hoechenberger Feb 10, 2023

mmagnuski Feb 10, 2023 •

edited

drammock Feb 10, 2023

mmagnuski Feb 10, 2023

drammock commented Feb 10, 2023

mmagnuski commented Feb 10, 2023

drammock commented Feb 10, 2023

larsoner commented Feb 11, 2023

mmagnuski commented Feb 11, 2023

larsoner commented Feb 11, 2023

larsoner commented Feb 11, 2023

mmagnuski commented Feb 11, 2023

mmagnuski commented Feb 11, 2023

mmagnuski commented Apr 25, 2023

[WIP] Support exporting mne objects to xarray DataArray objects #11464

Are you sure you want to change the base?

[WIP] Support exporting mne objects to xarray DataArray objects #11464

Conversation

mmagnuski commented Feb 10, 2023 • edited

TODO:

mmagnuski commented Feb 10, 2023

larsoner commented Feb 10, 2023

hoechenberger Feb 10, 2023

Choose a reason for hiding this comment

mmagnuski Feb 10, 2023 • edited

Choose a reason for hiding this comment

drammock Feb 10, 2023

Choose a reason for hiding this comment

mmagnuski Feb 10, 2023

Choose a reason for hiding this comment

drammock commented Feb 10, 2023

mmagnuski commented Feb 10, 2023

drammock commented Feb 10, 2023

larsoner commented Feb 11, 2023

mmagnuski commented Feb 11, 2023

larsoner commented Feb 11, 2023

larsoner commented Feb 11, 2023

mmagnuski commented Feb 11, 2023

mmagnuski commented Feb 11, 2023

mmagnuski commented Apr 25, 2023

mmagnuski commented Feb 10, 2023 •

edited

mmagnuski Feb 10, 2023 •

edited