Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broader support of the SNIRF file format and enable reading GowerLab data #10555

Merged
merged 16 commits into from May 5, 2022

Conversation

rob-luke
Copy link
Member

@rob-luke rob-luke commented Apr 21, 2022

Reference issue

Fixes #10538

What does this implement/fix?

The result of these three improvements is that we can read files from the Gowerlab manufacturer.

Changes

  • I have standardised the way that channels are ordered across different (NIRx, SNIRF) importers. When I started writing these reader functions I only owned a NIRx device, so we used that manafacturers arbitrary channel ordering. Now that we support many manafacturers it seemed prudent to have a consistent and explainable default fNIRS channel ordering. We now sort according to np.argsort(raw.ch_names). This meant that many tests I had that used index had to have the indexs changed. However, I never address channels by index in my research code, I always use raw.pick(['some channel names']). Moving forward, I will use channel name indexing rather than int indexing in the tests I write.

Additional information

  • The software Homer exports with the time unit = "unknown". This isn't snirf compliant, but they are a big player so we should ensure we can read their files.

@rob-luke
Copy link
Member Author

rob-luke commented Apr 21, 2022

Thanks for reporting this issue @samuelpowell.
The file you shared was very helpful. Can you share a smaller file for adding to our test battery? The file you shared was 24MB when extracted. Or is this not possible due to the issue you've raised at fNIRS/snirf#103 ?

@larsoner the file Sam shared is 4MB when compressed, but 24 when expanded. It will be stored in the testing repo as a non compressed file. But we distribute the data as a compressed file. So is it the compressed or raw data size that we want to minimise?

@samuelpowell
Copy link

Yes I can make a smaller file by removing some tiles, but I'd just like to finish up initial work on https://github.com/Gowerlabs/lumomat and issue a release, that way I can be sure that the test file has defined link to a version of the conversion script so that any future bugs or changes can be managed.

I will endeavour to get this done by the end of tomorrow.

Does that make sense?

@rob-luke
Copy link
Member Author

rob-luke commented Apr 21, 2022

Makes sense. Thanks @samuelpowell
And there is no rush from me. I can use the file you provided to continue working on other issues. And we can just merge when the test data is available. There are no releases coming up soon that we need to rush for.

@samuelpowell
Copy link

Thanks @rob-luke, I'll come back to you with something more suitable for CI asap.

@rob-luke rob-luke changed the title Add support for SNIRF timeunit Broader support of the SNIRF file format and enable reading GowerLab data May 1, 2022
@rob-luke
Copy link
Member Author

rob-luke commented May 1, 2022

@samuelpowell here is the current status of reading the Gowerlabs data

Read the SNIRF file (using gowerlabs default output format, not mne-style)

import mne
import os.path as op

raw = mne.io.read_raw_snirf("lumomat-1-1-0.snirf", preload=True)
# <RawSNIRF | lumomat-1-1-0.snirf, 216 x 274 (27.3 s), ~669 kB, data loaded>

Visualise sensor locations

# Set some useful values
subjects_dir = op.join(mne.datasets.sample.data_path(), 'subjects')
plot_kwargs = dict(subjects_dir=subjects_dir,
                   surfaces="brain", dig=True, eeg=[],
                   fnirs=['sources', 'detectors'], show_axes=True,
                   coord_frame='head', mri_fiducials=True)
coreg = mne.coreg.Coregistration(raw.info, "fsaverage", subjects_dir, fiducials="estimated")
# This initialises the coregistration, but the transformation (coreg.trans) is just initialised to identity
mne.viz.plot_alignment(raw.info, trans=coreg.trans, subject="fsaverage", **plot_kwargs)

This figure shows the fsaverage head and its landmarks as red, blue, and green diamonds.
We can also see the landmarks from the lumo data we imported as circles.
Finally, we can see the source and detectors.
At this stage the sensors are not near the scalp as the head landmarks have not been aligned to the mri.

image

Coregistration

There is a coregistration gui that can be used. But for simplicity and reproducibility I just use the command line:

coreg.fit_fiducials(lpa_weight=1., nasion_weight=1., rpa_weight=1.)
mne.viz.plot_alignment(raw.info, trans=coreg.trans, subject="fsaverage", **plot_kwargs)

image

The sensors now appear over the brain, rather than floating in space.
The landmarks of the mri and from the lumo data do not perfectly align (median distance of 6mm). I'd like feedback on this, but I assume that's because the fsaverage head was slightly bigger than the person who this data was recorded from.

# Optional step to scale generic MRI head to participant data, better to just use real MRI of participant
mne.scale_mri("fsaverage", "gowerlabsdemodata", 0.95, overwrite=True, subjects_dir= subjects_dir)
# Now rerun the coregistration on the scaled down head
coreg = mne.coreg.Coregistration(raw.info, "gowerlabsdemodata", subjects_dir, fiducials="estimated")
coreg.fit_fiducials(lpa_weight=1., nasion_weight=1., rpa_weight=1.)

And indeed, the median distance has decreased. So scaling the generic MRI may be a useful step.

Aligning using fiducials
Start median distance: 121.62 mm
End median distance: 2.23 mm

mne.viz.plot_alignment(raw.info, trans=coreg.trans, subject="gowerlabsdemodata", **plot_kwargs)

We now see that the data is well aligned to the brain space.

image

# I find this new way of plotting a little easier to use, so you can also plot the sensors using...
brain = mne.viz.Brain('gowerlabsdemodata', subjects_dir=subjects_dir, background='w', cortex='0.5', alpha=0.3)
brain.add_sensors(raw.info, trans=coreg.trans, fnirs=['sources', 'detectors'])

Apply this transformation to the underlying data structure

To be addressed in a follow up PR.

Questions

@samuelpowell does the position of these sensors look correct? Were they placed on the left front of the head in this formation? If so, I will request a review from the other devs.

@@ -238,11 +259,12 @@ def test_snirf_nirsport2_w_positions():
# nirsite https://github.com/mne-tools/mne-testing-data/pull/86
# figure 3
allowed_distance_error = 0.005
distances = source_detector_distances(raw.info)
assert_allclose(distances[::2][:14],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the ordering of the channels is now changed, I test specific source detector pairs, rather than an indexed channel.

@samuelpowell
Copy link

@rob-luke bravo!

mne_3T_cap_layout

@samuelpowell
Copy link

Our caps have 'default' layouts which are digitised on a phantom of an appropriate size. I have asked as to which phantom would have been used for this cap, and will let you know here. But indeed I expect that it is simply a matter of scaling.

@rob-luke
Copy link
Member Author

rob-luke commented May 1, 2022

Excellent news @samuelpowell . Thanks for getting back to me so quickly.

But indeed I expect that it is simply a matter of scaling.

Great, we can address later how we want to deal with this. But from the example above we can see that its something that can already be done within MNE.

I will request a review from the other devs.

First we need to wait until mne-tools/mne-testing-data#96 is merged, then I will ping the others (and you Sam) for a review.

@samuelpowell
Copy link

I understand that @RJCooperUCL applies a simple affine transform with suitable results. It appears that coreg is happy to apply translation, rotation, and (generalised) scaling, so this should work. I will read up on the documentation to get a better understanding of this.

When pinged I'll pull the PR and attempt to break your code, and will subsequently remove 'mne-style' output. There may be a delay as I am away until Thursday.

Thanks again for this!

@@ -51,6 +51,8 @@ Enhancements

- Add ``'voronoi'`` as an option for the ``image_interp`` argument in :func:`mne.viz.plot_topomap` to plot a topomap without interpolation using a Voronoi parcelation (:gh:`10571` by `Alex Rockhill`_)

- Add support for reading data from Gowerlabs devices to :func:`mne.io.read_raw_snirf` (:gh:`10555` by `Robert Luke`_ and `Samuel Powell`_)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelpowell are you ok with acknowledging your assistance here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that’s fine, thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rob-luke this should be :newcontrib: and moved to the top of the new list

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks reasonable but tests are failing

@@ -51,6 +51,8 @@ Enhancements

- Add ``'voronoi'`` as an option for the ``image_interp`` argument in :func:`mne.viz.plot_topomap` to plot a topomap without interpolation using a Voronoi parcelation (:gh:`10571` by `Alex Rockhill`_)

- Add support for reading data from Gowerlabs devices to :func:`mne.io.read_raw_snirf` (:gh:`10555` by `Robert Luke`_ and `Samuel Powell`_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rob-luke this should be :newcontrib: and moved to the top of the new list

@@ -85,33 +106,33 @@ def test_snirf_basic():
# Test channel naming
assert raw.info['ch_names'][:4] == ["S1_D1 760", "S1_D1 850",
"S1_D9 760", "S1_D9 850"]
assert raw.info['ch_names'][24:26] == ["S5_D13 760", "S5_D13 850"]
# assert raw.info['ch_names'][24:26] == ["S5_D8 760", "S5_D8 850"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cruft, remove?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gross 😞 , thanks for finding this mess.

@rob-luke
Copy link
Member Author

rob-luke commented May 5, 2022

@larsoner @agramfort could you pleaser review, this is good from my end.

@agramfort agramfort marked this pull request as ready for review May 5, 2022 06:32
@@ -459,6 +459,9 @@ def __init__(self, fname, saturated, preload=False, verbose=None):
annot = Annotations(onset, duration, description, ch_names=ch_names)
self.set_annotations(annot)

sort_idx = np.argsort(self.ch_names)
self.pick(picks=sort_idx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum. This will make a copy of the data buffer but more importantly this is not consistent with the other readers which read the channels in the order they appear in the file on disk. I prefer to avoid doing magic in readers. Relevant functions down the road should deal with files with channels present in different orders, eg combine_evoked grand_average etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make a copy of the data buffer

If preload=False it won't because we can pick without loaded data now. This means people who really need to save this 2x memory can do read_raw_nirx(...).load_data() instead. But really NIRS data tends to be pretty small so I don't think this will be an issue in practice.

more importantly this is not consistent with the other readers which read the channels in the order they appear in the file on disk.

I agree with this idea in principle, but it will require a big refactoring of our existing code, which assumes and requires that the frequency pairings are essentially properly "interlaced" to work correctly.

So to me I would like to add a comment # TODO: Remove this reordering, which would give us some months to properly craft order-agnostic code, which I think will be a bigger undertaking involving coordination between MNE-Python and MNE-NIRS and changes in probably 5-10 public functions...

Copy link
Member

@larsoner larsoner May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rob-luke I realized this is problematic. Let's say we have this arrangement with the more-or-less standard "logical channel numbering":

S1 -- 1 -- D1 -- 2 -- S2
|          |           |
3          4           5
|          |           |
D3 -- 6 -- S3 -- 7 -- D4

The native file will be written as ['S1_D1 760', 'S1_D2_850', 'S2_D1_760', 'S2_D1_850', 'S1_D3_760', 'S1_D3_850', ...], i.e., a pair ordering of [S1_D1, S2_D1, S1_D3, S3_D1, S2_D4, S3_D3, S3_D4], and people will naturally expect these pairs to be associated with channel numbers 1 through 7. But this pick(argsort(ch_names)) will result in a reordered pair ordering of [S1_D1, S1_D3, S2_D1, S2_D4, S3_D1, S3_D3, S3_D4], so channels 1-7 from before now show up in the order [1, 3, 2, 5, 4, 6, 7].

Rather than this pick(argsort(...)) it seems like it would make sense to reorder the channels such that the original "logical channel" ordering is preserved, but such that pairs are still immediately adjacent. We could start by making sure that the first N channels are all fNIRS channels, and that they are paired. Non-fNIRS channels go after that in channels N+1:. Sound good?

Eventually like we said before we could/should relax this by making the channel logic smarter everywhere, but that's a bigger undertaking. All we need to do to fix this immediate/lpressing "logical channel" reordering problem I think is .pick(a_better_order) where a_better_order comes from a few lines of pairing logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that there is a standard logical channel ordering. Even for the trivial example which you illustrate I would assign the channel numbers differently to you.

You have assigned

  • Channel 1: s1 d1
  • Channel 2: s2 d1
  • Channel 3: s1 d3

I would have thought the logical ordering was

  • Channel 1: s1 d1
  • Channel 2: s1 d3
  • Channel 3: s2 d1

I also don't think that the channel number has any meaning. And it's the pair naming that should be used.

That's why I went for a simple sort. It will always be the same and can be easily described.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure it can differ by system or setup, but to me the acquisition system sets the "standard order" when it writes data to disk in some order. So the idea is just to respect the order as written to disk as much as possible until we can remove any ordering restrictions whatsoever.

FWIW we used to have this order problem with MEEG data, too, and removing it made the code cleaner in the end. So I'm looking forward to removing the paired-order restriction for fNIRS, as the cleanest long term solution is just to keep the order each manufacturer uses in the first place, or a user does with reorder_channels, etc. Currently reorder_chaanels is problematic for fNIRS and it shouldn't be!

@agramfort
Copy link
Member

agramfort commented May 5, 2022 via email

@larsoner
Copy link
Member

larsoner commented May 5, 2022

There are channel pairs that are always processed jointly in a standard pipeline that goes from intensity -> optical density -> chromophore. For example, the first four channels in a raw intensity could be

['S1_D1 760', 'S1_D1 850', 'S1_D2 760', 'S1_D2 850']

then these get converted to optical density (using their pairings), and then finally to chromophore (which are still paired) like

['S1_D1 hbo', 'S1_D1 hbr', 'S1_D2 hbo', 'S1_D2 hbr']

So it would be possible in principle to deal with a non-paired ordering for example by wavelength like

['S1_D1 760', 'S1_D2 760', 'S1_D2 850', 'S1_D1 850']

but in practice no manufacturer has done this (or something more random) so the code in many places checks/assumes channels are in pair order already.

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough …

@agramfort agramfort merged commit 7fcace5 into mne-tools:main May 5, 2022
@rob-luke
Copy link
Member Author

rob-luke commented May 5, 2022

Thank you @agramfort and @larsoner
Thanks for the good question @agramfort and thanks for the well-articulated response @larsoner
I agree with both of you, ideally, we would have order-agnostic code. But it will be a semi large undertaking.
First I plan to add some of the last missing functionality (e.g. get_montage), before tackling such refactoring. But happy to review a PR if someone figures out a nice way to make the code order agnostic.

@rob-luke rob-luke deleted the snirftimeunit branch May 5, 2022 21:33
larsoner added a commit to larsoner/mne-python that referenced this pull request May 9, 2022
* upstream/main:
  MRG: Add "highlight" parameter to Evoked.plot() to conveniently highlight time periods (mne-tools#10614)
  MRG: Allow to pass array of "average" values to plot_evoked_topomap() (mne-tools#10610)
  fix: snirf length units (mne-tools#10613)
  minor, doc: fix subplot titles in tutorial (mne-tools#10607)
  Display averaged time period in Evoked topomap title (mne-tools#10606)
  MAINT: Fix for pydata-sphinx-theme [skip azp][skip actions][circle deploy] (mne-tools#10605)
  DOC: report.add_html in tutorial (mne-tools#10603)
  Broader support of the SNIRF file format and enable reading GowerLab data (mne-tools#10555)
  MRG: Recommend mamba instead of libmamba for installation (mne-tools#10597)
  Fix dev documentation warning [skip azp][skip actions] (mne-tools#10599)
  FIX cmap (mne-tools#10593)
  [ENH, MRG] Add interpolate bridged electrodes function (mne-tools#10587)
@samuelpowell
Copy link

samuelpowell commented May 14, 2022

In relation to the Gowerlabs LUMO output:

  • there is an internal canonical ordering deep in our stack, but this is an implementation detail and not specified, the output in our SNIRF files can be ordered arbitrarily (the SNIRF specification has no requirements in this regard)
  • generally we operate source or detector major, then by wavelength, e.g. ['S1_D1 735', 'S1_D2 735', ... , 'S1_D1 850', 'S1_D2 850'], but in any case our ordering wouldn't depend upon the geometry of the channels
  • large caps support over 50k channels, many of which will not contain useful data, so it should be anticipated that the channel output is sparse (e.g. channels may not be formed from every source / detector pair)

Also, the 'useful' ordering depends on how the data is used:

  • for spectroscopy one will group by wavelength as discussed above
  • for reconstruction (at least using e.g. FEM) it is common to arrange channels in the way that they are produced when solving the linear system (m^T A q) where m is (n x n_detector) and q is (n x n_sources), the actual ordering when flattened to a vector depends on the memory layout of the system but it is consistent and appropriate because it avoids applying permutations repeatedly when you're trying to get on with the dirty business of linear algebra

Given the above I would suggest that there is no 'correct' ordering, and that one will have to apply a permutation depending on use (as indeed is being done here for spectroscopy).

I'm not yet familiar enough with the internal working of MNE / MNE-NIRS to suggest how that is best implemented, but if you have an ordered enumerated list of channels and their metadata (e.g. source index, detector index, and wavelength) then it's simple to sort as required. Doing so based upon the contents of a string descriptor may be more challenging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SNIRF import ignores timeUnit field contrary to specification
4 participants