[WIP] allow read_raw_bids to take additional input parameters #244

sappelhoff · 2019-07-22T08:18:15Z

closes #203

Merge checklist

Maintainer, please confirm the following before merging:

All comments resolved
This is not your own PR
All CIs are happy
PR title starts with [MRG]
whats_new.rst is updated
PR description includes phrase "closes <#issue-number>"
Commit history does not contain any merge commits

codecov-io · 2019-07-22T09:08:23Z

Codecov Report

Merging #244 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #244      +/-   ##
==========================================
+ Coverage   98.41%   98.43%   +0.02%     
==========================================
  Files          15       15              
  Lines        1573     1597      +24     
==========================================
+ Hits         1548     1572      +24     
  Misses         25       25

Impacted Files	Coverage Δ
mne_bids/tests/test_read.py	`100% <100%> (ø)`	⬆️
mne_bids/read.py	`99.26% <100%> (+0.09%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2547049...d94e7d3. Read the comment docs.

doc/whats_new.rst

jasmainak · 2019-07-22T21:14:47Z

I don't quite understand this PR. Shouldn't BIDS structure tell you where these additional files are? You shouldn't have to expose additional arguments for this.

jasmainak · 2019-07-22T21:15:32Z

cli/mne_bids_raw_to_bids.py

+    if ext in ['.con', '.sqd']:
+        read_params = dict(elp=opt.electrode, hsp=opt.hsp, hpi=opt.hpi)
+    elif ext == '.pdf':
+        read_params = dict(config_fname=opt.config, head_shape_fname=opt.hsp)


you shouldn't have to specify them. It should come from BIDS

this gets passed to _read_raw so it's not yet in BIDS AFAIK.

This is the write utility of the CLI ... so the data is not yet in BIDS and has to be supplied "manually"

jasmainak · 2019-07-22T21:16:52Z

mne_bids/read.py

+        parameters of MNE-Python's read_raw_* function for the data format you
+        intend to read. For example, when reading fif data you can specify
+        `read_params` as {allow_maxshield=False}. Non-fitting parameters will
+        be ignored. Not all parameters are currently supported. If None,


A dangerous route!

+100. Don't silently ignore things.

I am raising Userwarning in verbose mode ... should I

always raise a warning?

always throw an error?

... do something else? what?

jasmainak · 2019-07-22T21:20:24Z

mne_bids/read.py

+        The keys of the dictionary must correspond with potential input
+        parameters of MNE-Python's read_raw_* function for the data format you
+        intend to read. For example, when reading fif data you can specify
+        `read_params` as {allow_maxshield=False}. Non-fitting parameters will


I wouldn't bother adding a new argument just for this. Let's do use-case driven software engineering. Ask people around you to try mne-bids and raise issues. It will give you a sense of gratification but also give a sense of priority.

fully agreed. Also besides allow_maxshield all the other params contain info that should be present in the BIDS sidecar files.

see my response: #244 (comment)

agramfort

also I would have relied on something like

read_raw_foo(fname, ..., **read_params)

if you have a parameter that is not valid you get a clear Python exception

agramfort · 2019-07-23T05:38:13Z

cli/mne_bids_raw_to_bids.py

+    if ext in ['.con', '.sqd']:
+        read_params = dict(elp=opt.electrode, hsp=opt.hsp, hpi=opt.hpi)
+    elif ext == '.pdf':
+        read_params = dict(config_fname=opt.config, head_shape_fname=opt.hsp)


this gets passed to _read_raw so it's not yet in BIDS AFAIK.

agramfort · 2019-07-23T05:41:58Z

mne_bids/read.py

+        The keys of the dictionary must correspond with potential input
+        parameters of MNE-Python's read_raw_* function for the data format you
+        intend to read. For example, when reading fif data you can specify
+        `read_params` as {allow_maxshield=False}. Non-fitting parameters will


fully agreed. Also besides allow_maxshield all the other params contain info that should be present in the BIDS sidecar files.

agramfort · 2019-07-23T05:42:26Z

mne_bids/read.py

+        parameters of MNE-Python's read_raw_* function for the data format you
+        intend to read. For example, when reading fif data you can specify
+        `read_params` as {allow_maxshield=False}. Non-fitting parameters will
+        be ignored. Not all parameters are currently supported. If None,


+100. Don't silently ignore things.

sappelhoff · 2019-07-24T11:02:14Z

I don't quite understand this PR

I prepared this PR to target #203

In #203 I argued that our private function _read_raw supports digitization files as inputs ... but these are not exposed to its "parent read_raw_bids.

That is what I wanted to do with this PR (no more, no less).

sappelhoff · 2019-07-24T11:15:44Z

Shouldn't BIDS structure tell you where these additional files are? You shouldn't have to expose additional arguments for this.

I agree, we should support that in the future. Let me talk only about MEG first:

It will be a major effort: We have to rely on these BIDS files:

*_coordsystem.json
*_headshape.pos
- note: this is also in the coordsystem section
- note 2: the headshape business is a bit unclear: there is an ongoing discussion about it, which has been open for a long time; community interest seems low

and from these files, we have to extract the data that we need for a given data format:

fif --> only allow_maxshield should be a problem (rest contained in fif file)
bti --> needs headshape_fname
ctf --> should be no problem (all contained in .ds directory)
kit --> needs mrk, elp, hsp

(what about itab and kriss by the way? It seems like we are not supporting them at the moment? see BIDS MEG formats)

so we need to write reading functions for the BIDS files ... and then code to ...

get the relevant data from the BIDS files for a given data format
enrich the raw data of that format with the additional data (headshape, etc.)

I would not take this on currently, but rather wait for a feature request or user issue.

This present PR however provides an intermediate solution, until the "real deal" is implemented.

sappelhoff · 2019-07-24T11:19:37Z

One last comment: I would separate this discussion between MEG and EEG, because the situation seems much simpler in EEG.

in EEG, we have electrodes.tsv with x,y,z coordinates and coordsystem.json with info on how to interpret the electrodes.tsv

The coordsystem.json also contains coordinates for LPA, RPA, and Nasion.

Reading these would be quite simple, and if we have LPA, RPA, Nasion, and the electrode positions with respect to these three landmarks, we can easily convert this into a Neuromag coordsystem as used in FIF, and we'll get rid of all problems :-)

agramfort · 2019-07-24T13:15:41Z

FYI there is current a rewamping of the digitization / montage mechanism by @massich

the hsp, elp etc... should be handled outside of the read raw function. Vision is to be able to read them outside and then call a set_montage method on the raw.

sappelhoff · 2019-07-24T13:36:59Z

FYI there is current a rewamping of the digitization / montage mechanism by @massich

yes, I really like that - do we know something about the timescale when this will be finished?

agramfort · 2019-07-24T14:10:50Z

Hopefully in a few weeks And before the next release If you have some working code or not using eeg and montage with mne I am sure @massich would love to know

agramfort · 2019-07-25T05:32:35Z

@sappelhoff I would simplify this code. See something like this:

read_params = dict(elp=..., hsp=..., )
read_raw_bla(fname, **read_params)

by doing this no need to check for ignored params. It's easy to document as you need to point to mne-python documentation etc.

jasmainak · 2019-07-25T06:05:02Z

It will be a major effort: We have to rely on these BIDS files:
*_coordsystem.json
*_headshape.pos
note: this is also in the coordsystem section
note 2: the headshape business is a bit unclear: there is an ongoing discussion about it, which has been open for a long time; community interest seems low

yes, but if not a GSoC, when will we undertake such a major effort? :) Then again, I think you should focus your efforts on the study template so you are the user and you figure out what are the issues. We have so far not hit this issue in the study template, or have we?

and from these files, we have to extract the data that we need for a given data format:
fif --> only allow_maxshield should be a problem (rest contained in fif file)

agreed.

bti --> needs headshape_fname

No, your headshape.pos should do the job

ctf --> should be no problem (all contained in .ds directory)
kit --> needs mrk, elp, hsp

These follow the BIDS naming specifications, e.g., sub-control01_ses-001_task-rest_run-01_markers.<mrk,sqd> as in the specification example.

so we need to write reading functions for the BIDS files

you'd simply read in a text file, so that should not be too difficult? Based on the link you posted, it seems to me that you just need to write a reader for CTF. We can ignore KRISS for now and for the rest you already have the digitizer info in the files themselves.

This present PR however provides an intermediate solution, until the "real deal" is implemented.

hmm ... maybe, but I'm not too enthusiastic about this solution.

sappelhoff · 2019-07-25T08:55:41Z

yes, but if not a GSoC, when will we undertake such a major effort? :)

fair enough :-)

hen again, I think you should focus your efforts on the study template so you are the user and you figure out what are the issues.

ok

We have so far not hit this issue in the study template, or have we?

we hit it once, when we wanted to plot sensors, but Alex then fixed it by changing an argument and it's fine now.

This present PR however provides an intermediate solution, until the "real deal" is implemented.

hmm ... maybe, but I'm not too enthusiastic about this solution.

okay, let's be pragmatic: Should I invest some more work and incorporate alex's feedback and then we merge, or should we close this?

teonbrooks · 2019-07-25T22:31:03Z

drive by comment:

for read bids folders in, it should rely on the basic reader and the sidecar files.
for writing, hopefully the new digization refactoring will make it clean and straightforward for non-FIF system to load their dig points and a simple function and be added to the raw object to ultimately be written out to the sidecars.
I think the cosmetics things are fine to merge
I think the additional read_params param should be avoided (should be handled in other like above).

is there a way to set allow_maxshield with FIF objects without adding a param?

sappelhoff · 2019-07-29T08:39:01Z

Thanks for weighing in @teonbrooks :-)

I think the cosmetics things are fine to merge

are there any cosmetic things though? The main refactoring of this PR is the introduction of the _get_read_params function, which is then used for the read_params param.

However, if we are not exposing read_params, I don't think we need the _get_read_params function.

Please advise:

Is there something salvageable from this PR ... or
should we just close it and improve the text in read_raw_bids should read digitization files #203 so that nobody else interprets it in the way that I did 😉

agramfort · 2019-07-29T09:13:51Z

Again I would just add a read_params dict() param to read_raw_bids saying that these are kwargs that can be passed to mne read functions cf mne documentation

…

massich · 2019-07-29T14:09:34Z

yes, I really like that - do we know something about the timescale when this will be finished?

@sappelhoff this is my top priority right now. And the sooner I get done with it the better. Ping me anytime.

teonbrooks · 2019-07-29T16:27:22Z

Again I would just add a read_params dict() param to read_raw_bids saying that these are kwargs that can be passed to mne read functions cf mne documentation

that makes sense. this is me playing the 4 year long game of mne-tools/mne-python#1930 😄

jasmainak · 2019-07-30T03:43:22Z

Let me take over this PR for a moment, so that @sappelhoff you can concentrate on the study-template :) You can review this when you are bored.

jasmainak · 2019-07-30T04:52:10Z

mne_bids/read.py

-                              mrk=hpi, preload=False)
+        mrk = _find_matching_sidecar(bids_fname, bids_root,
+                                     '_markers.???')
+        raw = io.read_raw_kit(raw_fpath, elp=None, hsp=None,


mne-bids only copies the mrk file but mne-python expects also elp and hsp to be provided. This tells me that the writing of this file is already broken. cc @sappelhoff

jasmainak · 2019-07-30T05:47:15Z

@agramfort I'm fine with adding **kwargs if you insist ... so @teonbrooks finally has what he wished for 4 years ago ;-)

But, how do you propose to deal with conflicting kwargs? E.g.: read_raw_edf has an argument called eog and then we also read the channel types from the BIDS sidecar files. If you don't expose this parameter, it's clear that the BIDS sidecar should be used to set the EOG channel. However, now there is a conflict.

Also, I feel doing this will hide some of the issues that you have in BIDS and mne-bids. For example, when I wrote the tests for reading the hsp files in the KIT reader, it's when I realized that all the files necessary are not written by mne-bids.

@sappelhoff I left some todos for you. I think the PR is back to you as I won't have more time :-) Sorry I made a bit of mess while I was playing around, feel free to add issues and revert my commits if that's more efficient.

agramfort · 2019-07-30T07:26:02Z

@agramfort <https://github.com/agramfort> I'm fine with adding **kwargs if you insist ... so @teonbrooks <https://github.com/teonbrooks> finally has what he wished for 4 years ago ;-) But, how do you propose to deal with conflicting kwargs? E.g.: read_raw_edf has an argument called eog and then we also read the channel types from the BIDS sidecar files. If you don't expose this parameter, it's clear that the BIDS sidecar should be used to set the EOG channel. However, now there is a conflict.

use the **kwargs at your own risks and the BIDS files should always have the priority.

Also, I feel doing this will hide some of the issues that you have in BIDS and mne-bids. For example, when I wrote the tests for reading the hsp files in the KIT reader, it's when I realized that all the files necessary are not written by mne-bids.

you need to use the software your write to understand how / when it breaks :)

sappelhoff · 2019-08-14T16:07:25Z

mne_bids/write.py

+                        ('y', y),
+                        ('z', z)])
+    _write_tsv(fname, data, overwrite=overwrite, verbose=verbose)
+    return fname


I think this is slightly problematic, because like this, we will write electrodes.tsv data that might originate from an idealized Montage as opposed to a measured DigMontage 🤔

for BIDS data, we only want to write electrode positions (x,y,z) if these have been measured/digitized ... not some template or idealized positions

cc @jasmainak

I'll have to figure out if it's possible in MNE to distinguish between a DigMontage and a Montage ... currently it seems like once they are set, they are treated equally.

Ideally, I would vote for only writing electrodes.tsv files if we know that

the electrode positions are MEASURED

we have NAS, PRA, LPA to scale the positions to FIFF space

why is that? Does the BIDS specification make this mandate somewhere? I couldn't find it with a quick search ...

i don't think much consideration has been given toward template data in the digitization in the bids discussion. i think it would be nice to have this information but if the analysis ultimately rest on the template, the digpoint data should exist somewhere (in the electrode file or in the attached template). it would be incumbent on the researcher to describe it in their methods somehow.

To me either you have it described in your electrode file, if you have a digitization or in the template if you are using a template. (and the template structure should be a replica of electrode file).

I feel like researchers / data-providers should only provide measured data (=digitized, true sensor positions) ... template positions etc. should be provided by the analysis software.

Sharing template positions without an indicator that the positions in question are in fact just some "standard", "idealized" positions can easily lead to misunderstandings, where somebody thinks that these are true positions (...and this would lead to errors in source localization)

With this reasoning, a researcher would write an electrodes.tsv and accompanying coordsystem.json only if they actually have true positions ... else, they just omit this info and the analysis software can provide a template based on the data format.

Think about it this way: If researchers / data-providers also shared an electrodes.tsv for template positions for each dataset and datafile, we'd have a lot of duplication and clutter ... because the template positions are always the same, depending on the sensor name and manufacturer.

I assume this is true for all kinds of sensors (templates can almost always be unambiguously guessed)

This is not true because if this were true, you would not need raw.set_montage method or montage argument in the raw readers. As we are practical people, I invite you to try writing code to unambiguously guess it ;-) If it cannot be guessed in 5 tries, then the burden of sharing the electrode locations should lie on the researcher sharing the data, no?

If it cannot be guessed in 5 tries, then the burden of sharing the electrode locations should lie on the researcher sharing the data, no?

okay, fair enough --> I already agreed that this could/should be fixed in BIDS (by adding a key-vaul pair whether the electrode locations are template positions or digitized positions)

but not everybody is using BIDS, so we will also have to deal with this in MNE-Python. The suggestion by @larsoner seems good to me at a first glance (using BOTH his points)

just fyi, this is what Robert Oostenveld thinks about this: bids-standard/pyedf#7 (comment)

... the thread might be an interesting read for everybody working on digitization in MNE-Python

For me BIDS (MEG, EEG and iEEG) so far have always been on raw (i.e. measured and unprocessed) data, not on processed (template) or model data (like electrodes on a sphere). That might have had to be stressed more.

I would argue that this should be fixed in the BIDS specification not in MNE. If you share template positions, there should be clear indicator in the specification that it's a template. Ultimately, many researchers won't care as long as they can run their pipeline.

Agreed, I propose that the BIDS spec for raw data is updated to clarify that people should include measured data, not templates. Should that also be extended to template anatomical MRIs? I.e. would people right now also be including for example colin27 as anat/sub-01_T1w.nii, in case they don't have an anatomical MRI?

If you share the electrode locations only in some BIDS datasets, how is the researcher supposed to guess which template it belongs to?

That has been given explicit thought and is part of the specification: you SHOULD specify CapManufacturersModelName and EEGPlacementScheme. These are more precise than merely relying on electrode names, since cap designs vary between manufacturers and sometimes between Caucasian/Asian fit. If you (as experimenter) tell me what cap you used - I (as software developer) can provide you with the template - which I can also update and improve over the years. E.g. I might start with a spherical 10-20 template, then in a later version of the software update that for a more accurate one. Ideally EEG cap manufacturers would get themselves a digitizer, scan all of their cap designs and share those (as we are arguing for here).

I opened an issue for this on BIDS-Specification: bids-standard/bids-specification#318

for MNE-BIDS, we "solved" this in our last call, I summarize the solution here: #264

sappelhoff · 2019-08-23T15:51:48Z

closing, because part of the work will be done in #263

and the rest of this PR can be trashed because it's not (yet) reliant on BIDS.

Feel free to re-open if somebody plans to work on this in the near future - but I won't have the time for this particular thing

sappelhoff added the enhancement label Jul 22, 2019

sappelhoff marked this pull request as ready for review July 22, 2019 08:58

sappelhoff changed the title ~~allow read_raw_bids to take additional input parameters~~ [MRG] allow read_raw_bids to take additional input parameters Jul 22, 2019

sappelhoff requested a review from jasmainak July 22, 2019 09:00

agramfort reviewed Jul 22, 2019

View reviewed changes

doc/whats_new.rst Show resolved Hide resolved

jasmainak reviewed Jul 22, 2019

View reviewed changes

agramfort reviewed Jul 23, 2019

View reviewed changes

sappelhoff added 4 commits July 29, 2019 23:44

first draft

42c4348

default read_params to empty dict if None

ebf60a3

only warn if we have unused params

ab25f0a

simplify logic flow, repair dict key

76c6d02

sappelhoff added 3 commits July 29, 2019 23:47

adjust cli to work with new setup of _read_raw

27ee8c5

add whatsnew

4b667c1

improve testing coverage

4484f2b

jasmainak force-pushed the enh_reader branch from 0be43d3 to 4484f2b Compare July 30, 2019 03:49

TST: marker files are not copied during write

4b6a3c7

jasmainak force-pushed the enh_reader branch from 49da7b6 to 4b6a3c7 Compare July 30, 2019 04:30

jasmainak added 2 commits July 30, 2019 00:42

ENH: mrk is found but elp and hsp not read

d85814c

FIX make it work for BTi

400581e

jasmainak reviewed Jul 30, 2019

View reviewed changes

ENH: add a writer for the electrode positions

a7eb21a

jasmainak changed the title ~~[MRG] allow read_raw_bids to take additional input parameters~~ [WIP] allow read_raw_bids to take additional input parameters Jul 30, 2019

ENH: todo for @sappelhoff

d94e7d3

sappelhoff commented Aug 14, 2019

View reviewed changes

This was referenced Aug 15, 2019

Make it work for "inconsistent runs" (using ds001971) mne-tools/mne-bids-pipeline#41

Closed

[WIP] ENH: add func to write elec tsv #263

Closed

This was referenced Aug 22, 2019

Encoding electrode positions in EDF, EDF+, BDF bids-standard/pyedf#7

Open

Discussion: BIDS-RAW specification --> Sharing of template data ... yes/no? bids-standard/bids-specification#318

Open

sappelhoff closed this Aug 23, 2019

sappelhoff deleted the enh_reader branch March 22, 2020 07:17

[WIP] allow read_raw_bids to take additional input parameters #244

[WIP] allow read_raw_bids to take additional input parameters #244

Conversation

sappelhoff commented Jul 22, 2019

Merge checklist

codecov-io commented Jul 22, 2019 • edited Loading

Codecov Report

jasmainak commented Jul 22, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agramfort left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sappelhoff commented Jul 24, 2019

sappelhoff commented Jul 24, 2019

sappelhoff commented Jul 24, 2019

agramfort commented Jul 24, 2019

sappelhoff commented Jul 24, 2019

agramfort commented Jul 24, 2019 via email

agramfort commented Jul 25, 2019

jasmainak commented Jul 25, 2019

sappelhoff commented Jul 25, 2019

teonbrooks commented Jul 25, 2019

sappelhoff commented Jul 29, 2019

agramfort commented Jul 29, 2019 via email

massich commented Jul 29, 2019

teonbrooks commented Jul 29, 2019

jasmainak commented Jul 30, 2019

Choose a reason for hiding this comment

jasmainak commented Jul 30, 2019

agramfort commented Jul 30, 2019 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sappelhoff Aug 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sappelhoff commented Aug 23, 2019

codecov-io commented Jul 22, 2019 •

edited

Loading

sappelhoff Aug 15, 2019 •

edited

Loading