Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.get_data()'s picks docstring does not explicitly state that bad channels are included by default #12197

Open
hoechenberger opened this issue Nov 12, 2023 · 8 comments
Labels

Comments

@hoechenberger
Copy link
Member

hoechenberger commented Nov 12, 2023

Proposed documentation enhancement

Currently, the doctoring for the picks parameter of .get_data() reads:

picksstr | array_like | slice | None

Channels to include. Slices and lists of integers will be interpreted as channel indices. In lists, channel type strings (e.g., ['meg', 'eeg']) will pick channels of those types, channel name strings (e.g., ['MEG0111', 'MEG2623'] will pick the given channels. Can also be the string values “all” to pick all channels, or “data” to pick data channels. None (default) will pick all channels. Note that channels in info['bads'] will be included if their names or indices are explicitly provided.

See e.g. https://mne.tools/dev/generated/mne.Epochs.html#mne.Epochs.get_data

I always assumed that bad channels are not included by default (None), as I didn't provide their names or indices. But that doesn't seem to be the case.

The doctoring should be updated to explicitly state that by default, bads will be included.

This also suggests that we should take a look at our tutorials that make use of .get_data() and update them to pass picks="data" or similar.

We may also need to take a look at the decoding steps of MNE-BIDS-Pipeline, which use .get_data(), too.

Edit: In most (all?) tutorials we run inst.pick(..., exclude="bads"), which avoids the issue mentioned above.

@hoechenberger hoechenberger changed the title .get_data()'s picks doctoring does not explicitly state that bad channels are included by default .get_data()'s picks docstring does not explicitly state that bad channels are included by default Nov 12, 2023
hoechenberger added a commit to hoechenberger/mne-bids-pipeline that referenced this issue Nov 13, 2023
@drammock
Copy link
Member

thank you for opening this. I think the docstring is confusing but technically not wrong:

None (default) will pick all channels

does imply that bad channels will be included. but the following sentence is the one that I think creates a misleading impression:

Note that channels in info['bads'] will be included if their names or indices are explicitly provided.

This is meant to say "if you're passing an explicit list of indices or strings to picks, then note that we don't automatically exclude channels in .info['bads'] in that case" or in other words: "warning, you might accidentally overrule .info['bads'] if you're not careful about which channel names/indices you pass in to picks"

That is definitely a mouthful and not easy to get from the docstring as it is currently written. So yes, we should try to make it less confusing / ambiguous.

But it also brings up for me a larger issue: as you know the default value of picks=None gets interpreted differently in different parts of the codebase --- sometimes None means "good data channels", sometimes "good and bad data channels", sometimes "good channels", and sometimes "good and bad channels" (AKA all channels). Also I think there are flavors like "good data channels except not ref-meg".

I think it would be a big improvement to both the user and developer experiences if the default in all of these cases was a string representing the flavor: i.e., in funcs/methods where we want the default to be "good data" channels, the default should be something like picks="good data" instead of picks=None. I.e., if you had seen that the default were picks="all data", and known from the docstring (or linked glossary entry?) how to interpret that, i.e., that "all" means "good and bad", then I think this also would have reduced the likelihood of confusion/mistake.

@larsoner
Copy link
Member

I think it would be a big improvement to both the user and developer experiences if the default in all of these cases was a string representing the flavor

+1 on getting rid of picks=None in favor of something like picks="data". But another option that I expect will be easier to implement and work with existing parts of our code and use cases would be to use the pair of picks and exclude to achive "all data" for example as:

picks="data", exclude=()

and "good data" as:

picks="data", exclude="bads"

In other words, we can split up the bad inclusion/exclusion using the exclude kwarg instead of packing it into picks as "good data". exclude=None or exclude="bads" is already used a bunch of places in the codebase:

$ git grep "exclude=\"bads\"" mne | wc -l
170
$ git grep "exclude=None" mne | wc -l
23

IIRC this is what we did recently with the Spectrum-including-bads PRs. Getting these uses to work nicely / as expected with picks="all data" when we already have exclude later in the signature I expect to be a lot messier than 1) adding exclude=... where we currently don't have it and 2) avoiding picks=None everywhere (replaced by explicit picks=<str>, exclude=None | "bads use).

@larsoner
Copy link
Member

larsoner commented Nov 14, 2023

... the other advantage of spltting in this way is things like picks="eeg", exclude=... can work and we don't have to implement stuff like "good eeg" and "all eeg" etc.

@drammock
Copy link
Member

+1 for @larsoner's proposal, on the assumption that we actually have an exclude param in all the relevant places (I think we do but haven't checked). And even if not, it feels cleaner to just add exclude where it's missing (in addition to changing the pick default to "all" or "data") rather than adding handling for a bunch of new strings like "good data" to picks

@cbrnr
Copy link
Contributor

cbrnr commented Nov 14, 2023

I'm also +1 for tackling this with picks and exclude. Regarding the initial comment, I think we actually used to exclude all bads when picks=None. When did this change?

@larsoner
Copy link
Member

I think we actually used to exclude all bads when picks=None. When did this change?

At least from a few years back when we added picks=<str> support I think it could mean "all channels", "all good channels", "all data channels", or "all good data channels" and which one it meant varied from function to function.

@hoechenberger
Copy link
Member Author

I like @larsoner's proposal

@drammock .get_data() unfortunately doesn't have an exclude parameter

@larsoner
Copy link
Member

on the assumption that we actually have an exclude param in all the relevant places (I think we do but haven't checked). And even if not, it feels cleaner to just add exclude where it's missing

.get_data() unfortunately doesn't have an exclude parameter

Indeed there are probably a number of places we'll want to add it. It should be easy enough to add them as we eradicate picks=None -- we're going to be modifying those signatures anyway so it's easy to check for exclude at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants