Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace metadata eval with query #10705

Merged
merged 5 commits into from Jun 7, 2022
Merged

Conversation

cbrnr
Copy link
Contributor

@cbrnr cbrnr commented Jun 3, 2022

Fixes #10700. Let's see if all tests still pass.

mne/utils/mixin.py Outdated Show resolved Hide resolved
@agramfort
Copy link
Member

seams to work @cbrnr !

can you add a test with your toy example

@cbrnr
Copy link
Contributor Author

cbrnr commented Jun 3, 2022

OK, but we might have a problem here, because what should happen if an event_id has the same name as a metainfo column?

import mne
import numpy as np
from pandas import DataFrame, Series, NA

metadata = DataFrame(
    {"A": Series([True, True, True, False, False, NA], dtype="boolean")}
)
rng = np.random.default_rng()
epochs = mne.EpochsArray(
    data=rng.standard_normal(size=(6, 8, 500)),
    info=mne.create_info(8, 250, "eeg"),
    event_id={"A": 1},  # event ID "A" has the same name as metainfo column "A"
    metadata=metadata
)

# ambiguous: should epochs["A"] get epochs of type "A" or where metainfo["A"] matches?
assert len(epochs["A"]) == 3  # gets epochs for which column A == True

assert len(epochs["not A"]) == 2  # gets epochs for which column A == True
assert len(epochs["A.isna()"]) == 1  # gets epochs for NA in column A

@drammock
Copy link
Member

drammock commented Jun 3, 2022

what should happen if an event_id has the same name as a metainfo column?

We should clearly document which one has precedence. IIRC in the code we try event_id first so epochs['A'] should get epochs of type A, not epochs for which metadata column A is truthy. (anyway that's how I think it ought to be). If we want to be really nice we could detect such conflicts and print an info or warning when they occur.

@hoechenberger
Copy link
Member

Ohohoh, I'd like to ensure this doesn't break the MNE-BIDS-Pipeline before we merge this

@mmagnuski
Copy link
Member

@cbrnr @drammock
I also think event_id should have precedence. For boolean columns I'd generally recommend the slightly more verbose query "A == True" to avoid such conflicts.

@agramfort agramfort marked this pull request as ready for review June 6, 2022 20:05
@cbrnr
Copy link
Contributor Author

cbrnr commented Jun 7, 2022

@agramfort I was going to write a bit more compact test and haven't pushed it yet. Do you want to keep the test you added?

@agramfort
Copy link
Member

agramfort commented Jun 7, 2022 via email

@cbrnr
Copy link
Contributor Author

cbrnr commented Jun 7, 2022

Alright, so if there is an event_id labeled A, epochs['A'] will continue to get epochs with this ID (so the previous behavior does not change), even if there is also a metadata column A. If that column A is boolean, we can query all rows where this column is True with epochs["A == True"]. If there is no metadata column with the same name, then epochs['A'] will get all epochs where this column is True.

So I guess this is already what we want?

@hoechenberger
Copy link
Member

hoechenberger commented Jun 7, 2022

I have to say I'd like to avoid us having to make this kind of decision in the first place. Can we not raise a ValueError if the user tries to set metadata with column names equal to an event name?

Or, allow me to go crazy here ... do everything via metadata internally, as in always create metadata with columns event_id and event_name.... 🤔

@agramfort
Copy link
Member

agramfort commented Jun 7, 2022 via email

@cbrnr
Copy link
Contributor Author

cbrnr commented Jun 7, 2022

Building everything upon metadata would be cool, but this will be a huge change. It's probably not worth it unless someone really wants to do it.

In any case, @hoechenberger you said you wanted to verify that everything still works with BIDS – did you have a chance to do that? We should wait for you before we merge this PR.

@hoechenberger
Copy link
Member

In any case, @hoechenberger you said you wanted to verify that everything still works with BIDS – did you have a chance to do that? We should wait for you before we merge this PR.

Nope, will want to do that later today!!

Copy link
Member

@hoechenberger hoechenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be working with the pipeline, hence all good from my end!

@hoechenberger hoechenberger merged commit b2652a6 into mne-tools:main Jun 7, 2022
@hoechenberger
Copy link
Member

Thanks @cbrnr and @agramfort!

@mmagnuski
Copy link
Member

🎉

@cbrnr cbrnr deleted the metadata-query branch June 8, 2022 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing values in epochs metadata are not handled correctly
5 participants