Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated event renaming doesn't always do the right thing #936

Closed
hoechenberger opened this issue Jan 6, 2022 · 3 comments · Fixed by #937
Closed

Automated event renaming doesn't always do the right thing #936

hoechenberger opened this issue Jan 6, 2022 · 3 comments · Fixed by #937
Assignees
Labels

Comments

@hoechenberger
Copy link
Member

hoechenberger commented Jan 6, 2022

# %%
from pathlib import Path
from mne_bids import BIDSPath, read_raw_bids

root = Path('~/mne_data/eeg_matchingpennies')
bp = BIDSPath(
    subject='05',
    task='matchingpennies',
    suffix='eeg',
    extension='.vhdr',
    datatype='eeg',
    root=root
)
raw = read_raw_bids(bp)
The event "left" refers to multiple event values. Creating hierarchical event names.
    Renaming event: left -> left/n/a
    Renaming event: left -> left/n/a
    Renaming event: left -> left/1
    Renaming event: left -> left/n/a
    Renaming event: left -> left/n/a
    Renaming event: left -> left/1
    Renaming event: left -> left/n/a
    Renaming event: left -> left/n/a
    Renaming event: left -> left/1
...

The n/as clearly shoulnd't be there.

@hoechenberger
Copy link
Member Author

hoechenberger commented Jan 6, 2022

@sappelhoff (in case you have a minute to spare :))

It appears that the eeg_matchingpennies events TSV file for sub-05 starts out like this:

onset              	duration 	trial_type 	response_time      	stim_file      	value 	sample  	stage 	trial 	bci_prediction 	bci_prediction_valid 	n_repeated 	latency
15.038933333332999 	n/a      	right      	116.666666667      	left_hand.png  	n/a   	n/a     	1     	1     	right          	1                    	2          	-4.2
18.038933333333    	n/a      	right      	116.666666667      	left_hand.png  	n/a   	n/a     	1     	1     	right          	1                    	2          	-4.2
18.1556            	0        	right      	116.666666667      	left_hand.png  	2     	90778   	1     	1     	right          	1                    	2          	-4.2

IMHO this is the same event listed 3 times, and only once associated with a value and a sample, right?

The duration of the first two seems to be in violation of BIDS:

Screen Shot 2022-01-06 at 19 05 21

https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/05-task-events.html

Now I'm wondering if we should just explicitly skip rows that don't have a proper duration set , or issue a warning, or just remain silent and potentially produce garbage and blame it on the user, who fed us a dataset that's not fully BIDS-compliant?

cc @agramfort

@sappelhoff
Copy link
Member

sappelhoff commented Jan 6, 2022

The n/a durations indicate that they are NOT zero (not instantaneous events), but that they are also unknown --> this is data from back when I was a masters student and recording some things for an internship, so ... there are many things that didn't go perfect 🙂

The basic problem is that back then, I formatted the BIDS dataset in a "by trial" fashion --> each row in events.tsv corresponds to one event in a trial ... and many columns encode additional stuff that happened during the trial ... but not necessarily at the time of the event corresponding to that row

This is bad practice and I should have just recorded all events and organized events.tsv in a "by event" fashion, where each row is an actual event, and each trial has multiple rows (because typically there are a ton of events happening each trial).


The current "version" of matchingpennies was created after a long discussion with Kay because we wanted to add HED tags to the dataset. See: bids-standard/bids-examples#191

We then realized the "by trial" versus "by event" issue, and I pushed changes to make the dataset to be more in a "by event" style ... unfortunately I didn't have enough data to reproduce all the details, so that's where the n/as come from.

Fast forward: Meanwhile, HED has moved to version 3, and we've all gotten a little smarter along the way, so I have a PR open to "revert" the dataset back to the original "by trial" style ... because it's tidier, and HED should be able to deal with "by trial" event structures as well (it's just less detailed than "by events"). See: bids-standard/bids-examples#265

Unfortunately, that PR has stalled for quite a while now.


One last comment: I don't think n/a as duration should be invalid ... sometimes you just don't know the duration 🤔

@hoechenberger
Copy link
Member Author

Thanks @sappelhoff! I'll try to make things work better with the current version of the dataset. Don't bother touching it.

@hoechenberger hoechenberger self-assigned this Jan 6, 2022
hoechenberger added a commit to hoechenberger/mne-bids that referenced this issue Jan 6, 2022
hoechenberger added a commit that referenced this issue Jan 7, 2022
…ad (#937)

* Fix handling of n/a event values when auto-renaming events on read

Fixes #936

* Add tests

* Update changelog

* Nuke Circle cache

* Fix tests?

* Nuke Circle cache again…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants