Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle datasets with invalid info[meas_id][secs]? #7803

Open
hoechenberger opened this issue May 20, 2020 · 11 comments
Open

How to handle datasets with invalid info[meas_id][secs]? #7803

hoechenberger opened this issue May 20, 2020 · 11 comments

Comments

@hoechenberger
Copy link
Member

hoechenberger commented May 20, 2020

I'm woking with the ds000246 OpenNeuro dataset:

$ aws s3 sync --no-sign-request s3://openneuro.org/ds000246 ds000246
$ cd ds000246/sub-emptyroom/meg

Reading the data works as expected:

import mne
raw = mne.io.read_raw_ctf('sub-emptyroom_task-noise_run-01_meg.ds')

Writing thows an exception:

raw.save('/tmp/foo.fif')

Traceback:

RuntimeError                              Traceback (most recent call last)
<ipython-input-4-eb369e79ee42> in <module>
----> 1 raw.save('/tmp/foo.fif')

<decorator-gen-155> in save(self, fname, picks, tmin, tmax, buffer_size_sec, drop_small_buffer, proj, fmt, overwrite, split_size, split_naming, verbose)

~/Development/mne-python/mne/io/base.py in save(self, fname, picks, tmin, tmax, buffer_size_sec, drop_small_buffer, proj, fmt, overwrite, split_size, split_naming, verbose)
   1379                 "split_naming must be either 'neuromag' or 'bids' instead "
   1380                 "of '{}'.".format(split_naming))
-> 1381         _write_raw(fname, self, info, picks, fmt, data_type, reset_range,
   1382                    start, stop, buffer_size, projector, drop_small_buffer,
   1383                    split_size, split_naming, part_idx, None, overwrite)

~/Development/mne-python/mne/io/base.py in _write_raw(fname, raw, info, picks, fmt, data_type, reset_range, start, stop, buffer_size, projector, drop_small_buffer, split_size, split_naming, part_idx, prev_fname, overwrite)
   1844 
   1845     picks = _picks_to_idx(info, picks, 'all', ())
-> 1846     fid, cals = _start_writing_raw(use_fname, info, picks, data_type,
   1847                                    reset_range, raw.annotations)
   1848 

~/Development/mne-python/mne/io/base.py in _start_writing_raw(name, info, sel, data_type, reset_range, annotations)
   2018         cals.append(info['chs'][k]['cal'] * info['chs'][k]['range'])
   2019 
-> 2020     write_meas_info(fid, info, data_type=data_type, reset_range=reset_range)
   2021 
   2022     #

~/Development/mne-python/mne/io/meas_info.py in write_meas_info(fid, info, data_type, reset_range)
   1453     """
   1454     info._check_consistency()
-> 1455     _check_dates(info)
   1456 
   1457     # Measurement info

~/Development/mne-python/mne/io/meas_info.py in _check_dates(info, prepend_error)
   1411                 if (value[key_2] < np.iinfo('>i4').min or
   1412                         value[key_2] > np.iinfo('>i4').max):
-> 1413                     raise RuntimeError('%sinfo[%s][%s] must be between '
   1414                                        '"%r" and "%r", got "%r"'
   1415                                        % (prepend_error, key, key_2,

RuntimeError: info[meas_id][secs] must be between "-2147483648" and "2147483647", got "-5364633480"

How to best deal with data like this? Can I simply set info[meas_id][secs] to an arbitrary (valid) value? Also it seems a little odd that I can create (and work with) some data by reading it, but then cannot write it back to disk…

@larsoner
Copy link
Member

Also it seems a little odd that I can create (and work with) some data by reading it, but then cannot write it back to disk…

The FIF format in particular has a limit on how large a span of dates it can write because it writes out seconds in int32. Other formats that use other methods (e.g., storing seconds in int64, or dates in a suitable string format) will not suffer from this problem.

As to how to fix it, you can set it to zero and things will work (unless you have saved separate annotations you want to add), but be careful if you ever want to do something having to do with dates across multiple subjects or runs. Typically during anonymization you shift all subjects and runs by some fixed amount so that their relative timings stay fixed. Wiping out the meas_date will make this no longer be the case.

@agramfort
Copy link
Member

agramfort commented May 21, 2020 via email

@bloyl
Copy link
Contributor

bloyl commented May 21, 2020 via email

@hoechenberger
Copy link
Member Author

hoechenberger commented May 21, 2020 via email

@agramfort
Copy link
Member

agramfort commented May 22, 2020 via email

@hoechenberger
Copy link
Member Author

bids validator cannot read meg files just the file names so he cannot detect these issues.

Wait, so you're saying there's BIDS-relevant metadata stored in a file format that the BIDS validator cannot read? Shouldn't this be stored in a sidecar file, like the events??

@hoechenberger
Copy link
Member Author

Thanks @larsoner for the explanation, and thanks @bloyl for the suggestion to try and re-anonymize, I will look into this and see how it goes!

@bloyl
Copy link
Contributor

bloyl commented May 24, 2020

This raises an interesting question.

What is the expectation if bids sidecar information differs from what is stored in the underlying imaging data headers?

@hoechenberger
Copy link
Member Author

What is the expectation if bids sidecar information differs from what is stored in the underlying imaging data headers?

I believe the sidecar-based values always take precedence.

@agramfort
Copy link
Member

agramfort commented May 25, 2020 via email

@davidcian
Copy link

Same issue here with the Temple University TUAR dataset. Ended up just dropping the meas_date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants