Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_raw_bids does not produce a valid SET file #1122

Closed
laemtl opened this issue Mar 7, 2023 · 20 comments · Fixed by #1126
Closed

write_raw_bids does not produce a valid SET file #1122

laemtl opened this issue Mar 7, 2023 · 20 comments · Fixed by #1126

Comments

@laemtl
Copy link
Contributor

laemtl commented Mar 7, 2023

Description of the problem

The function write_raw_bids does not produce a valid .set file that can be read by EEGLAB. When importing a .set file that was generated by write_raw_bids in EEGLAB, we systematically get the following error:

Screenshot (23)

Steps to reproduce

import mne
import os
from mne_bids import write_raw_bids, BIDSPath

file = r'C:\Users\Lae\Documents\test_data\DCC0003_444597_V01_FACE.set'
bids_directory = r'C:\Users\Lae\Documents\test_data'
raw = mne.io.read_raw_eeglab(input_fname=file, preload=False, verbose=True)

os.makedirs(bids_directory + os.path.sep, exist_ok=True)
bids_directory = bids_directory + os.path.sep
bids_root = bids_directory
bids_basename = BIDSPath(subject='001', task='test', root=bids_root, acquisition='eeg', run=None)
bids_basename.update(session='V1')
write_raw_bids(raw, bids_basename, allow_preload=False, overwrite=False, verbose=True)

Expected results

A valid EEGLAB file

Actual results

File can't be open, see above error.

Additional information

Platform: Windows-10-10.0.19041-SP0
Python: 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)]
Executable: C:\Users\Lae\Documents\hbcd-eeg2bids\Scripts\python.exe
CPU: Intel64 Family 6 Model 142 Stepping 9, GenuineIntel: 4 cores
Memory: Unavailable (requires "psutil" package)
mne: 1.2.0
numpy: 1.20.3 {OpenBLAS 0.3.13.dev with 4 threads}
scipy: 1.9.2
matplotlib: 3.6.1 {backend=TkAgg}

sklearn: 1.1.2
numba: 0.56.2
nibabel: Not found
nilearn: Not found
dipy: Not found
openmeeg: Not found
cupy: Not found
pandas: 1.5.0
pyvista: Not found
pyvistaqt: Not found
ipyvtklink: Not found
vtk: Not found
qtpy: Not found
ipympl: Not found
pyqtgraph: Not found
pooch: v1.6.0

mne_bids: 0.12
mne_nirs: Not found
mne_features: 0.2.1
mne_qt_browser: Not found
mne_connectivity: Not found
mne_icalabel: Not found

@laemtl laemtl added the bug label Mar 7, 2023
@agramfort
Copy link
Member

agramfort commented Mar 7, 2023 via email

@sappelhoff
Copy link
Member

cc @jackz314

@jackz314
Copy link

jackz314 commented Mar 7, 2023

I don't think mne-bids it uses eeglabio for exporting to set format, seems like it uses an internal function copyfile_eeglab and the bug should be fixed there.

@laemtl you might want to consider the Raw.export function instead for now, which uses eeglabio to export to eeglab format.

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

Thanks @jackz314, I will give it a try! I believe the issue happens here, where savemat is used:

savemat(file_name=dest, mdict=mdict, appendmat=False)

I also tracked down the issue. The error I'm experiencing is caused by EEG.srate, EEG.xmin being uint16 and uint8.

@sappelhoff
Copy link
Member

thanks @jackz314 I somehow forgot about that.

@laemtl -- interesting: Do you think we could fix that error using savemat?

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

@sappelhoff @jackz314 What about using eeglabio? It does a pretty good job in converting all the values back to float since its last release.

@sappelhoff
Copy link
Member

sappelhoff commented Mar 8, 2023

My principle with all the copyfile functions that we have is:

  • we should try to touch raw data as little as possible
  • if we can copy it, that's better than reading it in and writing it out, as during that stage, errors could occur

Having that said ... if we don't find an easy fix for our copyfile_eeglab, I would be in favor of switching our mne-bids code to use eeglabio

@laemtl I'd be grateful if you could spend some time to see whether copyfile_eeglab could be fixed for your needs:

def copyfile_eeglab(src, dest):
"""Copy an EEGLAB file to a new location.
If the EEGLAB ``.set`` file comes with an accompanying ``.fdt`` binary file
that contains the actual data, this function will copy this file, too, and
update all internal pointers in the new ``.set`` file.
Parameters
----------
src : path-like
Path to the source raw .set file.
dest : path-like
Path to the destination of the new .set file.
See Also
--------
copyfile_brainvision
copyfile_bti
copyfile_ctf
copyfile_edf
copyfile_kit
"""
if not mne.utils.check_version('scipy', '1.5.0'): # pragma: no cover
raise ImportError('SciPy >=1.5.0 is required handling EEGLAB data.')
# Get extension of the EEGLAB file
_, ext_src = _parse_ext(src)
fname_dest, ext_dest = _parse_ext(dest)
if ext_src != ext_dest:
raise ValueError(f'Need to move data with same extension'
f' but got {ext_src}, {ext_dest}')
# Load the EEG struct
# NOTE: do *not* simplify cells, because this changes the underlying
# structure and potentially breaks re-reading of the file
uint16_codec = None
eeg = loadmat(file_name=src, simplify_cells=False,
appendmat=False, uint16_codec=uint16_codec)
oldstyle = False
if 'EEG' in eeg:
eeg = eeg['EEG']
oldstyle = True
try:
# If the data field is a string, it points to a .fdt file in src dir
if isinstance(eeg['data'][0, 0][0], str):
has_fdt_link = True
except IndexError:
has_fdt_link = False
if has_fdt_link:
fdt_fname = eeg['data'][0, 0][0]
assert fdt_fname.endswith('.fdt'), f'Unexpected fdt name: {fdt_fname}'
head, _ = op.split(src)
fdt_path = op.join(head, fdt_fname)
# Copy the .fdt file and give it a new name
fdt_name_new = fname_dest + '.fdt'
sh.copyfile(fdt_path, fdt_name_new)
# Now adjust the pointer in the .set file
# NOTE: Clunky numpy code is to match MATLAB structure for "savemat"
_, tail = op.split(fdt_name_new)
new_value = np.empty((1, 1), dtype=object)
new_value[0, 0] = np.atleast_1d(np.array(tail))
eeg['data'] = new_value
# Save the EEG dictionary as a Matlab struct again
mdict = dict(EEG=eeg) if oldstyle else eeg
savemat(file_name=dest, mdict=mdict, appendmat=False)
else:
# If no .fdt file, simply copy the .set file, no modifications
# necessary
sh.copyfile(src, dest)

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

@sappelhoff My idea is to modify copyfiles.py and use eeglabio export_mne_raw instead of loadmat/savemat:

raw = mne.io.read_raw(...)
export_mne_raw(raw, "file_name.set").

So the logic in copyfiles.py will remain the same but the read/write logic will be more specific to the eeglab specs to prevent int conversion.

@sappelhoff
Copy link
Member

Okay, I guess that would be better. Would you be willing to submit a pull request to implement this behavior?

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

Sure :)

@sappelhoff
Copy link
Member

who knows, maybe this will also fix #1120

cc @kaare-mikkelsen

@sappelhoff sappelhoff added this to the 0.13 milestone Mar 8, 2023
@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

@sappelhoff Given the side-effects of mne.io.read_raw and export_mne_raw that can lead to errors while reading/saving the data, I opted for a simpler solution. See #1126.

@jasmainak
Copy link
Member

The fact that the file could not be copied indicates a problem with the original file, no? I think that's outside the purview of MNE-BIDS

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

@jasmainak Are you talking about #1120?

@sappelhoff
Copy link
Member

The fact that the file could not be copied

if an fdt file is present, we don't strictly copy -- we read the file, edit the file pointer to the fdt file, and write it back (leaving everything else untouched). I think this is still where somehow the issue with different integer types occurs. Can you confirm @laemtl?

@jasmainak
Copy link
Member

I commented in #1126 . Maybe eeglabio is the way to go if it preserves integer types better than savemat ? We don't want to take up the maintenance burden of writing EEGLAB files, it's opening a can of worms

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

@sappelhoff yes, loadmat converts non-decimal numbers to int.

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

@jasmainak eeglabio implements the same logic that I did here, casting values in-between loadmat and savemat calls. The issue with eeglabio is that it requires the data to be loaded with mne.io.BaseRaw which parses/rejects the data according to some criterion, and I think it's preferable to minimize the dataset validation/alteration.

@jasmainak
Copy link
Member

okay I leave the decision to @sappelhoff ! Was just a drive-by comment :)

@laemtl
Copy link
Contributor Author

laemtl commented Mar 8, 2023

@jasmainak I edited #1126 based on your comment. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

5 participants