Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Reference run not guaranteed to be processed before noise runs #759

Closed
allermat opened this issue Jul 6, 2023 · 9 comments · Fixed by #761
Closed

BUG: Reference run not guaranteed to be processed before noise runs #759

allermat opened this issue Jul 6, 2023 · 9 comments · Fixed by #761

Comments

@allermat
Copy link
Contributor

allermat commented Jul 6, 2023

This has been discussed on the MNE Forum here.

I received an error when running the Maxwell filter step in the MNE-BIDS-Pipeline, saying that the rank of the data in the reference rune does not match the empty room data rank, see below:

[10:55:33] │ ❌ preprocessing/_02_maxfilter sub-09 run-01 A critical error occurred. The error message was: Reference run data rank 73.0 does not match empty-room data rank 71.0 after Maxwell filtering. This indicates that the data were processed  differently.

Aborting pipeline run. The full traceback is:

Traceback (most recent call last):

  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne_bids_pipeline/_run.py", line 54, in wrapper
    out = memory.cache(func)(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne_bids_pipeline/_run.py", line 263, in wrapper
    out_files = memorized_func(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/joblib/memory.py", line 594, in __call__
    return self._cached_call(args, kwargs)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/joblib/memory.py", line 537, in _cached_call
    out, metadata = self.call(*args, **kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/joblib/memory.py", line 779, in call
    output = self.func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne_bids_pipeline/steps/preprocessing/_02_maxfilter.py", line 264, in run_maxwell_filter
    raise RuntimeError(msg)

RuntimeError: Reference run data rank 73.0 does not match empty-room data rank 71.0 after Maxwell filtering. This indicates that the data were processed  differently.

This occurred only with a specific subject in the dataset and it has to do with which run I set as reference. If I set mf_reference_run = '03', I get the error, however, if I set mf_reference_run = '01', then the Maxwell filter step runs fine without an error.

@larsoner
Copy link
Member

larsoner commented Jul 6, 2023

One solution that should work for this would be something like:

mf_bads: Literal["run", "union"] = "run"

where changing to "union" it will use the union of all bads across all data that will be maxwell filtered (runs + noise + rest). This actually in some sense is a safer option because the autobad only sometimes finds bad channels that it should find, so by combining across runs with a union is probably safest / most conservative.

@hoechenberger
Copy link
Member

hoechenberger commented Jul 6, 2023

Actually I thought this shouldn't happen, as we're injecting the reference run's bass into the empty-room recording:

# Load reference run plus its auto-bads
raw_ref = read_raw_bids(
bids_path_ref_in,
extra_params=cfg.reader_extra_params,
verbose=cfg.read_raw_bids_verbose,
)
if bids_path_ref_bads_in is not None:
bads = _read_bads_tsv(
cfg=cfg,
bids_path_bads=bids_path_ref_bads_in,
)
raw_ref.info["bads"] = bads
raw_ref.info._check_consistency()
raw_ref.pick_types(meg=True, exclude=[])
if prepare_maxwell_filter:
# We need to include any automatically found bad channels, if relevant.
# TODO: This 'union' operation should affect the raw runs, too,
# otherwise rank mismatches will still occur (eventually for some
# configs). But at least using the union here should reduce them.
raw_er = mne.preprocessing.maxwell_filter_prepare_emptyroom(
raw_er=raw_er,
raw=raw_ref,
bads="union",
)
else:
# Take bads from the reference run
raw_er.info["bads"] = raw_ref.info["bads"]

@larsoner
Copy link
Member

larsoner commented Jul 6, 2023

I think the error message might be misleading/wrong here. This happens when processing run 01 when the ref run is 03, at least assuming @allermat is on the latest MNE-BIDS-Pipeline as of a week or so ago when I split the individual runs out from the empty-room ones. @allermat can you make sure you're on the latest dev version and double check?

Either way I think we should add this mf_bads option, though -- having each run have a different rank will probably be problematic eventually, and using the union of bads across all runs will make them all have the same correct info-based rank. We've always used a fixed-across-all-runs union-like bads list at UW partially for this reason.

Also separately we it looks like the Epochs we create will inherit the info from the reference run rather than the first run:

for run in cfg.runs:
this_raw_fname = raw_fname.copy().update(run=run)
this_raw_fname = _update_for_splits(this_raw_fname, None, single=True)
raw_filt = mne.io.read_raw_fif(this_raw_fname)
raws_filt.append(raw_filt)
del this_raw_fname
# Concatenate the filtered raws and extract the events.
raw_filt_concat = mne.concatenate_raws(raws_filt, on_mismatch="warn")

The first run's .info will not necessarily have a rank that matches that of the reference run (and therefore the empty-room data). So I think we need to fix this as well as add the mf_bads option.

@allermat
Copy link
Contributor Author

allermat commented Jul 6, 2023

@larsoner I ran this with a slightly older dev version: 1.4.0.dev3+gb0cf4a0. I'll re-run it with the latest and let you know.

@larsoner
Copy link
Member

larsoner commented Jul 6, 2023

@allermat FYI you might need to clear your cache or derivatives because we changed how caching works

@allermat
Copy link
Contributor Author

allermat commented Jul 6, 2023

Thanks, I always delete the derivatives and start from scratch whenever I re-run the pipeline

@allermat
Copy link
Contributor Author

allermat commented Jul 6, 2023

Ok, so with the latest dev version (1.5.0.dev3+gccbac54) maxfilter ran without an issue on the subject I had trouble with. I'll re-run this on the whole dataset and report if I have any issues.

@allermat
Copy link
Contributor Author

allermat commented Jul 7, 2023

I couldn't test this on all my subjects, but maxfilter seems to run fine now on other subjects as well (at least when I run the pipeline in series, so n_jobs = 1).

Whilst testing this I kept getting a weird missing file error whenever I ran the pipeline with n_jobs > 1, The missing file is always ...run-03_proc-sss_raw.fif, regardless of subject, which happens to be the the mf_reference_run. But despite the error, the file is in the derivatives folder! So not sure why it is not found. I use parallel_backend = 'loky' and I haven't encountered such an issue with the previous versions of the pipeline.

The error message is:

[16:49:48] │ ❌ preprocessing/_03_maxfilter sub-03 run-noise A critical error occurred. The error message was: fname does not exist: /imaging/davis/Projects/SpeechMisperceptionMEEG/data/derivatives/mne-bids-pipeline_test/sub-03/meg/sub-03_task-main_run-03_proc-sss_raw.fif

Aborting pipeline run. The traceback is:

  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne_bids_pipeline/_run.py", line 55, in __mne_bids_pipeline_failsafe_wrapper__
    out = memory.cache(func)(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne_bids_pipeline/_run.py", line 267, in wrapper
    memorized_func(*args, **kwargs)
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/joblib/memory.py", line 594, in __call__
    return self._cached_call(args, kwargs)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/joblib/memory.py", line 537, in _cached_call
    out, metadata = self.call(*args, **kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/joblib/memory.py", line 779, in call
    output = self.func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne_bids_pipeline/steps/preprocessing/_03_maxfilter.py", line 253, in run_maxwell_filter
    raw_exp = mne.io.read_raw_fif(bids_path_ref_sss)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne/io/fiff/raw.py", line 540, in read_raw_fif
    return Raw(
           ^^^^
  File "<decorator-gen-271>", line 12, in __init__
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne/io/fiff/raw.py", line 93, in __init__
    raw, next_fname, buffer_size_sec = self._read_raw_file(
                                       ^^^^^^^^^^^^^^^^^^^^
  File "<decorator-gen-272>", line 12, in _read_raw_file
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne/io/fiff/raw.py", line 175, in _read_raw_file
    fname = str(_check_fname(fname, "read", True, "fname"))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<decorator-gen-0>", line 12, in _check_fname
  File "/home/ma09/.conda/envs/my_mne1.4_fix/lib/python3.11/site-packages/mne/utils/check.py", line 250, in _check_fname
    raise FileNotFoundError(f"{name} does not exist: {fname}")

@larsoner
Copy link
Member

larsoner commented Jul 7, 2023

So not sure why it is not found.

I think I get it. We assume when processing empty-room and rest runs that the reference run exists / has finished processing. However, this isn't necessarily the case when n_jobs > 1 because the ref run could be being processed at the same time as an empty room or rest run.

We probably need to fix this by changing the parallelization to be over all non-rest runs, then process the empty room and rest runs (if applicable) afterward.

So to summarize I think two changes are needed:

  1. Change parallelization strategy to ensure the ref run is complete before the empty room and rest runs are processed
  2. Add a mf_bads: Literal["run", "union"] = "run"

I'll re-title this issue for point (1) and then try to quickly fix the bug. Then I'll open a new issue for (2).

@larsoner larsoner changed the title Error due to mismatch between Maxwell filter reference run and empty room data rank BUG: Reference run not guaranteed to be processed before noise runs Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants