Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Too many JSON files error #717

Closed
larsoner opened this issue Mar 1, 2023 · 7 comments · Fixed by #743
Closed

BUG: Too many JSON files error #717

larsoner opened this issue Mar 1, 2023 · 7 comments · Fixed by #743

Comments

@larsoner
Copy link
Member

larsoner commented Mar 1, 2023

https://mne.discourse.group/t/mne-bids-pipeline-too-many-json-files-error/6436

This is a MEG dataset which I converted to BIDS format using mne-bids. The json files which appear to cause the trouble are [_beh.json] sidecar files for the behavioural data I saved to the beh/ subfolder like so:

|MNE-BIDS_data/
|— README
|— dataset_description.json
|— participants.json
|— participants.tsv
|— sub-01/
|------ sub-01_scans.tsv
|------ beh/
|--------- sub-01_task-main_run-01_beh.json
|--------- sub-01_task-main_run-01_beh.tsv

@allermat
Copy link
Contributor

allermat commented Mar 3, 2023

I think the issue is caused by this statement in mne_bids_pipeline/steps/init/_02_find_empty_room.py line 41:

if hasattr(bids_path_in, "find_matching_sidecar"):
    in_files["sidecar"] = (
        bids_path_in.copy()
        .update(datatype=None)
        .find_matching_sidecar(extension=".json")
    )

When I execute it on my data, the datatype is set to 'meg' in bids_path_in, but this statement resets the datatype to None before calling find_matching_sidecar(extension=".json"). This way it finds all matching .json files regardless of data types: in my case, the sidecar files in beh/ match the search criteria as well as they have the same pattern as the meg files except for the _beh suffix (i.e., meg/sub-01_task-main_run-01_meg.json vs beh/sub-01_task-main_run-01_beh.json)

If I remove the call to update(datatype=None), the code moves on uninterrupted until it bumps into the same thing in mne_bids/path.py line 914:

sidecar_fname = \
    self.copy().update(datatype=None).find_matching_sidecar(
        extension='.json')

If I remove the call to update(datatype=None) here as well, the pipeline completes init/_02_find_empty_room without an error and moves on to the next step.

I'm not sure why the update of datatype to None is necessary, so I just wanted to ask if others have an idea and if removing these both in mne_bids_pipeline and mne_bids is a sensible solution?

@hoechenberger
Copy link
Member

I'm not sure why the update of datatype to None is necessary,

Me neither – we should try to drop it and check if the MNE-BIDS test suite still passs.

@allermat
Copy link
Contributor

allermat commented Mar 6, 2023

Thanks, I'll try it as soon as I can.

@allermat
Copy link
Contributor

Hi,

I'm trying to run the MNE-BIDS tests locally to verify that my changes did not mess up anything, but the tests seem to fail already at the test session collection stage, see the error output here:

(mnedev) [ma09@login-j01 mne-bids]$ make test
Running tests
Test session starts (platform: linux, Python 3.10.9, pytest 7.2.2, pytest-sugar 0.9.6)
cachedir: .pytest_cache
rootdir: /imaging/davis/users/ma09/devel/mne-bids, configfile: setup.cfg
plugins: anyio-3.6.2, sugar-0.9.6, cov-4.0.0
collecting ... 
\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015 ERROR collecting test session \u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015\u2015
/home/ma09/.conda/envs/mnedev/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:992: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:883: in exec_module
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
mne_bids/__init__.py:12: in <module>
    from mne_bids.write import (make_dataset_description, write_anat,
mne_bids/write.py:21: in <module>
    from pkg_resources import parse_version
/home/ma09/.conda/envs/mnedev/lib/python3.10/site-packages/pkg_resources/__init__.py:121: in <module>
    warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)
E   DeprecationWarning: pkg_resources is deprecated as an API

----------------------------------------------------------------- generated xml file: /imaging/davis/users/ma09/devel/mne-bids/junit-results.xml ------------------------------------------------------------------

---------- coverage: platform linux, python 3.10.9-final-0 -----------
Coverage XML written to file coverage.xml

============================================================================================= short test summary info =============================================================================================
FAILED  - DeprecationWarning: pkg_resources is deprecated as an API
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Results (3.49s):
make: *** [test] Error 2

To me this looks like there's something wrong in my conda environment setup. I set up my development conda environment using the instructions found here.

Could anyone please help sorting this out?

Many thanks,
Máté

@allermat
Copy link
Contributor

Hi,

I'm back at this now. Since my last post, the code in mne_bids/path.py has been updated. The relevant bit is now at line 990:

sidecar_fname = (
            self.copy()
            .update(datatype=None, suffix="meg")
            .find_matching_sidecar(extension=".json")

I haven't tested this, but I think this also fixed the issue, because this will now only look for files with suffix='meg', so the ones with suffix='beh', won't be selected.

Just to recap, my suggested solution was to completely remove the call to .update() like so:

sidecar_fname = (
            self.copy()
            .find_matching_sidecar(extension=".json")

I checked the code in mne_bids_pipeline/steps/init/_02_find_empty_room.py, and there the code is still the same as in my original post. I am happy to update that and make a PR. Should I use the solution applied in mne_bids/path.py above?

Thanks,
Máté

@larsoner
Copy link
Member Author

I am happy to update that and make a PR. Should I use the solution applied in mne_bids/path.py above?

Yes I would go with the mne-bids solution if it works, PR welcome!

allermat added a commit to allermat/mne-bids-pipeline that referenced this issue Jun 19, 2023
…oom() in steps/init/_02_find_empty_room.p
@allermat
Copy link
Contributor

Hi, I just sent a PR, I hope I did it right (this was my first one). I couldn't run the test suite unfortunately as for some reason it doesn't run on my system (see above).
Let me know if I can do anything else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants