Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError: [Errno 2] No such file or directory: '_task-rest_bold.json' #516

Closed
m-petersen opened this issue Jul 7, 2021 · 9 comments · Fixed by #523 or #564
Closed

FileNotFoundError: [Errno 2] No such file or directory: '_task-rest_bold.json' #516

m-petersen opened this issue Jul 7, 2021 · 9 comments · Fixed by #523 or #564
Labels

Comments

@m-petersen
Copy link

m-petersen commented Jul 7, 2021

Summary

I try to implement bids conversion with heudiconv on our local HPC using a singularity container ( heudiconv-0.9.0). As I will work with a large cohort (>2000 subjects) I am currently implementing a parallelization across nodes via SLURM and within each node with GNU parallel with a test dataset (5 times the same subject). In doing so, a test run fails with the following error.
`local:5/0/100%/0.0s 0: Traceback (most recent call last):
0:   File "/opt/miniconda-latest/bin/heudiconv", line 33, in <module>
0:     sys.exit(load_entry_point('heudiconv', 'console_scripts', 'heudiconv')())
0:   File "/src/heudiconv/heudiconv/cli/run.py", line 24, in main
0:     workflow(**kwargs)
0:   File "/src/heudiconv/heudiconv/main.py", line 351, in workflow
0:     grouping=grouping,)
0:   File "/src/heudiconv/heudiconv/convert.py", line 238, in prep_conversion
0:     getattr(heuristic, 'DEFAULT_FIELDS', {}))
0:   File "/src/heudiconv/heudiconv/bids.py", line 94, in populate_bids_templates
0:     populate_aggregated_jsons(path)
0:   File "/src/heudiconv/heudiconv/bids.py", line 126, in populate_aggregated_jsons
0:     json_ = load_json(fpath)
0:   File "/src/heudiconv/heudiconv/utils.py", line 177, in load_json
0:     with open(filename, 'r') as fp:
0: FileNotFoundError: [Errno 2] No such file or directory: '/bids/sub-ewgenia001/ses-1/func/sub-ewgenia001_ses-1_task-rest_bold.json'` 

Interestingly this affects only 4 of 5 subjects with the remaining one (seemingly always the first subject completing) without issues. All this sounds a little bit like race-condition as discussed in #362. However, as far as I understand a fix has been implemented and I am working with datalad creating a ephemeral clone of the dataset on a scratch partition for each subject before applying heudiconv to it (scripts below). Maybe I am misunderstanding something but shouldn't the latter somehow address the race condition with the processes writing to a separate/cloned top-level file?

Any input would be highly appreciated. Happy to provide further details.

heudiconv_heuristic.txt
pipelines_parallelization.txt (batch script parallelizing pipelines_processing across subjects with GNU parallel)
pipelines_processing.txt

Platform details:

Choose one:

  • [ x] Container (heudiconv 0.9.0)
@m-petersen
Copy link
Author

To follow up, using --bids notop heudiconv runs without issues.

@yarikoptic
Copy link
Member

ha -- situation is a bit different from #362 -- I guess while the populate_aggregated_jsons (locked) is going through the long list of .json files it got, some other parallel process managed to remove that file (may be to just recreate with updated content or smth like that). Do not see a clear way out yet besides making some load_json_wait to be used there which would also wait for the file to re-appear in some reasonable duration of time, so smth along the lines of

diff --git a/heudiconv/utils.py b/heudiconv/utils.py
index f30a23e..7384cf7 100644
--- a/heudiconv/utils.py
+++ b/heudiconv/utils.py
@@ -14,6 +14,7 @@ from collections import namedtuple
 from glob import glob
 from subprocess import check_output
 from datetime import datetime
+from time import sleep
 
 from nipype.utils.filemanip import which
 
@@ -173,12 +174,17 @@ def load_json(filename):
     -------
     data : dict
     """
-    try:
-        with open(filename, 'r') as fp:
-            data = json.load(fp)
-    except JSONDecodeError:
-        lgr.error("{fname} is not a valid json file".format(fname=filename))
-        raise
+    for i in range(100):  # >= 10 sec wait
+        try:
+            with open(filename, 'r') as fp:
+                data = json.load(fp)
+                break
+        except JSONDecodeError:
+            lgr.error("{fname} is not a valid json file".format(fname=filename))
+            raise
+        except FileNotFoundError:
+            sleep(0.1)
+            continue
 
     return data
     

but I guess it is not something you could try out easily right?

@yarikoptic
Copy link
Member

I thought to suggest meanwhile that you could run all individual conversions indeed with notop but then "conclude" with a single run of --command populate-templates but I saw that add_participant_record is not ran then, so you would end up with a not filled out participants.tsv :-/

@m-petersen
Copy link
Author

m-petersen commented Jul 9, 2021

Hi Yaroslav,

thanks for your help.

but I guess it is not something you could try out easily right?

Indeed, it isn't something I can try easily with my setup. Nevertheless, --bids notop works fine and I fill the participants.tsv afterwards.

@burdinskid13
Copy link

I ran into the same issue as the original poster @m-petersen -- has any solution for this been implemented yet?

@mgxd
Copy link
Member

mgxd commented Sep 13, 2021

@burdinskid13 it looks like this bug is still around - the best workaround for now seems to be #516 (comment)

yarikoptic added a commit to dbic/heudiconv that referenced this issue Sep 13, 2021
just to blindly counteract the effect which likely to happen whenever
some per-subject process is converting (and just load/saving .json files)
while some other top level populate_aggregated_jsons calls out to json_load
to "harvest" known information.  There it should be safe to retry since
then anyways the last one to load/save those top level files will produce
the correct one.

Hopefully closes nipy#516
@yarikoptic
Copy link
Member

sorry for the delay. I have now implemented that workaround in the comment as #523 . I think it should be safe, rapid review would be appreciated. If no objections etc, I will merge tomorrow and kick out a fresh heudiconv version -- it has been awhile

yarikoptic added a commit to dbic/heudiconv that referenced this issue Sep 14, 2021
just to blindly counteract the effect which likely to happen whenever
some per-subject process is converting (and just load/saving .json files)
while some other top level populate_aggregated_jsons calls out to json_load
to "harvest" known information.  There it should be safe to retry since
then anyways the last one to load/save those top level files will produce
the correct one.

Hopefully closes nipy#516
yarikoptic added a commit to dbic/heudiconv that referenced this issue Sep 15, 2021
just to blindly counteract the effect which likely to happen whenever
some per-subject process is converting (and just load/saving .json files)
while some other top level populate_aggregated_jsons calls out to json_load
to "harvest" known information.  There it should be safe to retry since
then anyways the last one to load/save those top level files will produce
the correct one.

Hopefully closes nipy#516
yarikoptic added a commit to dbic/heudiconv that referenced this issue Sep 15, 2021
just to blindly counteract the effect which likely to happen whenever
some per-subject process is converting (and just load/saving .json files)
while some other top level populate_aggregated_jsons calls out to json_load
to "harvest" known information.  There it should be safe to retry since
then anyways the last one to load/save those top level files will produce
the correct one.

Hopefully closes nipy#516
@github-actions
Copy link

🚀 Issue was released in v0.11.1 🚀

@yarikoptic
Copy link
Member

sorry - referenced this issue incorrectly within #564, so it was released some time before (I guess 0.10.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants