Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError: [Errno 2] No such file or directory: 'result_merge_confound_metadata.pklz' #1819

Closed
soyfrien opened this issue Oct 12, 2019 · 23 comments
Labels

Comments

@soyfrien
Copy link

soyfrien commented Oct 12, 2019

Hello we've been getting file not found errors with fmriprep-docker.

I've tried running without the wrapper:
sudo docker run -v /louis/raw_bids:/data -v /louis/derivatives:/out -v /louis/tmp:/scratch -v /usr/local/freesurfer:/freesurfer poldracklab/fmriprep:1.5.0 --fs-license-file=/freesurfer/license.txt --ignore slicetiming --output-spaces MNI152NLin2009cAsym:res-2 T1w fsaverage5 --participant-label 12345 -w /scratch /data /out participant

Rather late in processing the following error occurs:

Node: fmriprep_wf.single_subject_12345_wf.func_preproc_task_reapp_run_06_wf.bold_confounds_wf.merge_confound_metadata
Working directory: /scratch/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_06_wf/bold_confounds_wf/merge_confound_metadata

...

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 316, in _send_procs_to_workers
    self.procs[jobid].run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 488, in run
    self, report_type='postexec', is_mapnode=isinstance(self, MapNode))
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 152, in write_report
    result = node.result  # Locally cache result
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 198, in result
    op.join(self.output_dir(), 'result_%s.pklz' % self.name))[0]
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 300, in load_resultfile
    result = loadpkl(results_file)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/utils/filemanip.py", line 687, in loadpkl
    pkl_file = pklopen(infile.name, 'rb')
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'result_merge_confound_metadata.pklz'

A second error #1807 also occurs.

I will post the log of fmriprep-docker when that finishes.

I understand an update is slated for #1807, is this related?

Many thanks,
Louis

@soyfrien
Copy link
Author

Here is the output from fmriprep-docker.

$ sudo fmriprep-docker -i poldracklab/fmriprep:1.5.0 -u 12218:12218 -w /louis/temp /louis/raw_bids /louis/derivatives2 participant --fs-license-file=/usr/local/freesurfer/license.txt --ignore slicetiming --output-spaces MNI152NLin2009cAsym:res-2 T1w fsaverage5 --participant-label 12345

Error 1:

$ cat /louis/derivatives2/fmriprep/sub-12345/log/20191012-051737_a10fd8b5-3194-41dd-a507-ef4efb6afbd6/crash-20191012-111456-UID12218-targets-f5ad0e38-2a26-402b-9fb2-b943d7894878.txt 
Node: fmriprep_wf.single_subject_12345_wf.func_preproc_task_pass_run_02_wf.bold_surf_wf.targets
Working directory: /scratch/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_02_wf/bold_surf_wf/targets

Node inputs:

function_str = def select_target(subject_id, space):
    """ Given a source subject ID and a target space, get the target subject ID """
    return subject_id if space == 'fsnative' else space

space = ['fsaverage5']
subject_id = <undefined>

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 269, in _send_procs_to_workers
    num_subnodes = self.procs[jobid].num_subnodes()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1204, in num_subnodes
    self._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1219, in _get_inputs
    super(MapNode, self)._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 521, in _get_inputs
    outputs = _load_resultfile(results_file)[0].outputs
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 300, in load_resultfile
    result = loadpkl(results_file)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/utils/filemanip.py", line 687, in loadpkl
    pkl_file = pklopen(infile.name, 'rb')
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'result_autorecon3.pklz'

Error 2:

$ cat /louis/derivatives2/fmriprep/sub-12345/log/20191012-051737_a10fd8b5-3194-41dd-a507-ef4efb6afbd6/crash-20191012-111903-UID12218-lta_ras2ras-fe8e298e-15dc-4b14-ac68-b9cdaa4b4cc7.txt 
Node: fmriprep_wf.single_subject_12345_wf.func_preproc_task_pass_run_02_wf.bold_reg_wf.bbreg_wf.lta_ras2ras
Working directory: /scratch/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_02_wf/bold_reg_wf/bbreg_wf/lta_ras2ras

Node inputs:

args = <undefined>
environ = {}
in_fsl = <undefined>
in_itk = <undefined>
in_lta = <undefined>
in_mni = <undefined>
in_niftyreg = <undefined>
in_reg = <undefined>
invert = <undefined>
ltavox2vox = <undefined>
out_fsl = <undefined>
out_itk = <undefined>
out_lta = True
out_mni = <undefined>
out_reg = <undefined>
source_file = <undefined>
target_conform = <undefined>
target_file = <undefined>

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 269, in _send_procs_to_workers
    num_subnodes = self.procs[jobid].num_subnodes()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1204, in num_subnodes
    self._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1219, in _get_inputs
    super(MapNode, self)._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 521, in _get_inputs
    outputs = _load_resultfile(results_file)[0].outputs
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 300, in load_resultfile
    result = loadpkl(results_file)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/utils/filemanip.py", line 687, in loadpkl
    pkl_file = pklopen(infile.name, 'rb')
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'result_transforms.pklz'

@oesteban
Copy link
Member

Hi @ppdac, although you are right this is somewhat related to #1807, you are experiencing the worst case of these problems. We are still investigating the conditions that lead to the problem you've seen - could you send us the full output log from fmriprep? Is the problem solidly replicable (i.e., if you remove the work directory and rerun it will happen the same)?

@oesteban oesteban added the bug label Oct 14, 2019
@soyfrien
Copy link
Author

soyfrien commented Oct 14, 2019 via email

@soyfrien
Copy link
Author

Hi @oesteban,

Thanks for looking at this.

Here is the full output of the run:
https://gist.githubusercontent.com/ppdac/ca959d9c58c9e7e6bce995fad7f63d3b/raw/da9b2650dab63dd695cab6b6d4eb0504e92e3a1e/run1.log.txt

I will remove the work directory, rerun and post the output of that when it's finished.

@soyfrien
Copy link
Author

Second run:

Full log:
https://gist.githubusercontent.com/ppdac/aca54c1c478acea168c92b912f820dc4/raw/b740e2c16a6c316c74f6b9f3e6dba87550dbd00f/run2.log.txt

Error 1

$ cat crash-20191014-111012-UID12218-ds_confounds-e3bac540-361c-43d4-98d2-42e4c55cc645.txt 
Node: fmriprep_wf.single_subject_12345_wf.func_preproc_task_reapp_run_02_wf.func_derivatives_wf.ds_confounds
Working directory: /scratch/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_02_wf/func_derivatives_wf/ds_confounds

Node inputs:

base_directory = /out
check_hdr = True
compress = <undefined>
desc = confounds
extra_values = <undefined>
in_file = <undefined>
keep_dtype = False
meta_dict = <undefined>
source_file = /data/sub-12345/func/sub-12345_task-reapp_run-02_bold.nii.gz
space = 
suffix = regressors

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 316, in _send_procs_to_workers
    self.procs[jobid].run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 411, in run
    cached, updated = self.is_cached()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 297, in is_cached
    hashed_inputs, hashvalue = self._get_hashval()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 494, in _get_hashval
    self._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 521, in _get_inputs
    outputs = _load_resultfile(results_file)[0].outputs
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 311, in load_resultfile
    if resolve and result.outputs:
AttributeError: 'NoneType' object has no attribute 'outputs'

Error 2

$ cat crash-20191014-111016-UID12218-name_surfs-8692d605-0951-445c-854c-b53960100972.txt 
Node: fmriprep_wf.single_subject_12345_wf.func_preproc_task_reapp_run_06_wf.func_derivatives_wf.name_surfs
Working directory: /scratch/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_06_wf/func_derivatives_wf/name_surfs

Node inputs:

in_file = <undefined>
pattern = (?P<LR>[lr])h.(?P<space>\w+).gii
template = space-{space}_hemi-{LR}.func

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 269, in _send_procs_to_workers
    num_subnodes = self.procs[jobid].num_subnodes()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1204, in num_subnodes
    self._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1219, in _get_inputs
    super(MapNode, self)._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 521, in _get_inputs
    outputs = _load_resultfile(results_file)[0].outputs
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 300, in load_resultfile
    result = loadpkl(results_file)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/utils/filemanip.py", line 687, in loadpkl
    pkl_file = pklopen(infile.name, 'rb')
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/usr/local/miniconda/lib/python3.7/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'result_update_metadata.pklz'

@jeroenvanbaar
Copy link

If it helps, I ran into a similar issue (referenced here) with this crash log:

https://github.com/poldracklab/fmriprep/files/3718387/crash-20191011-015931-jvanbaar-merge_confound_metadata-224ef30b-5a7f-43ef-b8f0-50c3a2f9167d.txt

@ppdac is it also true for you that the 'missing' files actually do exist in the working dir?

Original comment: #1697 (comment)_

@soyfrien
Copy link
Author

soyfrien commented Oct 16, 2019

Hi @jarodroland, would these be the working directories in the container or host machine? If it's on the container, can you help me understand how to take a look at that? The containers seem to remove themselves after running.

@oesteban
Copy link
Member

@soyfrien
Copy link
Author

Hi @oesteban, the unstable tag wound up producing semi-inconsistent errors too. 1.5.1rc1 was successful on the initial run. I'll let you know how future runs look.

@jarodroland
Copy link
Contributor

Hi @jarodroland, would these be the working directories in the container or host machine? If it's on the container, can you help me understand how to take a look at that? The containers seem to remove themselves after running.

@ppdac, Did you @ the wrong person?

@soyfrien
Copy link
Author

soyfrien commented Oct 18, 2019

Hi @jarodroland, would these be the working directories in the container or host machine? If it's on the container, can you help me understand how to take a look at that? The containers seem to remove themselves after running.

@ppdac, Did you @ the wrong person?

Yes, sorry. That was meant for @jeroenvanbaar.

@soyfrien
Copy link
Author

soyfrien commented Oct 18, 2019

@jeroenvanbaar
Copy link

@ppdac I can find these files in the working dir on my host machine. In my case, I manually set the working dir to a designated scratch directory on the computing cluster using the -w flag in the fmriprep command. All intermediate files will remain there after running fmriprep it seems

@soyfrien
Copy link
Author

soyfrien commented Oct 19, 2019

@jeroenvanbaar, thank you for the suggestion.

I will try it if 1.5.1rc1 starts to give my users problems. But I may be misunderstanding you, I don't think I have the correct files.

--

There are many .pklz files, but I don't see any working directory matches found in the error logs:

$ find /louis -type f -name "*.pklz" | grep result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_rest_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_03_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_04_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_06_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_05_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_rest_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_03_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_04_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_06_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_05_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-unstable/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_rest_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_03_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_04_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_06_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_05_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-151rc1/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_rest_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_03_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_04_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_06_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_05_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_rest_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_03_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_04_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_06_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_05_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_pass_run_01_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz
/louis/temp-run2/fmriprep_wf/single_subject_12345_wf/func_preproc_task_reapp_run_02_wf/bold_surf_wf/update_metadata/result_update_metadata.pklz

@mangstad
Copy link

So does the current release candidate solve this issue? I'm trying to run a large amount of subjects through fmriprep on our HPC. So far a few hundred succeeded, but a few hundred others had errors along the lines of any of the following:
FileNotFoundError: [Errno 2] No such file or directory: 'result_transforms.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_name_surfs.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_update_metadata.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_ds_confounds.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_merger.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_t1_template_dimensions.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_fix_surfs.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_autorecon3.pklz'
FileNotFoundError: [Errno 2] No such file or directory: 'result_concat_affines.pklz'

@oesteban
Copy link
Member

We've just seen this problem with fMRIPrep 1.5.1rc1 on our HPC:

191018-14:50:28,445 nipype.workflow ERROR:
         Node update_metadata failed to run on host nid00907.
191018-14:50:28,453 nipype.workflow ERROR:
         Saving crash info to /derivatives/fmriprep-1.5.1rc1/fmriprep/sub-MSC05/log/20191017-172719_f2854d59-8ad4-4717-838f-d365acd754cc/crash-20191018-145028-oesteban-update_metada
ta-3b3dbe6c-383c-45a0-bb48-18cfacf7e22a.txt
Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 267, in _send_procs_to_workers
    num_subnodes = self.procs[jobid].num_subnodes()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1217, in num_subnodes
    self._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 1232, in _get_inputs
    super(MapNode, self)._get_inputs()
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 526, in _get_inputs
    outputs = _load_resultfile(results_file).outputs
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/utils.py", line 293, in load_resultfile
    if not results_file.exists():
  File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1329, in exists
    self.stat()
  File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1151, in stat
    return self._accessor.stat(self)
OSError: [Errno 5] Input/output error: '/scratch/03763/oesteban/fmriprep-work/ds000224-sub-MSC05/fmriprep_wf/single_subject_MSC05_wf/func_preproc_ses_func08_task_memoryfaces_wf/bold_surf_wf/merger/result_merger.pklz'


When creating this crashfile, the results file corresponding
to the node could not be found.

@effigies
Copy link
Member

Ugh. I was really hoping this was resolved. Does the file actually exist? Since we at least have a full path now, perhaps it is just a FS synchronization issue?

@effigies
Copy link
Member

I guess in the worst case, we could do os.sync(). I can't really decide whether I expect that to significantly affect performance.

@oesteban
Copy link
Member

pinging @satra for him to check on the latest 3 posts.

perhaps it is just a FS synchronization issue?

It is shaping us as such.

I guess in the worst case, we could do os.sync(). I can't really decide whether I expect that to significantly affect performance.

We could call os.syng() only before _get_inputs when MapNodes are calculating num_subnodes(). It's not ideal, but at least we shave off a few calls from regular Nodes.

@mangstad as you can see in my traceback, I am getting one of those FileNotFoundError but the node that actually breaks is not that corresponding to the missing file, it is a different one (called update_meatadata). You can extract the name of the node from the file name given by the line that starts with Saving crash info to....
In other words find <path/to/output/folder> -name "crash-*" should give you a list of nodes that failed. Let's check whether update_metadata is a habitual suspect or not.

@mangstad
Copy link

Ran that find on my output directory, then a few sed's to trim off everything but the node name. Looks like in my case it's not usually update_metadata. This is over everyone in my output folder, not just those with the No such file or directory: 'result_...pklz' errors, so it's definitely possible there are some other unrelated crashes in there as well.

$ find ./ -name "crash-*" | sed 's/./.mangstad//' | sed 's/^-//' | sed 's/-.//' | sort | uniq -c
1 about
1 add_motion_headers
1 check_hdr
2 ds_bold_surfs
1 ds_confounds
1 ds_t1_template_transforms
2 inu_n4
29 lta_ras2ras
1 lta_to_itk
13 name_surfs
2 sampler.aI.a1
1 t1_conform
2 targets
5 update_metadata

@bpinsard
Copy link
Collaborator

What is the status on that bug? Is it solved with rc1 or rc2?
It is happening a lot, even if the work-dir is on a local hard-drive.
Thanks!

@oesteban
Copy link
Member

We are pushing to cut a new full-release ASAP.

@soyfrien soyfrien closed this as completed Mar 6, 2020
@soyfrien
Copy link
Author

soyfrien commented Mar 6, 2020

About

fMRIPrep version: 20.0.1
fMRIPrep command: /usr/local/miniconda/bin/fmriprep /data /out participant --ignore slicetiming --participant-label sub-150 --output-spaces MNI152NLin2009cAsym:res-2 T1w fsaverage5
Date preprocessed: 2020-03-04 01:55:36 +0000

Errors
No errors to report!

Thank you all for the hard work!

To other users, sorry, I don't know which patch resolved this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants