Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zran_seek returned error: -1 #1387

Closed
Irisfee opened this issue Nov 13, 2018 · 23 comments · Fixed by #1421
Closed

zran_seek returned error: -1 #1387

Irisfee opened this issue Nov 13, 2018 · 23 comments · Fixed by #1421

Comments

@Irisfee
Copy link

Irisfee commented Nov 13, 2018

Hello!

I am using the same codes to run 4 subjects with fmriprep, and 3 of them were preprocessed successfully, with one subject stopped at the middle.

the error is:

traceback (most recent call last):
File "/usr/local/miniconda/bin/fmriprep", line 11, in
sys.exit(main())
File "/usr/local/miniconda/lib/python3.6/site-packages/fmriprep/cli/run.py", line 342, in main
fmriprep_wf.run(**plugin_settings)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/workflows.py", line 595, in run
runner.run(execgraph, updatehash=updatehash, config=self.config)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/base.py", line 162, in run
self._clean_queue(jobid, graph, result=result))
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/base.py", line 224, in _clean_queue
raise RuntimeError("".join(result['traceback']))
RuntimeError: Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
result['result'] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 471, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 555, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 635, in _run_command
result = self._interface.run(cwd=outdir)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 522, in run
runtime = self._run_interface(runtime)
File "/usr/local/miniconda/lib/python3.6/site-packages/fmriprep/interfaces/nilearn.py", line 126, in _run_interface
new_nii = concat_imgs(self.inputs.in_files, dtype=self.inputs.dtype)
File "/usr/local/miniconda/lib/python3.6/site-packages/nilearn/_utils/niimg_conversions.py", line 449, in concat_niimgs
niimg = check_niimg(niimg, ensure_ndim=ndim)
File "/usr/local/miniconda/lib/python3.6/site-packages/nilearn/_utils/niimg_conversions.py", line 271, in check_niimg
niimg = load_niimg(niimg, dtype=dtype)
File "/usr/local/miniconda/lib/python3.6/site-packages/nilearn/_utils/niimg.py", line 116, in load_niimg
dtype = _get_target_dtype(niimg.get_data().dtype, dtype)
File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/dataobj_images.py", line 202, in get_data
data = np.asanyarray(self._dataobj)
File "/usr/local/miniconda/lib/python3.6/site-packages/numpy/core/numeric.py", line 544, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/arrayproxy.py", line 356, in array
raw_data = self.get_unscaled()
File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/arrayproxy.py", line 351, in get_unscaled
mmap=self._mmap)
File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/volumeutils.py", line 525, in array_from_file
infile.seek(offset)
File "indexed_gzip/indexed_gzip.pyx", line 430, in indexed_gzip.indexed_gzip._IndexedGzipFile.seek
indexed_gzip.indexed_gzip.ZranError: zran_seek returned error: -1

The codes I used for running:

singularity run -e --bind /projects/kuhl_lab/yzhao17:/projects/kuhl_lab/yzhao17 /projects/kuhl_lab/yzhao17/Image/fmriprep.simg /projects/kuhl_lab/yzhao17/DIIN/bids_data /projects/kuhl_lab/yzhao17/DIIN/derivatives participant --participant_label 01 --output-space T1w template fsaverage6 --medial-surface-nan -w /projects/kuhl_lab/yzhao17/DIIN/bids_data/works --resource-monitor --notrack --stop-on-first-crash

The data passed bids format check, and I don't think there is any difference between subjects. I am wondering what's going on.

thank you so much!

@effigies
Copy link
Member

Hi, what version of fMRIPrep are you using? This may have been addressed in #1356, which was included in the 1.2.1 release.

Related: #801.

@Irisfee
Copy link
Author

Irisfee commented Nov 13, 2018

Hi, I am using 1.2.1. I have checked out the #1356, but I am still confused how to solve it. Is that possible for me to change that setting?

@effigies
Copy link
Member

Ah, then it didn't fix it. That's a shame. Have you tried re-running? Our best guess has been that it's an NFS cache-related issue, and sometimes re-running resolves it.

cc @pauldmccarthy Just a heads up that this appears under default nibabel settings (we no longer set KEEP_FILE_OPEN_DEFAULT='auto').

@Irisfee
Copy link
Author

Irisfee commented Nov 13, 2018

I have rerun it, but however, I got the same error again.

@Irisfee
Copy link
Author

Irisfee commented Nov 13, 2018

I am wondering if I should clean up the work dir when re-running?

@effigies
Copy link
Member

That sometimes helps, but I'm not sure that it would, here. Can you determine the node in which it failed? There are several nodes that use that interface; if we can identify which one, we can try deleting just its inputs, to save you some time.

@Irisfee
Copy link
Author

Irisfee commented Nov 13, 2018

sorry I am totally a newbie to the nipype workflow and I don't know how to find the node that failed. here is the output file:
df_job001.txt

looking forward to your further suggestions!

@emdupre
Copy link
Collaborator

emdupre commented Nov 13, 2018

It looks like the error is on the merge node in the bold_t1_trans_wf:

181113-18:49:55,165 nipype.workflow WARNING:
[Node] Error on "fmriprep_wf.single_subject_01_wf.func_preproc_task_retrieval_run_02_wf.bold_t1_trans_wf.merge" (/projects/kuhl_lab/yzhao17/DIIN/bids_data/works/fmriprep_wf/single_subject_01_wf/func_preproc_task_retrieval_run_02_wf/bold_t1_trans_wf/merge)

Could you look in that folder in the working directory, @Irisfee, and remove its listed inputs before re-running ?

@effigies
Copy link
Member

	 [Node] Error on "fmriprep_wf.single_subject_01_wf.func_preproc_task_retrieval_run_02_wf.bold_t1_trans_wf.merge" (/projects/kuhl_lab/yzhao17/DIIN/bids_data/works/fmriprep_wf/single_subject_01_wf/func_preproc_task_retrieval_run_02_wf/bold_t1_trans_wf/merge)

So perhaps just delete /projects/kuhl_lab/yzhao17/DIIN/bids_data/works/fmriprep_wf/single_subject_01_wf/func_preproc_task_retrieval_run_02_wf/bold_t1_trans_wf

@Irisfee
Copy link
Author

Irisfee commented Nov 14, 2018

I have re-run another time and this time I have got the results successfully. thank you very much @effigies and @emdupre

I am just wondering if this bug happens totally random, or it might more likely to happen in some situations? I run 4 subjects together, and I got this problem twice only on one of them.

@effigies
Copy link
Member

I believe it's actually an NFS issue. We can't reliably reproduce it. If deleting the inputs resolved the issue, I'd guess that perhaps it's actually the step before merge that's producing malformed files. Next time you see this, could you zip up that entire directory, and we can try to see if there are some malformed files there?

@Irisfee
Copy link
Author

Irisfee commented Nov 15, 2018

Due to some reasons, I rerun all the subjects again, and still, I have the same trouble with that subject. This time, deleting the error node folder doesn't solve the trouble. I can share you with the functional data but I can't share you with the anatomical data without skull-stripping for subject-protecting. Would that help?

@effigies
Copy link
Member

No, we can't reproduce without the anatomicals. But if you're able to tar the bold_t1_trans_wf directory, I could at least inspect the intermediate results.

@Irisfee
Copy link
Author

Irisfee commented Nov 15, 2018

I have sent the files to your gmail address. thanks

@effigies
Copy link
Member

import nibabel as nb
import glob

for img in glob.glob('bold_to_t1w_transform/*.nii.gz'):
    try:
        nb.load(img).get_data()
    except Exception as err:
        print(img)
        print(err)
bold_to_t1w_transform/vol0109_xform-00109.nii.gz
zran_seek returned error: -1

However, if I uninstall indexed_gzip, I can read it fine. Reinstalling, I get the error again.

@pauldmccarthy Here's a failing file: vol0109_xform-00109.nii.gz

I can open an issue over in indexed_gzip, if you like.

@effigies
Copy link
Member

I wonder if we can do any kind of fallback, so that we can use IndexedGzipFile when it works, and use GzipFile when we hit a zran_seek error.

@pauldmccarthy
Copy link

@effigies thanks - super busy right now, but I will look at this as soon as I get a chance.

@effigies
Copy link
Member

@Irisfee Do you think you could just delete /projects/kuhl_lab/yzhao17/DIIN/bids_data/works/fmriprep_wf/single_subject_01_wf/func_preproc_task_retrieval_run_04_wf/bold_t1_trans_wf, for now? Until we can either get a fix into indexed_gzip or nibabel, I think this is just going to be a stochastic bug.

@pauldmccarthy Sounds good. I opened an issue on your page, so that you don't have to dig around over here to find it.

@Irisfee
Copy link
Author

Irisfee commented Nov 15, 2018

I have deleted that and rerun twice, however, I got the same error twice at this node. So it seems that deleting the workdir doesn't solve the issue now.

@hstojic
Copy link
Contributor

hstojic commented Nov 21, 2018

Hello,

I got the same error with one of my datasets, although in a slightly different part of the pipeline:

181121-02:02:30,129 nipype.workflow WARNING:
	 [Node] Error on "fmriprep_wf.single_subject_s016_wf.func_preproc_task_fnclearning_run_04_wf.bold_mni_trans_wf.bold_to_mni_transform" (/scratch/scratch/ucjttoj/fnclearning_fmri/dProcessed/fmriprep_work/fmriprep_wf/single_subject_s016_wf/func_preproc_task_fnclearning_run_04_wf/bold_mni_trans_wf/bold_to_mni_transform)
181121-02:02:30,690 nipype.workflow ERROR:
	 Node bold_to_mni_transform failed to run on host node-i00a-002.myriad.ucl.ac.uk.
181121-02:02:30,692 nipype.workflow ERROR:
	 Saving crash info to /home/ucjttoj/Scratch/fnclearning_fmri/dProcessed/fmriprep/sub-s016/log/20181120-233348_48bec329-0a82-4f31-be7a-de197281ba15/crash-20181121-020230-ucjttoj-bold_to_mni_transform-a49ac52c-7129-4aae-a683-b36f43774280.txt
Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
    result['result'] = node.run(updatehash=updatehash)
  File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 471, in run
    result = self._run_interface(execute=True)
  File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 555, in _run_interface
    return self._run_command(execute)
  File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 635, in _run_command
    result = self._interface.run(cwd=outdir)
  File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 522, in run
    runtime = self._run_interface(runtime)
  File "/usr/local/miniconda/lib/python3.6/site-packages/fmriprep/interfaces/itk.py", line 143, in _run_interface
    for i, (in_file, in_xfm) in enumerate(zip(in_files, xfms_list))]
  File "/usr/local/miniconda/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/usr/local/miniconda/lib/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/usr/local/miniconda/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/local/miniconda/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/miniconda/lib/python3.6/site-packages/fmriprep/interfaces/itk.py", line 262, in _applytfms
    runtime = xfm.run().runtime
  File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 522, in run
    runtime = self._run_interface(runtime)
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/interfaces/fixes.py", line 29, in _run_interface
    self.__class__.__name__, __version__))
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/interfaces/utils.py", line 179, in _copyxform
    newimg = resampled.__class__(resampled.get_data(), orig.affine, header)
  File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/dataobj_images.py", line 202, in get_data
    data = np.asanyarray(self._dataobj)
  File "/usr/local/miniconda/lib/python3.6/site-packages/numpy/core/numeric.py", line 544, in asanyarray
    return array(a, dtype, copy=False, order=order, subok=True)
  File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/arrayproxy.py", line 356, in __array__
    raw_data = self.get_unscaled()
  File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/arrayproxy.py", line 351, in get_unscaled
    mmap=self._mmap)
  File "/usr/local/miniconda/lib/python3.6/site-packages/nibabel/volumeutils.py", line 525, in array_from_file
    infile.seek(offset)
  File "indexed_gzip/indexed_gzip.pyx", line 430, in indexed_gzip.indexed_gzip._IndexedGzipFile.seek
indexed_gzip.indexed_gzip.ZranError: zran_seek returned error: -1

Here is the crash file:
crash-s016.txt

I ran it twice, failing the same way both times - but now I will try deleting the files and see whether it succeeds.

@effigies
Copy link
Member

The quick fix will be to remove indexed_gzip from our Docker images. Given that this is popping up so frequently now, I think we'll need to in the short term.

@effigies
Copy link
Member

@hstojic If you can find a failing file, could you send it along to @pauldmccarthy in pauldmccarthy/indexed_gzip#15.

@pauldmccarthy
Copy link

indexed_gzip 0.8.8 is now available, which should hopefully fix this issue - let me know if you are still having problems after upgrading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants