New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nu_correct: disk i/o issues (singularity) #1308
Comments
Is there any chance you're hitting a quota? Unlike most of fMRIPrep, FreeSurfer will store all data in the output directory. I'm seeing lines like the following:
I'm interpreting this as the temporary directory is |
There's no quotas on |
Are you able to get support from your sysadmin? It seems like an odd error. You could also try the FreeSurfer list, to see if they've seen this show up for other reasons. Note that this looks the same as freesurfer/freesurfer#462. We may be hitting an HPC edge case. Finally, did you try re-running? FreeSurfer should try to pick back up where it left off. If it's a timing-related bug, it may be resolved by a second pass. |
This may also be a Singularity issue, where for some reason you're using space inside the container: https://groups.google.com/a/lbl.gov/forum/#!topic/singularity/eq-tLo2SewM I don't know off-hand how to test that hypothesis, though. |
Yeah we are requesting additional local /scratch space from them but I doubt they'll give us much more. (If network connectivity is even the issue).
Indeed. But isn't this a rather common scenario?
Is there perhaps a way to tell FreeSurfer to wait longer or retry? Rerunning using the temporary working directory seems like asking for hard-to-reproduce scenarios..
Yes I saw that thread, unfortunately there is no way for us to allow arbitrary writing inside the container. So we'd have to figure out where freesurfer is writing files (if it's not within the working directory). |
Yes, HPC is a common environment, but the heterogeneity of clusters, as well as users' individual environments, makes it pretty difficult for us to reproduce issues. Also, that everybody is creating their own Singularity images from our Docker images is probably an unnecessary source of variation.
Not very easily under the current Nipype framework.
If you use a persistent working directory location, you'll have an easier time picking up where you left off. |
That makes sense. I'll try some more things and report back :) |
Got some excellent help figuring out where freesurfer is trying to write from Michael Krause: https://mail.nmr.mgh.harvard.edu/pipermail//freesurfer/2018-October/058813.html turns out it uses Should this be added to the documentation? PS I still got some disk full errors earlier in the pipeline, but this was because all my fmriprep processes were accessing the same mail report file at the same time, so a random sleep at the start of the HPC job fixed this. |
Seems like you finally figured out. Please reopen if I'm wrong. |
We are trying to get fmriprep running on our HPC (Slurm-based) in a singularity container.
We run the container with
-c -e
flags and mount a working directory and data and output directory.The process runs fine untill freesurfer's
nu_correct
says:This doesn't seem to make sense as the drive mounted has terabytes available. Is it possible that freesurfer is trying to write to a directory that is not defined as the working/data/output directory?
Singularity call:
attached:
crash-20181004-145951-vandejjf-autorecon1-881688dd-5cd7-4498-b890-80ae08380433.txt
The text was updated successfully, but these errors were encountered: