-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in finding parcellation atlases when running pipeline in parallel #1064
Comments
This seems like it could be stemming from a race condition. Namely, XCP-D might be in the process of writing out a new copy of the atlas in one node while another node is trying to access it. I will look into it tomorrow, but my guess is that we can fix this by checking if the output atlas already exists in whichever node typically writes it out. If the atlas exists, then that node should just not try to write out a new copy at all. |
That is probably correct. I think we can make XCP-D first check if the file exists, and if so, skip writing and use it directly? |
Maybe a related but not the same issue is with the
|
The atlas race condition falls under XCP-D's purview, because I use a custom function to copy the files, but the new error stems from the DerivativesDataSink class imported from Niworkflows. I think the best move forward would be to open an issue in the Niworkflows repo (https://github.com/nipreps/niworkflows) or open a NeuroStars post with the |
I have filed a new issue there, but I am not sure if I provide enough details. |
It is a little strange. I have tried
|
@tsalo Given my last comment, should we reopen this issue? Or just wait for Niworkflows? |
It turns out both of the failures (first the tsv, then the json) you mentioned were due to the DerivativesDataSink, so it seems like #1066 didn't fix anything (though it's entirely possible the atlas files were susceptible to the race condition issue). I will reopen this and attempt to use a similar file-copying approach instead of DerivativesDataSink for these two outputs. |
Summary
Now the output atlases folder is moved out of the subject directory, which is very reasonable. But for some unknown reason, when running pipeline for several subjects in parallel, some sub-tasks will result in an error in finding corresponding parcellation atlas file. The following is crash report:
Additional details
What were you trying to do?
Running several subjects in parallel.
What did you expect to happen?
Atlases should be okay for all subjects.
What actually happened?
Some subject pipeline cannot find atlas file.
Reproducing the bug
None. The pipeline is set up as the default setting.
The text was updated successfully, but these errors were encountered: