-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in ".pathnames.iv" file in position command #493
Comments
Hi @ASLeonard, the code link you sent is not run in parallel, as far as I can see. The construction of the XP index is single threaded, with the exception being the https://github.com/ekg/mmmulti we use. |
Sorry for not being clear. This is the command I was running something like
So the parallelism is running multiple calls of And this is one of the errors when running in the directory
Since there are multiple instances of |
I see. From what I understand the problem is not originating from the XP index, because here we generate the file paths randomly. Also non of the code of the XP index is invoked within |
Which version of ODGI are you using @ASLeonard ? |
I built odgi from tip (a054641).
I'm less sure this is the reason then based on your experience, but the problem only appears when calling multiple |
We could fix the problem by generating files with random names. Strangely, your files have "no name", that is they have names starting with a dot.
…________________________________
From: Alex Leonard ***@***.***>
Sent: Wednesday, April 12, 2023 3:44:56 PM
To: pangenome/odgi ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [pangenome/odgi] Race condition in ".pathnames.iv" file in position command (Issue #493)
I built odgi from tip (a054641<a054641>).
From what I understand the problem is not originating from the XP index, because here we generate the file paths randomly. Also non of the code of the XP index is invoked within position_main.cpp.
I'm less sure this is the reason then based on your experience, but the problem only appears when calling multiple odgi positions in parallel and never occurs when running multiple odgi position calls sequentially, so I still think the issue is with different calls racing to the "/cluster/work/alex/KIT/.pathnames.iv" file.
—
Reply to this email directly, view it on GitHub<#493 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AO26XHRA3NSGYG7HIHZIYNLXA2WVRANCNFSM6AAAAAAW3F3BTQ>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
I understand the problem, but I don't understand how it can happen 🍿 |
Ah, the problem could be that you are writing to |
I was writing to stdout to further postprocess the output. I can try writing to separate files (based on the unique paths of interest), so that would at least test if the I'm not sure why the files have "no name" and are just dot names in the working directory. $TMPDIR is unset (but pretty sure I saw this issue on the compute nodes where $TMPDIR is set). |
So my current assumption is that when
I am not familiar enough with the |
I am confused by this command: parallel -j 4 odgi position -i pggb.og -p HER:{1}-{2} -E -d 1000 -o /dev/stdout ::: <paths of interest>
|
Hi,
Probably not a common case, but querying positions in a graph in an outer parallel loop is buggy, because of this line writing a local pathname file with a fixed name.
odgi/src/algorithms/xp.cpp
Line 113 in 2c9a17f
Running
parallel -j 1 ...
fixed the issue for me, so pretty sure it is exactly due to a race condition on this line. It seems there is another similar case here.Best,
Alex
The text was updated successfully, but these errors were encountered: