-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make fqgrep more grep-like in its options #13
Conversation
Weird, there seems to be some issue with the thread pools stalling when threads <= 2 |
@sstadick totally fine if you have zero time, but I am stuck with a weird deadlock when using exactly two threads. I think there's a thread for reading, writing, and processing, so something is stalling or deadlocking. I am suspicious that my use of a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nh13 I didn't see the notification for this until today!
So the number of threads you need is equal to:
(if paired) 2 for reading (1 if not paired)
1 for writing
1 for the pool.install
Otherwise, as you noted, it will deadlock.
Example: Paired input, 3 threads given
1 thread is immediately started for writing, and blocks waiting on the channel
1 thread is used as the context within pool.install
A thread for reader 1 is created on the threadpool, sending chunks and then blocking once the channel is full.
A task is added to the pool for reader 2 that wants to use a thread, but all 3 threads are in a blocked state:
- the
pool.install
thread is waiting on reader2 to start sending over its channel - the reader 1 thread is blocked waiting for its channel to be less full to start ending again
- the writer is blocked and waiting for anything to show up over the rx channel.
What I'd recommend is spawning threads not tied to the threadpool to do the reading and writing.
I'd not count the reader and writer threads toward the total number of threads used since they don't really use CPU and mostly just sleep-wait on sys call.s
That's how gzp
and by association crabz
do their accounting for threads.
Alternatively you could switch to an async model which adds a lot of complexity, but specifically solves this scenario where you want to context switch between tasks with a limited threadpool.
progress.record(); | ||
progress.record(); | ||
} | ||
assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this compares the full read header, which will likely be different R1 and R2 since it contains the <read>
element. I had to remove this to run locally on the datasets I had on hand.
I pushed a branch that has debug statements that demonstrate exactly how / why the deadlock happens: https://github.com/fulcrumgenomics/fqgrep/tree/demo/deadlock After building, run with |
@sstadick the pony I want is "I have N tasks, M processors, where N >> M, and M is a thread pool". I want the user to say "use M threads", and not have it be "the tool will use M + 2" threads, if that makes sense. I thought that spawning threads and installing closures would support that? If a thread is blocked, can't another thread be executed? |
For a number of complicated reasons the pony you are asking for is As far as I can tell, Part of the reason rayon can't do this is that suspending the execution of a task and restarting it on another thread means moving memory around, which may or may not be safe.
Lastly, with regards to not counting IO threads - that is what pigz does as well sstadick/gzp#11 (comment) |
@sstadick thanks for the explanation, that makes a lot of sense. In a lot of contexts, like you have for I do think the right answer is to span separate reader/writer threads and call it a day, like you have in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks awesome!!
@tfenne in case you want to review it, here it is. I think there are probably non-trivial improvements that can be made to the implementation, but I've stared at it long enough I can no longer see them. |
This adds a ton of changes:
--threads
is respected. Previously reader and writer threads were always allocated outside the match pool, and there were specific arguments for the latter and compressing the output (the latter feature has been removed, plaintext FASTQ is the only output format, just pipe it if you need to).-f
below.--decompress
option is given, in which case they're assumed to be GZIP compressed. This includes standard input. The exception are.gz/.bgz
and.fastq/.fq
which are always treated as GZIP compressed and plain text respectively.-c, --count
: simply return the count of matching records-F, --fixed-strings
: interpret pattern as a set of fixed strings-v
: Selected records are those not matching any of the specified patterns--color <color>
: color the output records with ANSI color codes-e, --regexp <regexp>...
: specify the pattern used during the search. Can be specified multiple times-f, --file <file>
: Read one or more newline separated patterns from file.-Z, --decompress
: treat all non.gz
,.bgz
,.fastq
, and.fq
files as GZIP compressed (default treat as uncompressed)--paired
: if one file or standard input is used, treat the input as an interleaved paired end FASTQ. If more than one file is given, ensure that the number of files are a multiple of two, and treat each consecutive pair of files as R1 and R2 respectively. If the pattern matches either R1 or R2, output both (interleaved FASTQ).--reverse-complement
: searches the reverse complement of the read sequences in addition--progress
: write progress (counts of records searched)-t, --threads <threads>
: see (1) aboveThis could use a lot of unit tests and some test on large datasets (to ensure correctness and performance relative to the previous version)