-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about using the C API to acess a POD5 file efficiently using multiple threads #1
Comments
Hi @hasindu2008, Great shouts on the above, I'll try to answer in order:
I will endeveour to push a release shortly that at least makes getting signal in a single threaded way easier, then push onto getting it faster - using multiple threads. Thanks,
|
Hi George Thank you for the quick response. (1) is not a big requirement as I can use g++ for now. For (2), how are you envisioning the MinKNOW output to become - what would be the default batch size. And, also is MinKNOW going to output one large single POD5 file per one sequencing run or will it be multiple POD5 files like it Is being done with FAST5 at the moment? (3) is what I am mostly interested in. Isn't arrow capable of internally utilising threads for decompression and/or parsing?
|
@jorj1988 |
Is this multi-threading-related crash in POD5 fixed now? |
Hello @hasindu2008, My apologies - this issue has slipped by me. Yes, this issue is now resolved - it is now safe to read pod5 files from as many threads as you like. I would recommend for cache efficiency reading one batch at a time in each thread, however this is not required by the API.
|
Dear POD5 developers,
I have been trying to use the POD5 C API to write a simple example of converting raw signal data to pico ampere. It is a single POD5 file containing a large number of reads and I want to iterate through all the reads while exploiting as many threads as possible. Learning from the Dorado code I have written something and a code snippet is given below. I have a few questions.
pod5_get_signal_row_info()
without using a C++ vector (by using pure C structs)? See comment on the code belowIs the above implementation the most efficient way to use POD5 on a multi core system?
The text was updated successfully, but these errors were encountered: