Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POD5 streaming data functionality #2

Closed
harisankarsadasivan opened this issue May 23, 2022 · 2 comments
Closed

POD5 streaming data functionality #2

harisankarsadasivan opened this issue May 23, 2022 · 2 comments

Comments

@harisankarsadasivan
Copy link

harisankarsadasivan commented May 23, 2022

Hello,

Could you please explain the streaming functionality in C/C++ if I were to extract raw data for Read Until/selective sequencing?
I know a FAST5 has to be completely written before it can be read. Does POD5 have any advantages for Read Until?
How would future chunks of a read be handled? Appended to the same file or new file?

@0x55555555
Copy link
Collaborator

Hello,

Could you please explain the streaming functionality in C/C++ if I were to extract raw data for Read Until/selective sequencing?

POD5 is an on disk file format for nanopore read data - its two table approach supports writing partial read data to disk, and then appending to and completing reads later. You could extract the data for downstream analysis, but I would not expect you to do this in an application that feeds back into the sequencing process, like adaptive sampling does.

I know a FAST5 has to be completely written before it can be read. Does POD5 have any advantages for Read Until?

I don't expect this to have any direct benefits to read until, as this requires much lower latency access to small amounts read data - this is why we provide a specific API for read until.

How would future chunks of a read be handled? Appended to the same file or new file?

One read is always contained within one file - It would be technically possible to spread data across multiple files, but I dont feel like this would be useful to readers. Right now conceptually signal data without a record in the read table is considered "orphaned", and would not be read by the tools we provide.

Hope that helps,

  • George

@iiSeymour
Copy link
Member

You might also be interested in the docs and C++ examples here @harisankarsadasivan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants