Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: support for PacBio "native" format #27

Closed
ptrebert opened this issue Dec 5, 2019 · 2 comments
Closed

Q: support for PacBio "native" format #27

ptrebert opened this issue Dec 5, 2019 · 2 comments

Comments

@ptrebert
Copy link

ptrebert commented Dec 5, 2019

Hi,
is elPrep also a drop-in replacement for processing BAM files in "PacBio native" format, i.e., BAM files containing unaligned long reads plus additional quality information that is used for downstream steps such as alignment with pbmm2?
Thanks for the info.
+Peter

@pcostanza
Copy link
Contributor

Hi,

Unfortunately, I don't believe this will work. elPrep supports the steps in between alignment and variant calling (sorting, marking duplicates, BQSR, etc.) of a typical GATK best-practices pipeline, and they require already aligned BAM files. If you can align the PacBio reads first, it may be possible to use elPrep afterwards, and we have heard other people using elPrep in that way. However, we don't do any testing with long reads ourselves yet, so we cannot guarantee that it will work as expected.

If you want to discuss this in more detail, please let us know.

Cheers,
Pascal

@ptrebert
Copy link
Author

ptrebert commented Dec 6, 2019

Hi Pascal,
thanks for the info. Depending on where the future development of elPrep is going, please consider this a feature request. My main reason for asking was that current tools that handle these unaligned/PacBio "native" BAM files seem to be quite slow for tasks such as merging sub-parts into a larger file, or dumping from BAM to FASTQ/FASTA; things that I assume to be rather I/O limited, but that seem to be CPU limited in current implementations, ergo something elPrep could do wonders about. Anyway, I am closing this now. If, at some point, elPrep does support these PacBio BAM files, please announce it loud and clear.
Best,
Peter

@ptrebert ptrebert closed this as completed Dec 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants