-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wholegenome.interval_list not recognized by CreateIntervalsBed process #56
Comments
Hi, |
Hi @maxulysse, thanks for the prompt reply, I indeed missed that particular documentation file, my bad. To clarify, the
|
Thanks a lot for the link @klmr, I'll look more into this format to make sur that it works with sarek. |
By the way with which genome are you working? |
I’m using GRCh38, and the file in question can be found in the GATK resource bundle on GCP (requires login) or as a download via HTTP. |
I do think it's the file that we used to create the intervals file that we host on AWS iGenomes. |
I think I have a solution.
|
Yes, that works, thanks! I’ve added a review comment on your changeset. |
PR has been created ;-) |
I may be overlooking something but Sarek does not seem to document the input file formats/purposes of the genome files. For most files, the purpose is obvious but some, at least for me, aren’t. And for others it isn’t clear what file format is expected.
For instance, I had assumed that the
wholegenome.interval_list
Picard-formatted file from the GATK resource bundle would be valid as agenomes.intervals
file, but the result is a cryptic error message, as well as a work directory full of weird BED files:It turns out that the relevant Sarek process only supports two out of the three formats described by the GATK documentation, and notably does not support the Picard-format file, which is included in the official GATK bundle.
The text was updated successfully, but these errors were encountered: