-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when running variant calling: "Expected three tokens in header line, got 2" #67
Comments
Hi Kiran, can you attach more of the log file before the error? Specifically beginning from the invocation of margin. This is an issue in margin's merge function which I've seen when the file margin uses to track read ids per chunk is corrupted, but that shouldn't happen during execution in this workflow. Also if you can send the command you're using to run this and the specific version of the docker container, this would help us debug. |
Sure thing. I've attached the full log below. I'm using the Docker image "kishwars/pepper_deepvariant:r0.4.1". The specific command I'm running is:
Also if it helps, I've put the chr22 BAM file (~5 GB) in Dropbox . |
Thanks Kiran, I don't see anything out of place in the log file. I'm downloading the bam now, can you send PEPPER_SNP_OUPUT.vcf.gz so I can debug this? |
Hi @kvg , just checking in to see if you have resolved this issue and whether you can supply the VCF |
Hi - sorry for my delay. I shall get you the PEPPER_SNP_OUPUT.vcf.gz today.
Thanks,
-Kiran
…On Jun 7, 2021, 1:58 PM -0400, tpesout ***@***.***>, wrote:
Hi @kvg , just checking in to see if you have resolved this issue and whether you can supply the VCF
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Hi @tpesout - I've now generated the PEPPER_SNP_OUPUT.vcf.gz for the same sample and chromosome and attached it here. Thanks! |
Kiran, It looks like the issue is arising because your BAM has duplicate reads:
I'll need some time to discuss if or how we should handle this in the pipeline, but for now removing duplicates from your BAM should fix this issue. |
@kvg As this is an issue with bad data, we're not going to make changes to the pipeline to support this case. I'm going to close this issue, but please feel free to reopen or reply if you have further issues. |
Sounds great - thanks very much for your help in diagnosing the issue! Now to investigate why my data has duplicate reads in the first place. |
@kvg if it's helpful, I found that for the two that I checked there were three alignments which were exactly the same, and one which had something different (as far as "uniq" thinks). |
Hi I am having the same issue. I am working with data coming from cancer chromothripsis (ie highly broken up chromosomes with lots of structural variants), hence supplementary alignments (and consequently multiple reads with the same QNAME in the bam file) are expected. If I filtered out all these reads I would loose a substantial part of the information. |
Have you solved it? I am facing the same issues, but I do not know if I need to remove the duplicate. |
Hi, I have not looked into it yet. Let me know if you worked it out somehow. All I can think of right now is appending some random strings to the read names, as a quick workaround.. |
Yes, I tried to use the command |
Here is quick workaround to append random umi-like sequences to the readnames, then Pepper does not complain:
you may want to delete the escaping \ |
Hello,
I'm running into a strange problem when running PEPPER-Margin-DeepVariant on ONT data for a human whole-genome sample covered to ~30x. About an hour into the run, I get an error message stating, "Expected three tokens in header line, got 2".
The log file (excerpted) is as follows:
I'm not sure which file it's saying is effectively malformed - is this an error on my end or is this perhaps a bug in PEPPER-Margin-DeepVariant?
Thanks,
-Kiran
The text was updated successfully, but these errors were encountered: