-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with VCF file no sample found #176
Comments
It seems like the issue is related to the absence of sample information in your VCF files generated by MANTA SVcaller. Truvari expects VCF files to have sample information in the header, and it's reporting that it cannot find any samples in the specified files.
Let's address the issues one by one:
No SAMPLE columns found in VCF:
The error indicates that the VCF file candidateSV.v.C.f.gz does not have any sample columns. This is a crucial piece of information for tools like Truvari that compare variants between different samples. You need to make sure that MANTA includes sample information in its VCF output.
You can check the header of your VCF file using tools like bcftools:
bcftools view -h candidateSV.v.C.f.gz
Look for the line starting with #CHROM in the VCF header. It should contain information about the samples.
No sample line / ValueError: cannot create VariantHeader:
The second error indicates a problem with reading the VCF file, and it's likely related to the absence of a valid header or sample information. Ensure that the VCF file is correctly formatted and has the necessary header lines.
You can use bcftools to check the VCF file format:
bcftools validate candidateSV.v.C.f.gz
This command will provide information about the validity of the VCF file.
ValueError: file does not have a valid header:
The last error indicates that Truvari is unable to recognize a valid header in the VCF file. Make sure that the file specified (candidateSV.v.C.f.gz) is indeed a valid VCF file.
Check the file by using zcat or zless to inspect its content:
zcat candidateSV.v.C.f.gz | less
Ensure that the file contains a valid VCF header (lines starting with ## for metadata and #CHROM for the sample information).
If MANTA SVcaller does not include sample information in its output, you might need to consult MANTA's documentation or options to include sample information in the VCF file. If the issue persists, you may need to contact the MANTA support community for assistance or consider alternative structural variant callers that provide the necessary sample information in their VCF output
Kimden: ***@***.***>
Gönderilme: 24 Kasım 2023 Cuma 14:40
Kime: ***@***.***>
Bilgi: ***@***.***>
Konu: [ACEnglish/truvari] Error with VCF file no sample found (Issue #176)
Hello,
I am trying to run truvari bench on vcf files generated from MANTA SVcaller but unfortunately I am having errors regarding no samples found. Could you please suggest some ideas?
[Screenshot from 2023-11-24 12-39-29]<https://user-images.githubusercontent.com/45700858/285446480-00370a52-e9c5-4c45-9242-623e5fca146b.png>
—
Reply to this email directly, view it on GitHub<#176>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AUXB247OG4F57A2VZ2OXJX3YGCBRLAVCNFSM6AAAAAA7ZAA3LOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYDSNJYGQZTKMA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
candidateSV.vcf.gz |
There must have been an error in how the format and sample was added. I ran the following script and was able to run on its output VCF.
import sys
for line in sys.stdin:
if line.startswith("##"):
sys.stdout.write(line)
elif line.startswith("#"):
sys.stdout.write('##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">\n')
sys.stdout.write(line.strip() + '\tFORMAT\tSAMPLE\n')
else:
sys.stdout.write(line.strip() + '\tGT\t.\n')
gunzip -c candidateSV.vcf.gz| python fixer.py | bgzip > cand_fix.vcf.gz |
Thank you so much finally it's working for me now. |
Hello,
![Screenshot from 2023-11-24 12-39-29](https://private-user-images.githubusercontent.com/45700858/285446480-00370a52-e9c5-4c45-9242-623e5fca146b.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk2NTY4NzEsIm5iZiI6MTcxOTY1NjU3MSwicGF0aCI6Ii80NTcwMDg1OC8yODU0NDY0ODAtMDAzNzBhNTItZTljNS00YzQ1LTkyNDItNjIzZTVmY2ExNDZiLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjI5VDEwMjI1MVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTY5MmJjNzk4ZjEwYjk0ODkzNDgwNmMzOGMwMjQxYmI2NGE3YmZjYWQyZDBkOWU1NGVkNDU4YjdlY2I0YWJjYTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.4fqAFc5zkaLOAT5XgUcccw5Qbc8Y0t3JrTjwLP7iEY0)
I am trying to run truvari bench on vcf files generated from MANTA SVcaller but unfortunately I am having errors regarding no samples found. Could you please suggest some ideas?
The text was updated successfully, but these errors were encountered: