Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ID column to SV BED file #10

Closed
SHuang-Broad opened this issue Dec 9, 2020 · 4 comments
Closed

Add ID column to SV BED file #10

SHuang-Broad opened this issue Dec 9, 2020 · 4 comments

Comments

@SHuang-Broad
Copy link

Hi,

I constantly make use of the GIAB SV callset and really appreciate the effort of curating all of these.

I do have one feature request:

The SV BED file right now contains only the coordinates but not the type of variant the interval is associated with, or the originating variant ID available from the VCF (in HG19).
An IGV trick that I constantly use is packing some information—that I want to quickly get for the variant—from the source VCF into the ID (4th) column of the BED file, which will be displayed by IGV. This way one doesn't need to click on a VCF record just for a quick glance.

I'd appreciate it if the VCF ID records are copied into the BED file.

Thank you!
Steve

@jzook
Copy link
Contributor

jzook commented Dec 9, 2020

Thanks for your suggestion! We generally recommend using the Tier 1 vcf file for this information, and the Tier 1 bed describes the regions in which we've made (almost) all the SV calls in the vcf. We don't have an easy way to add annotations to the Tier 2 bed since many of the variants are complex, but we are working towards new assembly-based benchmarks to describe these, including one focused on medically relevant genes for which we'll post a draft very soon. In the meantime, you could use a whole genome hifiasm/dipcall vcf to get one estimate of the potential SV call in HG002 - ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_HG002_medical_genes_benchmark_v0.01.00/GRCh37/hifiasm_v0.11.

@SHuang-Broad
Copy link
Author

Thanks Justin!

My assumption was that the BED file was generated from the VCF
ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_SVs_Integration_v0.6/HG002_SVs_Tier1_v0.6.vcf.gz
But that might not be true based on your reply.

The reason I'm working with BED is that we typically work with GRCh38, so lifting over the BED is easy but not the VCF itself.

@jzook
Copy link
Contributor

jzook commented Dec 10, 2020 via email

@SHuang-Broad
Copy link
Author

Thanks for the information Justin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants