You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
If I import a VCF file into genomicsDB containing many samples, say something from the 1000 genome project, are the fields that are common to all samples (the ones mentioned in the subject line) stored in each TileDB cell? Or are they stored just once per variant?
Regards,
Shubham
The text was updated successfully, but these errors were encountered:
Hello Shubham,
Yes, all the common fields are stored in every TileDB cell for each sample.
GenomicsDB was developed primarily for storing variant data from many individual samples (many VCFs, each VCF with 1 sample, say from the output of a variant caller) and then jointly querying/processing the data. It doesn't work well when the variant data from multiple samples is already combined into a single VCF.
We have seen this issue before and gave some thought to it. In the end, I gave up trying to make multi-sample VCF import into GenomicsDB efficient.
Hi,
If I import a VCF file into genomicsDB containing many samples, say something from the 1000 genome project, are the fields that are common to all samples (the ones mentioned in the subject line) stored in each TileDB cell? Or are they stored just once per variant?
Regards,
Shubham
The text was updated successfully, but these errors were encountered: