Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
VariantRDD union creates multiple records for the same SNP ID #1644
We already split alternate alleles at the same position into multiple
+1 @heuermh, the union method specifies no contract WRT "duplicate" records.
Logically, in variant space, "duplicate" is not clearly defined, as two records with the same chr/pos/ref/alt could have conflicting annotations. I don't believe that the VCF spec says that dupe records are illegal, as long as both records are well formed and the data is properly sorted.
I suggest we close as