Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Expose sorting by reference index #952
As I interpret it, the goal of this issue is to add a CLI option flag
The result would be a vcf or sam/bam file with records sorted first by the referenceIndex of the contig, and then by genomic position. The header sequence dictionary rows will also match the referenceIndex ordering.
Is this correct?
Question: I am not clear as to whether indeed we do wish for this command to also sort by genomic position within the reference/contig groups. It is possible the original sam/bam or vcf was not sorted by position, however we have no way to recover the original order, and there is not a guarantee the order ADAM would output would be the same as the original vcf or bam.
Also - I need to look at code to understand more about the cases where
As long a SequenceDictionary exists with contigs with referenceIndex, I guess the above a,b,c is upstream from concern in this issue, but just wanted to check.