Introduction

This RFC lays out the Data Versioning process for St. Jude Cloud.

Motivation

To provide the community with harmonized data from experiments performed at St. Jude Children's Research Hospital, the St. Jude Cloud team has developed datatype-specific harmonization workflows. Over time, it is necessary to incorporate bug fixes and other changes in those workflows and the underlying tools. In order to make these changes with minimal impact, we seek to define "data versions", within which, we have evidence that the workflow results are compatible and not "substantially" different. This RFC will lay out the methods by which we determine what constitutes a significant difference in the resulting data.

Pre-Discussion

We have previously had short discussions regarding changes to our data versioning scheme. Currently we attach the pipeline version used to generate the data to the file's metadata. This has left us in a position where we are hesitant to increment pipelines to new versions or update to new versions of underlying tools. We generally agree that there are many pipeline-level changes that do not materially impact the output results. The goal of this RFC is to determine what changes we can be made without creating a new data version. To do so, we also need to generate metrics and data that we can use to justify to users that the results are not different.

There are a number of approaches that could be taken to do this. We could look at alignment information, in which case, we may want to use results from the QC pipeline to justify. Another option is to examine the results from an analysis perspective. If the variant calls are the same (or reasonably the same), that may be sufficient. These are just two limited examples of options we could use.

Currently we have the RNA-Seq workflow v2.0.0 in use for St. Jude Cloud. We have subsequently released improvements, including the integration of XenoCP. This is not in use for production work as we include the pipeline version with the data files in St. Jude Cloud. Creation of a data versioning standard will enable us to roll out bug fixes and new features on an appropriate timeline and provide confidence to ourselves and our users that data is consistent, hopefully, without requiring complete reruns of all data for any change.

We also have issues with the msgen pipeline that is used for WGS and WES. Microsoft has made changes to the pipeline and process throughout the time that we have been using it for production data releases. We have not incremented any versions in that time. So as it stands, we are trusting, without any verification, that any changes introduced are acceptable.

A data versioning process should help us find a reasonable balance that doesn't stop us from making any changes, ever, but also we want to avoid incorporating any and all changes without any assurances, or insight in msgen's case, that those changes are reasonable and not meaningfully impactful to the data results.

Discussion

Whole-Genome and Whole-Exome Sequencing

We propose to evaluate whole-genome (WGS) and whole-exome (WES) sequencing by running well characterized samples through each iteration of the analysis pipeline. The resulting variant calls (gVCFs) will be compared to existing high-quality variant calls. This comparison will be conducted using Illumina's hap.py comparison tool as recommended by the Global Alliance for Genomics and Health (GA4GH) Benchmarking Team. Specifically, we propose to run samples from the National Institute for Standards and Technology (NIST)'s Genome in a Bottle (GIAB) project. We will perform analysis using samples HG002, HG003, HG004, HG006, and HG007 for WGS. For WES, we will use samples HG002, HG003, HG004, and HG005. The results from prior iterations of the pipeline will be supplied as the truth set. The variant calls from the new workflow version will be treated as the query. The confident call sets from GIAB will be provided as the truth dataset for a second comparison. In this way we can track divergences in the pipeline and also determine the significance of any discovered divergence.

RNA-Seq

We propose to evaluate RNA sequencing (RNA-Seq) by running well characterized samples through each iteration of the analysis pipeline. Specifically, we propose to run samples from the National Institute for Standards and Technology (NIST)'s Genome in a Bottle (GIAB) project. We will use RNA-Seq data from HG002, HG004, and HG005 samples. These three samples will be run through new iterations of the St. Jude Cloud RNA-Seq harmonization workflow. The pipeline will output aligned BAMs and feature count files for each sample. We will then generate variant calls using the GATK RNA-Seq short variant discovery best practices workflow. We will use Illumina's hap.py comparison tool, along with the high confidence variant calls from GIAB, to compare output from prior versions of the RNA-Seq pipeline to the proposed version.

GATK RNA-Seq short variant discovery best practices workflow

The GATK best practices workflows are de facto standards for data processing in bioinformatics. Therefore we will reuse their existing workflow for generating variant calls from RNA-Seq data. The workflow accepts a STAR-aligned BAM file, such as those produced by our RNA-Seq alignment harmonization workflow, and produces a set of variant calls by running a number of GATK steps.

Workflow

Run Picard MarkDuplicates

    picard -Xmx~{java_heap_size}g MarkDuplicates \
        -I ~{bam} \
        -O ~{if create_bam then prefix + ".bam" else "/dev/null"} \
        --VALIDATION_STRINGENCY SILENT \
        --CREATE_INDEX ~{create_bam} \
        --CREATE_MD5_FILE ~{create_bam} \
        --COMPRESSION_LEVEL 5 \
        --METRICS_FILE ~{prefix}.metrics.txt

Run GATK SplitNCigarReads

    gatk \
        SplitNCigarReads \
        -R ~{fasta} \
        -I ~{bam} \
        -O ~{prefix}.bam \
        -OBM true
   # GATK is unreasonable and uses the plain ".bai" suffix.
   mv ~{prefix}.bai ~{prefix}.bam.bai

Run GATK Spark-enabled BaseRecalibrator (BaseRecalibratorSpark).

     gatk \
         --java-options "-Xms4000m" \
         BaseRecalibratorSpark \
         -R ~{fasta} \
         -I ~{bam} \
         ~{if use_original_quality_scores then "--use-original-qualities" else "" } \
         -O ~{prefix}.txt \
         -known-sites ~{dbSNP_vcf} \
         -known-sites ~{sep=" --known-sites " known_indels_sites_VCFs} \
         --spark-master local[4]

Run GATK Spark-enabled ApplyBQSR (ApplyBQSRSpark).

        gatk \
        --java-options "-Xms3000m" \
        ApplyBQSRSpark \
        --spark-master local[4] \
        -I ~{bam} \
        ~{if use_original_quality_scores then "--use-original-qualities" else "" } \
        -O ~{prefix}.bam \
        --bqsr-recal-file ~{recalibration_report}

Run Picard IntervalListTools to generate list of regions to scatter over.

    picard -Xms1g \
        IntervalListTools \
        SCATTER_COUNT=~{scatter_count} \
        SUBDIVISION_MODE=BALANCING_WITHOUT_INTERVAL_SUBDIVISION_WITH_OVERFLOW \
        UNIQUE=~{unique} \
        SORT=~{sort} \
        INPUT=~{interval_list} \
        OUTPUT=out

Run GATK Spark-enabled HaplotypeCaller (HaplotypeCallerSpark) on each region.

    gatk \
       --java-options "-Xms6000m" \
        HaplotypeCallerSpark \
        -R ~{fasta} \
        -I ~{bam} \
        -L ~{interval_list} \
        -O ~{prefix}.vcf.gz \
        ~{if use_soft_clipped_bases then "" else "--dont-use-soft-clipped-bases"} \
        --standard-min-confidence-threshold-for-calling ~{stand_call_conf} \
       --spark-master local[4]

Run Picard MergeVcfs.

    picard -Xms2000m \
        MergeVcfs \
        ~{sep(' ', prefix('--INPUT=', vcfs))} \
        --OUTPUT ~{output_vcf_name}

Run GATK VariantFiltration.

    gatk \
        VariantFiltration \
            --R ~{fasta} \
            --V ~{vcf} \
            --window ~{window} \
            --cluster ~{cluster} \
             ~{sep(' ', prefix('--filter-name ', filter_name))} \
             ~{sep(' ', prefix('--filter-expression ', squote(filter_expression)))} \
            -O ~{prefix}.vcf.gz

Validation and Evaluation

After generation of RNA-Seq variants, we will run Illumina's hap.py comparison tool, as recommended by GA4GH, to evaluate the variant calls produced by the newest version of the pipeline compared with those produced by the prior version of the pipeline and compared against the high-quality variant call benchmark set published by GIAB.

We will run multiple comparisons. The first is to compare the prior version of the pipeline and the proposed version of the pipeline, using the confident call set from GIAB as an additional filter. We will use the Gencode CDS regions to restrict analysis to only those expected to be covered by RNA-Seq.

    hap.py \
        ~{latest release VCF} \
        ~{new version VCF} \
        -o ~{report_name} \
        -r ~{reference_fa} \
        -T ~{gencode_CDS_bed} \
        -f ~{sample_confident_call_bed}

We will also run without the GIAB high-confidence callset.

    hap.py \
        ~{latest release VCF} \
        ~{new version VCF} \
        -o ~{report_name} \
        -r ~{reference_fa} \
        -T ~{gencode_CDS_bed}

We will run both because using a confident call set to restrict the comparison causes variants outside that region (in either the TRUTH or QUERY sample) to be excluded and marked as unknown (UNK). As can be seen below, when comparing the first release of our RNA-Seq workflow (v2.0.0) with the latest stable release (v3.0.1), there are a significant number of additional variants found in the new version. This suggests that our decision to increment the major revision was correct. Ultimately, most of those new variants are filtered out with standard quality filters applied during the variant calling pipeline. When looking at the result with the GIAB confident call set applied, there are a large number of variants (1939 SNPs that passed filtering for HG004 below) included and marked UNK. Upon further investigation, there are only 16 SNPs passing filtering that are not called in v2.0.0. The remaining variants are called in both versions, but labeled as UNK due to being outside the confident call regions (that is, not found in WGS).

HG002 v2.0.0 vs. v3.0.1 with GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	606	604	2	1,026	235	186	1	15	0.997…	0.720…	0.181…	0.836…			3.102…	4.115…
INDEL	PASS	606	601	5	769	7	160	1	0	0.992…	0.989…	0.208…	0.990…			3.102…	2.948…
SNP	ALL	15,278	15,270	8	21,339	3,092	2,977	4	107	0.999…	0.832…	0.140…	0.908…	3.200…	2.806…	1.295…	1.474…
SNP	PASS	15,278	15,266	12	16,929	18	1,645	4	1	0.999…	0.999…	0.097…	0.999…	3.200…	3.090…	1.295…	1.282…

HG002 v2.0.0 vs. v3.0.1 without GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	760	756	4	1,026	268	1	25	0.995…	0.739…	0.848…			2.892…	4.115…
INDEL	PASS	760	752	8	769	15	1	6	0.989…	0.980…	0.985…			2.892…	2.948…
SNP	ALL	16,924	16,912	12	21,339	4,427	5	183	0.999…	0.793…	0.884…	3.088…	2.806…	1.282…	1.474…
SNP	PASS	16,924	16,901	23	16,929	28	5	2	0.999…	0.998…	0.998…	3.088…	3.090…	1.282…	1.282…

HG004 v2.0.0 vs. v3.0.1 with GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	656	652	4	1,157	298	199	1	16	0.994…	0.689…	0.172…	0.814…			3.257…	4.662…
INDEL	PASS	656	649	7	822	6	159	1	1	0.989…	0.991…	0.193…	0.990…			3.257…	3.100…
SNP	ALL	15,872	15,864	8	22,068	2,791	3,413	3	128	0.999…	0.850…	0.155…	0.919…	3.160…	2.769…	1.314…	1.467…
SNP	PASS	15,872	15,855	17	17,817	23	1,939	3	2	0.999…	0.999…	0.109…	0.999…	3.160…	3.027…	1.314…	1.295…

HG004 v2.0.0 vs. v3.0.1 without GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	811	803	8	1,157	346	3	28	0.990…	0.701…	0.821…			2.955…	4.662…
INDEL	PASS	811	800	11	822	14	3	5	0.986…	0.983…	0.985…			2.955…	3.100…
SNP	ALL	17,810	17,796	14	22,068	4,272	6	226	0.999…	0.806…	0.893…	3.033…	2.769…	1.294…	1.467…
SNP	PASS	17,810	17,782	28	17,817	35	6	3	0.998…	0.998…	0.998…	3.033…	3.027…	1.294…	1.295…

HG005 v2.0.0 vs. v3.0.1 with GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	595	591	4	1,049	229	226	1	16	0.993…	0.722…	0.215…	0.836…			2.868…	4.005…
INDEL	PASS	595	585	10	776	7	181	1	3	0.983…	0.988…	0.233…	0.986…			2.868…	2.752…
SNP	ALL	15,301	15,295	6	22,779	2,962	4,522	4	129	1.000…	0.838…	0.199…	0.912…	3.139…	2.662…	1.176…	1.423…
SNP	PASS	15,301	15,283	18	17,391	18	2,090	4	3	0.999…	0.999…	0.120…	0.999…	3.139…	3.025…	1.176…	1.178…

HG005 v2.0.0 vs. v3.0.1 without GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	775	769	6	1,049	276	1	30	0.992…	0.737…	0.846…			2.755…	4.005…
INDEL	PASS	775	762	13	776	10	1	6	0.983…	0.987…	0.985…			2.755…	2.752…
SNP	ALL	17,378	17,363	15	22,779	5,416	10	263	0.999…	0.762…	0.865…	3.026…	2.662…	1.177…	1.423…
SNP	PASS	17,378	17,341	37	17,391	50	10	8	0.998…	0.997…	0.997…	3.026…	3.025…	1.177…	1.178…

HG002 v3.0.1 vs `main` with GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	608	601	7	1,018	232	185	1	15	0.988…	0.721…	0.182…	0.834…			3.088…	4.126…
INDEL	PASS	608	597	11	756	3	156	1	1	0.982…	0.995…	0.206…	0.988…			3.088…	2.942…
SNP	ALL	15,284	15,272	12	21,328	3,086	2,970	3	99	0.999…	0.832…	0.139…	0.908…	3.202…	2.806…	1.297…	1.471…
SNP	PASS	15,284	15,260	24	16,926	21	1,645	3	0	0.998…	0.999…	0.097…	0.999…	3.202…	3.091…	1.297…	1.282…

HG002 v3.0.1 vs `main` without GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	764	752	12	1,018	265	2	23	0.984…	0.740…	0.845…			2.907…	4.126…
INDEL	PASS	764	747	17	756	8	2	5	0.978…	0.989…	0.984…			2.907…	2.942…
SNP	ALL	16,929	16,906	23	21,328	4,422	4	184	0.999…	0.793…	0.884…	3.090…	2.806…	1.282…	1.471…
SNP	PASS	16,929	16,887	42	16,926	39	4	7	0.998…	0.998…	0.998…	3.090…	3.091…	1.282…	1.282…

HG004 v3.0.1 vs `main` with GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	950	917	33	1,147	23	199	2	1	0.965…	0.976…	0.173…	0.970…			5.032…	4.558…
INDEL	PASS	950	651	299	823	6	158	2	1	0.685…	0.991…	0.192…	0.810…			5.032…	3.064…
SNP	ALL	18,648	18,601	47	22,089	71	3,417	7	7	0.997…	0.996…	0.155…	0.997…	2.962…	2.765…	1.451…	1.469…
SNP	PASS	18,648	15,875	2,773	17,823	8	1,940	1	1	0.851…	0.999…	0.109…	0.919…	2.962…	3.026…	1.451…	1.295…

HG004 v3.0.1 vs `main` without GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	1,145	1,106	39	1,147	33	4	7	0.966…	0.971…	0.969…			4.549…	4.558…
INDEL	PASS	1,145	803	342	823	12	4	5	0.701…	0.985…	0.819…			4.549…	3.064…
SNP	ALL	22,057	21,985	72	22,089	104	14	13	0.997…	0.995…	0.996…	2.769…	2.765…	1.465…	1.469…
SNP	PASS	22,057	17,809	4,248	17,823	14	3	2	0.807…	0.999…	0.893…	2.769…	3.026…	1.465…	1.295…

HG005 v3.0.1 vs `main` with GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	592	585	7	1,052	234	230	0	22	0.988…	0.715…	0.219…	0.830…			2.799…	4.063…
INDEL	PASS	592	580	12	775	7	185	0	5	0.980…	0.988…	0.239…	0.984…			2.799…	2.779…
SNP	ALL	15,299	15,292	7	22,769	2,969	4,508	3	132	1.000…	0.837…	0.198…	0.911…	3.139…	2.661…	1.176…	1.419…
SNP	PASS	15,299	15,281	18	17,382	23	2,078	3	2	0.999…	0.998…	0.120…	0.999…	3.139…	3.024…	1.176…	1.176…

HG005 v3.0.1 vs `main` without GIAB confidence set

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	FP.gt	FP.al	METRIC.Recall	METRIC.Precision	METRIC.F1_Score	TRUTH.TOTAL.TiTv_ratio	QUERY.TOTAL.TiTv_ratio	TRUTH.TOTAL.het_hom_ratio	QUERY.TOTAL.het_hom_ratio
INDEL	ALL	769	761	8	1,052	287	1	37	0.990…	0.727…	0.838…			2.689…	4.063…
INDEL	PASS	769	756	13	775	15	1	9	0.983…	0.981…	0.982…			2.689…	2.779…
SNP	ALL	17,387	17,362	25	22,769	5,407	9	266	0.999…	0.763…	0.865…	3.025…	2.661…	1.178…	1.419…
SNP	PASS	17,387	17,337	50	17,382	45	8	4	0.997…	0.997…	0.997…	3.025…	3.024…	1.178…	1.176…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_versioning.md

data_versioning.md

Table of Contents

Introduction

Motivation

Pre-Discussion

Discussion

Whole-Genome and Whole-Exome Sequencing

RNA-Seq

GATK RNA-Seq short variant discovery best practices workflow

Workflow

Validation and Evaluation

HG002 v2.0.0 vs. v3.0.1 with GIAB confidence set

HG002 v2.0.0 vs. v3.0.1 without GIAB confidence set

HG004 v2.0.0 vs. v3.0.1 with GIAB confidence set

HG004 v2.0.0 vs. v3.0.1 without GIAB confidence set

HG005 v2.0.0 vs. v3.0.1 with GIAB confidence set

HG005 v2.0.0 vs. v3.0.1 without GIAB confidence set

HG002 v3.0.1 vs `main` with GIAB confidence set

HG002 v3.0.1 vs `main` without GIAB confidence set

HG004 v3.0.1 vs `main` with GIAB confidence set

HG004 v3.0.1 vs `main` without GIAB confidence set

HG005 v3.0.1 vs `main` with GIAB confidence set

HG005 v3.0.1 vs `main` without GIAB confidence set

Files

data_versioning.md

Latest commit

History

data_versioning.md

File metadata and controls

Table of Contents

Introduction

Motivation

Pre-Discussion

Discussion

Whole-Genome and Whole-Exome Sequencing

RNA-Seq

GATK RNA-Seq short variant discovery best practices workflow

Workflow

Validation and Evaluation

HG002 v2.0.0 vs. v3.0.1 with GIAB confidence set

HG002 v2.0.0 vs. v3.0.1 without GIAB confidence set

HG004 v2.0.0 vs. v3.0.1 with GIAB confidence set

HG004 v2.0.0 vs. v3.0.1 without GIAB confidence set

HG005 v2.0.0 vs. v3.0.1 with GIAB confidence set

HG005 v2.0.0 vs. v3.0.1 without GIAB confidence set

HG002 v3.0.1 vs main with GIAB confidence set

HG002 v3.0.1 vs main without GIAB confidence set

HG004 v3.0.1 vs main with GIAB confidence set

HG004 v3.0.1 vs main without GIAB confidence set

HG005 v3.0.1 vs main with GIAB confidence set

HG005 v3.0.1 vs main without GIAB confidence set

HG002 v3.0.1 vs `main` with GIAB confidence set

HG002 v3.0.1 vs `main` without GIAB confidence set

HG004 v3.0.1 vs `main` with GIAB confidence set

HG004 v3.0.1 vs `main` without GIAB confidence set

HG005 v3.0.1 vs `main` with GIAB confidence set

HG005 v3.0.1 vs `main` without GIAB confidence set