major refactor for scalability
v0.2.0
This was a large re-write of somalier. The command-line usage is backwards incompatible (but
should not change moving forward). There is now a per-sample extract step:
somalier extract -d extracted/ -s $sites_vcf -f $fasta $sample.cram
followed by a relate step:
somalier relate --ped $ped extracted/*.somalier
This enables parallelization by sample across nodes and the resulting, extracted, binary "somalier"
files are only ~220KB per sample so reading them is nearly instant and the relate step
runs in 10 seconds for my 603-sample test-case which makes adjusting pedigree files or removing samples
and re-running a much faster process.
This means we can add a single (n+1) sample and once it's extracted, we can compare it to an entire cohort in a few seconds.
somalier extract can also take a (multi-sample) VCF and create an idential "somalier" file
for cases when a VCF is available.
The sites files (linked below) are also greatly improved (with fewer sites, better accuracy) in this release)
For example, here is the output from previous version:

compared to this version:

Note how on the bottom figure for this version, like colors (relationships indicated from a pedigree file) cluster more tightly than in the previous version.
This release also reports values for X and Y chromosomes which help to evaluate observed vs expected sex, which can help resolve sample swaps.
Install
This release comes with 2 linux binaries:
- somalier_static is a completely static binary and the recommended way to run somalier; just wget, chmod+x (get a sites file) and go.
- somalier_shared requires htslib (and libhts.so). use this binary if you need to access S3 or https files.